<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>ReCAP - Information Retrieval and Case-Based Reasoning for Robust Deliberation and Synthesis of Arguments in the Political Discourse⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ralph Bergmann</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ralf Schenkel</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lorik Dumani</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stefan Ollinger</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Trier</institution>
          ,
          <addr-line>Behringstrasse 13, 54296 Trier</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The ReCAP project is a recently started project within the DFG priority programm robust argumentation machines (RATIO). It follows the vision of future argumentation machines that support researchers, journalistic writers, as well as human decision makers to obtain a comprehensive overview of current arguments and opinions related to a certain topic, as well as to develop personal, well-founded opinions justified by convincing arguments. Unlike existing search engines, which primarily operate on the textual level, such argumentation machines will reason on the knowledge level formed by arguments and argumentation structures. The focus of ReCAP is on novel contributions to and confluence of methods from information retrieval and knowledge representation and reasoning, in particular case-based reasoning. The aim is to develop methods that are able to capture arguments in a robust and scalable manner, in particular representing, contextualizing, and aggregating arguments and making them available to a user. Together with experts from the political domain real-world scenarios and use cases are worked out. A corpus of semantically annotated argumentations is being created from relevant text sources and will be made available to the argumentation research community.</p>
      </abstract>
      <kwd-group>
        <kwd>Argumentation</kwd>
        <kwd>Information Retrieval</kwd>
        <kwd>Case-Based Reasoning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>Argumentation is a core aspect of everyday human life, e.g. in medicine, law
and politics. On the one hand humans search for arguments to make good own
decisions, and on the other hand they search or form own arguments to persuade
others. However, in times of information age, big data, and fake news it is almost
impossible to manually find all valid and relevant arguments for a certain topic.</p>
      <p>
        An argument consists of a claim or standpoint supported or opposed by
reasons or premises [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. Argument components are usually expressed in natural
language. Together, the arguments form a graph or argumentation structure.
Deliberation finds and weighs all arguments supporting or opposing some
question or topic based on the available knowledge, e.g. by assessing their strength
or factual correctness, to enable informed decision making, e.g. for a political
action. Synthesis tries to generate new arguments for an upcoming topic based
on transferring an existing relevant argument to the new topic and adapting it
to the new environment.
      </p>
      <p>This paper gives an overview over the recently started project ReCAP which
is part of the DFG priority programm robust argumentation machines
(RATIO)1. It follows the vision of future argumentation machines that support
political researchers and journalistic writers in deliberation and synthesis. Unlike
existing search engines, which primarily operate on the textual level, such
argumentation machines will reason on a knowledge level formed by (argumentative)
propositions and argumentation structures. We propose a general architecture
for an argumentation machine with focus on novel contributions to and
confluence of methods from Information Retrieval (IR) and Knowledge Representation
and Reasoning (RI), in particular Case-Based Reasoning. The argumentation
machine works closely with argumentation structures in natural language and
in order to achieve argumentative reasoning, it abstracts further away from the
text by notions of similarity, extraction of facts, validation, clustering,
generalization and adaptation of arguments, thereby ofering some form of argument
competency. Together with experts from the political domain we develop
realworld scenarios which feedback into the development of the system. A corpus of
high-quality argumentation structures for closely related topics is developed.</p>
      <p>Section 2 describes the deliberation and synthesis use cases which drive the
research in the project. Then, sect. 3 gives an overall overview of the project
before sect. 4 presents the proposed methods that are in the focus of research.
Section 5 presents first results of a workshop with our cooperation partners
from the political domain. Finally, sect. 6 summarizes related work and sect. 7
concludes the paper.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Use Cases on Deliberation and Synthesis</title>
      <p>We now illustrate the ReCAP vision by two exemplary use cases from the domain
of political journalism. In the deliberation use case, we consider a journalist who
attempts to write a survey article about a topic that is currently debated (or was
debated in the past). Typical examples for such topics from the recent past could
be the Brexit, accepting refugees in Europe, or a specific countermeasure against
the subprime mortgage crisis. For her survey, she wants to collect arguments in
favour of and against the topic, and she decides to search this information on
the Web. As of now, she would have to manually collect large amounts of
relevant documents (such as news articles, forum entries, blog posts, etc.) using a
search engine, then manually extract arguments from the documents, and
cluster similar arguments; maybe she also wants to rank the arguments such that
1 http://www.spp-ratio.de
she can focus on the most important ones. The methods and tools that we will
develop in the ReCAP project will provide strong support in this case, such
that the journalist will be able to focus on journalistic aspects of the problem.
The journalist needs to provide a textual description of the topic . The ReCAP
argumentation machine will then automatically find documents with
argumentative content where the topic is discussed and extract the (possibly complex)
argumentation structure. It will then cluster similar arguments and
argumentation structures, thus allowing for a concise overview of the discussion. Additional
modules will help assessing the strength and the validity of arguments. As an
output of the search, the system will present an aggregated view of arguments or
even argumentation structures pro and con the topic, weighted by popularity,
truthfulness or persuasiveness, with the option to drill-down and look at the
(textual) sources of each argument. For a deeper analysis, our system can show,
for each argument, how this argument is in turn supported or attacked by other
arguments.</p>
      <p>In the synthesis use case, the journalist attempts to forecast possible future
discussions that may emerge about a topic that is just about to become
important, based on similar discussions in the past. As a typical example, one could
ask which of the arguments pro and con the Brexit would still apply for an exit
of the Netherlands from the European Union. Instead of finding documents
discussing this topic and extracting arguments from them, the journalist now needs
to locate documents on similar topics (in our example, topics on the Brexit) and
examine which arguments used there still apply, which need to be modified, and
which cannot be applied in the new scenario. Again, the goal of the ReCAP
project is to develop methods and tools that will support the journalist with
such tasks. Given a topic, we will first determine similar topics based on
argument similarity measures; in the example, this could be Brexit or Grexit. The
ReCAP argumentation machine then finds documents discussing these related
topics, extracts argumentation structures from them, and aggregates them. For
each argument used for such a related topic, the system will then estimate if it is
still valid in this context, if it needs to be adapted or replaced by an analogous
one, or if it does not apply in this context at all and must be removed. As a
result, a new argumentation is synthesised by reuse and analogical transfer of
existing argument structures that particularly addresses the potential exit of the
Netherlands from the EU.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Project Overview</title>
      <p>The overall project vision is reflected in a preliminary view of the argumentation
machine’s architecture, depicted in Fig. 1. This figure serves as an overview of
the various basic research questions addressed in the ReCAP project as well as
of their interrelationships. The bottom part of this layered architecture shows
the textual level of the argumentation machine, addressing argument mining and
corpus construction from existing textual sources, leading to semantically
annotated argumentation graphs, reflecting the document’s content on the knowledge</p>
      <sec id="sec-3-1">
        <title>User Interaction</title>
      </sec>
      <sec id="sec-3-2">
        <title>Context</title>
      </sec>
      <sec id="sec-3-3">
        <title>Deliberation</title>
      </sec>
      <sec id="sec-3-4">
        <title>Synthesis</title>
      </sec>
      <sec id="sec-3-5">
        <title>Similarity</title>
        <p>(Learning and
Assessment)</p>
      </sec>
      <sec id="sec-3-6">
        <title>Analysis</title>
        <p>(Extraction,Clustering,
Generalisation)</p>
      </sec>
      <sec id="sec-3-7">
        <title>Retrieval</title>
      </sec>
      <sec id="sec-3-8">
        <title>Case-Based</title>
      </sec>
      <sec id="sec-3-9">
        <title>Reasoning</title>
        <p>U
se
C
ssae
K
n
o
w
l
e
d
g
e
Lve vE
leB lau
a
cenhm itno
a
rsk
A
n
n
o
tt
a
e
d
C
o
r
p
o
r
a</p>
      </sec>
      <sec id="sec-3-10">
        <title>Extracted Argumentation Graphs</title>
      </sec>
      <sec id="sec-3-11">
        <title>Source Text Retrieval</title>
        <p>&amp; Argument Mining
…
…</p>
      </sec>
      <sec id="sec-3-12">
        <title>Reuseable, Generalised, &amp; Validated Components</title>
        <p>Elementary Arguments Argumentation Schemas
… …</p>
      </sec>
      <sec id="sec-3-13">
        <title>Validation</title>
        <p>…
A
p
p
li
tc
a
i
o
n
L
e
v
e
l
T
e
x
t
u
a
l
L
e
v
e
l</p>
      </sec>
      <sec id="sec-3-14">
        <title>Scenario</title>
        <p>Support</p>
      </sec>
      <sec id="sec-3-15">
        <title>Building</title>
        <p>K Processing
now Blocks
l
e
d
g
e
L
e
lve Argumentation</p>
      </sec>
      <sec id="sec-3-16">
        <title>Base</title>
      </sec>
      <sec id="sec-3-17">
        <title>Building</title>
      </sec>
      <sec id="sec-3-18">
        <title>Processing</title>
      </sec>
      <sec id="sec-3-19">
        <title>Blocks</title>
      </sec>
      <sec id="sec-3-20">
        <title>Text</title>
      </sec>
      <sec id="sec-3-21">
        <title>Sources</title>
      </sec>
      <sec id="sec-3-22">
        <title>Argumentative Texts</title>
      </sec>
      <sec id="sec-3-23">
        <title>Factual Texts and Databases</title>
        <p>level. The validation of factual statements in arguments based on related text and
databases leads to further enhancements of their representation on the
knowledge level by including assessments of validity and strength as well as by links
to the related textual evidences. The knowledge-level reasoning is positioned on
top of the textual level. The extracted, specific argumentation graphs need to
be analysed such that their major common constituents (elementary arguments,
supporting and attacking relations, argumentation structures) are identified.
Together with the specific argumentation graphs they will form the argumentation
base for further knowledge processing. We will investigate new similarity
measures for comparing arguments and argument structures, supported by machine
learning methods for textual similarity. Such a computational notion of
similarity is also the core for argumentation graph analysis. In addition argument
(structure) retrieval, as required for finding suitable arguments, requires a notion
of similarity to measure relevance of arguments. Finally, Case-Based Reasoning
aims at supporting synthesis by designing new argument structures by adapting
the best existing similar structures from the argumentation base by analogical
transfer. The upper level of the architecture encompasses the specific
applicationoriented components to support deliberation and synthesis as required for the
targeted use cases. Their implementation will make use of the knowledge
processing building blocks. The context module aims at capturing, analysing, and
representing the specific user’s context, i.e., the specific issue under consideration
as well as specific beliefs and constraints of the user.</p>
        <p>We will focus our research in the first three years primarily on the
knowledgelevel. On the textual level, we address the collection of relevant evidences for
validation. Research on argument mining will be deferred. Thus, the
transformation of available text sources into semantically annotated argumentation graphs
is performed manually. For this purpose, we identify relevant German language
text sources from the political domain and manually annotate the argumentation
structures that occur. Thereby a research corpus of high-quality is constructed
(see sect. 5) which supports the evaluation of the developed methods.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Methods</title>
      <p>We now give a brief overview of the methods being researched within the overall
architecture of the argumentation machine.
4.1</p>
      <sec id="sec-4-1">
        <title>Representation of Argumentation Structures</title>
        <p>
          We will develop a model for representing argumentation structures and their
components. An argument is a relationship between a claim and several premises,
by which the claim is either attacked or supported. An argumentation structure
consists of arguments, forming a graph. Claims and premises have a textual
representation in natural language. Our model is essentially extending AIF [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]
by support for storing specific meta information on claims, premises and their
relationships (e.g. correctness and strength measures, who used the argument,
when was it used, etc.) and for explicitly linking arguments and documents. For
each argument, we will include provenance information in the form of its exact
position in the source document, the annotator, and possibly any further meta
information from the automated annotation process. Additionally, we will model
argumentation schemes [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ] and their instantiation in a concrete argumentation
structure.
        </p>
        <p>
          Our methods often work on the textual representation of premises and claims.
A pre-processing step improves the understanding of a proposition in isolation
by adding further annotations, e.g. by POS tagging, mapping to an ontology
and resolution of anaphora. A model of user context represents the attitude
towards certain arguments, documents, experts, and sources, similar to the
context definition by Brewka &amp; Eiter [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], thus allowing for a personalised argument
ranking, retrieval, and adaptation. The context for a user group generalises this
to groups of users (e.g. right-wing populists), allowing for a targeted analysis of
the argumentation within a certain population.
4.2
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>Quantitative Properties of Arguments</title>
        <p>
          The validity or factual correctness of a proposition estimates the degree of truth
of the proposition. Validity measures will exploit the textual representation of
a premise in two diferent ways. In a first line of work, we will develop methods
that evaluate the validity of a proposition by connecting it to a factual
knowledge base, extending on existing work for RDF facts [
          <xref ref-type="bibr" rid="ref11 ref17">11,17</xref>
          ] . We will apply and
modify pattern-based information extraction methods that extract facts to check
from the proposition. If the fact is found in the knowledge base, this is a strong
indication that the fact is correct. We will use the YAGO knowledge base [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]
which has the advantage to include both multilingual information and temporal
information on facts. As pure fact-checking against a knowledge base will only
work for a subset of all facts used in argumentations, we plan to investigate
an alternative approach that attempts to find the (possibly rephrased)
proposition in a large text corpus or even on the Web and estimates its correctness
based on frequency and/or authority of sources; the latter may depend on the
context. We will build on existing methods such as those proposed by Leong
and Cucerzan [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ], which focus on reformulating a factual statement, not an
argumentative proposition, and operate only on Wikipedia, exploiting specific
properties like citations and inter-article linkage. In a next step, we will consider
for all premises linked to a specific claim their strength or importance for the
discussion, potentially relative to a given context. The goal here is to develop a
ranking function. Popularity may be a reasonable start, but will usually not be
enough since the strongest argument may not be given often, whereas a weak
argument that is known to many people may be given frequently.
4.3
        </p>
      </sec>
      <sec id="sec-4-3">
        <title>Similarity and Analysis of Argumentation Structures</title>
        <p>
          Similarity is a core concept relevant when reasoning with argumentation
structures extracted from text. Similarity measures will be considered for diferent
purposes, in particular for similarity-based retrieval of graphs (see sect. 4.4) for
deliberation as well as for CBR (see sect. 4.5) to support the synthesis of new
argumentation graphs. Following the local-global principle established in CBR
[
          <xref ref-type="bibr" rid="ref23">23</xref>
          ] the measure will be decomposed in local similarity measures assessing the
similarity of all information available in the representation. Those measures will
include combined structural and textual similarity measures [
          <xref ref-type="bibr" rid="ref25 ref27">25,27</xref>
          ], the
semantic closeness of the arguments and relation types based on related ontological
information [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ] as well as the numerical similarity of certain attributes, such
as validity and strength estimations. The global similarity is computed using
an optimization process [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] which creates an alignment between the nodes and
edges of the two argumentation graphs based on the local similarities. We will
analyse the computational complexity of this optimisation problem and develop
heuristic methods for finding good approximate solutions under acceptable time
constraints. The developed measures will be evaluated in the context of their
application and purpose based on systematically constructed ranking experiments
with real users.
        </p>
        <p>
          Building upon this research on similarity, we will further consider additional
methods that support the analysis and decomposition of the argumentation
graphs. We will develop clustering algorithms for arguments and
argumentation graphs based on hierarchical clustering [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], such as divisive clustering using
k-medoid splitting, making use of the developed similarity measures.
Generalisation can be performed without changing the structure of the graph by replacing
one or several arguments by more general arguments [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]. The previously
performed clustering provides the cluster label as a general argument for a cluster
member. In case of hierarchical clusters, generalisation can be achieved on
several levels of generality. Finally, we will identify frequently occurring elementary
arguments, argumentation graphs and subgraphs are identified, extracted,
generalised (if possible), and stored in a separate part of the argument base of
reusable argumentation components.
4.4
        </p>
      </sec>
      <sec id="sec-4-4">
        <title>Retrieval of Argumentation Structures</title>
        <p>
          When exploring the arguments for some topic, a user may want to identify other
topics where an argument was used, similar argumentation structures in
diferent topics, or where a specific partial instantiation of an argumentation scheme
is used. We will identify typical information needs based on discussions with
experts. To enable such complex queries, we will define a query language based
on a graph query language like SPARQL [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ] that allows specification of strict
constraints (e.g. a minimal strength), vague constraints (e.g. textual content of
arguments) and constraints on the graph structure (e.g. relations, higher-order
relations and scheme instantiations [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ]). Since important information may be
available only in the documents from which arguments were extracted/mined,
but not in the extracted arguments, the language will also allow to specify vague
and strict constraints on source documents. The language will also include means
to refer to a predefined context. Scoring and ranking methods for the results will
combine graph-based similarity measures (sect. 4.3) with content-based scores
for conditions on documents and argument properties, building on our earlier
work [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] for knowledge bases. We plan to apply standard learning-to-rank
techniques that combine a large number of query-independent and query-dependent
features. For training these models, we will develop an annotated corpus of
structured queries together with relevant results, and we will develop tools for
constructing these relevance assessments based on crowdsourcing. We will also
develop an easy-to-use query interface with the option to explore detailed
solutions, including faceted browsing.
4.5
        </p>
      </sec>
      <sec id="sec-4-5">
        <title>Case-Based Reasoning with Argumentation Structures</title>
        <p>
          The work on similarity and retrieval is extended towards a comprehensive CBR
approach for the synthesis of argumentation graphs by reuse and adaptation
of argumentation graphs and propositions from the argumentation base. This
research builds upon previous work on CBR for legal argumentation ([
          <xref ref-type="bibr" rid="ref24 ref7">7,24</xref>
          ]) as
well as on the previous work on process-oriented CBR ([
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]). We will research on
new adaptation methods that iteratively transform a retrieved argumentation
graph towards an adapted argumentation graph that is better suited to the
query than the original graph. In particular we will transfer the main concepts
successfully developed for the compositional and operator-based adaptation of
workflow graphs [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] towards argumentation graphs. This includes methods for
learning the required adaptation knowledge from the argumentation base. For
learning adaptation operators, pairs of similar argumentation graphs from the
argumentation base will be searched and compared (e.g. a mapping between
the two graphs is constructed). The identified diferences will be analysed and
turned into a formal operator description that is able to bridge this diference
between the two graphs. While the general principle underlying this learning
approach is established in CBR [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], its application to argumentation graphs
is absolutely novel. The adaptation process itself will then apply the learned
adaptation knowledge iteratively on the retrieved argumentation graph, leading
to a local search process. This search process (which can be implemented, for
example, as a hill-climbing or a stochastic local search) aims at optimising two
criteria in parallel: a) the similarity of the adapted argumentation graph to the
current query (how well does the synthesised argumentation match the claim)
and b) the validity and strength of it.
5
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Workshop with cooperation partners</title>
      <p>We organized a workshop with expert groups in the fields of journalistic writing
(led by Dr. Damian Trilling, University of Amsterdam) and political research
(led by Dr. Lasse Cronqvist, Trier University). The goal of this workshop was
on the one hand to develop a comprehensive understanding of the problems and
workflows of each occupational group in order to sketch visionary tools, and on
the other hand to elaborate concrete use cases for deliberation and synthesis.
For the latter, topics and sources of argumentative text had to be identified.
5.1</p>
      <sec id="sec-5-1">
        <title>Operational workflows and potential tools</title>
        <p>Nowadays, journalism can be divided into three main branches: classic
journalism, investigative journalism and online journalism. Classic journalism describes
a ground-in routine with superficial investigation and known sources where
journalists are short of time and need to publish multiple articles per day; thus there
is less need for argumentation tools. A journalist in investigative journalism deals
with a subject rigorously. She concerns with phenomena in society, e.g.
rightwing extremism, and needs to search forums, blogs, etc. There is a lot of data
that needs to be extracted and there supporting tools could be helpful. In online
journalism articles of print media are enriched e.g. with info graphics, headlines,
etc. This work is mainly done by other journalists which are neither always
familiar with the themes nor have much time to address the themes in depth.
Thus supporting tools would be useful for these journalists, e.g. a deliberation
tool that generates an overview of all arguments for or against a given topic.</p>
        <p>
          A simplified example of such an overview is depicted in Fig. 2, showing
arguments for and against merging the school forms Hauptschule and Realschule
in Rhineland-Palantine, grouped by argument type. The Dutch company
Argumentenfabriek created similar argument maps (mostly in Dutch, some in
English) in a manual process [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. A potential deliberation tool could generate such
argument maps automatically for arguments in German. It could also visualize
the temporal dynamics by allowing to focus on arguments used in specific time
frame.
        </p>
        <p>What are arguments for and
against merging the school
forms Hauptschule and
Realschule in
Rhineland</p>
        <p>Palantine?
consequences</p>
        <p>Authority
Opions
for
against
against</p>
        <p>The orientation of profession and studies in Rhineland-Palantine is</p>
        <p>commendable.</p>
        <p>There are less early school leavers.</p>
        <p>The problems of the pupils of the Hauptschule are translocated to the Realschule.</p>
        <p>Today's fear of parents of the Hauptschule leads to an increase of registrations
at Gymnasiums and a quality loss because of more pupils of dif erent potentials.</p>
        <p>It does not make sense to educate young people more and more with regard to</p>
        <p>the threat of skil shortage.</p>
        <p>Malte Blümke, state's chairman of the association of teachers, takes
the line that Rhineland-Palantine does have a sustainable school system
Doris Ahnen, the minister of education, takes the line that Rhineland-Palantine</p>
        <p>does have a sustainable school system
Nils Wiechmann, the speaker of the state executive commitee says that the</p>
        <p>planed changes underpin our social unfair education system
Statistics
for</p>
        <p>Rhineland-Palantine has school classes with nationwide among others the
lowest pupil numbers.</p>
        <p>
          Contrary to journalists there does not exist a ground-in routine for political
researchers. The way of working of a political researcher is sometimes very
individual and depends on particular cases. Frequently, they work with annotating
software like MaxQDA. They annotate 20 to 30 texts on a certain topic and then
analyze passages of a certain argument type, focusing rather in the content of the
text than the argument structure. The workshop with the political researchers
confirmed that the tools envisioned with the journalist expert group would also
be useful for these users. Both expert groups agreed that OVA [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] is a suitable
tool to annotate arguments in texts.
5.2
        </p>
      </sec>
      <sec id="sec-5-2">
        <title>Topics and Sources for Use Cases</title>
        <p>The topics for use cases should be relevant to society and should not be too
specific in order to find enough argumentative texts. To ensure a high-quality
annotation with argument structures, typical texts discussing the topic should
neither be to complicated for a non-expert nor be too simplistic, and the topic
needs to have a manageable size in terms of the diferent arguments used. In
addition to that, the meaning of typical concepts used should not change
significantly over time. Since the focus of the project is on arguments in German, topics
should be limited to Germany. Especially concerning the synthesis of arguments,
information needs to be transferable between instances of topics.</p>
        <p>We committed to the topic of education policy because it meets all these
requirements. A core component of education policy is that it is under the
responsibility of federal states. Arguments are clearly limited to states and therefore it
is possible to transfer arguments between states. In order to have a diverse
starting point, we decided to begin with three federal states (Rhineland-Palatinate,
Hamburg, and Bavaria). Within this overall theme, a number of specific detailed
topics were identified, e.g. the discussion of G8 vs. G9, the question if children
should spend full days at school, or the debate if Hauptschule and Realschule
should be merged.</p>
        <p>For every topic, textual sources of arguments had to be identified and the
argument structures within them manually annotated. Since the quality of
arguments stands and falls with these sources we had to choose these with care.
A good source are the protocols of plenary debates since they include diferent
viewpoints, they are of medium complexity, and large volumes of them are freely
available. Another central source are news articles, which also include diferent
viewpoints, but are usually neither always open nor free to use. We will also
consider information provided by political parties and lobby groups because these
are usually rich of arguments.
6</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Related Work</title>
      <p>
        Today, many corpora with annotated argumentation already exist. Important
sources for such corpora are aifdb.org (from University of Dundee) which
provides argumentation structures in the Argument Interchange Format [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], the
IBM Debating Technologies datasets based on Wikipedia articles, corpora
provided by the UKP Lab in Darmstadt, and corpora built by the Applied CL
Discourse Lab in Potsdam. However, none of these corpora is directly tied to
specific usage scenarios and queries for deliberation and synthesis of arguments.
      </p>
      <p>
        The validation of factual statements is an important problem not only in
argumentation, but in many other fields. ClaimBuster [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] automatically finds,
but does not check, facts in a debate that are worth checking. TruthTeller
of the Washington Post (which is now ofline) matched statements made by
politicians in a speech to a database of pre-checked statements. PolitiFact (http:
//www.politifact.com/) performs thorough manual fact-checking for selected facts
and rates the accuracy of statements. A number of rather limited methods for
automatic validation of factual statements exist. Some approaches find, given an
RDF triple, Web documents supporting this fact, converting it to various
textual representations; recent examples include Defacto [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and Multi-Verifier [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ].
Validation of factual statements represented in textual form has seen much less
work. Existing approaches essentially rely on paraphrasing a statement and
finding it in a reference collection like Wikipedia [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. Textual entailment [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] can be
seen as a way of finding other statements that entail the statement in question.
Habernal and Gurevych [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] estimate the convincingness of arguments using
neural networks.
      </p>
      <p>
        Stab et al. [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ] present ArgumenText, an argument retrieval system capable
of retrieving topic-relevant sentential arguments from a large collection of diverse
Web texts for any given controversial topic. The system first retrieves relevant
documents, then it identifies arguments and classifies them as “pro” or “con”,
and presents them ranked by relevance in a web interface. Gutfreund et al.
[
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] introduce a system that automatically generates arguments supporting and
contesting a given point of view about a controversial topic.
7
      </p>
    </sec>
    <sec id="sec-7">
      <title>Conclusion</title>
      <p>In this paper, we gave an overview of the main approach and the involved
methods of the recently started project ReCAP that aims at developing an
argumentation machine and related applications in the domain of politics. This project
will be linked with the other projects in the priority programme RATIO. In
particular, we aim at incorporating research results from projects which focus on
argument mining during later phases of our roadmap for research. Thereby we
aim at completing our vision of a full argumentation machine working without
manual preparation of textual resources.</p>
      <p>Bergmann et al.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <article-title>Example of an overview map of arguments from Argumentenfabriek</article-title>
          . https://www. argumentenfabriek.nl/media/2276/10032-tno-co2-engels.pdf.
          <source>Last accessed 8 June</source>
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <article-title>2. OVA tool developed by ARG-tech</article-title>
          . http://ova.arg-tech.org/.
          <source>Last accessed 8 June</source>
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>I.</given-names>
            <surname>Androutsopoulos</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Malakasiotis</surname>
          </string-name>
          .
          <article-title>A survey of paraphrasing and textual entailment methods</article-title>
          .
          <source>J. Artif. Intell. Res. (JAIR)</source>
          ,
          <volume>38</volume>
          :
          <fpage>135</fpage>
          -
          <lpage>187</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>R.</given-names>
            <surname>Bergmann</surname>
          </string-name>
          and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Gil</surname>
          </string-name>
          .
          <article-title>Similarity assessment and eficient retrieval of semantic workflows</article-title>
          .
          <source>Information Systems</source>
          ,
          <volume>40</volume>
          :
          <fpage>115</fpage>
          -
          <lpage>127</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Ralph</given-names>
            <surname>Bergmann</surname>
          </string-name>
          and
          <string-name>
            <given-names>Gilbert</given-names>
            <surname>Müller</surname>
          </string-name>
          .
          <article-title>Similarity-based Retrieval and Automatic Adaptation of Semantic Workflows</article-title>
          .
          <source>In Synergies Between Knowledge Engineering and Software Engineering, number 626 in Advances in Intelligent Systems and Computing</source>
          , pages
          <fpage>31</fpage>
          -
          <lpage>54</lpage>
          . Springer,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>F.</given-names>
            <surname>Bex</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lawrence</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Snaith</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Reed</surname>
          </string-name>
          .
          <article-title>Implementing the argument web</article-title>
          .
          <source>Commun. ACM</source>
          ,
          <volume>56</volume>
          (
          <issue>10</issue>
          ):
          <fpage>66</fpage>
          -
          <lpage>73</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>L.Karl</given-names>
            <surname>Branting</surname>
          </string-name>
          .
          <article-title>A reduction-graph model of precedent in legal analysis</article-title>
          .
          <source>Artificial Intelligence</source>
          ,
          <volume>150</volume>
          (
          <issue>1</issue>
          ):
          <fpage>59</fpage>
          -
          <lpage>95</lpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>G.</given-names>
            <surname>Brewka</surname>
          </string-name>
          and
          <string-name>
            <given-names>T.</given-names>
            <surname>Eiter</surname>
          </string-name>
          .
          <article-title>Argumentation context systems: A framework for abstract group argumentation</article-title>
          .
          <source>In Proc. 10th Int. Conf. on Logic Programming and Nonmonotonic Reasoning (LPNMR)</source>
          , pages
          <fpage>44</fpage>
          -
          <lpage>57</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>S.</given-names>
            <surname>Craw</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Wiratunga</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R. C</given-names>
            <surname>Rowe</surname>
          </string-name>
          .
          <article-title>Learning adaptation knowledge to improve case-based reasoning</article-title>
          .
          <source>Artificial Intelligence</source>
          ,
          <volume>170</volume>
          (
          <issue>16</issue>
          ):
          <fpage>1175</fpage>
          -
          <lpage>1192</lpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>S.</given-names>
            <surname>Elbassuoni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ramanath</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Schenkel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sydow</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Weikum</surname>
          </string-name>
          .
          <article-title>Languagemodel-based ranking for queries on RDF-graphs</article-title>
          .
          <source>In Proc. 18th ACM Conf. on Information and Knowledge Management</source>
          , pages
          <fpage>977</fpage>
          -
          <lpage>986</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>D.</given-names>
            <surname>Gerber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Esteves</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lehmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Bühmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Usbeck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. Ngonga</given-names>
            <surname>Ngomo</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Speck</surname>
          </string-name>
          .
          <article-title>Defacto - temporal and multilingual deep fact validation</article-title>
          .
          <source>J. Web Sem</source>
          .,
          <volume>35</volume>
          :
          <fpage>85</fpage>
          -
          <lpage>101</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>D.</given-names>
            <surname>Gutfreund</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Katz</surname>
          </string-name>
          , and
          <string-name>
            <given-names>N.</given-names>
            <surname>Slonim</surname>
          </string-name>
          .
          <article-title>Automatic arguments construction - from search</article-title>
          engine to research engine,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13. I.
          <article-title>Habernal and I. Gurevych. Which argument is more convincing? Analyzing and predicting convincingness of Web arguments using bidirectional LSTM</article-title>
          .
          <string-name>
            <surname>In</surname>
            <given-names>ACL</given-names>
          </string-name>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14. J. Han and
          <string-name>
            <given-names>M.</given-names>
            <surname>Kamber</surname>
          </string-name>
          .
          <article-title>Data Mining: Concepts and Techniques</article-title>
          . Morgan Kaufmann,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <given-names>N.</given-names>
            <surname>Hassan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Adair</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hamilton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Tremayne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Yang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Yu</surname>
          </string-name>
          .
          <article-title>The quest to automate fact-checking</article-title>
          .
          <source>In Proc. of the 2015 Computation+Journalism Symposium</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <given-names>C. W.</given-names>
            <surname>Leong</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Cucerzan</surname>
          </string-name>
          .
          <article-title>Supporting factual statements with evidence from the web</article-title>
          .
          <source>In CIKM</source>
          , pages
          <fpage>1153</fpage>
          -
          <lpage>1162</lpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Stefen</surname>
            <given-names>Metzger</given-names>
          </string-name>
          , Shady Elbassuoni, Katja Hose, and Ralf Schenkel. S3K:
          <article-title>Seeking statement-supporting top-k witnesses</article-title>
          .
          <source>In Proc. 20th ACM Conf. on Information and Knowledge Management (CIKM)</source>
          , Glasgow, UK,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18. G. Müller and
          <string-name>
            <given-names>R.</given-names>
            <surname>Bergmann</surname>
          </string-name>
          .
          <article-title>Generalization of Workflows in Process-Oriented Case-Based Reasoning</article-title>
          .
          <source>In Proc. of the 28th FLAIRS Conference</source>
          , pages
          <fpage>391</fpage>
          -
          <lpage>396</lpage>
          ,
          <string-name>
            <surname>Hollywood</surname>
          </string-name>
          (Florida), USA,
          <year>2015</year>
          . AAAI Press.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <given-names>A.</given-names>
            <surname>Peldszus</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Stede</surname>
          </string-name>
          .
          <article-title>From argument diagrams to argumentation mining in texts: A survey</article-title>
          .
          <source>Int. J. Cogn. Inform. Nat. Intell</source>
          .,
          <volume>7</volume>
          (
          <issue>1</issue>
          ):
          <fpage>1</fpage>
          -
          <lpage>31</lpage>
          ,
          <year>January 2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>J. Pérez</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Arenas</surname>
            , and
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Gutierrez</surname>
          </string-name>
          .
          <article-title>Semantics and complexity of SPARQL</article-title>
          .
          <source>ACM Trans. Database Syst</source>
          .,
          <volume>34</volume>
          (
          <issue>3</issue>
          ),
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>T.</surname>
          </string-name>
          Rebele et al.
          <article-title>YAGO: A multilingual knowledge base from Wikipedia, Wordnet, and Geonames</article-title>
          .
          <source>In Proc. 15th Int. Semantic Web Conference (ISWC)</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <given-names>P.</given-names>
            <surname>Resnik</surname>
          </string-name>
          .
          <article-title>Using information content to evaluate semantic similarity in a taxonomy</article-title>
          .
          <source>In Proceedings of the 14th Int. Joint Conf. on Artificial Intelligence</source>
          , pages
          <fpage>448</fpage>
          -
          <lpage>453</lpage>
          . Morgan Kaufmann,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>M. M. Richter</surname>
            and
            <given-names>R. O.</given-names>
          </string-name>
          <string-name>
            <surname>Weber</surname>
          </string-name>
          .
          <source>Case-Based Reasoning - A Textbook</source>
          . Springer,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Edwina L. Rissland</surname>
            ,
            <given-names>Kevin D.</given-names>
          </string-name>
          <string-name>
            <surname>Ashley</surname>
            , and
            <given-names>Karl</given-names>
          </string-name>
          <string-name>
            <surname>Branting</surname>
          </string-name>
          .
          <article-title>Case-based reasoning and law</article-title>
          .
          <source>The Knowledge Engineering Review</source>
          ,
          <volume>20</volume>
          (
          <issue>3</issue>
          ):
          <fpage>293</fpage>
          -
          <lpage>298</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <given-names>R.</given-names>
            <surname>Schenkel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Theobald</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Weikum</surname>
          </string-name>
          .
          <article-title>Semantic similarity search on semistructured data with the XXL search engine</article-title>
          .
          <source>Information Retrieval</source>
          ,
          <volume>8</volume>
          (
          <issue>4</issue>
          ):
          <fpage>521</fpage>
          -
          <lpage>545</lpage>
          ,
          <year>December 2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>C. Stab</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Daxenberger</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Stahlhut</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Miller</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Schiller</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Tauchmann</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Eger</surname>
            ,
            <given-names>and I.</given-names>
          </string-name>
          <string-name>
            <surname>Gurevych</surname>
          </string-name>
          . Argumentext:
          <article-title>Searching for arguments in heterogeneous sources</article-title>
          .
          <source>In Proceedings of the 2018 Conference of the NAACL</source>
          , pages
          <fpage>21</fpage>
          -
          <lpage>25</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>M. Theobald</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Bast</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Majumdar</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Schenkel</surname>
            , and
            <given-names>G. Weikum.</given-names>
          </string-name>
          <article-title>TopX: Eficient and versatile top-k query processing for semistructured data</article-title>
          .
          <source>VLDB Journal</source>
          ,
          <volume>17</volume>
          (
          <issue>1</issue>
          ):
          <fpage>81</fpage>
          -
          <lpage>115</lpage>
          ,
          <year>January 2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <given-names>D. N.</given-names>
            <surname>Walton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Reed</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Macagno</surname>
          </string-name>
          . Argumentation Schemes. Cambridge University Press,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          29.
          <string-name>
            <given-names>T.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Zhu</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          <article-title>. Multi-verifier: A novel method for fact statement verification</article-title>
          .
          <source>World Wide Web</source>
          ,
          <volume>18</volume>
          (
          <issue>5</issue>
          ):
          <fpage>1463</fpage>
          -
          <lpage>1480</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>