<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Reinforcement Learning for Dialogue Game Based Argumentation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sultan Alahmari</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tommy Yuan</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Daniel Kudenko</string-name>
          <email>daniel.kudenko@york.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of York, Department of Computer Science</institution>
          ,
          <addr-line>Deramore Lane York YO10 5GH, UK smsa500, tommy.yuan and</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>Communication between agents is an important feature for intelligent systems. One type of communication is the argumentative dialogue where both agents aim to convert each others' point of view. Argumentative reinforcement learning (RL) agents can learn from experience via trial and error. For an agent learning to argue with other agents, it would be useful to discover some useful argument patterns i.e. state and action pairs, which will enable a RL agent to transfer what has been learned from one domain to another. A dialogue model is needed to manage the evolving dialogue between agents. In this paper we adopt the DE dialogue game and construct an RL agent that can win a debate with the minimum number of moves. The result is promising so far and this enables us to move forward to transfer learning in order to generalise our approach. It is anticipated that our research will contribute to the design and implementation of argumentative learning agents.</p>
      </abstract>
      <kwd-group>
        <kwd>Multi-agent systems</kwd>
        <kwd>Argumentation</kwd>
        <kwd>Dialogue system</kwd>
        <kwd>Dialogue model</kwd>
        <kwd>DE dialogue game</kwd>
        <kwd>Reinforcement learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Recently, research and applications in artificial intelligence and machine learning have
been increasing rapidly. Agents or multi-agents are able to via learning from the data
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]- [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Reinforcement learning (RL) is a machine learning approach where an agent
can learn to map an action to a state in an environment in order to maximise the
cumulative reward [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. In our previous work [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], we have built an RL agent that is able to
argue against baseline agents using abstract argumentation systems framework [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]- [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]
where the argument game was adopted for reasons of simplicity and flexibility [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The
results, as reported in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], are positive when an agent learns in a single argument graph
and improves its performance over time. However, when an agent applies what has been
learned to a different graph, the result is less encouraging. The difficulty with the
abstract argument representation is that argument patterns, i.e. state action pairs are hard
to be learned without referencing to the internal argument structure. This motivate us to
move to propositional-logic based representation and a richer dialogue model [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], where
argument patterns such as argument schemes and source of support, can be learned. An
influential logic-based dialogue model - the DE model [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] is used to manage the
evolving dialogue.
      </p>
      <p>The remainder of this paper is organised as follows. We will introduce the DE
dialogue model in Section 2. Next, we discuss the design of the RL agent and baseline
agents in Section 3. Section 4 provides the details for the experiments and discusses the
results. Section 5 concludes the paper and give pointers for our intended future work.
2</p>
    </sec>
    <sec id="sec-2">
      <title>The ”DE” Dialogue Model</title>
      <p>
        The DE dialogue model [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]- [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] was developed based on [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], which is a further
version of Mackenzie’s DC system [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. The DE model facilities agents to make moves
in an evolving dialogue. The DE model has five different move types are quoted from
[
        <xref ref-type="bibr" rid="ref14 ref16 ref19">14, 16, 19</xref>
        ] as follows:
1. Assertion: The content of an assertion is a statement P; Q etc. or the truth-functional
compounds of statements: :P , If P then Q, P ^ Q.
2. Questions: The question of the statement P is ”Is it the case that P ?”
3. Challenges: The challenge of the statement P is ”Why is it supposed that P ?” (or
briefly ”Why P ?”).
4. Withdrawals: The withdrawal of the statement P is ”no commitment P ”.
5. Resolution demands:The resolution demand of the statement P is ”resolve whether
P”.
      </p>
      <p>
        The DE model specifies one publicly inspectable commitment store (henceforth
referred as CS) for each player. A commitment store contains statements that have been
stated or accepted by a speaker. A commitment store has an assertion list which holds
statements which an agent has explicitly stated and a concession list which contains
statements have been implicitly accepted [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. The commitment rules below are taken
from [
        <xref ref-type="bibr" rid="ref14 ref16 ref19">14, 16, 19</xref>
        ]:
1. Initial commitment,CR0: The initial commitment of each participant is null.
2. Withdrawals,CRW : After the withdrawal of P , the statement P is not included in
the move.
3. Statements,CRS :After a statement P , unless the preceding event was a challenge,
P is included in the move maker’s assertion list and the dialogue partner’s
concession list, and :P will be removed from the move maker’s concession list if it is
there.
4. Defence,CRY S :After a statement P , if the preceding event was Why Q?, P and If
P then Q are included in the move maker’s assertion list and the dialogue partner’s
concession list, and :P and :(If P then Q) are removed from the move maker’s
concession list if they are there.
5. Challenges,CRY : A challenge of P results in P being removed from the store of
the move maker’s if it is there.
      </p>
      <p>
        Dialogue rules that an agent must follow during the dialogue are stated in [
        <xref ref-type="bibr" rid="ref14 ref16 ref19">14,16,19</xref>
        ]
as follows:
1. RF ROM : Each participant or agent can make one of the permitted types of move
in turn.
2. RRE P S T AT : Mutual commitment may not be asserted until answering the
question or challenge.
3. RQU E S T : The possible answers to question P can be “P”, “:P” or “No
commitment”.
4. RC H ALL: “Why P?” can be answered by withdrawal of P, a statement to the
challenger or resolution demand for any commitments of the challenger which imply
P.
5. RRE S OLV E : A resolution demand can happen only if the opponent has
inconsistent statements in the commitment store.
6. RRE S OLU T I ON : A resolution demand has to be followed by withdrawal of one
of the offending conjuncts or affirmation of the disputed consequent.
7. RLE GALC H AL: The agent can challenge the opponent “Why P?”, unless P is on
the assertion list of the opponent’s dialogue.
      </p>
      <p>
        There are several advantages for the adoption of the DE model. First, the model
leaves enough room for strategic formation, and strategy is essential for an agent to
make high quality dialogue contributions. Second, computational agents adopting the
DE model have been built with hard coded heuristic strategies and the model has shown
advantage over others due to its computational tractability and simple dialogue rules
(Yuan et al. 2007 [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], 2008 [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], 2011 [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]). Finally, the DE model is built upon
propositional logic. In contrast to abstract argument game, the model is more sophisticated
and richer where the dialogue state can be represented using commitment stores and
different move types such as questions, statement and challenge. While in an abstract
argument game, state representation is restricted to node and arcs. It is our expectation
that the DE game can facilitate fruitful learning experience for a computational agent.
The design of the RL agent that operate on the DE model is discussed next.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Agent Design</title>
      <p>The section provides the design details for the RL agent and the baseline agents that
have been used to facilitate the DE dialogue.
3.1</p>
      <p>
        Reinforcement Learning Agent
Reinforcement learning is a type of machine learning where an agent interacts with an
environment by mapping a state with an action in order to maximise the cumulative
reward [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Agent needs to explore a state action policy using trial and error. Our RL
agent engages in a DE dialogue with other computational agent. The outcome of the the
dialogue is one agent persuades the other to give up its original view point. To enable
reinforcement learning, we need to design state, action and reward for the agent.
      </p>
      <p>The RL agent needs to observe the environment then decides the kind of actions
to undertake. In the DE dialogue model, an important state variable is the commitment
store which records what an agent has said or implicitly accepted during the dialogue.
Commitment stores are updated via agent actions through the commitment rules. A
further state variable is the dialogue history which contains the dialogue situation an
agent is facing (e.g the previous move). Only the previous move is used as a state
variable for reasons of simplicity. Since each participant has its own commitment store,
we define dialogue state as: (previousmove [ CS1 [ CS2) (where CS1 is commitment
store for one dialogue participant and CS2 belongs to the dialogue partner) to maximise
the possibility that a state is unique in any case.</p>
      <p>An action is a move that the RL agent chooses from the available move types in
the DE model, which are: assert, question, challenge, withdraw or resolution demand
together with the move content which is a proposition or conjunct of propositions. The
agent should map the state with an action which is called policy ( ).</p>
      <p>When a RL agent wins, it will receive a positive reward and a negative reward when
loses. We would also like to make an agent to win with the minimum number of moves
by using the following reward function formula:</p>
      <p>R = 100 +</p>
      <p>W
L
where W is the number of moves in a first winning episode (the benchmark), whereas,
L is the number of moves in the current episode. When the L is at the minimum, the
reward will be increased. We use the following Q-learning algorithm:</p>
      <p>Q(st; at)</p>
      <p>Q(st; at) + [rt+1 +
max Q(st; at+1)
a</p>
      <p>Q(st; at)]</p>
      <p>We offer the immediate reward rt+1 =
the minimum moves to win the game.</p>
      <p>
        Our RL agent adopts the knowledge base as shown in Fig. 1 [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ].
      </p>
      <p>0:01 to encourage the RL agent to choose</p>
      <p>
        Moore argues that the knowledge base should provide statements that can be used
to answer questions and support other statements, as well as providing statements to
rebut other statements [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. We also add some features such as argument schemes
and evidence source to the knowledge base with the expectation that an RL agent will
be able to recognise the argument patterns via these features. An argument scheme
(such as argue from consequence [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]) is represented in a parallelogram. An item of
evidence (such as newspapers, magazines or scientific papers) is represented in a circle.
For example, a RL agent could learn the reliability of a supporting evidence from the
environment as seen in Fig. 2.
To facilitate the DE dialogue with the RL agent discussed above, two baseline agents
have been built, one with the fixed strategy and one with random strategy [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. For the
fixed strategy agent, the set of heuristics was based on three levels of decisions taken
from [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]- [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]:
1. (1) Retain or change the current focus.
2. (2) Build own view or demolish the user’s view.
3. (3) Select method to fulfil the objective set at levels (1) and (2) [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
      </p>
      <p>
        According to Yuan et al., in [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], they clarify all three levels as ” Levels (1) and
(2) refer to strategies which apply only when the computer is facing a statement or
withdrawal, since in all other cases the computer must respond to the incoming move.
Level (3) refers to tactics used to reach the aims fixed at levels (1) and (2). Details of
the level 3 strategies can be found from”.
      </p>
      <p>A random agent will choose a move randomly from the set of legally available
moves in respect of the DE game rules.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Experiment and Result</title>
      <p>
        To examine the performance of the RL agent discussed in the previous section, we
have conducted a number of experiments. The baseline agents discussed above have
been used to play with the RL agent for the evaluation purpose. The discussion topic is
capital punishment and this is adapted from [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. One participant needs to persuade the
other to accept its point of view in order to resolve the conflict.
      </p>
      <p>We set up the learning rate, discounted factor and epsilon as 0.9, 0.9 and 0.3
respectively. We set a game as 4,000 episodes. Starting from 0 episode where the Q-table
values are zeros, the learning agent can behave like a random agent, we ran episode
zero 30 times to avoid lucky choice. Hence, we need to trade off between exploration
and exploitation to evaluate what has been learned. Exploitation is therefore every 100
episodes and epsilon reset to = 0 and repeated the run 30 times. The performance was
then measured by calculating how much reward each agent gained every 100 episodes,
as we were interested in exploitation when the learning agent makes a decision based
on the value in the Q-table.</p>
      <p>Indeed, as the reward function mentioned above shows, we were interested in long
term reward winning with the minimum number of moves. Therefore, we made our
learning agent play two games, one against a fixed strategy agent and another against a
random agent. The results look promising against both baseline agents, as can be seen
in Figs. 3 and 4.</p>
      <p>Both figures illustrate that the learning curves significantly rise with the increase of
the number of the episodes. This demonstrates that the agent is able to learn to argue
against the baseline agents and win most of the games after converging.</p>
      <p>Further, the reward function encourages the RL agent to win with the minimum
number of moves against the fixed strategy agent as shown in Fig. 5, and likewise
against random agent as shown in Fig. 6. Both figures demonstrate that the minimum
number of moves converges after around 750 episodes.</p>
      <p>
        Overall, the experiments show that the RL agent has been able to learn to argue
against different baseline agents with different argument goals e.g. win or win with
minimal number of moves.
We have used a DE dialogue model to engage our RL agent playing against baseline
agents and learning how to argue. The results are promising given the RL agent’s
improved performance against the baseline agents. This encourages us to continue with
the investigation of transfer learning in order to apply what has been learned from one
argument domain to another, e.g. Brexit. We are also planning to study the dialogue
quality attributes such as coherence and fluency [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]- [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and incorporate them in the
reward function then study the consequences.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Sultan</given-names>
            <surname>Alahmari</surname>
          </string-name>
          , Tommy Yuan, and
          <string-name>
            <given-names>Daniel</given-names>
            <surname>Kudenko</surname>
          </string-name>
          .
          <article-title>Reinforcement learning for abstract argumentation: Q-learning approach</article-title>
          .
          <source>In Adaptive and Learning Agents workshop (at AAMAS</source>
          <year>2017</year>
          ),
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Sultan</given-names>
            <surname>Alahmari</surname>
          </string-name>
          , Tommy Yuan, and
          <string-name>
            <given-names>Daniel</given-names>
            <surname>Kudenko</surname>
          </string-name>
          .
          <article-title>Reinforcement learning for argumentation: Describing a phd research</article-title>
          .
          <source>In Proceedings of the 17th Workshop on Computational Models of Natural Argument (CMNA17)</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Sultan</given-names>
            <surname>Alahmari</surname>
          </string-name>
          , Tommy Yuan, and Daniel Kudenko.
          <article-title>Policy generalisation in reinforcement learning for abstract argumentation</article-title>
          .
          <source>In Proceedings of the 18th Workshop on Computational Models of Natural Argument (CMNA18)</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Pedro</given-names>
            <surname>Domingos</surname>
          </string-name>
          .
          <article-title>A few useful things to know about machine learning</article-title>
          .
          <source>Communications of the ACM</source>
          ,
          <volume>55</volume>
          (
          <issue>10</issue>
          ):
          <fpage>78</fpage>
          -
          <lpage>87</lpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Phan</given-names>
            <surname>Minh Dung</surname>
          </string-name>
          .
          <article-title>On the acceptability of arguments and its fundamental role in nonmonotonic reasoning, logic programming and n-person games</article-title>
          .
          <source>Artificial intelligence</source>
          ,
          <volume>77</volume>
          (
          <issue>2</issue>
          ):
          <fpage>321</fpage>
          -
          <lpage>357</lpage>
          ,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Tim</given-names>
            <surname>Kelly</surname>
          </string-name>
          and
          <string-name>
            <given-names>Rob</given-names>
            <surname>Weaver</surname>
          </string-name>
          .
          <article-title>The goal structuring notation-a safety argument notation</article-title>
          .
          <source>In Proceedings of the dependable systems and networks 2004 workshop on assurance cases, page 6</source>
          .
          <string-name>
            <surname>Citeseer</surname>
          </string-name>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>Jiwei</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Will</given-names>
            <surname>Monroe</surname>
          </string-name>
          , Alan Ritter, Michel Galley,
          <string-name>
            <given-names>Jianfeng</given-names>
            <surname>Gao</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Dan</given-names>
            <surname>Jurafsky</surname>
          </string-name>
          .
          <article-title>Deep reinforcement learning for dialogue generation</article-title>
          .
          <source>arXiv preprint arXiv:1606.01541</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Jim</surname>
            <given-names>D</given-names>
          </string-name>
          <string-name>
            <surname>Mackenzie</surname>
          </string-name>
          .
          <article-title>Question-begging in non-cumulative systems</article-title>
          .
          <source>Journal of philosophical logic</source>
          ,
          <volume>8</volume>
          (
          <issue>1</issue>
          ):
          <fpage>117</fpage>
          -
          <lpage>133</lpage>
          ,
          <year>1979</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>David John Moore.</surname>
          </string-name>
          <article-title>Dialogue game theory for intelligent tutoring systems</article-title>
          .
          <source>PhD thesis</source>
          , Leeds Metropolitan University,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>Henry</given-names>
            <surname>Prakken</surname>
          </string-name>
          .
          <article-title>Coherence and flexibility in dialogue games for argumentation</article-title>
          .
          <source>Journal of logic and computation</source>
          ,
          <volume>15</volume>
          (
          <issue>6</issue>
          ):
          <fpage>1009</fpage>
          -
          <lpage>1040</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11. Richard S Sutton and
          <string-name>
            <given-names>Andrew G</given-names>
            <surname>Barto</surname>
          </string-name>
          .
          <article-title>Introduction to reinforcement learning</article-title>
          , volume
          <volume>135</volume>
          . MIT press Cambridge,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>Douglas</given-names>
            <surname>Walton</surname>
          </string-name>
          .
          <article-title>Argumentation schemes for presumptive reasoning</article-title>
          .
          <source>Routledge</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>Michael</given-names>
            <surname>Wooldridge</surname>
          </string-name>
          .
          <article-title>An introduction to multiagent systems</article-title>
          . John Wiley &amp; Sons,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Tangming</surname>
            <given-names>Yuan</given-names>
          </string-name>
          , David Moore,
          <string-name>
            <given-names>and Alec</given-names>
            <surname>Grierson</surname>
          </string-name>
          .
          <article-title>Computational agents as a test-bed to study the philosophical dialogue model” de”: A development of mackenzie's dc</article-title>
          .
          <source>Informal Logic</source>
          ,
          <volume>23</volume>
          (
          <issue>3</issue>
          ),
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Tangming</surname>
            <given-names>Yuan</given-names>
          </string-name>
          , David Moore,
          <string-name>
            <given-names>and Alec</given-names>
            <surname>Grierson</surname>
          </string-name>
          .
          <article-title>A human-computer debating system prototype and its dialogue strategies</article-title>
          .
          <source>International Journal of Intelligent Systems</source>
          ,
          <volume>22</volume>
          (
          <issue>1</issue>
          ):
          <fpage>133</fpage>
          -
          <lpage>156</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Tangming</surname>
            <given-names>Yuan</given-names>
          </string-name>
          , David Moore,
          <string-name>
            <given-names>and Alec</given-names>
            <surname>Grierson</surname>
          </string-name>
          .
          <article-title>A human-computer dialogue system for educational debate: A computational dialectics approach</article-title>
          .
          <source>International Journal of Artificial Intelligence in Education</source>
          ,
          <volume>18</volume>
          (
          <issue>1</issue>
          ):
          <fpage>3</fpage>
          -
          <lpage>26</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Tangming</surname>
            <given-names>Yuan</given-names>
          </string-name>
          , David Moore,
          <string-name>
            <given-names>and Alec</given-names>
            <surname>Grierson</surname>
          </string-name>
          .
          <article-title>Assessing debate strategies via computational agents</article-title>
          .
          <source>Argument and Computation</source>
          ,
          <volume>1</volume>
          (
          <issue>3</issue>
          ):
          <fpage>215</fpage>
          -
          <lpage>248</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Tangming</surname>
            <given-names>Yuan</given-names>
          </string-name>
          , David Moore,
          <string-name>
            <given-names>Chris</given-names>
            <surname>Reed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Andrew</given-names>
            <surname>Ravenscroft</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Nicolas</given-names>
            <surname>Maudet</surname>
          </string-name>
          .
          <article-title>Informal logic dialogue games in human-computer dialogue</article-title>
          .
          <source>The Knowledge Engineering Review</source>
          ,
          <volume>26</volume>
          (
          <issue>2</issue>
          ):
          <fpage>159</fpage>
          -
          <lpage>174</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <given-names>Tommy</given-names>
            <surname>Yuan</surname>
          </string-name>
          .
          <article-title>Human-Computer Debate, a Computational Dialectics Approach</article-title>
          .
          <source>PhD thesis</source>
          ,
          <source>Unpublished Doctoral Dissertation</source>
          , Leeds Metropolitan University,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>