<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>I Know What You're Doing: A Case Study on Case-Based Opponent Modeling and Low-Level Action Prediction</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Thomas Gabel</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eicke Godehardt</string-name>
          <email>godehardtg@fb2.fra-uas.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Faculty of Computer Science and Engineering Frankfurt University of Applied Sciences 60318</institution>
          <addr-line>Frankfurt am Main</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <fpage>13</fpage>
      <lpage>22</lpage>
      <abstract>
        <p>This paper focuses on an investigation of case-based opponent player modeling in the domain of simulated robotic soccer. While in previous and related work it has frequently been claimed that the prediction of low-level actions of an opponent agent in this application domain is infeasible, we show that { at least in certain settings { an online prediction of the opponent's actions can be made with high accuracy. We also stress why the ability to know the opponent's next low-level move can be of enormous utility to one's own playing strategy.</p>
      </abstract>
      <kwd-group>
        <kwd>RoboCup [12] is an international research initiative intending to expedite articial intelligence and intelligent robotics research by de ning a set of standard problems where various technologies can and ought to be combined solving them</kwd>
        <kwd>Annually</kwd>
        <kwd>there are championship tournaments in several leagues { ranging from rescue tasks over real soccer-playing robots to simulated ones</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Recognizing and predicting agent behavior is of crucial importance speci cally
in adversary domains. The case study presented in this paper is concerned with
the prediction of the low-level behavior of agents in the highly dynamic,
heterogeneous, and competitive domain of robotic soccer simulation (RoboCup).
Case-based reasoning represents one of the potentially useful methodologies for
accomplishing the analysis of the behavior of a single or a team of agents. In this
sense, the basic idea of our approach is to make a case-based agent observe its
opponent and, in an online fashion, i.e. during real game play, build up a case
base to be used for predicting the opponent's future actions.</p>
      <p>In Section 2, we introduce the opponent modeling problem, point to related
work, and argue why knowing an opponent's next low-level actions can be
bene cial. The remainder of the paper then outlines our case-based methodology
(Section 3), reviews the experimental results we obtained (Section 4), and
summarizes and discusses our ndings (Section 5).
2.1</p>
      <sec id="sec-1-1">
        <title>Robotic Soccer Simulation</title>
        <p>
          The focus of the paper at hand is laid upon RoboCup's 2D Simulation League,
where two teams of simulated soccer-playing agents compete against one another
using the Soccer Server [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], a real-time soccer simulation system.
        </p>
        <p>The Soccer Server allows autonomous software agents written in an arbitrary
programming language to play soccer in a client/server-based style: The server
simulates the playing eld, communication, the environment and its dynamics,
while the clients { eleven autonomous agents per team { connect to the server
and are permitted to send their intended actions (e.g. a parameterized kick or
dash command) once per simulation cycle to the server via UDP. Then, the
server takes all agents' actions into account, computes the subsequent world
state and provides all agents with (partial) information about their environment
via appropriate messages over UDP.</p>
        <p>So, decision making must be performed in real-time or, more precisely, in
discrete time steps: Every 100ms the agents can execute a low-level action and the
world-state will change based on the individual actions of all players. Speaking
about low-level actions, we should make clear that the actions themselves are
\parameterized basic actions" and the agent can execute only one of them per
time step:
{ dash(x; ) { lets the agent accelerate along its current body orientation by
relative power x 2 [0; 100] (if it does not accelerate, then its velocity decays)
into direction 2 ( 180; 180] relative to its body orientation
{ turn( ) { makes the agent turn its body by 2 ( 180; 180] where, however,
the Soccer Server reduces depending on the player's current velocity in
order to simulate an inertia moment
{ kick(x; ) { has an e ect only, if the ball is within the player's kick range
(1.085m around the player) and yields a kick of the ball by relative power
x 2 [0; 100] into direction 2 ( 180; 180]
{ There exist a few further actions (like tackling1, playing foul, or, for the goal
keeper, catching the ball) whose exact description is beyond scope.
Given this short description of the most important low-level actions that can
be employed by the agent, it is clear that these basic actions must be combined
cleverly in consecutive time steps in order to create \higher-level actions" like
intercepting balls, playing passes, doing dribblings, or marking players. We will
call those higher-level actions skills in the remainder of this paper.</p>
        <p>
          Robotic Soccer represents an excellent testbed for machine learning,
including approaches that involve case-based reasoning. For example, several research
groups have dealt with the task of learning parts of a soccer-playing agent's
behavior autonomously (for instance [
          <xref ref-type="bibr" rid="ref3 ref8 ref9">9, 8, 3</xref>
          ]). In [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], as an other example, we
speci cally addressed the issue of using CBR for the development of a player
agent skill for intercepting balls.
1 To tackle for the ball with a low-level action tackle( ) means to straddle for the ball
and thus changing its velocity, even if it is not in the player's immediate kick range;
such an action succeeds only with limited probability which decreases the farther
the ball is away from the agent.
2.2
        </p>
      </sec>
      <sec id="sec-1-2">
        <title>Related Work on Opponent Modeling</title>
        <p>
          Opponent modeling is an important factor that can contribute substantially to
a player's capabilities in a game, since it enables the prediction of future actions
of the opponent. In doing so, it also allows for adapting one's own behavior
accordingly. Case-based reasoning has been frequently used as a technique for
opponent modeling in multi-agent games [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], including the domain of robotic
soccer [
          <xref ref-type="bibr" rid="ref1 ref13">13, 1</xref>
          ].
        </p>
        <p>
          Using CBR, in [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] the authors make their simulated soccer agents recognize
currently executed higher-lever behaviors of the currently ball leading opponent
player. These include passing, dribbling, goal-kicking and clearing. These
higherlevel behaviors correspond to what we refer to as skills, i.e. action sequences that
are executed over a dozen or more time steps. This longer time horizon allows
the agent to take appropriate counter measures.
        </p>
        <p>
          The authors of [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] also deal with the case-based recognition of skills
(higherlevel behaviors, to be exact the shoot-on-goal skill) executed by an opponent
soccer player, focusing on the appropriate adjustment of the similarity measure
employed. While we do also think opponent modeling is useful for counteracting
adversary agents, we, however, disagree with these authors claiming that \in a
complex domain such as RoboCup it is infeasible to predict an agent's behavior
in terms of primitive actions". Instead we will show empirically that such a
lowlevel action prediction can be achieved during an on-going play using case-based
methods. To this end, the work presented in this paper is also related to the work
by Floyd et al. [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] whose goal is to mimic the overall behavior of entire soccer
simulation teams, be it for the purpose of analysis or for rapid prototyping when
developing one's own team, without putting too much emphasis on whether the
imitators yield competitive behavior.
2.3
        </p>
      </sec>
      <sec id="sec-1-3">
        <title>Related Previous Work</title>
        <p>What is the use of knowing exactly whether an opponent is going to execute a
kick(40; 30 ) or a dash(80; 0 ) low-level action next? This piece of information
certainly does not reveal whether this opponent's intention is to play a pass (and
to which teammate) in the near future or to dribble along. Clearly, for answering
questions like that the approaches listed in the previous section are potentially
more useful. But knowing the opposing agent's next low-level actions is extremely
useful, when knowing the next state on the eld is essential (cf. Figure 1 for an
illustration).</p>
        <p>
          In [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], we considered a soccer simulation defense scenario of crucial
importance: We focused on situations where one of our players had to interfere and
disturb an opponent ball leading player in order to scotch the opponent team's
attack at an early stage and, even better, to eventually conquer the ball
initiating a counter attack. We employed a reinforcement learning (RL) methodology
that enabled our agents to autonomously acquire such an aggressive duel
behavior, and we successfully embedded it into our soccer simulation team's defensive
strategy. So, the goal was to learn a so-called \duelling skill" (i.e. a higher-level
behavior which in the end yields a sequence of low-level actions) which made
our agent conquer the ball from the ball-leading opponent.
        </p>
        <p>Current
Player
Velocity
Current
Ball Velocity</p>
        <p>An important feature of the soccer simulation domain is that the model of
the environment is known. This means given, for example, the current position
and velocity of the ball, it is possible for any agent to calculate the position of the
ball in the next time step (because the implementation of the physical simulation
by the Soccer Server is open source2). As a second example, when knowing one's
own current position, velocity and body angle, and issuing a turn(68 ) low-level
action, the agent can precalculate the position, velocity and body orientation it
will have in the next step. Or, nally, when the agent knows the position and
velocity of the ball, it can precalculate the ball's position and velocity in the
next step, for any kick(x; ) command that it might issue.</p>
        <p>Knowing the model of the environment (formally, the transition function
p : S A S ! R where p(s; a; s0) tells the probability to end up in the next
state s0 when executing action a in the current state s), is extremely
advantageous in reinforcement learning, since then model-based instead of model-free
2 In practice, the Soccer Server adds some noise to all low-level actions executed, but
this is of minor importance to our concerns.
learning algorithms can be applied which typically comes along with a pleasant
simpli cation of the learning task.</p>
        <p>
          So, in soccer simulation the transition function p (model of the environment)
is given since the way the Soccer Server simulates a soccer match is known. In
the above-mentioned \duelling task", however, the situation is aggravated: Here,
we have to consider the in uence of an opponent whose next actions cannot be
controlled. In [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], we stated that the opponent's next (low-level) actions can
\hardly be predicted [which] makes it impossible to accurately anticipate the
successor state", knowing which is, as pointed out, extremely useful in RL. In
the paper at hand, we will show that predicting the opponent's next low-level
action might be easier than expected. As a consequence,
{ in [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] we had to rely on a rough approximation of p, that merely takes into
account that part of the state that can be in uenced directly by the learning
agent and which ignored the part of the future state which is under direct
control of the ball-leading opponent (e.g. the position of the ball in the next
state). This corresponded to the unrealistic assumption of an opponent that
never takes any action (cf. Figure 1, bottom left).
{ in future work we can employ a much more accurate version of p based on
the case-based prediction of the opponent's low-level actions described in the
next section.
3
        </p>
        <p>Case-Based Prediction of Low-Level Actions
In what follows, we di erentiate between an opponent (OPP) agent whose next
low-level actions are to be predicted as well as (our) case-based agent (CBA)
that essentially observes the opponent and that is going to build up a case base
to be used for the prediction of OPP's actions.</p>
        <p>When approaching the opponent modeling problem as a case-based reasoning
problem, the goal of the case-based agent is to correctly predict the next action of
its opponent given a characterization of the current situation. Stated di erently,
the current state of the system (including the case-based agent itself, its opponent
as well as all other relevant objects) represents a new query q. CBA's case base
C is made up of cases c = (p; s) whose problem parts p correspond to other,
older situations and corresponding solutions s which describe the action OPP
has taken in situation p. Next, the case-based agent will search its case base for
that case c^ = (p^; s^) 2 C (or for a set of k such cases) whose problem part features
the highest similarity to the current problem q and employ its solution s^ as the
current prediction of the opponent's next action.
3.1</p>
      </sec>
      <sec id="sec-1-4">
        <title>Problem Modeling</title>
        <p>In the context of this case study we focus on dribbling opponents, i.e. the
opponent has the ball in its kick range and moves along while keeping the ball
within its kick range all the time. Stated di erently, we focus on situations
where OPP behaves according to some \dribble skill" (a higher-level dribbling
behavior). Consequently, OPP executes in each time step one of the three
actions kick(x; ), dash(x; ), or turn( ). The standard rules of the simulation
allow x to be from [0; 100] and from ( 180 ; 180 ] for kicks and turns. For
dashes, is allowed to take one out of eight values (multiples of 45 ). In almost
all cases occurring during normal play, however, a dribbling player is heading
more or less towards his opponent's goal which is why the execution of low-level
turn actions represents an exceptional case. Therefore, for the time being, we
leave turn actions aside and focus on the correct prediction of dashes and kicks
including their parameters x and .</p>
        <p>Case Structure The state of the dribbling opponent (OPP) can be characterized
by the x and y position of the ball within its kick range (posb;x and posb;y)
relative to the center of OPP as well as the x and y components of the ball's
velocity (velb;x and velb;y; of course, these values are also relative to OPP's
body orientation). Moreover, OPP's x and y velocities (velp;x and velp;y) are
of relevance, making six features in total. The seventh relevant feature, OPP's
current body orientation p can be skipped due to the arguments mentioned
in the preceding paragraph. Furthermore, the y component of OPP's velocity
vector velp;y is, in general, zero since a dribbling player almost always dribbles
along its current body orientation. While this allows us to also skip the sixth
feature, we remove a redundancy in the remaining features (and thus arrive at
only four of them) by changing to a relative state description that incorporates
some background knowledge3 from the simulation. Hence, the problem part p of
a case c = (p; s) is a four-tuple p = (posbnx; posbny; velbnx; velbny) with
posbnx = posb;x + 0:94 velb;x
posbny = posb;y + 0:94 velb;y
0:4 velp;x
0:4 velp;y
velpnx = 0:94 velb;x
velpny = 0:94 velb;y
0:4 velp;x
0:4 velp;y
where all components characterize the next state as it would arise, if the agent
would not take any action (cf. Figure 1).</p>
        <p>The solution s of a case c = (p; s) consists of a class label l (\dash" or \kick")
as well as two accompanying real-valued attributes for the power x and angle
parameters of the respective action. Thus, the solution is a triple s = (l; x; ).
3.2</p>
      </sec>
      <sec id="sec-1-5">
        <title>Implementing the CBR Cycle</title>
        <p>The case-based agent CBA observes his opponent OPP and, in doing so, builds
up its case base. Note that all agents in soccer simulation act on incomplete and
uncertain information. Their visual input consists of noisy information about
objects in their limited eld of vision. However, if the observed opponents are
3 Knowledge about how the Soccer Server decays objects.
near and constantly focused at, CBA is provided with su ciently accurate
visual state information. In order to ll the contents of the cases' solution parts,
however, CBA must apply inverse dynamics of the soccer simulation. If CBA,
for example, observes that the velocity vector of the ball has been changed at
time t + 1 as in the bottom right part of Figure 1, then it can conclude that OPP
has executed a kick(50; 90 ) action at time t and can use that information to
complete the case it created at time step t.</p>
        <p>With ongoing observation of dribbling opponent players, CBA's case base C
grows and becomes more and more competent. Therefore, after jCj exceeds some
threshold, CBA can utilize its case base and query it to nd a prediction of the
action that OPP is going to take in the current time step.</p>
        <p>
          Retrieval and Similarity Measures We model the problem similarity using the
local-global principle [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] with identical local similarity measures for all problem
attributes, simi(qi; ci) = ( maxqii cmi ini )2, where mini and maxi denote the
minimum and maximum value of the domain of the ith feature. The global similarity
is formed as a weighted average according to
        </p>
        <p>Sim(q; c) =</p>
        <p>Pn
i=1 wi simi(qi; ci)</p>
        <p>Pn
i=1 wi
where attributes posbnx and posbny are weighted twice as much as velpnx and
velpny.</p>
        <p>We perform standard k-nearest neighbor retrieval using a value of k = 3 in
our experiments. When predicting the class of the solution, i.e. the type of the
low-level action (dash or kick), we apply a majority voting, and for the prediction
of the action parameters (x and ) we calculate the average over all cases among
the k nearest neighbors whose class label matches the majority class.
4</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Experimental Results</title>
      <p>To evaluate our approach we selected a set of contemporary soccer simulation
team binaries (top teams from recent years) and made one of their agents (OPP)
dribble for up to 2000 simulated time steps4. Our case-based agent CBA was
allowed meanwhile to observe OPP and build up its case base. We evaluated
CBA's performance in predicting OPP's low-level actions for increasing case
base sizes.</p>
      <p>Figure 2 visualizes the learning progress against an opponent agent from
team WrightEagle. As can be seen, compelling accuracies can be achieved for
both, the correctness of the type of the action (dash or kick) as well as for the
belonging action parameters. Interestingly, even the relative power / angle of
kicks can be predicted quite reliably with a remaining absolute error of less than
ten percent / ten degrees.
4 Duration of a regular match is 6000 time steps.
0,4
)
k
c
i
.K0,3
s
v
sh
a
D
(
rro0,2 18,7%
r
E
n
o
it
a
c
is0,1
s
a
l
C
0
13,0%
9,0%</p>
      <p>Error in Predicting Dashes vs. Kicks (y1)
Error in Predicted Dash Power (y2)
Error in Predicted Kick Power (y2)
Error in Predicted Kick Angle (y2)
40
rs
e
30 tem
a
r
a
P
n
o
it
20 fcA
o
rrr
o
E
n
o
10 iitc
d
e
r
P
0 3 8 13 18 25 30 40 50 60 70 85 100125 150 200 300 500</p>
      <p>Number of Cases in Case Base
750
1000</p>
      <p>Figure 3 focuses on di erent opponent agents and highlights the fact that a
substantial improvement in action type prediction accuracy can be obtained with
as little as 100 collected cases. Baseline to all these classi cation experiments is
the error of the \trivial" classi er (black) that predicts each action type to be
of the majority class. The right part of Figure 3 presents the recall of both,
dash and kick actions. Apparently, dashes are somewhat easier to predict than
kicks where, however, the recall of the latter is still above 65% for each of the
opponent agents considered.</p>
      <p>In Figure 4, we present aggregate numbers (averages over all opponents) that
emphasize how accurately the parameters of an action were predicted, given that
the type of the action could be identi ed correctly. To this end, dash angles
are disregarded since more than 99.2% of all dash actions performed used
= 0, i.e. yielded a dash forward. Here, we compare (a) an \early" case base
with only 10 cases, (b) an intermediate one5 with jCj = 100 as well as (c)
one that has resulted from 2000 simulated time steps and contains circa 1500
cases. Interestingly, even in (a) comparatively low errors can be obtained. In (b)
and (c), however, the resulting average absolute prediction errors become really
competitive ( 2:9 for dash powers x with x 2 [0; 100], 6:3 for kick powers x
with x 2 [0; 100], and 19:7 for kick angles with 2 [0 ; 360 ]).
35,3
23,6
19,7
32,1
32,3
29,3
Average Error in Predicted Kick Angle (within [0°,360°]),</p>
      <p>Standard Deviations in Gray
|CB|=10
|CB|=100
|CB|~1500</p>
      <p>14
ro )]012
r 0
rE ,010</p>
      <p>1
e [
rga in 8</p>
      <p>h
ve it
A (w6
4
2
0
8,3
3,7 2,9
12,3
9,0</p>
      <p>6,3
Average Error in Dash Power</p>
      <p>Average Error in Kick Power
Clearly, dribbling opponents are very likely to behave di erently when they are
disturbed, tackled, or attacked by a nearby opponent. Therefore, the approach
presented needs to be extended to \duelling situations" as they frequently arise
in real matches. For example, in scenarios like that the dribbler will presumably
not just dribble straight ahead, but also frequently execute turn actions (e.g. in
order to dribble around its disturber). This represents an aggravation of the
action type prediction problem since then three instead of two classes of actions
must be considered (dask, kick, turn).</p>
      <p>While the case study presented focused solely on non-attacked dribbling
opponents, this approach can easily be transferred to related or similar situations
where knowing the opponent's next move is crucial, too. This includes, but is
not limited to the behavior of an opponent striker when trying to perform a
shoot onto the goal (which typically requires a couple of time steps), the
behavior of the shooter as well as the goal keeper during penalty shoot-outs, or the
positioning behavior of the opponent goalie (anticipating which can be essential
for the striker).</p>
      <p>As a next step, we plan to combine the presented case-based prediction of
lowlevel actions with the reinforcement learning-based acquisition of agent behaviors
as outlined in Section 2.3. This involves, rst, solving the aggravated problem of
5 A case base of a size of about 100 to 500 cases can easily be created within the rst
half of a match for most players.
correctly recognizing three di erent classes of low-level actions mentioned at the
beginning of this section and, second, a proper utilization of the thereby obtained
improved model when learning a higher-level duelling skill using RL. Another
interesting direction for future work is the idea to let CBA start o with some
opponent model in form of a case-base acquired o ine (against, for example,
an older version of the team to be faced) and, using appropriate techniques for
case base maintenance, to successively replace old experience by new experience
gained online during the current match.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Ahmadi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Keighobadi-Lamjiri</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nevisi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Habibi</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Badie</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Using a Two-Layered Case-Based Reasoning for Prediction in Soccer Coach</article-title>
          .
          <source>In: Proceedings of the International Conference of Machine Learning; Models, Technologies and Applications (MLMTA'03)</source>
          . pp.
          <volume>181</volume>
          {
          <fpage>185</fpage>
          . CSREA Press (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bergmann</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , Richter,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Schmitt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Stahl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Vollrath</surname>
          </string-name>
          ,
          <string-name>
            <surname>I.</surname>
          </string-name>
          :
          <article-title>Utility-Oriented Matching: A New Research Direction for Case-Based Reasoning</article-title>
          .
          <source>In: Proceedings of the 9th German Workshop on Case-Based Reasoning</source>
          . pp.
          <volume>264</volume>
          {
          <issue>274</issue>
          (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Carvalho</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cheriton</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Reinforcement Learning for the Soccer Dribbling Task</article-title>
          .
          <source>In: Proceedings of IEEE Conference on Computational Intelligence and Games (CIG)</source>
          . pp.
          <volume>95</volume>
          {
          <fpage>101</fpage>
          . Seoul, South Korea (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Denzinger</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hamdan</surname>
          </string-name>
          , J.:
          <article-title>Improving Modeling of Other Agents Using Stereotypes and Compacti cation of Observations</article-title>
          .
          <source>In: Proceedings of Third International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS)</source>
          . pp.
          <volume>1414</volume>
          {
          <fpage>1415</fpage>
          . New York, USA (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Floyd</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Esfandiari</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lam</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>A Case-Based Reasoning Approach to Imitating RoboCup Players</article-title>
          .
          <source>In: Proceedings of the 21st International Florida Arti cial Intelligence Research Society Conference</source>
          . pp.
          <volume>251</volume>
          {
          <fpage>256</fpage>
          .
          <string-name>
            <surname>Coconut</surname>
            <given-names>Grove</given-names>
          </string-name>
          , USA (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Gabel</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Riedmiller</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>CBR for State Value Function Approximation in Reinforcement Learning</article-title>
          .
          <source>In: Proceedings of the 6th International Conference on CaseBased Reasoning (ICCBR</source>
          <year>2005</year>
          ). pp.
          <volume>206</volume>
          {
          <fpage>221</fpage>
          . Springer, Chicago, USA (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Gabel</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Riedmiller</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Trost</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>A Case Study on Improving Defense Behavior in Soccer Simulation 2D: The NeuroHassle Approach</article-title>
          . In: L.
          <string-name>
            <surname>Iocchi</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Matsubara</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Weitzenfeld</surname>
          </string-name>
          , C. Zhou, editors,
          <source>RoboCup</source>
          <year>2008</year>
          :
          <article-title>Robot Soccer World Cup XII, LNCS</article-title>
          . pp.
          <volume>61</volume>
          {
          <fpage>72</fpage>
          . Springer, Suzhou, China (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Kalyanakrishnan</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , Liu,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Stone</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.</surname>
          </string-name>
          :
          <article-title>Half Field O ense in RoboCup Soccer: A Multiagent Reinforcement Learning Case Study</article-title>
          . In: RoboCup-2006: Robot Soccer World Cup X. pp.
          <volume>72</volume>
          {
          <fpage>85</fpage>
          . Springer Verlag, Berlin (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Kuhlmann</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stone</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Progress in Learning 3 vs. 2 Keepaway</article-title>
          . In: RoboCup2003: Robot Soccer World Cup VII. pp.
          <volume>694</volume>
          {
          <fpage>702</fpage>
          . Springer, Berlin (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Noda</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Matsubara</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hiraki</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frank</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Soccer Server: A Tool for Research on Multi-Agent Systems</article-title>
          .
          <source>Applied Arti cial Intelligence</source>
          <volume>12</volume>
          (
          <issue>2-3</issue>
          ),
          <volume>233</volume>
          {
          <fpage>250</fpage>
          (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11. Ste ens, T.:
          <article-title>Similarity-Based Opponent Modelling Using Imperfect Domain Theories</article-title>
          .
          <source>In: Proceedings of the IEEE Symposium on Computational Intelligence and Games (CIG05)</source>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Veloso</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Balch</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stone</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <source>RoboCup</source>
          <year>2001</year>
          :
          <article-title>The Fifth Robotic Soccer World Championships</article-title>
          .
          <source>AI Magazine</source>
          <volume>1</volume>
          (
          <issue>23</issue>
          ),
          <volume>55</volume>
          {
          <fpage>68</fpage>
          (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Wendler</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bach</surname>
          </string-name>
          , J.:
          <article-title>Recognizing and Predicting Agent Behavior with Case-Based Reasoning</article-title>
          . In:
          <string-name>
            <given-names>D.</given-names>
            <surname>Polani</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Bonarini</surname>
          </string-name>
          and B. Browning (editors),
          <source>RoboCup</source>
          <year>2003</year>
          : Robot Soccer World Cup VII. pp.
          <volume>729</volume>
          {
          <fpage>728</fpage>
          .
          <string-name>
            <surname>Padova</surname>
          </string-name>
          ,
          <string-name>
            <surname>Italy</surname>
          </string-name>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>