=Paper=
{{Paper
|id=Vol-494/paper-39
|storemode=property
|title=Electricity Market (Virtual) Agents
|pdfUrl=https://ceur-ws.org/Vol-494/masspaper2.pdf
|volume=Vol-494
|dblpUrl=https://dblp.org/rec/conf/mallow/TrigoMC09
}}
==Electricity Market (Virtual) Agents==
Electricity Market (Virtual) Agents
Paulo Trigo Paulo Marques Helder Coelho
LabMAg, GuIAA; DEETC, ISEL GuIAA; DEETC, ISEL LabMAg; DI, FCUL
Instituto Superior de Eng. de Lisboa Instituto Superior de Eng. de Lisboa Faculdade de Ciências da Univ. de Lisboa
Portugal Portugal Portugal
Email: ptrigo@deetc.isel.ipl.pt Email: 28562@alunos.isel.ipl.pt Email: hcoelho@di.fc.ul.pt
Abstract—This paper describes a multi-agent based simulation TEMMAS agents exhibit bounded rationality, i.e., they
(MABS) framework to construct an artificial electric power mar- make decisions based on local information (partial knowledge)
ket populated with learning agents. The artificial market, named of the system and of other agents while learning and adapting
TEMMAS (The Electricity Market Multi-Agent Simulator), ex-
plores the integration of two design constructs: i) the specification their strategies during a simulation. The TEMMAS purpose
of the environmental physical market properties, and ii) the is not to explicitly search for equilibrium points, but rather
specification of the decision-making (deliberative) and reactive to reveal and assist to understand the complex and aggregate
agents. TEMMAS is materialized in an experimental setup system behaviors that emerge from the interactions of the
involving distinct power generator companies which operate market agents.
in the market and search for the trading strategies that best
exploit their generating units’ resources. The experimental results II. T HE MABS M ODELING F RAMEWORK
show a coherent market behavior that emerges from the overall
simulated environment. We describe the structural MABS constituents by means
of two concepts: i) the environmental entity, which owns a
distinct existence in the real environment, e.g. a resource such
I. I NTRODUCTION
as an electricity producer, or a decision-making agent such as
The start-up of nation-wide electric markets, along with a market bidder generator company, and ii) the environmental
its recent expansion to intercountry markets, aims at pro- property, which is a measurable aspect of the real environment,
viding competitive electricity service to consumers. The new e.g. the price of a bid or the demand for electricity. Hence,
market-based power industry calls for human decision-making we define the environmental entity set, ET = { e1 , . . . , en },
in order to settle the energy assets’ trading strategies. The and the environmental property set, EY = { p1 , . . . , pm }. The
interactions and influences among the market participants are whole environment is the union of its entities and properties:
usually described by game theoretic approaches which are E = ET ∪ EY .
based on the determination of equilibrium points to which The environmental entities, ET , are often clustered in diffe-
compare the actual market performance [1], [2]. However, rent classes, or types, thus partitioning ET into a set, PET , of
those approaches find it difficult to incorporate the ability disjoints subsets, PEi T , each containing
entities that belong to
of market participants to repeatedly probe markets and adapt the same class. Formally, PET = PE1T , . . . , PEkT defines
their strategies. Usually, the problem of finding the equilibria a full partition of ET , such that PEi T ⊆ ET and PET =
strategies is relaxed (simplified) both in terms of: i) the human ∪i=1...k PEi T and PEi T ∩ PEjT = ∅ ∀i 6= j. The partitioning
agents’ bidding policies, and ii) the technical and economical may be used to distinguish between decision-making agents
operation of the power system. and available resources, e.g. a company that decides the biding
As an alternative to the equilibrium approaches, the multi- strategy to pursue or a plant that provides the demanded power.
-agent based simulation (MABS) comes forth as being par- The environmental properties, EY , can also be clustered, in
ticulary well fitted to analyze dynamic and adaptive systems a similar way as for the environmental entities, thus grouping
with complex interactions among constituents [3], [4]. properties that are related. The partitioning may be used to ex-
In this paper we describe a MABS modeling frame- press distinct categories, e.g. economical, electrical, ecological
work that provides constructs for the (human) designer to or social aspects. Another, more technical usage, is to separate
specify a dynamic environment, its resources, observable constant parameters from dynamic state variables.
properties and its inhabitant decision-making agents. We The factored state space representation. The state of the
used the framework to capture the behavior of the elec- simulated environment is implicitly defined by the state of all
tricity market and to build a simulator, named TEMMAS its environmental entities and properties. We follow a factored
(The Electricity Market Multi-Agent Simulator), which incor- representation, that describes the state space as a set, V, of
porates the operation of several generator company (GenCo) discrete state variables [5]. Each state variable, vi ∈ V, takes
operators, each with distinct power generating units (GenUnit), on values in its domain D( vi ) and the global (i.e., over E)
and a market operator (Pool) which computes the hourly state space, S ⊆ ×vi ∈V D( vi ), is a subset of the Cartesian
market price (driven by the electricity demand). product of the state variable domains. A state s ∈ S is an
assignment of values to the set of state variables V. We define generator company, GenCo, submits (to Pool) how much
fC , C ⊆ V, as a projection such that if s is an assignment to energy, each of its generating unit, GenUnitGenCo , is willing
V, fC ( s ) is the assignment of s to C; we define a context c as to produce and at what price. Thus, we have: i) the power
an assignment to the subset C ⊆ V; the initial state variables supply system comprises a set, EGenCo , of generator companies,
of each entity and property are defined, respectively, by the ii) each generator company, GenCo, contains its own set,
functions initET : ET → C and initEY : EY → C. EGenUnitGenCo , of generating units, iii) each generating unit,
From environmental entities to resources and agents. The GenUnitGenCo , of a GenCo, has constant marginal costs, and
embodiment is central in describing the relation between the iv) the market operator, Pool, trades all the GenCos’ submitted
entities and the environment [6]. Each environmental entity can energy.
be seen as a body, possibly with the capability to influence the The bidding procedure conforms to the so-called “block
environmental properties. Based on this idea of embodiment, bids” approach [12], where a block represents a quantity of
two higher-level concepts (decoupled from the environment, energy being bided for a certain price; also, GenCos are not
E, characterization) are introduced: i) agent, owing reasoning allowed to bid higher than a predefined price ceiling. Thus,
and decision-making capabilities, and ii) resource, without any the market supply essential measurable aspects are the energy
reasoning capability. Thus, given a set of agents, Υ, we define price, quantity and production cost. The consumer side of
an association function embody : Υ → ET , which connects the market is mainly described by the quantity of demanded
an agent to its physical entity. In a similar way, given a set energy; we assume that there is no price elasticity of demand
of resources, Φ, we define the mapping function identity : (i.e., no demand-side market bidding).
Φ → EY . We consider that |E| = |Υ| + |Φ|, thus each entity is Therefore, we have: ET = { Pool } ∪ EGenCo ∪g∈EGenCo
either mapped to an agent or to a resource; there is no third EGenUnitg where EY = { quantity, price, productionCost }.
category. The quantity refers both to the supply and demand sides of
the market. The price referes both to the supply bided values
The decision-making approach. Each agent perceives (the and to the market settled (by Pool) value.
market) and acts (sells or buys) and there are two main The EGenCo contains the decision-making agents. The Pool
approaches to develop the reasoning and decision-making is a reactive agent that always applies the same predefined
capabilities: i) the qualitative mental-state based reasoning, auction rules in order to determine the market price and
such as the belief-desire-intention (BDI) architecture [7], hence the block bids that clear the market. Each EGenUnitGenCo
which is founded on logic theories, and ii) the quantita- represents the GenCo’s set of available resources.
tive, decision-theoretic, evaluation of causal effects, such as
The resources’ specification. Each generating unit,
the Markov decision process (MDP) support for sequential
GenUnitGenCo , defines its marginal costs and constructs the
decision-making in stochastic environments. There are also
block bids according to the strategy indicated by its generator
hybrid approaches that combine the qualitative and quantitative
company, GenCo. Each GenUnitGenCo calculates its marginal
formulations [8], [9].
costs according to, either the “WithHeatRate” [13]) or the
The qualitative mental-state approaches capture the relation
“WithCO2 ” [14] formulation.
between high level components (e.g. beliefs, desires, inten-
The “WithHeatRate” formulation estimates the marginal
tions) and tend to follow heuristic (or rule-based) decision-
cost, MC, by combining the variable operations and mainte-
-making strategies, thus being better fitted to tackle large-scale
nance costs, vO&M, the number of heat rate intervals, nP at,
problems and worst fitted to deal with stochastic environments.
each interval’s capacity, capi and the corresponding heat rate
The quantitative decision-theoretic approaches deal with low value, hri , and the price of the fuel, f P rice, being used; the
level components (e.g., primitive actions and immediate re- marginal cost for a given i ∈ [1, nP at] interval is given by,
wards) and searches for long-term policies that maximize some
utility function, thus being worst fitted to tackle large-scale (capi+1 × hri+1 ) − (capi × hri )
MCi+1 = vO&M+ ×f P rice
problems and better fitted to deal with stochastic environments. blockCapi+1
(1)
The electric power market is a stochastic environment and
where each block’s capacity is given by: blockCapi+1 =
we currently formulate medium-scale problems that can fit a
capi+1 − capi .
decision-theoretic agent model. Therefore, TEMMAS adaptive
The “WithCO2 ” marginal cost, MC, combines the variable
agents (e.g., market bidders) follow a MDP based approach
operations and maintenance costs, vO&M, the price of the
and resort to experience (sampled sequences of states, actions
fuel, f P rice, the CO2 cost, CO2 cost, and the unit’s produc-
and rewards from simulated interaction) to search for optimal,
tivity, η, through the expression,
or near-optimal, policies using reinforcement learning methods
such as Q-learning [10] or SARSA [11]. f P rice
MC = × K + CO2 cost + vO&M (2)
η
III. TEMMAS D ESIGN where K is a fuel-dependent constant factor, and CO2 cost
Within the current design model of TEMMAS the electricity is given by,
asset is traded through a spot market (no bilateral agreements), CO2 emit
which is operated via a Pool institutional power entity. Each CO2 cost = CO2 price × ×K (3)
η
where CO2 emit is the CO2 fuel’s emissions. Here all development platform. Figure 2 presents the general “agent’s
blocks have the same capacity; given a unit’s maximum perspective”, where the tasks and the goals are clustered into
capacity, maxCap, and a number of blocks, nBlocks, to sell, individual and social perspectives. Figure 3 gives additional
each block’s capacity is given by: blockCap = maxCap nBlocks . detail on the construction of tasks and goals using INGENIAS.
The decision-making strategies. Each generator company
defines the bidding strategy for each of its generating units. User Interface
We designed two types of strategies: a) the basic-adjustment,
that chooses among a set of basic rigid options, and b)
the heuristic-adjustment, that selects and follows a prede-
fined well-known heuristic. There are several basic-adjustment
strategies already defined in TEMMAS. Here we outline seven
Agents
of those strategies, sttgi where i ∈ { 1, . . . , 7 }, available for
Generating Generator
a GenCo to apply: i) sttg1 , bid according to the marginal Unit Company
production cost of each GenUnitGenCo (follow heat rate curves, Market
Operator Buyer
e.g., cf. tables II and III), ii) sttg2 , make a “small” in- (Pool)
crement in the prices of all the previous-day’s block bids, Generating Generator
iii) sttg3 , similar to sttg2 , but makes a “large” increment, Unit Company
iv) sttg4 , make a “small” decrement in the prices of all
the previous-day’s block bids, v) sttg5 , similar to sttg4 , but
Legend
makes a “large” decrement, vi) sttg6 , hold the prices of all
Marginal Cost Buying Offers
previous-day’s block bids, vii) sttg7 set the price to zero.
Sale Offers Market Results
There are two heuristic-adjustment defined strategies: a) the
“Fixed Increment Price Probing” (FIPP) that uses a percentage
Fig. 1. The TEMMAS architecture and the configurable parameters.
to increment the price of last day’s transacted energy blocks
and to decrement the non-transacted blocks, and b) “Physical
Withholding based on System Reserve” (PWSR) that reduces
the block’s capacity, as to decrement the next day’s estimated individual
system reserve (difference between total capacity and total social perspective
demand), and then bids the remaining energy at the maximum perspective
market price.
The agents’ decision process. The above strategies
correspond to the GenCo agent’s primary actions. The
GenCo has a set, EGenUnitGenCo , of generating units and, at
each decision-epoch, it decides the strategy to apply to
each generating unit, thus choosing a vector of strate-
−−→
gies, sttg, where the ith vector’s component refers to the
i
GenUnitGenCo generating unit; thus, its action space is given
|EGenUnitGenCo |
by: A = ×i=1 { sttg1 , . . . , sttg7 }i ∪ { FIPP, PWSR }.
The GenCo’s perceived market share, mShare, is used to
characterize the agent internal memory so its state space Fig. 2. TEMMAS agent’s view using INGENIAS framework.
is given by mShare ∈ [ 0..100 ]. Each GenCo is a MDP
decision-making agent such that the decision process period
represents a daily market. At each decision-epoch each agent satisfies satisfies
computes its daily profit (that is regarded as an internal reward
function) and the Pool agent receives all the GenCos’s block
bids for the 24 daily hours and settles the hourly market price
by matching offers in a classic supply and demand equilibrium
price (we assume a hourly constant demand).
TEMMAS architecture and construction. The TEMMAS
agents along with the major inter-agent communication paths
are represented in the bottom region of Figure 1; the top
region represents the user interface that enables to specify the consumes uses consumes uses
each of the resources’ and agents’ configurable parameters.
Fig. 3. TEMMAS tasks and goals specification using INGENIAS framework.
The implementation of the TEMMAS architecture followed
the INGENIAS [15] methodology and used its supporting
TABLE I
IV. TEMMAS ILLUSTRATIVE SETUP P ROPERTIES OF GENERATING UNITS ; THE UNITS ’ TYPES ARE COAL (CO),
COMBINED CYCLE (CC) AND GAS TURBINE (GT); THE O&M INDICATES
We used TEMMAS to build a specific electric market “ OPERATION AND MAINTENANCE ” COST.
simulation model. We picked the inspiration from the Iberian
Electricity Market (MIBEL – “Mercado Ibérico de Electrici- Type of generating unit
dade”) with Portuguese (e.g., EDP - “Electrividade de Portu- Property unit CO CC GT
gal”, “Turbogás”, “Tejo Energia”) and Spanish (e.g., “Endesa”, Fuel — Coal (BIT) Nat. Gas Nat. Gas
“Iberdrola”, “Union Fenosa”, “Hidro Cantábrico”, “Viesgo”, Capacity MW 500 250 125
“Bas Natural”, “Elcogás”) generator companies. Regarding the Fuel price C/MMBtu 1.5 5 5
total electricity capacity installed the Iberian market is com- Variable O&M C/MWh 1.75 2.8 8
posed of a major player (Spain) and a minor player (Portugal).
Our experiments exploit the combined market behavior of a TABLE II
major and a minor electricity market players. CO AND CC UNIT ’ S CAPACITY BLOCK (MW) AND HEAT RATE
(B TU / K W H ) AND THE CORRESPONDING MARGINAL COST ( C/MW H ).
We abstracted intra-nation market details and modeled each
country as a single generator company (with several generating CO generating unit CC generating unit
units). Figure 4 uses INGENIAS notation to depict the hierar- Cap. Heat rate Marg. cost Cap. Heat rate Marg. cost
chical structure of the electricity market; the Pool (OMEL –
250 12000 — 100 9000 —
“Operador do Mercado Ibérico de Electricidade”) settles the
350 10500 11.9 150 7800 29.8
market price (and coupled bids) after the bids submitted by
400 10080 12.5 200 7200 29.8
each GenCo (PT – “Portugal” and ES – “Spain”) according 450 9770 12.7 225 7010 30.3
to a strategy that depends on the marginal production costs of 500 9550 13.1 250 6880 31.4
each GenUnit.
TABLE III
GT UNIT ’ S CAPACITY BLOCK (MW) AND HEAT RATE (B TU / K W H ) AND
THE CORRESPONDING MARGINAL COST ( C/MW H ) .
GT generating unit
Cap. Heat rate Marg. cost
50 14000 —
100 10600 44.0
110 10330 46.2
120 10150 48.9
125 10100 52.5
computed according to the respective GenUnits (cf. Table I).
The “active” suffix (cf. Table IV, name column) means that
Fig. 4. An illustrative TEMMAS formulation (using INGENIAS notation). the GenCo searches for its GenUnits best bidding strategies;
i.e. “active” is a policy learning agent.
We considered three types of generating units: i) one base
load coal plant, CO, ii) one combined cycle plant, CC, to cover TABLE IV
T HE EXPERIMENT ’ S GenCoS AND GenUnit S .
intermediate load, and iii) one gas turbine, GT, peaking unit.
Table I shows the essential properties of each plant type and GenCo
tables II and III shows the heat rate curves used to define Exp. name Prod. Capac. GenUnits
the bidding blocks. The marginal cost was computed using #1 GenCo active 875 CO & CC & GT
expression ( 1 ); the bidding block’s quantity is the capacity
GenCo major 2000 2×CO & 4×CC
increment, e.g. for CO, the 11.9 marginal cost bidding block’s #2
GenCo minor&active 875 3×CC & 1×GT
quantity is 350 − 250 = 100 MW (cf. Table II, CO, top lines
GenCo major&active 2000 2×CO & 4×CC
2 and 1). #3
GenCo minor&active 875 3×CC & 1×GT
V. E XPERIMENTS AND RESULTS
Our experiments have two main purposes: i) illustrate the Experiment #1. The experiment sets a constant, 600
TEMMAS functionality, and ii) analyze the agents’ resulting MW, hourly demand for electricity. Figure 5 shows the
behavior, e.g. the learnt bidding policies, in light of the market GenCo active process of learning the bidding policy that gives
specific dynamics. the highest long-term profit. We used Q-learning, with an
We designed three experimental scenarios and Table IV -greedy exploration strategy, which picks a random action
shows the GenCo’s name along with its production capacity, with probability and behaves greedily otherwise (i.e., picks
the action with the highest estimated action value); we defined GenCos' Market Share
100
= 0.2. The learning factor rate of Q-learning was defined 90
as α = 0.01 and the discount factor (which measures the 80
Market Share ( % )
present value of future rewards) was set to γ = 0.5. Figure 70
GenCo _major
60
6 shows the bid blocks that cleared the market (at the first 50
hour of last simulated day). As there is no market competition 40
GenCo _minor&active
the cheapest, CO, bids zero, the GT sets the market price (to 30
20
its ceiling) and the most expensive 200 MW are distributed 10
among the most expensive GenUnits (CC, GT). Therefore, the 0
0 10 20 30 40 50 60 70 80 90 100
GenCo active agent found, for each perceived market share,
−−→ Simulation Cycle (1 Day; 24 Hours)
mShare, the best strategy, sttg, to bid its GenUnits’ energy
blocks.
Fig. 7. Market share evolution induced by GenCo minor&active. [Exp. #2]
Profit of GenCo _active
2.5
2
competition each company learns to secure its own fringe of
the market.
1.5
Profit ( M€ )
GenCos' Market Share
1 GenCo_major&active
100
90
0.5
80
Market Share ( % )
70
0
60
50
-0.5 GenCo_minor&active
40
0 200 400 600 800 1000 1200 1400 1600 1800 2000 2200 2400
30
Simulation Cycle (1 Day; 24 Hours)
20
10
Fig. 5. The process of learning a bid policy to maximize profit. [Exp. #1] 0
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
Simulation Cycle (1 Day; 24 Hours)
GenCo _active Coupled Block Bids (Day=2500; Hour=1)
180
Fig. 8. Market share evolution induced by both GenCos. [Exp. #3]
150
Price (€/MWh)
120
VI. C ONCLUSIONS AND FUTURE WORK
90
This paper describes our preliminary work in the cons-
60
truction of a MABS framework to analyze the macro-scale
30
dynamics of the electric power market. Although both research
0
0 50 100 150 200 250 300 350 400 450 500 550 600
fields (MABS and market simulation) achieved considerable
Capacity (MW) progress there is a lack of cross-cutting approaches. We used
Base Coal (CO) Comb. Cycle (CC) Gas Turbine (GT)
the proposed MABS framework to support our preliminary
work in the construction of the TEMMAS agent-based elec-
Fig. 6. The bid policy that maximizes profit (price ceiling is 180). [Exp. #1]
tricity market simulator.
Hence, our contribution is two folded: i) a comprehensive
Experiment #2. The experiment sets a constant, 2000 MW, formulation of MABS, including the simulated environment
hourly demand for electricity. Figure 7 shows the market share and the inhabiting decision-making and learning agents, and ii)
evolution while GenCo minor&active learns to play in the a simulation model (TEMMAS) of the electric power market
market with GenCo major, which is a larger company with a framed in the proposed formulation.
fixed strategy: “bid each block 5C higher than its marginal Our initial results reveal an emerging and coherent market
cost”. We see that GenCo minor&active gets around 18% behavior, thus inciting us to further extend the experimental
(75 − 57) of market from GenCo major. To earn that market setup with additional bidding strategies and to incorporate
the GenCo minor&active learnt to lower its prices in order to specific market rules, such as congestion management and
exploit the “5C space” offered by GenCo major fixed strategy. pricing regulation mechanisms.
Experiment #3. In this experiment both GenCos are “ac- R EFERENCES
tive”; the remaining is the same as in experiment #2. Figure
[1] Berry, C., Hobbs, B., Meroney, W., O’Neill, R., Jr, W.S.: Understanding
8 shows the market share oscillation while each company how market power can arise in network competition: a game theoretic
reacts to the other’s strategy to win the market. Despite the approach. Utilities Policy 8(3) (September 1999) 139–158
[2] Gabriel, S., Zhuang, J., Kiet, S.: A Nash-Cournot model for the
north american natural gas market. In: Proceedings of the 6th IAEE
European Conference: Modelling in Energy Economics and Policy. (2–
3 September 2004)
[3] Schuster, S., Gilbert, N.: Simulating online business models. In:
Proceedings of the 5th Workshop on Agent-Based Simulation (ABS-
04). (May 3–5 2004) 55–61
[4] Helleboogh, A., Vizzari, G., Uhrmacher, A., Michel, F.: Modeling
dynamic environments in multi-agent simulation. JAAMAS 14(1) (2007)
87–116
[5] Boutilier, C., Dearden, R., Goldszmidt, M.: Exploiting structure in policy
construction. In: Proceedings of the IJCAI-95. (1995) 1104–1111
[6] Clark, A.: Being there: putting brain, body, and world together again.
MIT (1998)
[7] Rao, A., Georgeff, M.: BDI agents: From theory to practice. In: Pro-
ceedings of the First International Conference on Multiagent Systems,
S (1995) 312–319
[8] Simari, G., Parsons, S.: On the relationship between MDPs and the
BDI architecture. In: Proceedings of the AAMAS-06. (May 8–12 2006)
1041–1048
[9] Trigo, P., Coelho, H.: Decision making with hybrid models: the case of
collective and individual motivations. International Journal of Reasoning
Based Intelligent Systems (IJRIS); Inderscience Publishers (2009)
[10] Watkins, C., Dayan, P.: Q-learning. Mach. Learning 8 (1992) 279–292
[11] Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT
P. (1998)
[12] : OMIP - The Iberian Electricity Market Operator. online: ‘http://www.
omip.pt’
[13] Botterud, A., Thimmapuram, P., Yamakado, M.: Simulating GenCo
bidding strategies in electricity markets with an agent-based model.
In: Proceedings of the 7th Annual IAEE European Energy Conference
(IAEE-05). (August 28–30 2005)
[14] Sousa, J., Lagarto, J.: How market players aadjusted their strategic
behaviour taking into account the CO2 emission costs - an application
to the spanish electricity market. In: Proceedings of the 4th International
Conference on the European Electricity Market (EEM-07), Cracow,
Poland (May 23–27 2007)
[15] Gómez-Sanz, J., Fuentes-Fernández, R., Pavón, J., Garcı́a-Magariño, I.:
INGENIAS development kit: a visual multi-agent system development
environment (BEST ACADEMIC DEMO OF AAMAS’08). In: Pro-
ceedings of the Seventh AAMAS, Estoril, Portugal (May 12-16 2008)
1675–1676