=Paper=
{{Paper
|id=Vol-1856/p03
|storemode=property
|title=Implementation of an FPGA testbed
as an educational experiment for students of
informatics
driven e-health systems
|pdfUrl=https://ceur-ws.org/Vol-1856/p03.pdf
|volume=Vol-1856
|authors=Lukas Stasytis
}}
==Implementation of an FPGA testbed
as an educational experiment for students of
informatics
driven e-health systems==
Implementation of an FPGA testbed
as an educational experiment for students of
informatics
Lukas Stasytis
Faculty of Informatics
Kaunas Technology University
Student g. 50, Kaunas 51368
e-mail: lukas.stasytis@ktu.edu
Abstract—The following paper describes a hardware imple- Fig. 1. Possible maze configurations
mentation of a simulation platform on an FPGA board to provide
a test bed for machine learning experimentation. The educational
value that such a project can have for a student in putting
theoretical programming, digital logic and computer architecture
principles to practice.
I. I NTRODUCTION
The author of this paper, while undertaking the 2nd semester
of a Computer Science program, took up the following project
as a means to put studied theoretical Digital Logic principles
to practical use beyond the typical course lab work and to
better understand how hardware is designed and implemented.
The goal of the project is to implement a two dimensional
simulation platform using the VHDL programming language
on to an FPGA (Field Programming Gate Array) board.
Machine learning implementations, like neural networks, could
then be attached to said implementation to run high speed
experiments. As such, this project is meant to lead up to an The black squares represent entities and the goal while the
artificial intelligence research project. A key focus of this grey squares represent walls. A surrounding wall is generated
paper is to show how taking up such a project was extremely around the structure, for reasons that will be discussed in the
valuable in revising old and learning new computer science level component implementation overview. The general
principles for the author. process run-down of the implementation is as follows:
(a) An external input command is given to start the simula-
tion after choosing a specific mode in which it will run.
(b) The initial maze is generated by taking into account the
II. I MPLEMENTATION OVERVIEW chosen mode.
(c) Each entity is given a random starting position and initial
As a whole, the implementation simulates a two dimensional maze information to help make the initial decision.
bit maze with ’1’ bit-wide entities that can traverse the maze (d) The simulation starts and over the course of a series of
taking turns in sequence. The maze itself can have a set counter clocks, each entity makes a decision of which
configuration to either simulate a complex maze, a flat field or direction to either move to or turn to.
a simple checker-board pattern. The entities are divided into (e) The entity decisions are sent to the component holding
two groups: the player and a list of guards. The goals of these the maze’s information to check if the decision is legal.
entities can vary, but the primary goal of the player is to reach (f) The newly generated maze information for the next
a designated tile on the maze while the goal of the guards is decision is sent back to each entity.
to catch the player. Figure 1 shows some of the possible maze (g) A turn summary is generated and sent to the player and,
configurations that the implementation covers. if chosen to, system output to store.
14
Copyright © 2017 held by the authors
(h) Based on the results of the turn, a new turn begins or the (f) The counter dictates which entity request will be handled
simulation is reset. first, allowing for a parallel and iterative request handling.
Figure 2 shows the entire structure of the implementation This is done by limiting each component’s functions to
with all its components. specific counter ticks.
(g) The level component, after checking each request -
Fig. 2. The base structure of the platform making sure the move is legal and that no collisions
are present, sends an answer back to the entities and
modifies the maze array to reflect on a passed or rejected
movement request. At the same time, a small field view of
the maze is generated for the entity to use when making
the next move. This information is sent to a transform
component.
(h) The transform component modifies the maze information
that goes to the entities. Specifically, removes entity view
vision behind walls.
(i) After each entities request has been handled and all the
answers have been delivered, a turn summary is generated
inside the level component and sent to the player and
the system output/storage. Based on this summary, the
simulation can continue or be reset due to player loss or
Going from right to left, the ten components are as follows: victory.
(a) Random number generator
(b) Maze generator
(c) Maze storage III. C OMPONENT OVERVIEW
(d) Level
(e) Main command A. State machines and the main command unit
(f) Transform
At the highest level, the platform acts as a state machine.
(g) Sub command
The main command component, which connects to each other
(h) Counter
component of the platform, has a three bit-wide variable
(i) Player
which determines which state the platform is in. Each other
(j) Guard (multiple)
component has a case statement at the very top, checking
The following is a short run down of each component and which state the platform is in to decide on a specific task
how they function in the platform as a whole before going that it will be performing. Figure 3 shows a bit dictionary of
deeper into the structure of each component and development the state signal.
principles:
(a) The main command component controls all the other
components in the system. The system inputs go directly Fig. 3. Bit dictionary of the state signal
to this component and based on those, the component
sends out a new state signal for each other component to
react to. This state signal defines what each component
should be doing - being on standby, preparing for a
simulation or running a simulation.
(b) Once the simulation request is made, the maze generator
starts generating the initial maze with the help of a
random number generation component.
(c) The newly generated maze is sent to the maze storage
component and from there, to the level component. The
maze generator continues to store new maps in the storage By using such a state configuration, modularity can be
component afterwards. achieved within the simulation to allow for different types of
(d) Initial maze information is sent to each entity from the simulations to be run without needing to recompile and re-
level component via the sub command component, which upload the hardware configuration. There exist two more main
is in charge of routing entity requests and maze controller state machine implementations within the platform, namely
(level) answers. the use of a counter to decide what state the turn is at and a
(e) With the initial configuration of each entity set, the actual state machine for the guard artificial intelligence, albeit only
simulation starts. Each entity sends out a request for a for testing purposes. Both will be explored in their respective
new move. sections.
15
B. Maze generator older maze configuration to a new one once a maze has been
The current primary maze generation module uses a re- cleared. Given how a single maze can take a few hundred
cursive backtracking generation algorithm implemented iter- clock cycles to build and a single turn takes ten cycles, we
atively using a stack of 14 bits the size of the maze path. 7 can see in practice that a map buffer can easily be achieved
for the x and y coordinates of each tile that has already been and used to not have to wait in between the player beating
traversed. Over a few hundred series of clocks, the component mazes. The maze stack also allows for pre-generated mazes
generates a full maze of a given size and then sends the maze to be input into the system from external memory.
one line at a time to the maze storage component. The E. Level component
generator itself is fed with a random number each clock cycle
The level component has four functions:
from the random number generation component and uses said
number to decide in which direction to continue digging paths. (a) Storing the current simulation maze array with all neces-
All the variables are stored in registers and the maze’s size sary information of each tile
goes up to 128*128 tiles wide, but could be adjusted with (b) Generating the starting positions of each entity and send-
only a few edits in the code. ing them out.
A list of algorithms were looked at to determine which (c) Checking for the legality of incoming movement requests
one would best suit a hardware implementation. Some of the and making changes to the maze array appropriately. That
algorithms looked at were: is, making an entity position change reflect in the actual
array that holds the maze for collision checking.
(a) Kruskal’s algorithm - very easy and quick implementa-
(d) Generating maze information for each entity when han-
tion, but produces a lot of dead-ends in the maze, which
dling a movement request
can be very difficult for an AI to overcome.
(b) Recursive division algorithm - requires very large stacks The level component is the heart of the platform that
to implement iteratively, would likely consume too much practically holds the entire simulation within itself. Entity
space on an FPGA. information and stored mazes for future simulation cycles are
the only external pieces of information that the level
The recursive backtracking algorithm was the final choice
component does not hold within itself. The component’s basic
because of its simple iterative implementation that requires a
structure is based on a series of if statements that determine
single, easy to read stack and highly branching paths, which
if the entity making a request is a player or a guard, if it is
could potentially allow for a more fluid machine learning
requesting to turn their vision or to make a movement and
process. An iterative approach was picked, because there is
if the entity has other entities in its field of view to count
no value to be gained in generating mazes over a single clock
damage checks. This is followed by a collision check by
cycle, given the comparably long simulation times that require
testing if the tile that the entity wants to move to is already
hundreds of clock cycles to complete. Additionally, a recursive
filled with an entity or not. If a collision occurs, the entity’s
method is very resource heavy and an FPGA board of any
position is simply left unmodified. A core design principle
variant is heavily limited in space.
within the level component is that the request is handled by
C. Random number generator simply modifying the incoming request signal and sending the
A simple LFSR - Linear Feedback Shift Register was used modified signal back to the entity that made the request. Figure
to create a pseudo-random number generator with minimal ef- 4 illustrates this signal.
fort. The generator starts generating numbers from the moment
an input command by the user is sent and reset every time the Fig. 4. A movement request signal breakdown
input is given again. Given how the FPGA of choice (Altera’s
DE0 Development and Education board) runs at 50 MHz clock
speeds, a human input should already add enough randomness
In the table, going from left to right, we have:
to the seed to allow for fairly random mazes to be generated.
Given the fact that different components of the FPGA board (a) ’7’ bits to store the x axis position of the entity.
can be accessed, some of which can hold highly random data (b) ’7’ bits to store the y axis position of the entity.
information, alternative, more random and automated methods (c) ’3’ bits saving the unique id of the entity.
of generating seeds might be considered in the future. Given (d) ’1’ bit telling if the entity is requesting to move or to turn
that the random number generator is separated from the actual (e) ’2’ bits telling the direction in which the entity would
maze generator, modularity is easy to accomplish. like to move or turn its vision.
Once a request arrives at the level component, the id tag
D. Maze storage is checked to determine if the requesting entity is a player
The generated mazes are stored in a four-layer wide stack or the guard, then it is checked if the entity is requesting
of mazes, each consisting of ’map width’ x ’map width’ size a move or a turn and based on that information the current
array of bits, where each one represents a wall. By saving position of the entity and the requested movement direction
mazes in a stack, rather than generating only a single maze are added together and checked for validity. Once validity has
and uploading that, we are able to instantly swap from an been confirmed, the position coordinates of the entity may or
16
Listing 1. Sub command component
may not be changed. All this information is then sent back
to the entity in an answer signal that is generated using the begin
same format as the request signal. Furthermore, a field view is if(rising edge(clk)) then
generated based on the direction the entity is facing and sent i f s t a t e ( 2 ) = ’ 1 ’ or s t a t e = ” 011 ” t h e n
to the transformation module. case counter i s
The generation itself is achieved by simply multiplying the when ” 0001 ” =>
entity coordinates by a number ranging from ’-4’ to ’4’, t o l e v e l <= sp 0 ; ( 1 )
excluding ’0’, which depends on the direction the entity is when ” 0010 ” =>
facing. This way, every required maze array tile for a field t o l e v e l <= sg 01 ;
view can be obtained with minimal if statements. This does, when ” 0011 ” =>
however, make any further mathematical operations in the t o l e v e l <= sg 02 ;
same clock cycle impossible on the generated map view. t o e n t i t i e s <= f r o m l e v e l ; ( 2
Lastly, a ’3’ bit-wide wall is present at the edges of the ) when ” 0100 ” =>
maze array to eliminate the need to worry about out-of-index t o l e v e l <= sg 03 ;
errors. In other words, rather than creating many positional t o e n t i t i e s <= f r o m l e v e l
checks to make sure that the field view does not look past ; when ” 0101 ” =>
the maze array, the array is simply expanded further and filled t o l e v e l <= sg 04 ;
with wall tiles. In doing so some register space is sacrificed t o e n t i t i e s <= f r o m l e v e l
for an easier algorithm. ; when ” 0110 ” =>
Figure 5 shows an example of what the field view looks t o l e v e l <= sg 05 ;
like. t o e n t i t i e s <= f r o m l e v e l
; when ” 0111 ” =>
Fig. 5. Entity field view visualization t o e n t i t i e s <= f r o m l e v e l
; when ” 1000 ” =>
t o e n t i t i e s <= f r o m l e v e l ; ( 3
) when o t h e r s =>
end c a s e ;
end i f ;
end i f ;
end p r o c e s s main ;
The following is a list explanation of the three noted lines:
Number 1 marks a guard entity facing northwards while
(1) A signal ’sp0’ (from the player entity) is being sent
number 2 - a player entity facing eastwards.
forward to the level component using the ’to level’ signal.
The player entity is given one extra line of vision to give
(2) Later in the turn cycle, the component is sending forward
it an advantage over the guards. The information is sent in
a request and also sending back an answer of a previous
three or four data packets, each holding the tile information
request in parallel. In this case - sending forward ’sg02’
of each specific tile in that line. The maze arrays have ’4’ bits
(second guard entity) and receiving ’sp0’ (the pre-last
assigned to each tile, marking the wall, player entity, guard
forwarded request’s answer)
entity and objective variables.
(3) At the end of the turn cycle, all that is left is for the
F. Sub command unit component to handle the remaining answer signals being
The sub command unit, located between the level and the generated by the level component.
entity components, acts as a router to make sure that the level
An important distinction is that the incoming entity requests
component is getting a single request each clock cycle and
come from specific input signals, while the answer signal to
that the entities are all receiving their answers. This unit is
the entities goes into the same ”to entities” signal, this is
not necessary, given that the entities could just be directly
because only the sub command component is inputting an
connected to the level component, however the separation
answer signal, while all six entities are inputting the request
allows to vastly simplify the level component, which is already
signals. So in the return trip, each entity can simply feed from
much more complex than the rest of the platform and so losing
the same bus and check if the answer travelling matches their
one clock cycle per turn to achieve this result is deemed
identification number, while this is not possible while making
a worthwhile trade. Listing 1 shows the code of the sub
the requests, because the signals would clash. A master-slave
command component.
communication protocol could be used to make the entity
input also come from a single signal. This is not done for
simplicity sake, since the “slave” count is very limited in the
current implementation, leaving the protocol as a potential
17
future modification if deemed necessary.
18
G. Field view transform Fig. 6. Platform simulation conveyer
The transform component is given the information that an
entity is meant to obtain from the level component and then
makes some changes to account for set rules. Specifically, the
current iteration of the implementation has a single bit
dedicated to saying that the tile is a wall. Naturally, an entity
should not be able to see what is behind a wall. Thus, a filter
is ran that removes all information that should not be visible to
the entity, because of walls obstructing vision. This is achieved
by a series of if statements. Each tile in the first few lines of
the cone is checked for holding a wall value and if it does - The way the conveyor works is that by implementing a
each tile behind it is made null, mimicking a lack of vision. counter component that counts up to a specific number of
choice and then resets, we are able to fully control what each
The reasoning behind this component is that, in the level
component is doing at each clock cycle while the simulation is
component, a clock cycle is already used up when making a
running. The conveyor greatly decreases the amount of clock
mathematical transformation on the entity’s facing direction
cycles each turn requires to be cycled by giving each entity a
and coordinates to determine which tiles to grab without
piece of information to work with. For example, in the counter
needing to input too many if statements. As such, the transfor-
= ’4’ cycle, going from up to down, it can be seen that:
mation component is separated from the level component to
make the scrambling transformation independently. This also (a) The sub command and transformation components are
allows to easily modify the component to change how the sending out the 1st guard’s request answer to be picked
transformation is done. up by said guard.
(b) The 2nd guard’s answer is being handled and sent out to
the sub command and transformation components.
H. Entities (c) The 3rd guard’s request is being sent to the level com-
ponent from the sub command.
A few practices present in object-oriented programming are (d) The 4th guard’s request is being sent from the entity to
put into use. Namely, how all the entity components are the sub command component.
similar to objects in how they store functions and all entity Additionally, at the end of the conveyor, the level component
variables and are treated the same by other hardware generates a turn summary and sends that to the player entity.
components within the system. However, unlike in true object- If a component that implements machine learning is added
oriented programming, the object count is fixed and requires to the platform, it would be using up two additional clocks
code readjustments to change, which does cause a scalability (entity sends a request, the component sends back an answer)
issue. Each entity has a separate hardware component in which after each entity has received its field view for that turn.
all information about the entity is stored. The entities are each
J. Discussion
given a unique identification tag at the start of the simulation
by a direct input command. Additionally, each component is The development process of the overviewed implementation
prepared to be connected to a separate component that would has proven to be an extremely educational experience for the
decide which move the entity would make. This is how author with each part of the system acting as a tool to revise
machine learning would be implemented into the platform. and learn new computer science principles. The following is a
By attaching, for example, a neural network to each entity. list of some of the lessons that were learned in the development
After receiving an answer to a turn, each entity would send process:
this answer with additional information from past turns to the (a) State machines.
neural network, upon which, a move can be decided. The (b) Hardware time dependency - at the hardware level, a
answer would then be generated in a single clock cycle and mathematical calculation can only be correctly made if
sent back to the entity to then generate a request to the level it is known that the variables that make up the formula
component. Lastly, the player entity is different from the guard have already been calculated. Given the nature of clock
entities, because at the end of each turn, it also receives a direct cycles, this had to be carefully looked at.
turn summary from the level component. Using this summary, (c) Random number generation - a look at how random
the entity checks if the simulation should be reset because of numbers are generated at the hardware level is required.
the objective being achieved or the player having lost. Different algorithms have to be looked at to decide which
one is most fitting for the job.
(d) Random number generator seed randomness - different
I. The counter and platform parallelism approaches can be considered while designing a hardware
chip and how there exists a possibility to implement a
Figure 6 shows the main conveyor of the simulation. hardware random number generator by using physical
19
processes of the chip, such as thermal noise and clock more cost efficient approach and will be the next focus of the
drifts, which could achieve much greater degrees of research project. The implementation does still show a way
randomness than software implementations. to simply implement an entity simulation within hardware and
(e) Signal handling, component structure and optimization acted as an excellent tool for the author to further his
- the level component requires efficient handling of the education.
movement requests to send out an answer in the same
ACKNOWLEDGEMENT
clock cycle without creating variable dependencies, but
at the same time keeping the implementation of a man- I would like to express my deepest thanks to my mentor
ageable size. For example, the transformation component Kazimieras Bagdonas for not only guiding me through every
was separated from the level component entirely to allow step of the way towards realizing this project, but also helping
for field view generation to be done very efficiently by me find my direction and develop a deep passion for learning.
using a single entity’s coordinates and direction to apply REFERENCES
a mathematical transformation and obtain a field view.
[1] E. E. Almeida, J. E. Luntz, D. M. Tilbury, “Event-condition-action
(f) Routers - a simple router had to be implemented within systems for reconfigurable logic control,” IEEE Transac- tions on
the platform. Signal handling and time planning were Automation Science and Engineering, vol. 4(2), pp.167–181, 2007.
necessary. [2] K. L. Dittrich, S. Gatziu, A. Geppert, “The active database management
system manifesto: A rulebase of adbms features,” In International
(g) An event - condition - action (ECA) system is put into Workshop on Rules in Database Systems, pp. 1–17. Springer, 1995.
practice with the answer signal handling. [3] S. S. Huang, A. Hormati, D. F. Bacon, R. Rabbah, “Liquid metal: Object-
(h) Object oriented programming principles are practiced, oriented programming across the hardware/software boundary,” In
namely by separating entities into separate components European Conference on Object-Oriented Programming, pp. 76–103.
Springer, 2008.
and treating them much like objects while developing the [4] F. Islam, M. Ali, B. Y. Majlis, “FPGA implementation of an LFSR based
platform. The answer and request handling is done by pseudorandom pattern generator for mems testing,” International Journal
state machines that work very much like object methods. of Computer Applications, vol. 75(11), pp. 30-34, 2013.
[5] B. Jun, P. Kocher, “The intel random number generator, “Cryptography
Such an implementation gives an interesting look into the Research Inc. white paper, 1999.
differences between software and hardware design. [6] M. Smiroldo, “Method and apparatus for managing a number of time slots
(i) Artificial intelligence is looked at by implementing a state during which plural bidding devices can request communication access to
a central device,” August 5, 1997. US Patent 5,654,968.
machine to the guard entities for testing purposes.
(j) Hardware parallelism implementation using a conveyor
(k) Counter implementation and usage in hardware
K. Future work
After careful review, it has become apparent that the design
is highly inefficient when fully implemented in hardware. The
current design takes up over 12000 LUTs or 80% of the entire
matrix capacity of the FPGA board used for testing the system.
As such, the entire platform will be reimplemented in software
while creating a real-time interface between a computer and
the FPGA board and implementing only the machine learning
components within the FPGA board. The current plan is to
use a RS-232 communication protocol and send out entity
field data to the FPGA and take back only the answer from
the board. If the RS-232 standard turns out to be too much
of a bottleneck in the system, more sophisticated ways of
interfacing the devices will be looked into.
IV. C ONCLUSION
In this paper, the implementation of an experimental test bed
platform as a personal educational project was covered with all
the informatics principles and mazes that were touched upon
while developing. Over the course of the implementation and
after reviewing the final design, it has become clear that having
the entire design implemented on an FPGA board is highly
impractical, because the greatest bottleneck is in the machine
learning process, not the platform simulation. Limiting the
hardware implementation to only the machine learning part
and moving the rest of the system to software would be a
20