=Paper=
{{Paper
|id=Vol-1856/p03
|storemode=property
|title=Implementation of an FPGA testbed
as an educational experiment for students of
informatics
driven e-health systems
|pdfUrl=https://ceur-ws.org/Vol-1856/p03.pdf
|volume=Vol-1856
|authors=Lukas Stasytis
}}
==Implementation of an FPGA testbed
as an educational experiment for students of
informatics
driven e-health systems==
<pdf width="1500px">https://ceur-ws.org/Vol-1856/p03.pdf</pdf>
<pre>
               Implementation of an FPGA testbed
           as an educational experiment for students of
                           informatics
                                                          Lukas Stasytis
                                                        Faculty of Informatics
                                                  Kaunas Technology University
                                                   Student g. 50, Kaunas 51368
                                                  e-mail: lukas.stasytis@ktu.edu


   Abstract—The following paper describes a hardware imple-                              Fig. 1. Possible maze configurations
mentation of a simulation platform on an FPGA board to provide
a test bed for machine learning experimentation. The educational
value that such a project can have for a student in putting
theoretical programming, digital logic and computer architecture
principles to practice.


                      I. I NTRODUCTION

   The author of this paper, while undertaking the 2nd semester
of a Computer Science program, took up the following project
as a means to put studied theoretical Digital Logic principles
to practical use beyond the typical course lab work and to
better understand how hardware is designed and implemented.
The goal of the project is to implement a two dimensional
simulation platform using the VHDL programming language
on to an FPGA (Field Programming Gate Array) board.
Machine learning implementations, like neural networks, could
then be attached to said implementation to run high speed
experiments. As such, this project is meant to lead up to an                The black squares represent entities and the goal while the
artificial intelligence research project. A key focus of this            grey squares represent walls. A surrounding wall is generated
paper is to show how taking up such a project was extremely              around the structure, for reasons that will be discussed in the
valuable in revising old and learning new computer science               level component implementation overview. The general
principles for the author.                                               process run-down of the implementation is as follows:
                                                                         (a) An external input command is given to start the simula-
                                                                              tion after choosing a specific mode in which it will run.
                                                                         (b) The initial maze is generated by taking into account the
              II. I MPLEMENTATION OVERVIEW                                    chosen mode.
                                                                         (c) Each entity is given a random starting position and initial
   As a whole, the implementation simulates a two dimensional                 maze information to help make the initial decision.
bit maze with ’1’ bit-wide entities that can traverse the maze           (d) The simulation starts and over the course of a series of
taking turns in sequence. The maze itself can have a set                      counter clocks, each entity makes a decision of which
configuration to either simulate a complex maze, a flat field or              direction to either move to or turn to.
a simple checker-board pattern. The entities are divided into            (e) The entity decisions are sent to the component holding
two groups: the player and a list of guards. The goals of these               the maze’s information to check if the decision is legal.
entities can vary, but the primary goal of the player is to reach         (f) The newly generated maze information for the next
a designated tile on the maze while the goal of the guards is                 decision is sent back to each entity.
to catch the player. Figure 1 shows some of the possible maze            (g) A turn summary is generated and sent to the player and,
configurations that the implementation covers.                                if chosen to, system output to store.

                                                                    14
Copyright © 2017 held by the authors
(h) Based on the results of the turn, a new turn begins or the             (f) The counter dictates which entity request will be handled
     simulation is reset.                                                      first, allowing for a parallel and iterative request handling.
  Figure 2 shows the entire structure of the implementation                    This is done by limiting each component’s functions to
with all its components.                                                       specific counter ticks.
                                                                          (g) The level component, after checking each request -
               Fig. 2. The base structure of the platform                      making sure the move is legal and that no collisions
                                                                               are present, sends an answer back to the entities and
                                                                               modifies the maze array to reflect on a passed or rejected
                                                                               movement request. At the same time, a small field view of
                                                                               the maze is generated for the entity to use when making
                                                                               the next move. This information is sent to a transform
                                                                               component.
                                                                          (h) The transform component modifies the maze information
                                                                               that goes to the entities. Specifically, removes entity view
                                                                               vision behind walls.
                                                                           (i) After each entities request has been handled and all the
                                                                               answers have been delivered, a turn summary is generated
                                                                               inside the level component and sent to the player and
                                                                               the system output/storage. Based on this summary, the
                                                                               simulation can continue or be reset due to player loss or
   Going from right to left, the ten components are as follows:                victory.
 (a) Random number generator
 (b) Maze generator
 (c) Maze storage                                                                         III. C OMPONENT OVERVIEW
 (d) Level
 (e) Main command                                                         A. State machines and the main command unit
  (f) Transform
                                                                            At the highest level, the platform acts as a state machine.
 (g) Sub command
                                                                          The main command component, which connects to each other
 (h) Counter
                                                                          component of the platform, has a three bit-wide variable
  (i) Player
                                                                          which determines which state the platform is in. Each other
  (j) Guard (multiple)
                                                                          component has a case statement at the very top, checking
   The following is a short run down of each component and                which state the platform is in to decide on a specific task
how they function in the platform as a whole before going                 that it will be performing. Figure 3 shows a bit dictionary of
deeper into the structure of each component and development               the state signal.
principles:
 (a) The main command component controls all the other
      components in the system. The system inputs go directly                            Fig. 3. Bit dictionary of the state signal
      to this component and based on those, the component
      sends out a new state signal for each other component to
      react to. This state signal defines what each component
      should be doing - being on standby, preparing for a
      simulation or running a simulation.
 (b) Once the simulation request is made, the maze generator
      starts generating the initial maze with the help of a
      random number generation component.
 (c) The newly generated maze is sent to the maze storage
      component and from there, to the level component. The
      maze generator continues to store new maps in the storage              By using such a state configuration, modularity can be
      component afterwards.                                               achieved within the simulation to allow for different types of
 (d) Initial maze information is sent to each entity from the             simulations to be run without needing to recompile and re-
      level component via the sub command component, which                upload the hardware configuration. There exist two more main
      is in charge of routing entity requests and maze controller         state machine implementations within the platform, namely
      (level) answers.                                                    the use of a counter to decide what state the turn is at and a
 (e) With the initial configuration of each entity set, the actual        state machine for the guard artificial intelligence, albeit only
      simulation starts. Each entity sends out a request for a            for testing purposes. Both will be explored in their respective
      new move.                                                           sections.

                                                                     15
B. Maze generator                                                      older maze configuration to a new one once a maze has been
   The current primary maze generation module uses a re-               cleared. Given how a single maze can take a few hundred
cursive backtracking generation algorithm implemented iter-            clock cycles to build and a single turn takes ten cycles, we
atively using a stack of 14 bits the size of the maze path. 7          can see in practice that a map buffer can easily be achieved
for the x and y coordinates of each tile that has already been         and used to not have to wait in between the player beating
traversed. Over a few hundred series of clocks, the component          mazes. The maze stack also allows for pre-generated mazes
generates a full maze of a given size and then sends the maze          to be input into the system from external memory.
one line at a time to the maze storage component. The                  E. Level component
generator itself is fed with a random number each clock cycle
                                                                          The level component has four functions:
from the random number generation component and uses said
number to decide in which direction to continue digging paths.          (a) Storing the current simulation maze array with all neces-
All the variables are stored in registers and the maze’s size                sary information of each tile
goes up to 128*128 tiles wide, but could be adjusted with               (b) Generating the starting positions of each entity and send-
only a few edits in the code.                                                ing them out.
   A list of algorithms were looked at to determine which               (c) Checking for the legality of incoming movement requests
one would best suit a hardware implementation. Some of the                   and making changes to the maze array appropriately. That
algorithms looked at were:                                                   is, making an entity position change reflect in the actual
                                                                             array that holds the maze for collision checking.
 (a) Kruskal’s algorithm - very easy and quick implementa-
                                                                        (d) Generating maze information for each entity when han-
     tion, but produces a lot of dead-ends in the maze, which
                                                                             dling a movement request
     can be very difficult for an AI to overcome.
 (b) Recursive division algorithm - requires very large stacks            The level component is the heart of the platform that
     to implement iteratively, would likely consume too much           practically holds the entire simulation within itself. Entity
     space on an FPGA.                                                 information and stored mazes for future simulation cycles are
                                                                       the only external pieces of information that the level
   The recursive backtracking algorithm was the final choice
                                                                       component does not hold within itself. The component’s basic
because of its simple iterative implementation that requires a
                                                                       structure is based on a series of if statements that determine
single, easy to read stack and highly branching paths, which
                                                                       if the entity making a request is a player or a guard, if it is
could potentially allow for a more fluid machine learning
                                                                       requesting to turn their vision or to make a movement and
process. An iterative approach was picked, because there is
                                                                       if the entity has other entities in its field of view to count
no value to be gained in generating mazes over a single clock
                                                                       damage checks. This is followed by a collision check by
cycle, given the comparably long simulation times that require
                                                                       testing if the tile that the entity wants to move to is already
hundreds of clock cycles to complete. Additionally, a recursive
                                                                       filled with an entity or not. If a collision occurs, the entity’s
method is very resource heavy and an FPGA board of any
                                                                       position is simply left unmodified. A core design principle
variant is heavily limited in space.
                                                                       within the level component is that the request is handled by
C. Random number generator                                             simply modifying the incoming request signal and sending the
   A simple LFSR - Linear Feedback Shift Register was used             modified signal back to the entity that made the request. Figure
to create a pseudo-random number generator with minimal ef-            4 illustrates this signal.
fort. The generator starts generating numbers from the moment
an input command by the user is sent and reset every time the                      Fig. 4. A movement request signal breakdown
input is given again. Given how the FPGA of choice (Altera’s
DE0 Development and Education board) runs at 50 MHz clock
speeds, a human input should already add enough randomness
                                                                          In the table, going from left to right, we have:
to the seed to allow for fairly random mazes to be generated.
Given the fact that different components of the FPGA board              (a) ’7’ bits to store the x axis position of the entity.
can be accessed, some of which can hold highly random data              (b) ’7’ bits to store the y axis position of the entity.
information, alternative, more random and automated methods             (c) ’3’ bits saving the unique id of the entity.
of generating seeds might be considered in the future. Given            (d) ’1’ bit telling if the entity is requesting to move or to turn
that the random number generator is separated from the actual           (e) ’2’ bits telling the direction in which the entity would
maze generator, modularity is easy to accomplish.                            like to move or turn its vision.
                                                                          Once a request arrives at the level component, the id tag
D. Maze storage                                                        is checked to determine if the requesting entity is a player
   The generated mazes are stored in a four-layer wide stack           or the guard, then it is checked if the entity is requesting
of mazes, each consisting of ’map width’ x ’map width’ size            a move or a turn and based on that information the current
array of bits, where each one represents a wall. By saving             position of the entity and the requested movement direction
mazes in a stack, rather than generating only a single maze            are added together and checked for validity. Once validity has
and uploading that, we are able to instantly swap from an              been confirmed, the position coordinates of the entity may or

                                                                  16
                                                                                         Listing 1. Sub command component
may not be changed. All this information is then sent back
to the entity in an answer signal that is generated using the            begin
same format as the request signal. Furthermore, a field view is            if(rising edge(clk)) then
generated based on the direction the entity is facing and sent              i f s t a t e ( 2 ) = ’ 1 ’ or s t a t e = ” 011 ” t h e n
to the transformation module.                                                 case counter i s
   The generation itself is achieved by simply multiplying the                         when ” 0001 ” =>
entity coordinates by a number ranging from ’-4’ to ’4’,                                 t o l e v e l <= sp 0 ; ( 1 )
excluding ’0’, which depends on the direction the entity is                            when ” 0010 ” =>
facing. This way, every required maze array tile for a field                             t o l e v e l <= sg 01 ;
view can be obtained with minimal if statements. This does,                            when ” 0011 ” =>
however, make any further mathematical operations in the                                 t o l e v e l <= sg 02 ;
same clock cycle impossible on the generated map view.                                   t o e n t i t i e s <= f r o m l e v e l ; ( 2
   Lastly, a ’3’ bit-wide wall is present at the edges of the                          ) when ” 0100 ” =>
maze array to eliminate the need to worry about out-of-index                             t o l e v e l <= sg 03 ;
errors. In other words, rather than creating many positional                             t o e n t i t i e s <= f r o m l e v e l
checks to make sure that the field view does not look past                             ; when ” 0101 ” =>
the maze array, the array is simply expanded further and filled                          t o l e v e l <= sg 04 ;
with wall tiles. In doing so some register space is sacrificed                           t o e n t i t i e s <= f r o m l e v e l
for an easier algorithm.                                                               ; when ” 0110 ” =>
   Figure 5 shows an example of what the field view looks                                t o l e v e l <= sg 05 ;
like.                                                                                    t o e n t i t i e s <= f r o m l e v e l
                                                                                       ; when ” 0111 ” =>
                Fig. 5. Entity field view visualization                                  t o e n t i t i e s <= f r o m l e v e l
                                                                                       ; when ” 1000 ” =>
                                                                                         t o e n t i t i e s <= f r o m l e v e l ; ( 3
                                                                                       ) when o t h e r s =>
                                                                              end c a s e ;
                                                                            end i f ;
                                                                          end i f ;
                                                                         end p r o c e s s main ;

                                                                           The following is a list explanation of the three noted lines:
   Number 1 marks a guard entity facing northwards while
                                                                         (1) A signal ’sp0’ (from the player entity) is being sent
number 2 - a player entity facing eastwards.
                                                                             forward to the level component using the ’to level’ signal.
   The player entity is given one extra line of vision to give
                                                                         (2) Later in the turn cycle, the component is sending forward
it an advantage over the guards. The information is sent in
                                                                             a request and also sending back an answer of a previous
three or four data packets, each holding the tile information
                                                                             request in parallel. In this case - sending forward ’sg02’
of each specific tile in that line. The maze arrays have ’4’ bits
                                                                             (second guard entity) and receiving ’sp0’ (the pre-last
assigned to each tile, marking the wall, player entity, guard
                                                                             forwarded request’s answer)
entity and objective variables.
                                                                         (3) At the end of the turn cycle, all that is left is for the
F. Sub command unit                                                          component to handle the remaining answer signals being
   The sub command unit, located between the level and the                   generated by the level component.
entity components, acts as a router to make sure that the level
                                                                            An important distinction is that the incoming entity requests
component is getting a single request each clock cycle and
                                                                         come from specific input signals, while the answer signal to
that the entities are all receiving their answers. This unit is
                                                                         the entities goes into the same ”to entities” signal, this is
not necessary, given that the entities could just be directly
                                                                         because only the sub command component is inputting an
connected to the level component, however the separation
                                                                         answer signal, while all six entities are inputting the request
allows to vastly simplify the level component, which is already
                                                                         signals. So in the return trip, each entity can simply feed from
much more complex than the rest of the platform and so losing
                                                                         the same bus and check if the answer travelling matches their
one clock cycle per turn to achieve this result is deemed
                                                                         identification number, while this is not possible while making
a worthwhile trade. Listing 1 shows the code of the sub
                                                                         the requests, because the signals would clash. A master-slave
command component.
                                                                         communication protocol could be used to make the entity
                                                                         input also come from a single signal. This is not done for
                                                                         simplicity sake, since the “slave” count is very limited in the
                                                                         current implementation, leaving the protocol as a potential
                                                                    17
future modification if deemed necessary.


                                           18
G. Field view transform                                                                     Fig. 6. Platform simulation conveyer

   The transform component is given the information that an
entity is meant to obtain from the level component and then
makes some changes to account for set rules. Specifically, the
current iteration of the implementation has a single bit
dedicated to saying that the tile is a wall. Naturally, an entity
should not be able to see what is behind a wall. Thus, a filter
is ran that removes all information that should not be visible to
the entity, because of walls obstructing vision. This is achieved
by a series of if statements. Each tile in the first few lines of
the cone is checked for holding a wall value and if it does -                  The way the conveyor works is that by implementing a
each tile behind it is made null, mimicking a lack of vision.               counter component that counts up to a specific number of
                                                                            choice and then resets, we are able to fully control what each
   The reasoning behind this component is that, in the level
                                                                            component is doing at each clock cycle while the simulation is
component, a clock cycle is already used up when making a
                                                                            running. The conveyor greatly decreases the amount of clock
mathematical transformation on the entity’s facing direction
                                                                            cycles each turn requires to be cycled by giving each entity a
and coordinates to determine which tiles to grab without
                                                                            piece of information to work with. For example, in the counter
needing to input too many if statements. As such, the transfor-
                                                                            = ’4’ cycle, going from up to down, it can be seen that:
mation component is separated from the level component to
make the scrambling transformation independently. This also                  (a) The sub command and transformation components are
allows to easily modify the component to change how the                           sending out the 1st guard’s request answer to be picked
transformation is done.                                                           up by said guard.
                                                                             (b) The 2nd guard’s answer is being handled and sent out to
                                                                                  the sub command and transformation components.
H. Entities                                                                  (c) The 3rd guard’s request is being sent to the level com-
                                                                                  ponent from the sub command.
   A few practices present in object-oriented programming are                (d) The 4th guard’s request is being sent from the entity to
put into use. Namely, how all the entity components are                           the sub command component.
similar to objects in how they store functions and all entity                  Additionally, at the end of the conveyor, the level component
variables and are treated the same by other hardware                        generates a turn summary and sends that to the player entity.
components within the system. However, unlike in true object-               If a component that implements machine learning is added
oriented programming, the object count is fixed and requires                to the platform, it would be using up two additional clocks
code readjustments to change, which does cause a scalability                (entity sends a request, the component sends back an answer)
issue. Each entity has a separate hardware component in which               after each entity has received its field view for that turn.
all information about the entity is stored. The entities are each
                                                                            J. Discussion
given a unique identification tag at the start of the simulation
by a direct input command. Additionally, each component is                     The development process of the overviewed implementation
prepared to be connected to a separate component that would                 has proven to be an extremely educational experience for the
decide which move the entity would make. This is how                        author with each part of the system acting as a tool to revise
machine learning would be implemented into the platform.                    and learn new computer science principles. The following is a
By attaching, for example, a neural network to each entity.                 list of some of the lessons that were learned in the development
After receiving an answer to a turn, each entity would send                 process:
this answer with additional information from past turns to the               (a) State machines.
neural network, upon which, a move can be decided. The                       (b) Hardware time dependency - at the hardware level, a
answer would then be generated in a single clock cycle and                        mathematical calculation can only be correctly made if
sent back to the entity to then generate a request to the level                   it is known that the variables that make up the formula
component. Lastly, the player entity is different from the guard                  have already been calculated. Given the nature of clock
entities, because at the end of each turn, it also receives a direct              cycles, this had to be carefully looked at.
turn summary from the level component. Using this summary,                   (c) Random number generation - a look at how random
the entity checks if the simulation should be reset because of                    numbers are generated at the hardware level is required.
the objective being achieved or the player having lost.                           Different algorithms have to be looked at to decide which
                                                                                  one is most fitting for the job.
                                                                             (d) Random number generator seed randomness - different
I. The counter and platform parallelism                                           approaches can be considered while designing a hardware
                                                                                  chip and how there exists a possibility to implement a
  Figure 6 shows the main conveyor of the simulation.                             hardware random number generator by using physical

                                                                       19
     processes of the chip, such as thermal noise and clock              more cost efficient approach and will be the next focus of the
     drifts, which could achieve much greater degrees of                 research project. The implementation does still show a way
     randomness than software implementations.                           to simply implement an entity simulation within hardware and
(e) Signal handling, component structure and optimization                acted as an excellent tool for the author to further his
     - the level component requires efficient handling of the            education.
     movement requests to send out an answer in the same
                                                                                                 ACKNOWLEDGEMENT
     clock cycle without creating variable dependencies, but
     at the same time keeping the implementation of a man-                  I would like to express my deepest thanks to my mentor
     ageable size. For example, the transformation component             Kazimieras Bagdonas for not only guiding me through every
     was separated from the level component entirely to allow            step of the way towards realizing this project, but also helping
     for field view generation to be done very efficiently by            me find my direction and develop a deep passion for learning.
     using a single entity’s coordinates and direction to apply                                     REFERENCES
     a mathematical transformation and obtain a field view.
                                                                         [1] E. E. Almeida, J. E. Luntz, D. M. Tilbury, “Event-condition-action
 (f) Routers - a simple router had to be implemented within                   systems for reconfigurable logic control,” IEEE Transac- tions on
     the platform. Signal handling and time planning were                     Automation Science and Engineering, vol. 4(2), pp.167–181, 2007.
     necessary.                                                          [2] K. L. Dittrich, S. Gatziu, A. Geppert, “The active database management
                                                                              system manifesto: A rulebase of adbms features,” In International
(g) An event - condition - action (ECA) system is put into                    Workshop on Rules in Database Systems, pp. 1–17. Springer, 1995.
     practice with the answer signal handling.                           [3] S. S. Huang, A. Hormati, D. F. Bacon, R. Rabbah, “Liquid metal: Object-
(h) Object oriented programming principles are practiced,                     oriented programming across the hardware/software boundary,” In
     namely by separating entities into separate components                   European Conference on Object-Oriented Programming, pp. 76–103.
                                                                              Springer, 2008.
     and treating them much like objects while developing the            [4] F. Islam, M. Ali, B. Y. Majlis, “FPGA implementation of an LFSR based
     platform. The answer and request handling is done by                     pseudorandom pattern generator for mems testing,” International Journal
     state machines that work very much like object methods.                  of Computer Applications, vol. 75(11), pp. 30-34, 2013.
                                                                         [5] B. Jun, P. Kocher, “The intel random number generator, “Cryptography
     Such an implementation gives an interesting look into the                Research Inc. white paper, 1999.
     differences between software and hardware design.                   [6] M. Smiroldo, “Method and apparatus for managing a number of time slots
 (i) Artificial intelligence is looked at by implementing a state             during which plural bidding devices can request communication access to
                                                                              a central device,” August 5, 1997. US Patent 5,654,968.
     machine to the guard entities for testing purposes.
 (j) Hardware parallelism implementation using a conveyor
(k) Counter implementation and usage in hardware

K. Future work
   After careful review, it has become apparent that the design
is highly inefficient when fully implemented in hardware. The
current design takes up over 12000 LUTs or 80% of the entire
matrix capacity of the FPGA board used for testing the system.
As such, the entire platform will be reimplemented in software
while creating a real-time interface between a computer and
the FPGA board and implementing only the machine learning
components within the FPGA board. The current plan is to
use a RS-232 communication protocol and send out entity
field data to the FPGA and take back only the answer from
the board. If the RS-232 standard turns out to be too much
of a bottleneck in the system, more sophisticated ways of
interfacing the devices will be looked into.
                      IV. C ONCLUSION
   In this paper, the implementation of an experimental test bed
platform as a personal educational project was covered with all
the informatics principles and mazes that were touched upon
while developing. Over the course of the implementation and
after reviewing the final design, it has become clear that having
the entire design implemented on an FPGA board is highly
impractical, because the greatest bottleneck is in the machine
learning process, not the platform simulation. Limiting the
hardware implementation to only the machine learning part
and moving the rest of the system to software would be a

                                                                    20

</pre>