=Paper= {{Paper |id=Vol-2581/ase2020paper3 |storemode=property |title=FPGA-Based Debugging with Dynamic SignalSelection at Run-Time |pdfUrl=https://ceur-ws.org/Vol-2581/ase2020paper3.pdf |volume=Vol-2581 |authors=Gernot Fiala,Tobias Scheipel,Werner Neuwirth,Marcel Baunach |dblpUrl=https://dblp.org/rec/conf/se/FialaSNB20 }} ==FPGA-Based Debugging with Dynamic SignalSelection at Run-Time== https://ceur-ws.org/Vol-2581/ase2020paper3.pdf
       FPGA-Based Debugging with Dynamic Signal
                Selection at Run-Time
          Gernot Fiala                         Tobias Scheipel                    Werner Neuwirth                     Marcel Baunach
  Graz University of Technology Graz University of Technology      AVL List GmbH      Graz University of Technology
          Graz, Austria                 Graz, Austria              Graz, Austria             Graz, Austria
  gernot.fiala@student.tugraz.at  tobias.scheipel@tugraz.at   Werner.Neuwirth@avl.com      baunach@tugraz.at



   Abstract—For the development of FPGA-based automotive                   flexible and not to interrupt running tests, improved debug
systems, debugging of internal signals is necessary to detect errors       cores are required.
or to analyze/visualize the operation of the field programmable               The present work introduces a custom FPGA-based logic
gate array (FPGA) at runtime. Often, so called ”debug cores”
of the FPGA vendor are used for debugging. Xilinx Vivado is a              debugger, the Advanced Inverter Debugger (AID). The AID
development environment offering an integrated logic analyzer              can dynamically select signals for the debugging process
for statically selected signals. However, each time these input sig-       at run-time. The debugging process is controlled by a
nals shall be changed, the whole workflow (synthesis, placement,           user interface at a workstation. The debugging parameters
routing and generation of the bit stream) must be repeated, which          are sent from the workstation to the debug core on the
is very time consuming.
   The scope of the present work is to develop a custom                    FPGA. The communication is done with UDP/IP (user
and more flexible FPGA-based logic debugger: The Advanced                  datagram protocol/internet protocol). The debug core decodes
Inverter Debugger (AID) is a logic component, integrated into              the command information from the user interface and
the system under development, that can dynamically select signals          autonomously starts the debugging process with the given
for the debugging process at run-time. The debugging process is            configuration. The sampled signal data is then sent from
controlled by a user interface at a workstation, communicating
via UDP/IP over Ethernet. The AID is configurable with regard              the FPGA to the workstation and monitored with the user
to start/stop triggers and sample rate for each signal, and allows         interface. Optionally, the signal data can be logged in
long-term recording as well as visualization at the workstation.           comma-separated values (csv) files for long-term observation
For convenient use in the development of automotive control                and delayed analysis.
systems, the AID is available as Matlab component for integration
into and synthesis with the target system.
   Index Terms—automotive, debugging, FPGA
                                                                              This paper is organized as follows: Section II shows related
                                                                           work on debug cores. Section III gives an overview on
                                                                           different concepts to implement such debug cores. Section IV
                       I. I NTRODUCTION
                                                                           explains the structure and operation of the AID debug core.
   In the automotive industry, electric engines are becoming               Section V shows the resource usage of the debug core on
more and more important. For testing synchronous and asyn-                 the FPGA. Section VI shows the test hardware on which the
chronous engines, complex test benches are used, which are                 debug core was tested. Section VII shows the user interface,
able to set up different test conditions. These test benches use           which controls the debug core and tests. Finally, Section VIII
FPGA-based inverters and controllers, which provide several                concludes the paper.
functionalities, e.g., pulse width modulation (PWM), phase-
locked loop (PLL), voltage and current control to power                                             II. R ELATED W ORK
the different synchronous and asynchronous engines. During                   Xilinx Vivado provides the ILA core, which allows
the development process, these inverters and controllers must              developers to put an integrated logic analyzer into their
continuously be checked for correct operation, which is done               FPGA designs. An ILA can monitor signals during the
by debugging internal signals of the FPGA design.                          execution of the system at a predefined sampling rate if
   For debugging, standard debug cores of the FPGA vendor                  the signals meet predefined trigger conditions. The logic
like the integrated logic analyzer (ILA) [1] provided by Xilinx            overhead varies depending on the selected number of samples
Vivado [2] are commonly used. The ILA core uses statically                 and the defined input signals. The samples are stored on the
selected input signals and settings for the debugging process. If          FPGA and sent via a JTAG interface to the workstation for
the input signals or the debugging settings need to be changed,            monitoring with Xilinx Vivado.
the internal structure of the ILA core is updated in the FPGA
design. Therefore, the whole workflow (synthesis, placement,                  Debugging and validation of logic in FPGA designs was
routing and generation of the bit stream) must be repeated,                also considered by various researchers. In [3], a method of
which is very time consuming (especially for large FPGA                    run-time debugging and monitoring of FPGAs is shown. An
designs) and requires to stop and restart the system. To be more           embedded microprocessor is used to monitor internal signals



      Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
of the FPGA design. The connection between the signals and        processes the corresponding interrupt and reads the signal data
the microprocessor works via the on-chip memory (OCM)             from the RAM, builds the UDP package and sends it to the
bus and the processor’s local bus (PLB) through the shared        workstation with the user interface.
memory.                                                              The lwIP library also allows to use the transmission control
                                                                  protocol (TCP) for the connection between the ARM processor
   Another method uses a scan-chain based approach [4]. A         and the workstation. However, due to lower communication
watch-point capability is provided by inserting a scan-chain      overhead with UDP/IP, smaller data packages can be sent
into the FPGA design, which is configured as shift register.      faster from the ARM processor to the workstation. Therefore,
With this method it is possible to change the watch-point         UDP/IP was selected.
conditions at run-time without a recompilation of the FPGA
design. However, the monitored signals can not be changed
at run-time.

  III. C OMMUNICATION CONCEPTS FOR THE DEBUG CORE
   The debug core was initially designed to debug internal
signals of inverters and voltage and current controller for
automotive test benches. To debug these internal signals
and to visualize the inverter and controller behavior, the
AID provides an interface for up to 300 possible input
signals. Out of these, 4 can be selected dynamically for          Fig. 1. Communication between the debug core and the user interface with
                                                                  UDP/IP and the ARM processor.
the debugging process at run-time. The 300 input signals
were chosen to have a large selection of the internal signals
of the controller. Due to the high number of signals, big
multiplexers are required. To lower the resource usage on         B. Communication with UDP/IP and the AXI-Ethernet IP core
the FPGA, only 4 signals can be selected concurrently for            This approach uses the AXI-Ethernet [9] IP core for the
the debugging process. The debug core sends the sampled           communication between the debug core and the user interface,
signal data continuously to the user interface, depending on      shown in Fig. 2. The debugging parameters are adjusted with
the sampling frequency. Once started, this allows infinite        the user interface and sent with UDP/IP to the media access
debugging processes.                                              control (MAC) interface. The AXI-Ethernet IP core from
                                                                  Xilinx is used to receive the data and route it via an AXI-
  For the inclusion of the debug core into the FPGA design,       Stream (AXIS) [6] bus to the DebugCore block. The sampled
3 different methods are discussed in this section.                signal data can be stored in registers to collect enough samples
A. Communication with UDP/IP and the ARM Processor                for a UDP package. Then the collected signal samples are sent
                                                                  via the AXIS bus to the AXI-Ethernet block, which converts
   With this approach, the communication between the debug
                                                                  the AXIS data into data for the Ethernet transceiver and builds
core and the user interface at the workstation is done via
                                                                  and sends the UDP package to the workstation. The user
UDP/IP and the processing system (PS) [5] of an ARM
                                                                  interface processes and monitors the signal data.
processor, shown in Fig. 1. The Ethernet interface of the
                                                                     With this approach no PS and ARM processor is required.
ARM processor is used for the communication with the
                                                                  There is also no need for the DMA to access the RAM
user interface. An UDP echo server is running as standalone
                                                                  because the signal samples can be collected for a UDP package
application on the ARM processor, which is responsible for
                                                                  in registers on the FPGA. This might slightly increase the
receiving and sending the UDP packages. This application uses
                                                                  resource usage on the FPGA but can be accepted.
the Lightweight Internet Protocol (lwIP) library, which is a
network stack for embedded systems.
   To connect the programmable logic (PL) with the PS of the
ARM processor, the Advanced eXtensible Interface (AXI) [6],
a general purpose input output (GPIO) [7] port, an interrupt
system (enabled interrupt controller), and the direct memory
access (DMA) are used. The processor RAM is used for
transferring data between the PS and the PL. The AXI GPIO
port enables the read operation from the RAM on the PL side.
The AXI-DataMover [8] reads the debugging parameters from
the RAM and routes them to the DebugCore block.
   The connection from the PL to the PS is done with interrupts   Fig. 2. Communication between the debug core and the user interface with
and DMA. The sampled signal data is written via AXI-              UDP/IP and the AXI-Ethernet IP core.
DataMover into the RAM and the interrupt is set. The PS
C. Communication with TCP/IP and the AXI-Ethernet IP core                   IV. D ESIGN AND IMPLEMENTATION OF THE DEBUG CORE
   This approach also uses the AXI-Ethernet IP core for the                A. Structure of the Debug Core
communication between the debug core and the user interface,                  The debug core was developed in VHDL (very high speed
shown in Fig. 3. The debugging parameters are sent with                    integrated circuit hardware description language). It is built
TCP/IP from the user interface to the MAC interface. The                   with different units shown in Fig. 4. The MM2S-Datamover
AXI-Ethernet IP core receives the data and routes it via                   interface is the AXIS interface for the Memory Mapped to
AXIS bus to the DebugCore block. TCP/IP allows bigger data                 Stream (MM2S) transfer. The interface is used, when the AXI-
packages and more signal data can be sent to the user interface.           DataMover reads the debug parameters from the RAM and
Therefore, sampled signal data can be stored in the RAM for                transfers it to the debug core. With the unpackage (UNPKG)
the TCP package payload. To write the signal data into the                 Module, the AXIS data is decoded into the different param-
RAM, the AXI-DataMover is used. When enough samples are                    eters to set up the debugging process. The decoded signals
collected, the data is read from the RAM and transferred to                are routed to the Debug Core Module, which is the heart
the AXI-Ethernet IP core. The Ethernet transceiver sends the               of the debug core. It is responsible for the trigger, signal
TCP package to the user interface at the workstation.                      selection and signal sampling. The Debug Core Module unit
   With this approach, no PS and ARM processor is required.                was developed with Matlab [11] SIMULINK [12]. The VHDL
The RAM can be accessed with the AXI-DataMover and DMA                     code was generated with the Matlab model and included into
to collect the signal samples for the TCP package. This re-                the Vivado project. The sampled signal values are transferred
quires a unit, which controls the write and read operations via            to the package (PKG) Samples unit, which builds the AXIS
the AXI-DataMover. This concept is more complex compared                   data stream. The AXIS data stream is transferred via the
to the others, because memory access and the inclusion of the              Stream to Memory Mapped (S2MM) Datamover interface to
AXI-Ethernet IP core is required.                                          the AXI-DataMover, which writes the data into the RAM. The
                                                                           Datamove control (CTL) unit controls the AXI-DataMover
                                                                           operations. It sets an interrupt, when the debug parameters are
                                                                           successfully read from the RAM and transferred to the debug
                                                                           core. It also signals the PS with an interrupt when signal data
                                                                           can be read from the RAM to build the UDP package and
                                                                           send it to the workstation.




Fig. 3. Communication between the debug core and the user interface with
TCP/IP and the AXI-Ethernet IP core.

  To use the AXI-Ethernet IP core, a licence for the Xilinx
Tri-Mode Ethernet Media Access Control (TEMAC) [10] is
necessary.

D. Decision for the Implementation
   The idea was to send the sampled signal data as fast as
                                                                                        Fig. 4. Debug core structure on the FPGA
possible to the workstation to minimize the used memory
resources on the FPGA. With UDP/IP, packages can be sent
faster compared to TCP/IP and potential package loss with
                                                                           B. Implementation of the debug core
UDP was not an issue during our evaluations and in-field tests.
Both concepts with the AXI-Ethernet IP core are processor                     The debug core has up to 300 possible input signals, each
independent, which is a big advantage for the integration of               32 bits wide. They are combined to a signal interface and 4
the AID into different FPGA designs. Unfortunately, due to                 signals of them can be selected for the debugging process.
the missing TEMAC licence during the development time,                        The signal selection is part of the Debug Core Module
the AXI-Ethernet IP core couldn’t be used and both concepts                block. To select the 4 signals, 4 big multiplexers are used. In
with the AXI-Ethernet IP core were not possible for the                    Matlab SIMULINK, a 300:1 multiplexer can be built very eas-
implementation. To evaluate the functionality of the AID and               ily, but when the VHDL code is generated and synthesized, the
to analyze the communication with UDP/IP, the decision was                 multiplexers are built with the available F7 and F8 multiplexers
made to implement the first concept, ”Communication with                   of the FPGA. Since these multiplexers are commonly needed
UDP/IP and the ARM processor”.                                             for the main functions of the controllers and inverters, our
300:1 multiplexers are designed to us Look-up Tables (LUT).         signal values are written into the BRAMs, the write address
This was done by using bit wise logical disjunction.                overflows and starts at 0 to overwrite the old signal values. The
   To select the signals for the debugging process, the control     read address starts to increase and follows the write address
information from the user interface is decoded and routed as        with a constant gap. The read address also increases, when the
selection signals into the signal selection blocks. Depending       trigger condition is met. Then, the values of the ring buffer
on the value, one of the 300 input signals is selected as output    are used to build the UDP packages. This gives information
signal. The structure of the signal selection is shown in Fig.      about the signal behaviour before the trigger event occurs.
5. The signal selection can be done at run-time.                       Trigger condition can be set for the post- and pre-trigger.
                                                                    The available trigger conditions are above, lower or equal
                                                                    to the trigger value, shown in Table I. The trigger signal is
                                                                    always the first selected signal. When no trigger is active,
                                                                    the sampling process starts directly after sending the start
                                                                    debugging command information from the user interface to
                                                                    the FPGA.

                                                                                                TABLE I
                                                                                    T RIGGER TYPES OF THE DEBUG CORE

                                                                            Trigger Type              Trigger condition
                                                                             above value      signal value is above trigger value
                                                                             lower value   signal value is lower than trigger value
                                                                               equal to     signal value is equal to trigger value

               Fig. 5. Structure of the signal selection

   The selected signals can be sampled with the adjusted            C. Communication between the Programmable Logic and the
sample frequency. The operation frequency is 1 MHz for debug        Processing System of the ARM processor
core but all modules using the AXIS interface are operating            The communication from the PS to the debug core on
at a frequency of 100 MHz. The maximum sample frequency             the FPGA is done with AXI-GPIO ports, shown in Fig. 4.
is limited to 1 MHz and the lowest sample frequency is 1            They support a configurable I/O channel width of up to 32
kHz. Overall, there are 25 different sampling frequencies for       bits. These AXI-GPIO ports can be addressed with driver
the debugging process available, which can be selected in the       files from the PS. AXI-GPIO ports are used to initialize the
user interface. An internal counter controls the sample points      read operation from the RAM with the AXI-DataMover. To
depending on the adjusted sample frequency. The functionality       configure the transfers between the AXI-DataMover and the
is comparable to a frequency divider.                               RAM also AXI-GPIO ports are used.
   The number of samples defines how long the debugging                The communication from the debug core to the PS is done
process is active. The selectable values are between 1024 and       with interrupts. The first interrupt signals the PS that the debug
999424 samples. The step size is 1024. An internal counter          parameters were successfully read from the RAM and the
increases with each sample and when the adjusted number             AXI-GPIO port for initializing the read operation can be reset.
of samples is reached, an internal reset occurs, which stops           The second interrupt signals the PS, that the sampled
the debugging process and resets all of the debug core sub          signal values were successfully written into the RAM and
modules. This operation mode is called normal mode. There           can be read with the PS, to build the UDP package and
is a second operation mode, the infinity mode, in which the         send it to the workstation. To prevent simultaneous memory
sampling process is active until the stop command information       access, 2 memory addresses are used alternately to write the
is sent from the user interface to the FPGA.                        sampled signal data into the RAM. The DataMoveCTL block
   To influence the start condition, a trigger can be enabled.      controls the alternating address change. Multiple samples can
There is a pre- and a post-trigger available. The post-trigger      be collected for the UDP transfer, before the interrupt is set.
activates the sampling process immediately when the trigger         The adjustments for this can be made in the IP settings of
condition is met and the sampled signal data is sent to the user    the DataMoveCTL block or with the PS and AXI-GPIO ports.
interface. The pre-trigger continuously saves 100 samples per       Currently 32 samples are used to build the UDP package. One
signal into a ring buffer. Once the trigger condition is met, the   sample contains the package type, sample number (timestamp)
signal data is sent from the ring buffer to the user interface.     and the 4 signal values.
This allows to also analyse the signals before the actual trigger
event.                                                              D. Communication between the user interface and the ARM
   To access the Block RAMs (BRAM) of the ring buffer,              processor
counters are used as write and read addresses. When the start         The communication between the user interface and the
debugging command was received, the write address increases         ARM processor is done with UDP/IP. Different package
with each sample point. If no trigger event happens and 100         types are defined, to distinguish between control information,
version number and signal data. The package type defines,             (UART), Universal Serial Bus (USB), JTAG, High Definition
which information are transmitted with the UDP package. The           Multimedia Interface (HDMI), Video Graphics Array (VGA),
different package types are shown in Table II. A standalone           Audio I/O, Ethernet are supported and can be used for
application with a UDP echo server is running on the ARM              different kinds of applications. It also includes Double Data
processor. It receives and sends the UDP packages.                    Rate Random-Access Memory (DDR3-RAM), an interface
   If a UDP package is received and the command data are              for Secure Digital (SD) memory card, Light-Emitting Diodes
written into the RAM, the AXI GPIO port to enable the read            (LEDs), switches and I/O interfaces. With the processing
operation is set. The corresponding interrupt is processed by         system the different components can be activated and the
the interrupt system and the program waits for the interrupts to      programmable logic of the FPGA can be configured.
read the signal data from the RAM to build the UDP package
                                                                                    VII. U SER I NTERFACE AND T ESTS
and send it to the workstation.
   The version number request is directly processed by the PS,          The user interface controls the debug core on the FPGA.
which sends the version number back to the user interface.            All adjustments for the UDP connection and the debugging
                                                                      process can be made here. The UDP connection is set up
                            TABLE II                                  with the IP address, incoming and outgoing port numbers.
   D EFINITION OF THE PACKAGE TYPES FOR THE UDP CONNECTION

 Package Type                       Description
                                                                         To get the signal names from the FPGA design, which
      0          command information to start the debugging process   are connected to the debug core, a signal configuration file
      1                             signal data                       generator was implemented with C#. This file generator
      3          command information to reset the debugging process   extracts the input signal names with the interface name of
      5                   request for the version number
      6                 version number acknowledgement                the debug core of the FPGA design and maps them to the
                                                                      input port numbers to generate a csv configuration file. This
                                                                      configuration file can be loaded into the user interface to
                                                                      display the signal names, which can be selected for the
V. R ESOURCE U SAGE OF THE D EBUG C ORE ON THE FPGA                   debugging process.
   The signal selection logic is the biggest part of the debug
core. The multiplexers with 300 input signals, each 32 bits,             The user interface was programmed with C# and tested with
need a significant amount of resources on the FPGA in order to        the ZedBoard. Fig. 6 shows the user interface with the different
allow selection flexibility. To lower the resource usage, a debug     settings for the debugging process. The debugging process
core with 40 possible input signals was created. A comparison         was started with 8192 samples and a sample frequency of
between the AID300 and AID40 is shown in Table III. The               1 MHz. The post-trigger is selected as trigger condition and
FPGA on the ZedBoard [13] was used to get an overview                 data logging is enabled, which generates a csv file and saves
of the used resources. The biggest differences between the            the signal data with the debugging settings.
AID40 and AID300 are visible at the ”Slice LUT”, ”Slice”
and ”LUT as Logic” counters. This differences are caused by
the reduction of the input signals. However, no F7 and F8
multiplexers are required, because the signal selection logic
was developed to avoid them.

                          TABLE III
  R ESOURCE USAGE OF THE AID40 AND AID300 ON THE Z ED B OARD

           Resource          ZedBoard     AID40     AID300
           Slice LUT           53200       2055      5616
         Slice Register       106400       1420      2500
              Slice            13300       755       1730
         LUT as Logic          53200       2055      5616
       LUT Flip Flop Pairs     53200       257       257
        Brock RAM Tile          140         2         2
              DSP               220         1         1



                    VI. T EST-H ARDWARE
   The debug core was tested with the ZedBoard [13]. The              Fig. 6. Testing the debug core with the ZedBoard and the signal generator
ZedBoard is a development board for the Xilinx Zynq-7000
System on Chip (SoC) [14]. It contains a dual-core ARM                  After each received 999424 samples, a new file is
Cortex-A9 processor and a Z-7020 FPGA [15]. Several inter-            automatically created. This is important, when the debugging
faces like, e.g., Universal Asynchronous Receiver Transmitter         process is running in infinity mode. The csv files with the
logged signal data can also be loaded and monitored with the
user interface for delayed analysis.

   To test the debug core, a signal generator is used to simulate
a complex FPGA-based automotive system. It generates 300
test signals, which are used as input signals for the debug
core. The first 20 test signals are simple counters, which start
at different values and increase with different frequencies. The
other test signals are constants to test the signal selection
of the debug core. The test setup is a hardware in the
loop test by running the FPGA design on the FPGA of the
ZedBoard. The debugging process is started with the adjusted
debugging settings of the user interface. A receiving thread
and a processing thread are used to receive the UDP packages
and to process, log and monitor the signal data. It works well
for lower sample frequencies and high sample frequencies with
lower sample numbers shown in Fig. 7. The sample frequency
is adjusted to 1 MHz and the number of samples is set to
                                                                            Fig. 8. Testing the debug core with the ZedBoard with 1 MHz sampling rate
60416. However due to UDP/IP, small packages can be sent                    and 100352 samples
very fast from the FPGA to the workstation and when a large
number of samples and a high sample frequency (1 MHz or
500 kHz) are adjusted, sample loss occurs. During the tests, the            oped to debug FPGA-based inverters and controllers. The AID
incoming UDP packages were analyzed with Wireshark [16].                    is able to dynamically select signals for debugging FPGA-
Each UDP package arrived successfully at the workstation but                based embedded automotive systems at run-time. In order to
the receiving thread gets blocked sometimes by other threads                be flexible with the signal selection, the possible number of
and can not process the incoming UDP packages fast enough                   input signals has to be large. However, the multiplexers for the
and the samples are lost. This is shown in Fig. 8 with the                  signal selection requires more resources on the FPGA, which
adjusted sample frequency of 1 MHz and 100352 samples.                      might not be available.
UDP packages can also arrive in the wrong order because UDP                    The AID is controlled by a user interface at a workstation.
does not support packet sequencing. Therefore, the packages                 The communication is done with UDP/IP and the processing
are numbered by the AID upon transmission.                                  system (PS) of the ARM processor. To transfer the data be-
                                                                            tween the programmable logic (PL) and the PS, the processor
                                                                            RAM is used.
                                                                               The limitations of this approach are the receiver of the
                                                                            workstation and the PS of the ARM processor. With UDP/IP,
                                                                            packages can be sent very fast but the user interface at the
                                                                            workstation has problems to process the data at high sample
                                                                            frequencies. The receiving thread of the user interface is
                                                                            blocked sometimes, which leads to sample loss. There are also
                                                                            limitations when the PS is used for other operations beside the
                                                                            AID. Some interrupts of the debug core might no longer be
                                                                            processed, UDP packages are not sent, and samples get lost.
                                                                               For the future, the TEMAC license can be acquired to use
                                                                            the AXI-Ethernet IP core. With this IP core, the communica-
                                                                            tion can be done without the PS but the flexibility of the AID
                                                                            remains the same. Also, the communication can be changed
                                                                            to TCP/IP. With TCP/IP, bigger packages can be sent and the
                                                                            receiver has more time to process the received data.
                                                                                                   ACKNOWLEDGMENT
Fig. 7. Testing the debug core with the ZedBoard with 1 MHz sampling rate     This research was supported by AVL List GmbH, Graz,
and 60416 samples
                                                                            Austria.
                                                                                                        R EFERENCES
                        VIII. C ONCLUSION
                                                                             [1] Xilinx, “Integrated logic analyzer v6.1,” Xilinx, Apr. 2016.
  This paper shows a custom FPGA-based debug core, the                           [Online]. Available: https://www.xilinx.com/support/documentation/ip
Advanced Inverter Debugger (AID), which was initially devel-                     documentation/ila/v6 1/pg172-ila.pdf
[2] ——, “Xilinx vivado,” Xilinx. [Online]. Available: https:                        axi datamover/v5 1/pg022 axi datamover.pdf
    //www.xilinx.com/products/designtools/vivado.html                           [9] ——, “Axi 1g/2.5g ethernet subsystem v7.0,” Xilinx, Apr. 2017.
[3] A. Penttinen, R. Jastrzebski, R. Pöllänen, and O. Pyrhönen, “Run-time        [Online]. Available: https://www.xilinx.com/support/documentation/ip
    debugging and monitoring of fpga circuits using embedded microproces-           documentation/axi ethernet/v7 0/pg138-axi-ethernet.pdf
    sor,” in 2006 IEEE Design and Diagnostics of Electronic Circuits and       [10] ——, “Tri-mode ethernet mac v9.0,” Xilinx, Apr. 2018. [Online]. Avail-
    systems. Prague, Czech Republic: IEEE, Apr. 2006, 1-4244-0185-2.                able: https://www.xilinx.com/support/documentation/ip documentation/
[4] A. Tiwari and K. A. Tomko, “Scan-chain based watch-points for                   tri mode ethernet mac/v9 0/pg051-tri-mode-eth-mac.pdf
    efficient run-time debugging and verification of fpga designs,” in 2003    [11] MathWorks, “Matlab.” [Online]. Available: https://de.mathworks.com/
    Proceedings of the ASP-DAC Asia and South Pacific Design Automation             products/matlab.html
    Conference. Kitakyushu, Japan: IEEE, Jan. 2003, 0-7803-7659-5.             [12] ——, “Simulink.” [Online]. Available: https://de.mathworks.com/
[5] Xilinx, “Processing system 7 v5.5,” Xilinx, May 2017. [Online]. Avail-          products/simulink.html
    able: https://www.xilinx.com/support/documentation/ip documentation/       [13] AVNET, “Zedboard.” [Online]. Available: http://zedboard.org/product/
    processing system7/v5 5/pg082-processing-system7.pdf                            zedboard
[6] ——, “Axi reference guide,” Xilinx, July 2017. [Online]. Avail-             [14] Xilinx, “Zynq-7000 soc data sheet: Overview,” Xilinx, July. 2018.
    able: https://www.xilinx.com/support/documentation/ip documentation/            [Online]. Available: https://www.xilinx.com/support/documentation/
    axi ref guide/latest/ug1037-vivado-axi-reference-guide.pdf                      data sheets/ds190-Zynq-7000-Overview.pdf
[7] ——, “Axi gpio v2.0,” Xilinx, Oct. 2016. [Online]. Avail-                   [15] ——, “Zynq-7000 soc z-7020 data sheet,” Xilinx, July. 2018. [Online].
    able: https://www.xilinx.com/support/documentation/ip documentation/            Available: https://www.xilinx.com/support/documentation/data sheets/
    axi gpio/v2 0/pg144-axi-gpio.pdf                                                ds187-XC7Z010-XC7Z020-Data-Sheet.pdf
[8] ——, “Axi datamover v5.1,” Xilinx, Apr. 2017. [Online]. Avail-              [16] Wireshark-Community, “Wireshark.” [Online]. Available: https:
    able: https://www.xilinx.com/support/documentation/ip documentation/            //www.wireshark.org/