=Paper= {{Paper |id=Vol-3144/QUAMES-paper2 |storemode=property |title=An Approach for Model Based Testing of Augmented Reality Applications |pdfUrl=https://ceur-ws.org/Vol-3144/QUAMES-paper2.pdf |volume=Vol-3144 |authors=Porfirio Tramontana,Marco De Luca,Anna Rita Fasolino |dblpUrl=https://dblp.org/rec/conf/rcis/TramontanaLF22 }} ==An Approach for Model Based Testing of Augmented Reality Applications== https://ceur-ws.org/Vol-3144/QUAMES-paper2.pdf
An Approach for Model Based Testing of Augmented
Reality Applications
Porfirio Tramontana, Marco De Luca and Anna Rita Fasolino
University of Naples ”Federico II”, Napoli, Italy


                                      Abstract
                                      The popularity of Augmented Reality (AR) applications has strongly been increased with the worldwide
                                      success of the Pokemon Go videogame released by Niantic in 2016. However, AR offers tangible benefits
                                      in many further areas beyond entertainment, such as advertisement, education, navigation, maintenance,
                                      health, and so on. With the growing spread and success of AR applications in these fields, there has also
                                      been a growing necessity for approaches and technologies for assuring the quality of these applications,
                                      such as testing. A few technologies and frameworks have been recently proposed supporting the
                                      implementation and execution of test scripts that can be used to exercise the applications, but there still
                                      is a lack of effective techniques and tools for the automatic generation of executable test cases. In this
                                      paper, we investigate the possibility of using Model Based Testing techniques to generate executable
                                      test scripts from Finite State Machines modeling the behaviour of the GUI of AR applications, similarly
                                      to other GUI based applications. We have applied several model coverage criteria to design test suites
                                      and we have shown the feasibility of this approach by testing two small example applications involving
                                      Unity3D and Vuforia technologies.

                                      Keywords
                                      Augmented Reality, Model Based Testing, Finite State Machines




1. Introduction
Extended reality (XR) is an umbrella term to describe different kinds of technologies that are
able to merge the physical and virtual worlds. More in details, it is possible to distinguish
between:

               • Virtual reality (immersive or non-immersive VR), where the application simulates a
                 completely different environment around the user;
               • Augmented reality (AR), where the experience enhances the real world with digital details
                 such as images, text, and animation;
               • Mixed Reality (MR), where the application combines its own digital environment with
                 the user’s real-world environment and allows them to interact with each other.

  In particular, AR is a way to provide users with a sensorial experience beyond the reality.
Differently from Non-Immersive VR that is usually implemented in the context of console
or desktop interactive applications and Immersive VR, that needs special glasses or Visors,

Joint Proceedings of RCIS 2022 Workshops and Research Projects Track, May 17-20, 2022, Barcelona, Spain
Envelope-Open ptramont@unina.it (P. Tramontana); marco.deluca2@unina.it (M. D. Luca); fasolino@unina.it (A. R. Fasolino)
                                    © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
 CEUR
 Workshop
 Proceedings
               http://ceur-ws.org
               ISSN 1613-0073
                                    CEUR Workshop Proceedings (CEUR-WS.org)
Augmented Reality is now usually deployed on smartphones in form of Android, iOS or cross-
platform apps. In the following, we will focus on AR applications.
   Several factors have recently fueled the research and development of AR: the emergence
of dedicated AR devices and powerful development kits (such as Unity3D and UnReal), the
improvements in the performance of mobile devices and sensor integration, and advances
in computer vision (CV) technologies. The popularity of AR applications has strongly been
increased with the worldwide success of the Pokemon Go AR videogame released by Niantic in
20161 . However, AR offers tangible benefits in many further areas beyond entertainment, such
as advertisement, education, navigation, maintenance, health, and so on. With the growing
spread and success of AR applications in these fields, there has also been a growing necessity
for approaches and technologies for assuring the quality of these applications, such as testing.
   Testing of AR applications can be carried out at different levels. Unit testing can be carried
out to test source code methods and involves general techniques and tools of the XUnit family.
Unit testing does not suffice to reveal the faults of an application which can be exposed instead
by testing it at system level, by sequences of user and system events. AR applications are
event-based systems: event-based testing techniques can be used to test them likewise any
GUI-based or event-based system. A few technologies and frameworks have been recently
proposed supporting the implementation and execution of test scripts that can be used to
exercise the applications at system level. Although approaches for the functional testing of VR
applications have been recently proposed [1, 2], according to [3], more effective methods and
tools supporting the systematic design and execution of AR application testing are still needed.
   In this paper, we investigate the possibility of using Model Based approaches for testing the
behaviour of AR applications. These approaches are gaining popularity in literature and have
recently used with proficiency to teach testing activities to students [4]. We propose to use
Finite State Machines models to represent the behaviour of the GUI of AR applications, similarly
to other GUI based applications. We apply model coverage criteria to design test suites that
can be implemented as automatically executable test scripts. We demonstrate the feasibility of
this approach and report some encouraging results we obtained by testing two small example
applications involving Unity3D and Vuforia technologies.
   The paper is structured as follows. In Section 2 and 3, respectively. an introduction about AR
applications and the works in literature discussing their testing are reported. Section 4 presents
the proposed Model Based approach for the generation of test suites, while Section 5 shows the
feasibility of the approach and the effectiveness of the test suites on two example applications.
Finally, Section 6 discusses conclusions and future works.


2. Background
AR applications are composed of a client side responsible for the rendering of a 3D environment
mixing real camera images and virtual widgets that can be statically designed or generated on
the fly when specific marker images are recognized. The behaviour of the AR application in
response to user and system events is defined by using general purpose programming languages
(such as C#).
   1
       https://pokemongolive.com/
   An Execution Engine is required for rendering and running the application in the context of
a device. The two most popular engines for the development of 3D games and applications are
Unity3D2 and UnReal3 . The image detection capability can be provided by components such as
the Vuforia Engine 4 , that provides a Tracking service that allows the client to query a remote
image recognition service provided by Vuforia to know when a specific Marker image is shown
on the camera.
   In this paper we have focused our attention on Augmented Reality applications developed
with Unity and Vuforia. Each Unity project is composed of Scenes. A scene can be conceptually
represented by an instance of a GUI, that is rendered on the user device (e.g. on a smartphone)
and can be three-dimensionally navigated, exploiting the touch events and the inputs from
sensors, including motion sensors and camera. Each scene is composed of objects, which
can be specialized in GameObjects and Components that can be interconnected between them.
GameObjects represent graphic items that can move around the scene and with which the
user can interact. Components represent parts of the GameObjects and they can be associated
with code that describes the dynamic behavior of the components and of the GameObjects. In
particular, a component may implement listeners to Events. Events include both user events,
system events and the recognition and loss of Marker images. Figure 1 shows a class diagrams
depicting a metamodel including the main elements of a Unity AR application.




Figure 1: Metamodel of the elements of an Unity AR application




3. Related Work
In the last years, several works investigating the issues of extended reality applications and
possible approaches for detecting them have been proposed in the literature.
   Lehman et al. [5] stress the aspect that AR apps are different from conventional apps in that
the augmented images and labels are generated and positioned based on the user’s behavior
and environment. As a consequence, they identify four categories of common failures in AR
applications that are difficult to detect using conventional software engineering testing. The
failures they focus on consist of object classification failures (due to the impossibility to train a
classifier for every variation of inputs that it may receive, or to control exactly the movement
   2
     https://unity.com/
   3
     https://www.unrealengine.com/en-US/
   4
     https://library.vuforia.com/
and behavior of the user), placement failures, resource limitation failures, and style failures.
These failures need to be detected in the wild and, to this aim, the authors proposed the ARCHIE
framework. ARCHIE collects user feedback and system state data in situ to help developers
identify and debug issues important to testers.
   In 2020, Li et al [6] focused on bugs in XR applications deployed on the Web that exploit
the WebXR Device API. This technology enables users to interact with browsers using XR
devices. However, many WebXR applications are insufficiently tested before being released
and they suffer from various bugs that can degrade user experience or cause undesirable
consequences. To better understand the nature of bugs in WebXR applications, the authors
performed an empirical study where they collected 368 real bugs from 33 WebXR projects hosted
on GitHub. Via a seven-round manual analysis of these bugs, they built a taxonomy of WebXR
bugs according to their symptoms and root causes. They found three main types of issues: (1)
functional issues, (2) crashing issues, and (3) performance issues. Functional issues were further
classified into Application-Specific Functional Issues (consisting in unexpected behaviors often
caused by improper lifecycle event handling and erroneous design of interactive logic) and
Rendering issues (misrendering of objects or missing objects issues). Crashing issues consisted
in runtime exceptions or immediate application crashes. Performance issues in WebXR projects
have various symptoms including high memory consumption, high CPU utilization, abnormal
hanging of applications, and low frame rate or resolution. They observed six major root causes
of WebXR bugs, including: (1) incompatible runtime environment, (2) event handling mistakes,
(3) improper handling of diversified user interaction mechanisms, (4) wrong arguments, (5)
buggy dependencies, and (6) redundant operations. A further study investigated quality issues
(bad smells) in Unity projects [7]. The authors proposed UnityLinter, a static analysis tool that
supports Unity video game developers to detect seven types of bad smells.
   All the considered works show that several issues may affect the quality of XR applications.
Defining approaches and technologies for testing these applications and detecting such issues
is absolutely necessary for the XR developer community.
   As to the technologies supporting AR application testing, a few frameworks and libraries have
recently emerged and are currently available to the tester community. The AirTest framework5
allows to implement test cases replicating sequences of interactions with an AR application.
The airtest.core.api library allows to trigger different types of events on Unity3D applications,
including user events (e.g. click on buttons), system events (e.g. application opening and
closing) and Vuforia related events (e.g. marker identification and disappearing). In addition,
the poco library6 included in AirTest provides methods useful to implements locators that return
references to widgets and other objects present on the GUI of the AR under test.
   Another solution is offered by AltUnity Tester 7 , a free tool for testing of applications built with
Unity. AltUnity Tester allows to write tests in C#, Python and Java. AltUnity Tester consists of
AltUnity Server, which allows to access objects in the GUI hierarchy by opening a TCP socket
on the device running the application and waiting for the connection of an AltUnity Client, used
to connect to AltUnity Server by accessing and interacting with objects through written tests.

    5
      https://airtest.netease.com/
    6
      https://github.com/AirtestProject/Poco-SDK
    7
      https://altom.com/testing-tools/altunitytester/
   These technologies doubtless provide a support to implement and execute test cases that
can be used to exercise the applications at both unit and system level. According to [3], more
effective methods and tools supporting the design and execution of XR application testing are
needed.


4. Model Based Testing of AR applications
Model-based testing (MBT) relies on models of a system under test and/or its environment to
derive test cases for the system. It encompasses the processes and techniques for the automatic
derivation of abstract test cases from abstract models, the generation of concrete tests from
abstract tests, and the manual or automated execution of the resulting concrete test cases [8].
   A MBT process relies on some fundamental activities: (1) Modeling of the system under test,
(2) Definition of Test Selection Criteria and Test case design, (3) Implementation and execution
of the test cases in the context of the system under test. In this paper we have faced out these
problems in the context of AR applications, with specific focus on the automation of the last
activity.

4.1. Modeling the GUI of an Augmented Reality application
The behaviour of the front end of AR applications can be modeled by Finite State Machines
(FSM) as the one of other event-based GUIs [9, 10, 11]. In this case States correspond to instances
of Scene objects with a specific set of widgets, while Transitions correspond to changes between
scenes showing different widgets. Transitions are activated by Event triggers when a possible
Guard condition is true. The Guard condition may also depend on data variables locally defined
in the Component code.
   A possible way for obtaining such FSM model consists in reverse engineering it by static
and dynamic analysis of the application. Static analysis should take into account both the
application structure (e.g. Scenes, GameObjects, Components) and the source code of the scripts
(e.g. variables, event listeners and guard conditions), in order to identify the states of the app.
Dynamic analysis can be exploited to explore the behaviour of the application at runtime and
inferring the state transitions.
   For example, Figure 2 shows the FSM modeling the behaviour of an example AR application.
This application is composed of a single Scene. When the application is started, a Language
menu is shown on the device screen. When the user specifies the preferred language, the
application goes in a Marker Waiting state where the device camera output is shown and the
Vuforia listener observes when a marker image is framed by the camera. When a marker is
recognized, the corresponding animation is shown on the device screen (AR Animation state).
When the marker disappears from camera image, the application returns to the Marker Waiting
state. When the application is either on the Marker Waiting and on the AR Animation it is
possible to return to the Language Menu state by means of a settings button. From the Language
Menu state it is possible to quit the application. Figure 3 shows, from left to right, the screenshot
of the GUI application in the Language Menu state, the Marker that has to be recognized, and a
screenshot of the animation shown by the application when it is in the AR Animation state.
Figure 2: An Example of a FSM modeling the behaviour of an AR application




Figure 3: Details from the example AR application: the Language Menu state, the marker to be identified,
and the AR Animation state


4.2. Definition of Test Selection Criteria and Test Case Design
Testing strategies guided by specific FSM model coverage objectives can be used to define the
test suites. In particular, we have considered the following coverage objectives:

    • All States Coverage, in which each state of the FSM model is reached by at least a test
      case;
    • All Transitions Coverage, in which each transition of the FSM model is reached by at
      least a test case;
    • All Prime Paths Coverage [12], in which each prime path on the FSM is covered by at
      least a test case.

  For each of the considered testing strategies, different test suites built to satisfy the model
coverage objectives can be defined.

4.3. Test Case Implementation and Execution
In this step, the designed test cases have to be implemented using the features of a test automation
framework or library, in order to make them automatically executable. To this aim, some features
offered by the AirTest library can be exploited.
        In general, the implemented test cases will exploit (1) functions for pre-condition setting (the
     setup function), (2) functions for state identification, (3) a sequence of operations triggering the
     actions constituting the test case, (4) assertions to check the app behaviour, (5) a function for
     post-condition tear down (the tearDown function).
        Listing 1 shows an excerpt of a test script written in Python and using the AirTest library.
     The test script code uses the AirTest poco object to obtain references to menus, buttons and
     other widgets on the scenes. The verify methods are used to recognize the occurrence of the
     FSM states on the basis of the values of the state variables (in this example we used the language
     variable), the widgets shown on the GUI (e.g. the button), and the identification of the image
     marker. Assertions have been inserted into the test script to evaluate the occurrence of the
     expected sequence of states. The final set of statements represents the sequence of events
     constituting the test case. In this example, after the application is started, the LanguageMenu
     state should be recognized. On this GUI the start button is clicked. When the marker reported
     in Figure 3 is recognized by Vuforia, the ARAnimation state should be reached and the test ends.

                                                  Listing 1: Excerpt of a test script code
 1   from airtest.core.api import *
 2   from poco.drivers.unity3d import UnityPoco
 3   poco = UnityPoco()
 4
 5   def setup():
 6       os.system(...)
 7       global poco= UnityPoco();
 8       global language = "English";;
 9
10   def tearDown():
11       os.system(...)
12
13   def verifyLanguageMenuState():
14       global startButton;
15       menu = poco("LanguageSelectionGroup").children();
16       button = poco(type = 'Button');
17       identified = False;
18       if(len(menu) > 0 and len(startButton) == 1 and startButton.attr('name') == "StartButton"):
19           identified = True;
20       assert_equal(identified,True);
21
22   def verifyARAnimationState():
23       global button;
24       scene = poco("LiftAnimation").children();
25       button = poco(type = 'Button');
26       identified = False;
27       if(len(scene) > 1 and len(button) == 1 and button.attr('name') == "Settings"):
28           marker = poco("LIFT - "+language);
29           identified = marker.exists();
30       assert_equal(identified,True);
31
32   setup()
33   verifyLanguageMenuState()
34   button.click();
35   verifyARAnimationState()
36   tearDown()




     5. Examples
     We have carried out two testing activities on two example AR applications with the goal to show
     the feasibility of the proposed Model Based Testing technique on AR applications and evaluate its
     effectiveness.
        The considered applications were two small open source AR apps, both implemented with
Table 1
Size metrics of the two AUTs
                                 Code Metrics                     FSM Metrics
                  #Classes    #Methods #LOCs          #Branches     #States     #Transitions
            A1       3           8        156            13            4            15
            A2       3           7        153            14            3             9

Table 2
Number of generated test cases and coverage metrics for the three generated test suites and the two
AUTs
         #Test Cases          State Coverage    Transition Coverage Branch Coverage
         TS1 TS2 TS3 TS1 TS2 TS3 TS1 TS2                        TS3     TS1     TS2      TS3
     A1 2      6      12     4/4 4/4 4/4 4/15 15/15 15/15 9/13                  12/13 12/13
     A2 3      6      7      3/3 3/3 3/3 2/9            9/9     9/9     10/14 13/14 13/14


Unity3D and Vuforia. PointAR (A1)8 is a concept app showcasing the usage of Augmented Reality
to assist the foreign workforce with the induction process through the use of 3D animation for
visualisation and built-in translations. SafaryAnimal (A2)9 is a simple AR educative game using
Unity and Vuforia where animals 3D renderings appear and disappear when specific marker
images are observed.
   Table 1 reports some metrics about the applications under test (AUTs) and the reverse
engineered FSMs describing their behaviour. The table reports on the left part some source code
metrics (number of classes, number of methods, number of LOCs and number of branches of
the scriping source code), and on the right part some FSM metrics (the number of states and the
number of transitions). For each of the two AUTs three different test suites have been generated
from the FSM models, according to the three different coverage criteria. The test suites named
TS1 have been generated having the objective to cover all the FSM states, whereas the test suites
TS2 have been implemented to cover all the FSM transitions. Finally, the test suites names TS3
have been written to achieve the coverage of all the prime paths on the FSM model. In order to
evaluate the effectiveness of the Model based test suites, the coverage of states, transitions and
prime paths have been measured, together with the source code branch coverage. To obtain
these measures we manually inserted probes in the application source code (in correspondence
of each method declaration and each control structure branch) and in the source code of the
test cases (in correspondence of the state and transition identification statements).
   Table 2 reports the number of test cases composing each test suite and the measured coverage
values for each considered AUT. The obtained results show that the test suites designed to cover
transitions or prime paths achieve better coverage than the ones aiming at covering model
states. In fact, the TS1 ones cover the minority of the states and do not cover several code
branches that are instead reached by the other two test suites. On the other hand, the TS2 and
TS3 test suites also show some lacks in coverage (one branch for each AUT).
   In order to understand the causes of the coverage lacks, we analyzed the branches that have
not been covered by the test suites. In AUT1 there is a branch that is not covered by any test
   8
       https://github.com/abdullahibneat/PointAR
   9
       https://github.com/abdullahibneat/SafariAnimalsAR
suite: it corresponds to the code that is activated when the default language (English) is selected
after having previously selected another language. This branch could be covered by test cases
executing longer loops between the same states, including a language change and a return to
the initial language. It is thus unsurprising that the branch has not been covered by test cases
aiming at avoiding loop repetitions. In addition, TS1 does not execute most of the branches
related to language changes since they are not necessary to discovery new states of the FSM.
   In AUT2 there are selection buttons to change the animal shown on the screen while remaining
in the Recognized Marker status. The interactions with these buttons have not been triggered
by the TS1 test suite as they do not cause transitions toward new states. There is also a branch
that has not been covered by any test suite. It is activated by the condition in which an animal
with an incorrect index is selected. This condition is not feasible with the current version of the
application, thus it can be classified as dead code. For this reason, we can conclude that both
TS2 and TS3 have been able to cover all the feasible branches of the source code.
   In conclusion, we have observed how the strategies aiming at the coverage of transitions and
prime paths have been able to provide a complete coverage of states and transitions and an
almost complete coverage of the branches of the code. Although the example applications are
tiny and simple, the obtained results are promising and future work is necessary to generalize
them with respect to larger and more complex AR applications. The set of tools, the implemented
test cases and the output of their execution are available on a Github repository10 .


6. Conclusions
In this paper we have investigated the possibility to implement Model Based Testing techniques
on AR applications, exploiting their similarity with other types of GUIs on which MBT were
applied with success in the past. We have modeled the behaviour of the GUI of AR applications
with Finite State Machines that can be manually reverse engineered on the basis of the analyses
of the structure of the client side of the application, of its source code (including the code of
the listeners of user and system events, such as the ones related to the identification and loss
of markers, that are typical of AR applications) and of the observed behaviour. The obtained
FSM model have been exploited to design test suites aiming at covering states, transitions and
prime paths. These test suites have been implemented in form of automatically executable test
scripts exploiting the features offered by AirTest. We have demonstrated the feasibility of this
approach and some encouraging results on two small example applications involving Unity3D
and Vuforia technologies.
   This paper represents a preliminary work, for which we plan to carry out several activities in
the future in order to extend its applicability to test larger applications. We have recognized the
need to implement reverse engineering techniques and tools helping the modeling process and
supporting the automatic generation of model based test scripts. In particular, more general
solutions to the problem of state identification will be studied, together with the extension of the
support to user and systems events, exploiting the features offered by more recent emulators.



   10
        https://github.com/PorfirioTramontana/MBT-AR-applications
Acknowledgments
This work was partially funded by the grant FFABR of the italian ministry for university and
research (MIUR) and by the grant REYNA of the University of Naples Federico II. The authors
would also thank S. D. Bevilacqua for the implementation of the test suites in the context of his
Laurea Degree Thesis activity.


References
 [1] A. C. Corrêa Souza, F. L. S. Nunes, M. E. Delamaro, An automated functional testing
     approach for virtual reality applications, Software Testing, Verification and Reliability 28
     (2018). URL: https://onlinelibrary.wiley.com/doi/abs/10.1002/stvr.1690.
 [2] S. A. Andrade, F. L. S. Nunes, M. E. Delamaro, Towards the systematic testing of virtual
     reality programs, in: SVR, 2019, pp. 196–205. doi:10.1109/SVR.2019.00044 .
 [3] R. Prada, I. S. W. B. Prasetya, F. M. Kifetew, F. Dignum, T. E. J. Vos, J. Lander, J. Donnart,
     A. Kazmierowski, J. Davidson, P. M. Fernandes, Agent-based testing of extended reality
     systems, in: ICST, 2020, pp. 414–417. doi:10.1109/ICST46399.2020.00051 .
 [4] B. Marín, S. Alarcón, G. Giachetti, M. Snoeck, Tescav: An approach for learning model-
     based testing and coverage in practice, Lecture Notes in Business Information Processing
     385 LNBIP (2020) 302 – 317. doi:10.1007/978- 3- 030- 50316- 1_18 .
 [5] S. M. Lehman, H. Ling, C. C. Tan, Archie: A user-focused framework for testing augmented
     reality applications in the wild, in: VR, 2020, pp. 903–912. doi:10.1109/VR46266.2020.
     00013 .
 [6] S. Li, Y. Wu, Y. Liu, D. Wang, M. Wen, Y. Tao, Y. Sui, Y. Liu, An exploratory study of bugs
     in extended reality applications on the web, in: ISSRE, 2020, pp. 172–183. doi:10.1109/
     ISSRE5003.2020.00025 .
 [7] A. Borrelli, V. Nardone, G. A. D. Lucca, G. Canfora, M. D. Penta, Detecting video game-
     specific bad smells in unity projects, in: MSR, 2020, pp. 198–208. doi:10.1145/3379597.
     3387454 .
 [8] M. Utting, A. Pretschner, B. Legeard, A taxonomy of model-based testing approaches,
     Software Testing, Verification and Reliability 22 (2012) 297–312. doi:https://doi.org/
     10.1002/stvr.456 .
 [9] D. Amalfitano, A. R. Fasolino, P. Tramontana, Reverse engineering finite state machines
     from rich internet applications, in: WCRE, 2008, pp. 69–73. doi:10.1109/WCRE.2008.17 .
[10] M. Sama, S. Elbaum, F. Raimondi, D. S. Rosenblum, Z. Wang, Context-aware adaptive
     applications: Fault patterns and their automated identification, IEEE Transactions on
     Software Engineering 36 (2010) 644–661. doi:10.1109/TSE.2010.35 .
[11] D. Amalfitano, A. R. Fasolino, P. Tramontana, B. D. Ta, A. M. Memon, Mobigui-
     tar: Automated model-based testing of mobile apps, IEEE Software 32 (2015) 53–59.
     doi:10.1109/MS.2014.55 .
[12] N. Li, U. Praphamontripong, J. Offutt, An experimental comparison of four unit test criteria:
     Mutation, edge-pair, all-uses and prime path coverage, in: ICST Workshops, 2009, pp.
     220–229. doi:10.1109/ICSTW.2009.30 .