S. Götz, L. Linsbauer, I. Schaefer, A. Wortmann (Hrsg.): Software Engineering 2021 Satellite Events,
                     Lecture Notes in Informatics (LNI), Gesellschaft für Informatik, Bonn 2021 1


A Modular Architecture for Personalized Learning Content
in Anti-Phishing Learning Games

Rene Roepke,1 Vincent Drury,2 Ulrik Schroeder,1 Ulrike Meyer2


Abstract: While game-based anti-phishing education earned a lot of attention in the last years, it
can only attest to minor successes. Since developed games usually contain manually created and
curated content, problems can occur when learners are faced with content they cannot easily relate to.
This may hamper the motivation of learners and thus influence the learning experience negatively.
Existing work proposes the personalization of learning games to address these problems, but does not
go beyond a conceptual contribution. This paper provides an implementation of a personalization
pipeline for two learning game prototypes and presents the modular, component-based architecture.

Keywords: Personalization; Game-based Learning; Content generation; Learner modeling


1    Introduction

As game-based learning emerges as a scalable, motivational and effective approach in
security education for end-users, different anti-phishing learning games have been developed
and reviewed by various researchers [DS16, HASB16, TMJ17, Ro20a].While these games
aim for raising users’ awareness and educating them on the recognition of malicious URLs
or emails, the game content is often manually created with a games’ target group in mind.
Although developers can make an effort in creating suitable learning game content, the
heterogeneity of learners makes it hard to fulfill requirements of different individuals. In
case of anti-phishing education, where learners are presented URLs or emails of particular
services, a lack of personalized learning content leads to various potential shortcomings.
Unknown services can hamper raising awareness since more cognitive processes are needed
for learners to transfer learned knowledge to their daily activities. In addition, for unknown
services, a learner might be unable to decide whether a given URL is benign or not, due
to a lack of reference. Both might have negative impact on motivation and the learning
experience.
A solution to the aforementioned shortcomings is the personalization of learning game
content towards a learner’s individual characteristics. By adapting the presented services,
0 This research was supported by the research training group “Human Centered Systems Security” sponsored by
  the state of North Rhine-Westphalia.
1 RWTH Aachen University, Learning Technologies Resarch Group, Ahornstr. 15, 52074 Aachen, Germany

  [roepke,schroeder]@cs.rwth-aachen.de
2 RWTH Aachen University, Research Group IT-Security, Mies-van-der-Rohe Str. 15, 52074 Aachen, Germany

  [drury,meyer]@itsec.rwth-aachen.de


cb                                  Copyright © 2021 for this paper by its authors.
               Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
2 Rene Roepke, Vincent Drury, Ulrik Schroeder, Ulrike Meyer

URLs or emails to those the learners know, e.g. by considering their browser history, installed
applications and more, the learning game could be tailored to individual learners [Ro20b].
The contribution of this paper lies in the implementation of a modular, component-based
architecture for personalized learning games for anti-phishing education.


2   Related Work
In recent years, different contributions have been made to the fields of personalized
learning and game-based learning. Concerning the intersection of these fields, Streicher and
Smeddinck [SS16] provide an overview on the most prominent terms and concepts. They
explicitly distinguish between personalization, customization and adaptivity, since they are
often used interchangeably. While personalization describes “the act of changing a system
to the needs of a specific individual user” [SS16] and customization if it is the needs of a
user group, adaptivity describes solely the ability of a system to change over time and can
consequently be used to achieve personalization or customization. Based on this distinction,
we were only able to find examples of personalized learning games, which are based on
adaptivity and not personalization of content [KEJ13, LKR08, Ki06].
Bakkes et al. [BTP12] present an extensive literature-based overview on personalized
gaming. They emphasize the importance of learner modelling, as it is a requirement for
personalized game experiences, and describe different types of adaptation within games, e.g.
difficulty scaling and game mechanics adaptation. While it is more common for gameplay
or difficulty to be adapted based on specific learner characteristics (e.g. learning styles
[KEJ13]), the personalization of content is less well studied. In a previous publication, we
presented the idea of a personalization pipeline for game-based anti-phishing education
[Ro20b]. As it was only conceptual work, no implementation was presented.
To supplement these findings, we broaden our search to content personalization in other
learning environments and in particular, implementations of content personalization. Here,
Bezza et al. [BBM13] provide an overview on different methods for content personalization
in e-learning systems. They distinguish between inductive (i.e. without user intervention)
and deductive (i.e. with user intervention) user modeling. After presenting their classification
scheme for content personalization methods, they review existing contributions.

Regarding the implementation of personalized learning systems, Ismail et al. [IB18] provide
an abstract, reusable software architecture identifying four main compontents: (1) learner
unit, (2) knowledge unit, (3) personalization unit and (4) presentation unit. These components
are utilized with regards to modelling the user and personalizing access to learning resources.
Ismail et al. [IB18] provide abstract, conceptual descriptions of each component and note
that implementations may differ based on the implementation context. Lastly, they map
exemplary personalized learning systems to their proposed architecture.
Although existing work in the fields of personalized learning and game-based learning does
not provide explicit examples for content personalization in learning games but rather for
                             Personalized Learning Content in Anti-Phishing Learning Games 3

adaptive, personalized gameplay (e.g. [Ki06, LKR08]), suitable approaches can be drawn
from related fields. Our work builds on the idea of a personalization pipeline [Ro20b] and
the reusable architecture for personalized learning systems by Ismail et al. [IB18].


3   Concept of the Personalization Pipeline
In previous work [Ro20b], we presented the idea of a personalization pipeline for anti-
phishing learning games. The pipeline consists of three parts processing data about the
learner in order to provide personalized learning game content. The three parts are: (1)
data collection, (2) content generation and (3) content delivery. The pipeline is intended to
precede a game and provide adaptations to gameplay or the configuration of a game. Each
part of the personalization pipeline provides input for the next part.

For data collection, two different approaches can be followed: manual or automated. Both
approaches provide a learner model as their output to the content delivery module. For
anti-phishing learning games, the learner model consists of knowledge about used services
and visited websites.

To generate suitable game content, the content generation module offers different content
generators, e.g. a URL generator. These content generators are queried by the level generator
and provide content ready to be embedded into a level definition. Depending on the type of
game, additional content generators can be implemented, e.g. an email generator for games
about email phishing.
For content delivery, the generated level definitions serve as input to a level controller. The
level controller provides an interface to the game, generating levels depending on the current
game state, i.e. different levels based on the learner’s progress in the game. The interface
between level controller and game is the only connection of the personalization pipeline
with the game, making each component of the personalization pipeline modular and easy to
replace with different implementations.

Depending on the type of game, the implementation of the pipeline can vary, and thus, we
want to solidify the idea of a personalization pipeline and provide an implementation and
architecture with an interface to two game prototypes.


4   Implementation
Our implementation of the personalization pipeline follows a modular, component-based
design where each stage of the pipeline is represented by one module consisting of multiple
components. Following the single-responsibility principle, each component serves a specific
task or purpose. Data flow is managed with simple interfaces between components. Figure
1 depicts a component diagram of the architecture as well as data flow between different
components or modules.
4 Rene Roepke, Vincent Drury, Ulrik Schroeder, Ulrike Meyer

           Data collection             Content generation
                                                                                      MTLG
                                                                        Assets
                   Automated                 URL Generator                          Framework
                  Data Collector
                                             Email Generator
                     Selection
                     Interface                      ...                          Game


                   Learner model             Level Generator              Level Controller


                                                                Level
           Content Delivery

Fig. 1: Component diagram of the proposed architecture. Modules of the pipeline have different colors
and include multiple components. Arrows represent data flow.


We plan to evaluate the pipeline making use of two existing game prototypes for anti-phishing
education called “All sorts of Phish” and “A Phisher’s Bag of Tricks”. Both game prototypes
address the structure of URLs as well as common manipulations for phishing purposes.
While learners have to recognize the type of URL and sort them accordingly in the first
game, the second game (see Figure 2) requires learners to create malicious URLs themselves
by applying different manipulation rules.
The games are developed using the Multi Touch Learning Game framework MTLG3, which
supports game development using the HTML5 Canvas element and native JavaScript. The
games are implemented using the Model-View-Presenter pattern, while making use of
different core components of MTLG via the (revealing) module pattern [Os12]. Both games
are delivered using a simple web server, they run fully on the client side requiring no further
server capacities. The only exception is a logging module, that can be set up to forward
event log data to a remote server for evaluation purposes.


4.1   Data collection for learner modeling

All personalization efforts begin with the creation of a learner model, which is implemented
in the data collection module. Since we mainly distinguish between two different approaches
for data collection, i.e. manual and automatic, two components are provided in our data
collection module.
The current game prototypes support manual, deductive learner modeling implemented in a
selection interface, that shows pre-defined and popular services that the learners can then
select or deselect, depending on their familiarity with the services. Services are divided
into different categories (e.g. cloud storage, shopping, payment services) and displayed
using pagination. The selection screen is developed as an independent component, that can
3 https://mtlg-framework.gitlab.io/, last accessed 08-12-2020
                              Personalized Learning Content in Anti-Phishing Learning Games 5


Fig. 2: Level of the game “A Phisher’s bag of tricks”. The player has to create a malicious URL by
combining available URL parts via drag-and-drop


be added to existing games to facilitate the simple deductive creation of learner models.
Contrasting this manual selection is the automated, inductive approach, which uses a browser
extension to collect services directly from a learner’s browser history. This approach is not
yet suitable for large-scale studies, as there are major privacy concerns when collecting the
browser history of end-users. We aim to handle these issues in the future by making sure
that user data never leaves the browser.

The output of either approach is a learner model specified as a json object, that includes
information about the learner’s familiarity with a number of services. If additional aspects
need to be considered for the model, further data collection components can be integrated.
As for now, the learner model does not actively involve in-game data. However, it could be
extended, e.g. by collecting knowledge about the player’s progress or covered content to
reflect the current state of the learner during the game.


4.2   Rule-based content generation

The next step in the personalization pipeline is content generation. In the current game
prototypes, URLs can be generated dynamically to ensure a large variety of content, even
for learner models with only few known services. The necessary functionality is provided by
the URL generator component, which is used to transform benign services and their URLs
into malicious URLs imitating those services. It can also create additional benign URLs.
The generator takes service descriptions (consisting of a name and URL) as input, performs
a number of rule-based modifications on the URLs, and outputs a number of malicious or
benign URLs and details about their creation process. The rules are based on patterns that
were extracted from real-world examples of benign and phishing URLs: Benign rules are
6 Rene Roepke, Vincent Drury, Ulrik Schroeder, Ulrike Meyer

based on URLs from the login pages of popular websites (retrieved from Alexa4), while
malicious rules are based on a study of related work, and verified using the popular database
Phishtank5. Several rules can be applied in sequence to create more complex and realistic
URLs. The generators can also be exchanged to generate different types of content, for
example, an Email generator component is planned for a future game prototype.


4.3   Content delivery via a level controller

The last step in the pipeline is content delivery. Conceptually, the content delivery module is
the interface available to the game to request personalized content. While the pipeline was
initially described as a three-step process, here, the content delivery module takes on the role
of a controller, collecting necessary information with the help of several new components.
First, the level generator component is used to generate personalized levels by embedding
generated content into explicit level definitions for the games. The actual interface between
delivery module and game is implemented in the level controller, which triggers the creation
of new levels upon request by the game and returns appropriate level definitions. This
cyclic relation allows for on-demand level generation during the game. Level definitions are
tailored towards the specific game and include all information necessary to create a playable
level, including specific task descriptions, conditions required to clear the level and the
URLs that are to be used. Finally, the game handles the translation of level definitions into
actual playable levels, by creating the required views and game logic.


5     Discussion

Besides our previous conceptual contribution [Ro20b], we considered the work of Ismail
et al. [IB18] as a basis of our implementation, since they propose a reusable software
architecture for personalized learning systems. Although the architecture is not specifically
designed for personalized learning games, Ismail et al. describe abstract main components
reusable in different implementation contexts which can be mapped to our architecture.

First, the “learner unit” in the work of Ismail et al. [IB18] can be directly mapped to
our learner model, as both are responsible to maintain data about the learners. Next,
the “presentation unit” maps to our two games, as they are the environment in which a
learner interacts with the personalized content. While we argue that these two units can
be easily mapped to components in our architecture, mapping the “knowledge unit” and
“personalization unit” requires the clarification of a crucial difference between traditional
e-learning environments and learning games. While Ismail et al. [IB18] consider learning
resources to be courses, e-books, and similar, learning resources within a game environment
are tutorials and tasks, which in particular include the generated learning game content.
4 https://www.alexa.com/topsites, last accessed on 09-12-2020
5 https://www.phishtank.com/, last accessed on 09-12-2020
                              Personalized Learning Content in Anti-Phishing Learning Games 7

Therefore, the mapping of a “knowledge unit” is more vague, but best fits the generator
components and game assets. Lastly, the “personalization unit” cannot be mapped ideally
either, as it includes both data collection and content delivery. We argue that this distinction
improves our architecture, since our modular approach allows the exchange of individual
data collection and even level generation components.
Currently, the implementation of the pipeline and prototype games focuses only on the
personalization of services and URLs that appear in the games, other parts of the games are
fixed. In particular, progress in the game and level difficulties are not adapted to the learner.
This would require a notion of difficulty for the different levels, as well as an approach to
update the learner model to reflect progress, strengths and weaknesses of the learner. Even
though both of these requirements can already be modeled using the conceptual architecture
of the personalization pipeline, they are not currently implemented, as they are not the focus
of the research project.
Another open question is how to improve the data collection module. Currently, only the
manual selection is used for creating a learner model. The integration of the automated
approach for data collection needs to be evaluated carefully in regards to user privacy. In
particular, even though the games themselves do not leak any user information, the logging
functionality needs to be adjusted to not leak any information on URLs or services. With
both approaches implemented and functioning, the hybrid approach, that uses the output of
the automated approach as input to the manual selection, is also an alternative that is left to
be analysed in more detail.
To summarize, the current prototypes show, that the personalization pipeline proposed
previously [Ro20b] can be implemented using a modular, reusable architecture. The next
step will be to evaluate the personalized games, to compare them to their non-personalized
versions, as well as compare different data collection and personalization approaches
regarding their effect on learning outcomes and the learner’s experience.


6   Conclusion and Future Work
In this paper, we presented an architecture for personalized anti-phishing learning games.
To this end, we implemented a personalization pipeline consisting of three stages: data
collection, content generation and content delivery. Our results show that the modular
approach to pipeline and game creation makes it possible to retrofit personalized content into
existing games, and facilitates improving and changing different components as required by
the specific game.
For future work, we intend to evaluate the game prototypes and compare the personalized
games to the non-personalized setting, to answer the underlying research question whether
there are differences in learning outcomes and gaming behavior. Furthermore, we intend
to create the automated data collection as an alternative approach to the manual selection
interface and compare the approaches from a learner’s perspective.
8 Rene Roepke, Vincent Drury, Ulrik Schroeder, Ulrike Meyer

Bibliography
[BBM13] Bezza, Assma; Balla, Amar; Marir, Farhi: An approach for personalizing learning content
        in e-learning systems: A review. In: 2013 Second International Conference on E-Learning
        and E-Technologies in Education (ICEEE). pp. 218–223, 2013.
[BTP12]    Bakkes, Sander; Tan, Chek Tien; Pisan, Yusuf: Personalised Gaming: A Motivation and
           Overview of Literature. In: Proceedings of The 8th Australasian Conference on Interactive
           Entertainment: Playing the System. IE ’12, Association for Computing Machinery, New
           York, NY, USA, 2012.
[DS16]     Dewey, Chad M.; Shaffer, Chad: Advances in information SEcurity EDucation. In: Int.
           Conf. on Electro Information Technology. IEEE, Grand Forks, pp. 133–138, 2016.
[HASB16] Hendrix, Maurice; Al-Sherbaz, Ali; Bloom, Victoria: Game Based Cyber Security
         Training: are Serious Games suitable for cyber security training? Serious Games,
         3(1):53–61, 2016.
[IB18]     Ismail, Heba; Belkhouche, Boumediene: A Reusable Software Architecture for Personal-
           ized Learning Systems. In: 2018 International Conference on Innovations in Information
           Technology (IIT). pp. 105–110, 2018.
[KEJ13]    Khenissi, Mohamed Ali; Essalmi, Fathi; Jemni, Mohamed: Toward the personalization
           of learning games according to learning styles. In: 2013 International Conference on
           Electrical Engineering and Software Applications. pp. 1–6, 2013.
[Ki06]     Kickmeier-Rust, Michael D.; Schwarz, Daniel; Albert, Dietrich; Verpoorten, Dominique;
           Castaigne, J-L; Bopp, Matthias: The ELEKTRA project: Towards a new learning
           experience. M3–Interdisciplinary aspects on digital media & education, pp. 19–48, 2006.
[LKR08]    Law, Effie Lai-Chong; Kickmeier-Rust, Michael D.: 80Days: Immersive digital educational
           games with adaptive storytelling. University of Graz, 2008.
[Os12]     Osmani, Addy: Learning JavaScript Design Patterns: A JavaScript and jQuery Developer’s
           Guide. O’Reilly Media, Inc., 2012.
[Ro20a]    Roepke, Rene; Koehler, Klemens; Drury, Vincent; Schroeder, Ulrik; Wolf, Martin R.;
           Meyer, Ulrike: A Pond Full of Phishing Games - Analysis of Learning Games for
           Anti-Phishing Education. In (Hatzivasilis, George; Ioannidis, Sotiris, eds): Model-driven
           Simulation and Training Environments for Cybersecurity. Lecture Notes in Computer
           Science, Springer International Publishing, Cham, pp. 41–60, 2020.
[Ro20b]    Roepke, Rene; Schroeder, Ulrik; Drury, Vincent; Meyer, Ulrike: Towards Personalized
           Game-Based Learning in Anti-Phishing Education. In: 2020 IEEE 20th International
           Conference on Advanced Learning Technologies (ICALT). pp. 65–66, 2020.
[SS16]     Streicher, Alexander; Smeddinck, Jan D.: Personalized and Adaptive Serious Games. In
           (Dörner, Ralf; Göbel, Stefan; Kickmeier-Rust, Michael; Masuch, Maic; Zweig, Katharina,
           eds): Entertainment Computing and Serious Games, volume 9970 of Lecture Notes in
           Computer Science, pp. 332–377. Springer International Publishing, Cham, 2016.
[TMJ17]    Tioh, Jin-Ning; Mina, Mani; Jacobson, Douglas W.: Cyber security training a survey of
           serious games in cyber security. In: 2017 IEEE Frontiers in Education Conf. (FIE). IEEE,
           Indianapolis, pp. 1–5, 2017.