=Paper= {{Paper |id=Vol-2622/paper6 |storemode=property |title=Extracting Attribute-Based Access Control Rules from Business Process Event Logs |pdfUrl=https://ceur-ws.org/Vol-2622/paper6.pdf |volume=Vol-2622 |authors=Amani Abou Rida,Nour Assy,Walid Gaaloul |dblpUrl=https://dblp.org/rec/conf/bdcsintell/RidaAG19 }} ==Extracting Attribute-Based Access Control Rules from Business Process Event Logs== https://ceur-ws.org/Vol-2622/paper6.pdf
    Extracting Attribute-Based Access Control Rules
           From Business Process Event Logs
                  Amani Abou Rida                                                Nour Assy                           Walid Gaaloul
       Computer Science Department                                Computer Science Department               Computer Science Department
   Lebanese University - Faculty of sciences                     Lebanese International University                  Telecom sudParis
              Beirut, Lebanon                                           Beirut, Lebanon                               Paris, France
        amani.abourida96@gmail.com                                    nour.assy@liu.edu.lb                 walid.gaaloul@telecom-sudparis.eu



   Abstract—Protecting sensitive information from unauthorized                      such as date-time, are used to determine access. For instance,
access is recognized as a crucial issue for today’s organizations.                  the rule “Permit managers to access financial data provided
Identity and Access Management is one of the best practices                         they are from finance department” would allow users with
techniques that ensure that the right people have access to the
right systems at the right time. In particular, Attribute-Based                     attributes of “Role=Manager” and “Department=Finance” to
Access Control (ABAC) models have recently gained popularity                        access data with the attributes of “Category=Financial”. This
because of their capability to provide fine-grained and contextual                  makes ABAC a flexible and fine-grained strategy to manage
access control that is not based on the user but on the attributes                  users’ access operations.
of every component in the system. Despite the benefits of adopting                     In order to deploy ABAC, one has to manually define all
ABAC, it is commonly agreed that deploying an ABAC system
is a complicated, time-consuming and challenging task. This is                      the attributes of the system, assign attributes to each system
because all attributes of the system must be defined, and acess                     component and create and maintain policies (a.k.a rules)
rules must not only be created, but also regularly monitored                        according to the security requirements. This makes ABAC
and reviewed. In this paper, we propose an automated approach                       systems complicated and hard to deploy. This paper addresses
to extract ABAC rules from event logs which record the actual                       the aforementioned issue by proposing an automated approach
execution of business processes. Event logs capture which tasks
are performed by whom and at what point in time, and what                           to learn ABAC rules from business process event logs. Event
data are taken as input and output. Therefore, they provide                         logs [1] record the actual execution of business processes in
rich information on task and data access policies. Concretely, we                   an organization. They capture which tasks are performed by
propose to use (i) process mining techniques in order to analyze                    whom and at what point in time, and what data are taken
the event log and extract useful attributes and (ii) data mining                    as input and output. Therefore, they provide rich information
techniques in order to learn the ABAC rules. To validate our
approach, we (i) developed a java application, and (ii) performed                   on task and data access policies. The extracted ABAC rules
experiments on a real-life event log. Experimental results show                     depict the “current state” permissions within an organization.
that our approach is efficient and feasible.                                        In case of an existing ABAC model, our extracted rules can
   Index Terms—Process mining, Attribute Based Access Control                       be used to monitor and review the granted permissions. In the
Model, Event logs, Association Rule Mining.                                         other case, they serve as starting point for further refinement
                                                                                    and customization to derive “target state” ABAC models
                           I. I NTRODUCTION                                         that represent the tailored permissions to execute business
   Organizations are increasingly becoming dependent on in-                         processes. Concretely, our approach is divided into two main
formation technology to perform their business operations,                          steps. In the first step, we automatically extract from the event
thereby meeting their business objectives. In order to en-                          log the attributes and relations that are required for mining
sure reliable execution of business processes and protect                           the ABAC rules. In the second step we use Association Rule
data from unauthorized access, security measures need to be                         Mining to learn ABAC rules from the event log.
implemented. Identity and Access Management (a.k.a access                              The remainder of the paper is organized as follows. In Sec-
control) is considered as a crucial security measure and a                          tion II, we present an example that will be used throughout the
strong driving force for protecting the data, employees, and                        paper to illustrate our approach. Section III formalizes some
property of an organization. Roughly speaking, the purpose                          concepts and definitions that are required for our approach.
of access management is to grant authorized users access to                         In Section IV, we detail our approach for extracting ABAC
appropriate data and deny access to unauthorized users.                             rules from event logs. Our implementation and experimental
   Recently, there has been a growing interest in Attribute-                        results are reported in Section V. In Section VI, we discuss
Based Access Control (ABAC) models [7] which define access                          some related works before we conclude in Section VII.
control rules based on the attributes of every component
in an information system. In an ABAC system, any type                                                  II. RUNNING E XAMPLE
of attribute such as user attributes (a.k.a subject attributes),                       We consider a scenario that describes a process for handling
resource attributes and other relevant contextual attributes,                       a request for ticket compensation within an airline. To handle

  Copyright © 2019 for this paper by its authors. Use permitted under Creative
  Commons License Attribution 4.0 International (CC BY 4.0).



                                                                                                                                                        38
this ticket, the process should be secured so that each worker      which represent the output of our approach (detailed in Sec-
should know his work and if he can access the request               tion III-B).
or not. In general this process is shared between different
process users. It will be accessed according to specific rules.     A. Event Logs
These rules are complicated in the process if a company               An event log contains the execution data of a business
decided to do them manually. For that we should improve             process and is recorded by the information system. Event
information sharing by maintaining control of that information.     logs are used by process mining techniques to discover
In order to ease the process security experience, the process       process models, to check the conformance of a-priori process
provider decides to provide attribute based access control rules    models, to detect execution errors or to observe social
that relies upon the evaluation of attributes of the subject,       behaviors [1]. The structure of an event log is defined by the
attributes of the object, environment attributes, and the formal    XES standard which defines an event log as a set of traces [5].
relationship or access control rule or policy that can define the
allowable operations for subject-object attribute combinations.        Figure 2 shows the class diagram of an event log 1 . A log
These attributes needed in an ABAC model are to be extracted        consists of traces and a trace consists of events. The events
from the event logs. For example consider an event log where        within a case are ordered and they can also have attributes.
every row in it corresponds to an event.These events have           Examples of typical attribute names are activity, Contextual
different properties, for example we can see in the first column    Attribute (e.g. time), Object Data (e.g. costs), and resource. An
the PID which is the case ID. The second column refers to           event log accumulates the execution history of one process. A
the time-stamp in which the activity is being executed by the       log case corresponds to one process instance execution. The
resource. The third column, refers to the activity that is being    list of the most common attributes in event logs are:
executed. Then there is a column referring to the resource             • Case ID: which is the process instance id of the event.
which is the person executing the corresponding activity. And          • Activity: name of the action performed in the event.
we can have all kinds of other columns with other data                 • Event Type: which refers to the event state such as started,
like role, cost, department, location, and status. Using the              paused, resumed, and completed.
event log shown in figure 1 we can extract a rule with A               • Time-stamp: date and the time at which the event has
resource.role == ”Manager” AND resource.status == “2” AND                 been executed, establishing an order between the events.
environmental.dateTime == “10/2/2105 13:20” if object.cost             • Resource: name of the resource that initiates the event.
== “8” can do action == ”decide” AND object.customerID                    Data: data attribute related to the event.
== ”1718” will lead for Ellen to have an access for the
object while Sara can’t have access to it. For this reason the
above event log should be analyzed in order to (i) classify the        Log
attributes according to the ABAC model and (ii) infer the set of
ABAC rules. However, doing this manually is an incredibly              Trace
                                                                                                 Case

tedious and error-prone task. Therefore, we propose in this                                                                        TimeStamp
work an automated approach for extracting ABAC rules from                                        Activity
an event log.                                                                                                                             ...

                                                                       Event
                                                                                                 ContextualAttribute
                                                                                                                                   Role
                                                                          1 has 1*




                                                                                                                                          ...
                                                                       EventAttribute            Resource

                                                                                                                                   Cost


                                                                                                 ObjectData                               ...



                                                                                        Figure 2. Class diagram for an event log


                                                                       Definition 1 (Trace, Event log, Event Attribute) [11]. Let
                                                                    E be the event universe, i.e., the set of all possible event
             Figure 1. Shows an excerpt of the event log            identifiers [11]. An event log is a set of traces and each trace
                                                                    contains number of events, a trace < e1, e2, . . ., en > 2 T
                                                                    is a sequence of events.
                     III. P RELIMINARIES                            Events may be characterized by various attributes, e.g., an
                                                                    event may have a time-stamp, that refers to an activity, which
  This section presents two main ingredients of our approach:       is performed by a specific person, has associated costs, etc.
event logs which are used as input of our approach (detailed
in Section III-A) and attribute-based access control models           1 This diagram is a slightly modified version of the diagram presented in [1]




                                                                                                                                                      39
Let AN be a set of attribute names. For any event e 2 E and                                               • Operation attributes: They present the action or activity
name n 2 AN, #n(e) is the value of attribute n in event e. If                                               being done e.g. read, delete, view, approve. . .
event e does not have an attribute named n, then #n(e) = ?                                                • Object attributes: attributes that express the object (or
(null value) [11].                                                                                          resource) being accessed e.g. the object type, the depart-
Each event ei 2 E has event attribute. These event attributes                                               ment, the location, the cost. . .
corresponds to activity #activity (e) = a 2 A, and resource                                               • Contextual (environment) attributes: attributes that deal
#resource (e) = r 2 R, and #time(e) = t 2 D is the time-stamp                                               with time, location or dynamic aspects of the access
of event e. For convenience we assume the following standard                                                control scenario.
attributes:                                                                                                ABAC is also concerned with the policy and rules. These
   • #activity (e) is the activity associated to event e.                                               rules are based on the privileges of subjects and how resources
   • #time (e) is the time-stamp of event e.                                                            or objects are to be protected under which environmental
   • #resource (e) is the resource (user) associated to event e.                                        conditions in order to determine if access is allowed or not.
   We can also have another attributes that can be classified
according to contextual attributes (time, location ...) and object                                         Definition 2 (ABAC relation) We also define the following
attributes (Data type, cost, status . . . ). Let CA be set of                                           relation: A subject - operation - object SOPO permission tuple
contextual attributes where CA = D,L . . . and OA be a set                                              is a tuple  containing a subject s 2 S, an operation
of object attributes where OA= DT, S, CS.                                                               op 2 OP, and an object o 2 O. This tuple means that subject
                                                                                                        s has permission to perform operation op on object o. For
B. Attribute Based Access Control Model                                                                 convenience we assume the following standard attributes:
   ABAC model leads to grant or deny user requests according                                              • #SubjectAttribute (s) is a function that returns a set of
to the arbitrary conditions of the user, attributes of the object,                                          pair (attribute name, value) of the subject.
and environment conditions that can be recognized and more                                                • #ObjectAttribute (o) is a function that returns a set of pair
similar to the policies at hand. As owners of the objects,                                                  (attribute name, value) of the object.
they have the permission to establish a policy that can relates                                           • #EnvironmentalAttribute () is a function
what operations may be performed upon those objects, by                                                     that returns a set of pair (attribute name, value) of the
whom, and in what context those subjects may perform those                                                  environmental attributes.
operations.                                                                                                Definition 3 (ABAC rule) A rule ABAC-R is a tuple <
                                                                                                        #SubjectAttribute (s), SOPO , #ObjectAttribute (o), #Environ-
                                                                                                        mentalAttribute () >. The ABAC rule is defined
                                                                                                        as follow: #SubjectAttribute(s) ^ SOPO ^ #EnvironmentalAt-
                                      Attribute



                                     has                                                                tribute () ) #ObjectAttribute(e).
                                                           EnvironmentalAttribute

            SubjectAttribute                                                          ObjectAttribute
                                                                                                                         IV. M INING ABAC RULES
                                                                            has
  1*                                  Policy                                                      1*
                                                                                                        A. Approach overview
                                                     has
  Has
                               has
                                                                                    has           Has     In this section, we present an overview of our automated
                                       Rule
                                                                                                        approach for extracting ABAC rules from event logs. The
                                                                                                        input here is an event log that contains attributes defined in
   1                                                                                               1
                                       has
                                                                                                        Definition 1. The output of this algorithm generate ABAC rules
                                      Operation                                                         needed for an ABAC model.
                                           subject
                                           object

                                                                                                        Algorithm 1 Building an attribute based access control guid-
          Subject                                                                   Object              ance model
                                             Performs on
                                                                                                          1) Input: E
        Figure 3. Class diagram for attribute based access control model                                  2) Output: ABAC-R
                                                                                                          3) for e     E
   The main elements in ABAC model is the attributes that                                                 4) E-attributes = get-Attributes (e)
can be about anything and anyone. These attributes are likely                                             5) E-relations = get-Relations (e, E-attributes)
to fall into 4 different categories or functions (as in gram-                                             6) ABAC-attributes = get-Attributes (e)
matical function). Figure 3 shows a class diagram for ABAC                                                7) ABAC-relations = get-Attributes (e)
model that contains main categories. These categories are the                                             8) Mapping (E-attributes, E-relations, ABAC-attributes,
following:                                                                                                   ABAC-relations)
                                                                                                          9) E-saved = Save-To-ARFF (E-attributes)
   • Subject attributes: That describes the attributes in which
                                                                                                         10) end for
     they express the user who wants to access e.g. age, status,
                                                                                                         11) ABAC-R = Apriori (E-saved, minS, minC)
     role, job title. . .




                                                                                                                                                                            40
   The algorithm proceeds in four main steps. In the first step,                                  Table I
the event log is analyzed to extract attributes and relations            S HOWS M APPING BETWEEN EVENT LOGS ’ ATTRIBUTES AND ABAC
                                                                                            MODELS ’ ATTRIBUTES
needed in our model that are stored in E-attributes and E-
relations respectively (lines 4, 5). Then, in the second step,          Event Log                             Attribute Based Access Control
the attribute based access control model is also analyzed to            Resources                             Subject Attribute
                                                                        Object Data                           Object Attribute
extract main attributes and relations to be stored in ABAC-             Contextual Attribute                  Environmental Attribute
attributes and ABAC-relations respectively (lines 6,7). The             Activity                              Operation
third step consists of mapping between the attributes and               Resource-Activity-Object Data (RAO)   Subject-Operation-Object (SOPO)
relations of event logs to the attributes and relations of attribute
based access control (step 8). Finally, the last step consists of
generating the set of attribute based access control rules (step            Moreover, we can infer a Resource-Activity-ObjectData
11).                                                                        relation from these attributes which is defined as follow-
                                                                            ing:
                                                                            Definition 4        (Event relation) Resource-Activity-
B. Extracting attribute based access control rules from busi-
                                                                            ObjectData (RAO) assignments: The relation RAO is
ness process event logs
                                                                            defined as RAO = { (r, a, od ) 2 R × A x OD 9 e 2
  In this section we present the approach in two main steps                 E , #resource(e) = r ^ #activity(e) = a ^ #objectData(e)
as following: In the first step, we show a mapping between                  = od }. RAO relation holds if at least one event e 2 E
event logs and ABAC models by analyzing the event log and                   with resource r 2 R that executes activity a 2 A and
mapping the event logs’ attributes to ABAC models’ attributes.              ObjectData od 2 OD is recorded in the event log. We
In the second step, we applied apriori algorithm to extract                 say that we assign resource r to activity a that can be
ABAC rules.                                                                 accessed by object data od .
                                                                         2) Mapping event logs’ attributes to ABAC models’ at-
C. Mapping between event logs and ABAC models:                              tributes:
                                                                            Table 1 shows a mapping where a resource is said to be
   In this section, we present two main parts to have the
                                                                            subject attribute, an object data is the object attribute,
mapping between an event log and an ABAC model. In the
                                                                            contextual attribute is the environmental attribute, and
first part, we analyze the event log to extract the attributes
                                                                            the activity is the operation.
needed. In the second part, we map the attributes extracted
from the event log to ABAC model attributes.                           D. Apriori-based approach for extracting ABAC rules:
  1) Analyzing the event log:                                              The goal here is to find associations of items that occur
     In this section we analyzed the event log given as input.         together more often than one would expect from a random
     In this step, we aim to show how we can benefit from the          sampling of all possibilities. Apriori starts by selecting the
     event log to extract the attributes needed for creating an        frequent single attribute, then generates the candidate pairs of
     ABAC model. The class diagram in figure 3 summarizes              attributes in the event from the frequent singles and so on, until
     the analysis for any event log. We extended the original          it finds all possible attributes according to all the events in the
     event log so that we can benefit from the attributes in           event log. It uses Support, a well-known metric to compute
     creating ABAC model. An event log is decomposed of                the frequency of a set of correlated attributes in the event log.
     traces that contain a number of events. Each event has            The support is defined as the fraction of correlated attributes
     a set of attributes and we modified some of them as               of each event in the event log in which they always appear
     follow:                                                           together.
       • Resource: name of the resource that initiates the
                                                                                                         | Ae |
                                                                                                    S=                                 (1)
          event. It can also contain the role of the resource,                                            |E|
          age, and the status.                                            Where | Ae | is the number of correlated attributes in an
       • Contextual attributes: can be: Time-stamp: date and           event and | E | is the number of events in an event log. A
          the time at which the event has been executed,               support is equal to 1 if all the events in the event log repository
          establishing an order between the events. It can also        contain the correlated attributes. A support is equal to 0 if
          contain location, and department . . .                       none of the events in the event log contain the corresponding
       • Object Data: data attributes related to the event. It         attributes together. A set of correlated attributes is frequent if
          can contain cost, and type . . .                             its support is above a given threshold mSupp.
     After analyzing the event log we can realize that any                In the second step of the algorithm, the set of relevant
     attribute can be classified into these categories. The            attribute based access control rules in the form of LHS )
     main thing is to extract attributes that match the above          RHS are derived from the frequent correlated attributes. In
     elements so that any event should have at least the               order to keep only relevant rules, the confidence metric is
     following attributes (activity, object data, contextual           computed to evaluate the probability of occurrence of a rule.
     attribute, resource).                                             The confidence of an attribute based access control rule tt:




                                                                                                                                                41
LHS ) RHS is defined as the probability of occurrence                 and al where q is not equal to l, in which one activity depends
of the attributes in the right-hand side RHS given that the           on the start or finish of another in order to begin or end [6].
configurations in the left-hand side LHS are selected. In order       In our approach this relationship depends on the start event of
to have ABAC rule as in definition 3 we should apply filtering        each trace compared with the other events in the same trace.
on the attributes to the derived rules. In this definition we         This is defined by the following: For each trace T 2 E in event
should be sure that the attributes in the left side contains the      log and each event e 2 T in the trace we obtain start event
subject attribute, environmental attribute, and the operation         se 2 T which is the first event in each trace. The time-stamp
that the user is doing whereas the right side should contain the      here is splitted to years, day, hours, minutes, and seconds.
object attribute that describe the data that is accessed. Rules:      We compare #timestamp(se) - #timestamp(e) in each event e
#SubjectAttribute(s) ^ SOPO ^ #EnvironmentalAttribute() ) #ObjectAttribute (o) .                                     #resource(e), #activity(e), and #object(e) so that we can have
                                                                      the min Years(e), Day(e), Hours(e) , Minutes(e) , Seconds(e)
                         Sup(RHS [ LHS)                               between all the events. We then change all the #timestamp(e)
                   C=                                          (2)
                             SupRHS                                   to min Years(e), Day(e), Hours(e) , Minutes(e) , Seconds(e)
                                                                      that have same #resource(e), #activity(e), and #object(e).
   Where Sup(RHS LHS) is the support of the attributes (Sub-
ject Attributes, Environmental Attribute, Object Attributes, and                     V. E VALUATION AND VALIDATION
Operation) in the right-hand and left- hand sides of Rule and
                                                                         We implemented our approach as a java application to gen-
SupRHS is the support of the attributes in the right-hand side
                                                                      erate ABAC rules. We integrated ProM libraries 2 in order to
(Object Attribute).
                                                                      import and extract information from an event log. We also used
   When applying appriori algorithm on these events, we
                                                                      “filter events” in ProM to reduce the noise impact on the data.
can recognize in-frequent rules since the time in these
                                                                      ProM is an open source framework for implementing process
rules contain numerical values and each event contains
                                                                      mining techniques. Moreover, we integrated Weka software
different time. The discretization on time-stamp is defined
                                                                      that contains a group of visualization tools and algorithms for
in our work is the ability to specify that certain event
                                                                      data analysis and predictive modeling to generate rules using
needs to be executed at a specified date. Time might be
                                                                      apriori algorithm [9]. The user can choose different values of
absolute or relative and the granularity needs to be considered.
                                                                      the support and confidence to apply apriori algorithm so that
                                                                      we can generate the ABAC rules.
   Definition 4 (Absolute interval time-stamp) it is a time-
                                                                         To evaluate our approach, we used a real life event log
stamp pattern constraint that defines a punctual temporal
                                                                      provided from BPI challenge 2018 [12]. The data set consists
structure and refer to start and finish times of an activity [8]:
                                                                      of real business processes for EU direct payments for German
   1) Must Start On (MSO), Must Finish On (MFO): indicates            farmers. The event log consists of 43809 traces over a period
       the exact time, in which an activity must be scheduled         of three years. Tables II and III show some statistics about the
       to begin or complete;                                          number of traces, events, and the number of values in each
   2) Start No Earlier Than (SNET), Finish No Earlier Than            attribute before and after filtering the event log from the noise
       (FNET): indicate the earliest possible time that an ac-        respectively. The attributes that are used in the event log are
       tivity can begin or complete;                                  Resource, Activity, Document Type, Time-stamp. These are
   3) Start No Later Than (SNLT), Finish No Later Than                classified into ABAC attributes as following:
       (FNLT): indicate the latest possible time that this activity      • Subject Attribute: Resource
       is to begin or complete                                           • Object Attribute: Document Type
   4) Start No Earlier Than (SNET), Finish No Later Than                 • Environmental Attributes: Time-stamp
       (FNLT): indicate the earliest possible time that this
                                                                      The Activity is the operation used to perform an access by
       activity is to begin and the latest possible time that this
                                                                      the subject on the object according to the following attributes.
       activity is to complete.
                                                                      Moreover, Table III shows the attributes Absolute Time-stamp,
   In our approach we decided to use the concept of (4)               years, minutes, seconds, days, and hours are observed after
where SNET indicates the minimum time and FNLT indicates              applying both Relative and Absolute Time-stamp that are
the maximum time. Each event e 2 E has event attributes               defined in the approach.
#timestamp(e), #resource(e), #activity(e), and #object(e). We            In the first experiment, we evaluate the quality of the
compare #timestamp(e) in each event e that have the same              approach by calculating the complexity of the extracted rules.
#resource(e), #activity(e), and #object(e) so that we can have        This is studied by counting the number of rules that are
the min Time-stamp(e) between all the events. We then change          derived before and after filtering the attributes to derive ABAC
all the #timestamp(e) to min Time-stamp(e) . This is similarly        rules and by analyzing the impact of the Apriori support
done to calculate the max Time-stamp(e).                              and confidence values (detailed in Section V-A). In the
   Definition 5 (Relative time-stamp) It is a time-stamp pattern      second experiment, we evaluate the quality of the approach
constraint that refers to a dependency between the activities.
This dependency is a relationship between two activities, aq            2 http://www.promtools.org/




                                                                                                                                          42
                             Table II                               generated by Apriori and that are less interesting in the context
        S TATISTICS ON THE BPI CHALLENGE 2018 EVENT LOG             of access control.
                Traces                     43809
                Events                     2514266                      250
                values in Activity         41
                values in Resource         165                          200

                values in Document Type    8                            150


                                                                        100
                             Table III
S TATISTICS ON THE BPI CHALLENGE 2018 EVENT LOG AFTER FILTERING          50


                Traces                          108                       0
                                                                              S=0.1   S=0.1     S=0.1       S=0.5         S=0.5        S=0.5      S=0.8       S=0.8   S=0.8
                Events                          6183                          C=0.1   C=0.5     C=0.8       C=0.1         C=0.5        C=0.8      C=0.1       C=0.5   C=0.8
                values in Activity              34                                     Number of rules before filtering           Number of rules after filtering
                values in Resource              102
                values in Document Type         8
                values in Absolute Time-stamp   25                  Figure 4. Bar graph that shows the number of rules before and after filtering
                values in Years                 3                   the attributes according to different support and confidence values
                values in Minutes               38
                values in seconds               47
                values in Days                  7
                values in Hours                 21                  B. Completeness of the rules and parameter impacts
                                                                       In the second experiment, we target to study the complete-
by measuring the completeness of the extracted rules. This is       ness of our approach which is expressed in terms of the
studied by counting the average number of values extracted          number of attribute values extracted in the rules compared
for the different attributes (detailed in Section V-B).             with the total number of attribute values in the event log. On
                                                                    the one hand, a high number of attribute values per rule means
A. Complexity of the rules and parameter impact                     that we are able to cover the association between the attributes
   In this experiment, we study the complexity of our approach      of all ABAC elements from the event log. On the other hand,
which is expressed in term of the number of extracted rules.        a high percentage of retrieved values per attribute means that
The higher the number of extracted rules, the more the              our rules cover all possible elements in the event log choices
complexity increases. Our objective is to study the impact of       and that we are able to find all possible ABAC rules. Our
the different support and confidence values on the number           objective is to study the impact of the different support and
of extracted rules. To do so, we compute (1) the number of          confidence values on the number of extracted attribute values.
extracted rules before filtering which includes all the rules       To do so, We study (1) the number of extracted attribute
that are generated by applying the apriori algorithm and (2)        values per rule before filtering which includes all the rules
the number of extracted rules after filtering which includes the    that are generated by applying the apriori algorithm and (2)
rules that contain the selected attributes for the ABAC. The        the number of extracted attribute values per rule after filtering
average number of extracted configuration rules with different      which includes the rules that contain the selected attributes
support and confidence values are shown in Figure 4.                for the ABAC. Moreover, we study the relation between the
   First, the results clearly show that the number of rules         complexity and the completeness of the extracted attribute
decrease when the support values increase. The same applies         based access control rules. We study the average number of
for the confidence values. However, the effect of the support       values of each attribute found in the extracted rules compared
values is much more noticeable than that of the confidence          to the total number of attribute values in the event log that
values. This means that frequency plays an import role in the       can be seen in table III. Figure 5 shows the average number
context of access control. Highly frequent rules may refer to       of attribute values after performing filtering to the attributes
access control rules that are respected in an organization while    so that the rules contains the selected attributes for the ABAC.
less frequent rules may indicate a violation of permissions            First, the results clearly show that the number of attribute
and therefore need a closer inspection by experts. Second, the      values in the rules before and after increases when the support
number of rules greatly decreases after filtering the attributes.   values decreases. The same applied for the confidence. How-
This is because an ABAC rule is valid only if it includes           ever, the effect of the support values is much more noticeable
the following mandatory attributes: the name attribute of the       than that of the confidence values. This can be explained by
activity, at least one attribute of the subject (e.g. name, role,   the fact that, in high frequency the rule is not shown. This
etc.), the timestamp attribute of the environmental attributes      leads to the conclusion that the complexity of the ABAC
and at least one attribute of the object. An ABAC rule should       rules is positively correlated with their completeness (i.e.
also follow the following format (Subject attributes, Activity      when the complexity increases the completeness increases)
attributes, Environmental attributes ) Object attributes). By       while both the completeness and complexity are negatively
imposing such constraints, we can decrease the complexity of        correlated with the support threshold value (i.e. when the
our approach by decreasing the high number of rules that are        support decreases, the completeness and complexity increase).




                                                                                                                                                                              43
Therefore, one has to choose or to find a compromise between                                                                                                                         B. Extracting Access Control Models from Event Logs
the complexity and the completeness of the results.                                                                                                                                     Looking at the previous studies related to our topic, one
                                                                                                                                                                                     can see a closely related approach which is extracting Role-
                                                                                                                                                                                     Based Access Control (RBAC) from event logs [10]. This
                                               35
    % of values retrieved for each attribute




                                               30

                                               25                                                                                                                                    approach consists of three main steps where the authors aim
                                               20

                                               15
                                                                                                                                                                                     to derive an RBAC model from an event log. In the first
                                               10                                                                                                                                    step, called analysis, a process mining technique is used to
                                                5

                                                0
                                                                                                                                                                                     extract process-related data from an event log in XES format.
                                                        Activity      Resource     Document
                                                                                     Type
                                                                                                Absolute Time-
                                                                                                    stamp
                                                                                                                  Years          days          hours        minutes        seconds
                                                                                                                                                                                     In the next step, extracted data can be transformed into an in-
                                                                                               ABAC attributes extracted from the event log
                                                                                                                                                                                     memory RBAC model. Before that, minor adjustments could
                                                                                                                                                                                     be made to the extracted data, so that the data and relationships
                                                    S=0.1 C=0.1    S=0.1 C=0.5   S=0.1 C=0.8   S=0.5 C=0.1   S=0.5 C=0.5   S=0.5 C=0.8   S=0.8 C=0.1   S=0.8 C=0.5    S=0.8 C=0.8




Figure 5. Bar graph that shows the average number of values retrieved per
                                                                                                                                                                                     would reflect actual settings of the business process. In the
attribute according to different support and confidence values                                                                                                                       final step, an RBAC model is exported to the XML-based
                                                                                                                                                                                     format in order to support the data exchange between different
  As a conclusion the Experimental results showed that the                                                                                                                           applications, e.g., information systems could implement the
extracted attribute based access control rules are of good                                                                                                                           RBAC model or access policy management systems could be
quality in the mean of complexity and completeness.                                                                                                                                  used to enhance the RBAC model. ABAC models can be seen
                                                                                                                                                                                     as a generalization of RBAC models, i.e. the core element of
                                                                                     VI. R ELATED W ORK
                                                                                                                                                                                     RBAC is “role” which becomes an attribute in ABAC.
   Our work is related to two research areas: access control
models (detailed in Section VI-A) and automated extraction                                                                                                                                     VII. C ONCLUSION AND F UTURE W ORK
of access control models from event logs (detailed in Sec-                                                                                                                              In this paper, we proposed an automated approach to
tion VI-B).                                                                                                                                                                          extract ABAC models from event logs which record the actual
                                                                                                                                                                                     execution of business process. Given an event log as an input,
A. Access Control Model
                                                                                                                                                                                     we proposed to extract the attributes needed for an ABAC
   Some access control mechanisms that the Computer security                                                                                                                         model and to learn the ABAC rules automatically. We used
architects and administrators have developed to protect their                                                                                                                        process mining and data mining techniques. The output of our
object by mediating a request from subjects are defined in                                                                                                                           result is a set of ABAC rules in the form of SubjectAttribute
this section. Each mechanism has its own advantages and                                                                                                                              (s) ^ SOPO ^ EnvironmentalAttribute () )
limitations but it is important to note the evolution of these                                                                                                                       ObjectAttribute (o).
models to fully appreciate the flexibility and applicability of                                                                                                                         We validated our approach by implementing it as a java
the ABAC model. Some of these models are the following:                                                                                                                              application. We also performed experiments on a real life
   • MAC: Mandatory access control provides only the owner                                                                                                                           event log from BPI Challenge 2018. Experimental results
     and guardian management of the access controls. This                                                                                                                            showed that our approach is feasible and that the frequency is
     means the end user has no control over any settings that                                                                                                                        an important factor in the context of access control. Highly
     provide any privileges to anyone [4].                                                                                                                                           frequent rules may indicate that the right permissions are
   • DAC: According to [13] the Discretionary access control                                                                                                                         respected in an organization. In case our approach is used to
     do not supply a high security assurance for two reasons:                                                                                                                        create a new access control model, these rules help experts to
     First, the granting access is transitive. Second, DAC                                                                                                                           set up their access control model. On the other hand, infrequent
     policies are vulnerable to Trojan Horse attacks. A Trojan                                                                                                                       rules may indicate violation of permissions and need to be
     Horse program is the one that looks to be doing one                                                                                                                             closely inspected by experts.
     thing on the surface but literally does something more                                                                                                                             We aim to extend our work in two different directions. First,
     underneath without the cognizance of the user.                                                                                                                                  we plan to perform more experiments with real-life event logs
   • IBAC: Identity Based access control captures the identity                                                                                                                       to validate and generalize our results. Secondly, recently there
     of the subjects that want to access an object. This way is                                                                                                                      has been a growing interest in cloud-based process mining
     considered hard to be managed.                                                                                                                                                  solutions. This has raised issues about privacy and the call
   • RBAC: Role Based Access Control model defines roles                                                                                                                             for cryptography techniques to secure event logs. Therefore,
     that carry specific set of privilege that a subject request                                                                                                                     we plan to extend our approach to extract ABAC rules from
     and by the access for object owner when determining the                                                                                                                         secured event logs in order to check if the intended security
     privilege associated with each role [3].The big issue with                                                                                                                      is actually respected.
     this access control model is that if one requires access
     to other files that he doesn’t has permission to, he has                                                                                                                                                 R EFERENCES
     to find another way to do it since the roles are only                                                                                                                            [1]   Wil M. P. van der Aalst. Process Mining: Data Science
     associated with the position; otherwise, security managers                                                                                                                             in Action. 2nd ed. Heidelberg: Springer, 2016. ISBN:
     from other organizations could possibly get access to files                                                                                                                            978-3-662-49850-7. DOI: 10.1007/978-3-662-49851-4.
     they are unauthorized for [2].




                                                                                                                                                                                                                                                         44
 [2] Mark Ciampa. Security+ guide to network security
     fundamentals. Cengage Learning, 2012.
 [3] DF Ferraiolo and DR Kuhn. “Natl Institute of Standards
     and Tech., Dept. of Commerce, Maryland, Role-Based
     Access Control”. In: Proceedings of 15th Natl Com-
     puter Security Conference. 1992.
 [4] Virginia Nunes Leal Franqueira. “Access control from
     an intrusion detection perspective”. In: (2006).
 [5] Christian W Günther and Eric Verbeek. “Standard defi-
     nition”. In: Fluxicon Process Laboratories, XES Version
     1 (2012).
 [6] Rania Ben Halima et al. “Scheduling Business Process
     Activities for Time-Aware Cloud Resource Allocation”.
     In: OTM Confederated International Conferences” On
     the Move to Meaningful Internet Systems”. Springer.
     2018, pp. 445–462.
 [7] Vincent C Hu et al. “Guide to attribute based access
     control (ABAC) definition and considerations (draft)”.
     In: NIST special publication 800.162 (2013).
 [8] Cosmina Cristina Niculae. “Time patterns in workflow
     management systems”. In: BPM Center Report BPM-
     11-04, available online at http://bpmcenter. org/wp-
     content/uploads/reports/2011/BPM-l l-04. pdf (2011).
 [9] Weka Team. Use WEKA in your Java code. 2016.
[10] Taivo Teder. “Extracting Role-Based Access Control
     Models from Business Process Event Logs”. In: ().
[11] Wil Van Der Aalst. Process mining: discovery, confor-
     mance and enhancement of business processes. Vol. 2.
     Springer, 2011.
[12] Van Dongen, B.F. (Boudewijn) and Borchert, F. (Flo-
     rian). BPI Challenge 2018. 2018. DOI: 10.4121/UUID:
     3301445F-95E8-4FF0-98A4-901F1F204972.
[13] Younis A Younis, Kashif Kifayat, and Madjid Merabti.
     “An access control model for cloud computing”. In:
     Journal of Information Security and Applications 19.1
     (2014), pp. 45–60.




                                                               45