Extracting Attribute-Based Access Control Rules From Business Process Event Logs Amani Abou Rida Nour Assy Walid Gaaloul Computer Science Department Computer Science Department Computer Science Department Lebanese University - Faculty of sciences Lebanese International University Telecom sudParis Beirut, Lebanon Beirut, Lebanon Paris, France amani.abourida96@gmail.com nour.assy@liu.edu.lb walid.gaaloul@telecom-sudparis.eu Abstract—Protecting sensitive information from unauthorized such as date-time, are used to determine access. For instance, access is recognized as a crucial issue for today’s organizations. the rule “Permit managers to access financial data provided Identity and Access Management is one of the best practices they are from finance department” would allow users with techniques that ensure that the right people have access to the right systems at the right time. In particular, Attribute-Based attributes of “Role=Manager” and “Department=Finance” to Access Control (ABAC) models have recently gained popularity access data with the attributes of “Category=Financial”. This because of their capability to provide fine-grained and contextual makes ABAC a flexible and fine-grained strategy to manage access control that is not based on the user but on the attributes users’ access operations. of every component in the system. Despite the benefits of adopting In order to deploy ABAC, one has to manually define all ABAC, it is commonly agreed that deploying an ABAC system is a complicated, time-consuming and challenging task. This is the attributes of the system, assign attributes to each system because all attributes of the system must be defined, and acess component and create and maintain policies (a.k.a rules) rules must not only be created, but also regularly monitored according to the security requirements. This makes ABAC and reviewed. In this paper, we propose an automated approach systems complicated and hard to deploy. This paper addresses to extract ABAC rules from event logs which record the actual the aforementioned issue by proposing an automated approach execution of business processes. Event logs capture which tasks are performed by whom and at what point in time, and what to learn ABAC rules from business process event logs. Event data are taken as input and output. Therefore, they provide logs [1] record the actual execution of business processes in rich information on task and data access policies. Concretely, we an organization. They capture which tasks are performed by propose to use (i) process mining techniques in order to analyze whom and at what point in time, and what data are taken the event log and extract useful attributes and (ii) data mining as input and output. Therefore, they provide rich information techniques in order to learn the ABAC rules. To validate our approach, we (i) developed a java application, and (ii) performed on task and data access policies. The extracted ABAC rules experiments on a real-life event log. Experimental results show depict the “current state” permissions within an organization. that our approach is efficient and feasible. In case of an existing ABAC model, our extracted rules can Index Terms—Process mining, Attribute Based Access Control be used to monitor and review the granted permissions. In the Model, Event logs, Association Rule Mining. other case, they serve as starting point for further refinement and customization to derive “target state” ABAC models I. I NTRODUCTION that represent the tailored permissions to execute business Organizations are increasingly becoming dependent on in- processes. Concretely, our approach is divided into two main formation technology to perform their business operations, steps. In the first step, we automatically extract from the event thereby meeting their business objectives. In order to en- log the attributes and relations that are required for mining sure reliable execution of business processes and protect the ABAC rules. In the second step we use Association Rule data from unauthorized access, security measures need to be Mining to learn ABAC rules from the event log. implemented. Identity and Access Management (a.k.a access The remainder of the paper is organized as follows. In Sec- control) is considered as a crucial security measure and a tion II, we present an example that will be used throughout the strong driving force for protecting the data, employees, and paper to illustrate our approach. Section III formalizes some property of an organization. Roughly speaking, the purpose concepts and definitions that are required for our approach. of access management is to grant authorized users access to In Section IV, we detail our approach for extracting ABAC appropriate data and deny access to unauthorized users. rules from event logs. Our implementation and experimental Recently, there has been a growing interest in Attribute- results are reported in Section V. In Section VI, we discuss Based Access Control (ABAC) models [7] which define access some related works before we conclude in Section VII. control rules based on the attributes of every component in an information system. In an ABAC system, any type II. RUNNING E XAMPLE of attribute such as user attributes (a.k.a subject attributes), We consider a scenario that describes a process for handling resource attributes and other relevant contextual attributes, a request for ticket compensation within an airline. To handle Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 38 this ticket, the process should be secured so that each worker which represent the output of our approach (detailed in Sec- should know his work and if he can access the request tion III-B). or not. In general this process is shared between different process users. It will be accessed according to specific rules. A. Event Logs These rules are complicated in the process if a company An event log contains the execution data of a business decided to do them manually. For that we should improve process and is recorded by the information system. Event information sharing by maintaining control of that information. logs are used by process mining techniques to discover In order to ease the process security experience, the process process models, to check the conformance of a-priori process provider decides to provide attribute based access control rules models, to detect execution errors or to observe social that relies upon the evaluation of attributes of the subject, behaviors [1]. The structure of an event log is defined by the attributes of the object, environment attributes, and the formal XES standard which defines an event log as a set of traces [5]. relationship or access control rule or policy that can define the allowable operations for subject-object attribute combinations. Figure 2 shows the class diagram of an event log 1 . A log These attributes needed in an ABAC model are to be extracted consists of traces and a trace consists of events. The events from the event logs. For example consider an event log where within a case are ordered and they can also have attributes. every row in it corresponds to an event.These events have Examples of typical attribute names are activity, Contextual different properties, for example we can see in the first column Attribute (e.g. time), Object Data (e.g. costs), and resource. An the PID which is the case ID. The second column refers to event log accumulates the execution history of one process. A the time-stamp in which the activity is being executed by the log case corresponds to one process instance execution. The resource. The third column, refers to the activity that is being list of the most common attributes in event logs are: executed. Then there is a column referring to the resource • Case ID: which is the process instance id of the event. which is the person executing the corresponding activity. And • Activity: name of the action performed in the event. we can have all kinds of other columns with other data • Event Type: which refers to the event state such as started, like role, cost, department, location, and status. Using the paused, resumed, and completed. event log shown in figure 1 we can extract a rule with A • Time-stamp: date and the time at which the event has resource.role == ”Manager” AND resource.status == “2” AND been executed, establishing an order between the events. environmental.dateTime == “10/2/2105 13:20” if object.cost • Resource: name of the resource that initiates the event. == “8” can do action == ”decide” AND object.customerID Data: data attribute related to the event. == ”1718” will lead for Ellen to have an access for the object while Sara can’t have access to it. For this reason the above event log should be analyzed in order to (i) classify the Log attributes according to the ABAC model and (ii) infer the set of ABAC rules. However, doing this manually is an incredibly Trace Case tedious and error-prone task. Therefore, we propose in this TimeStamp work an automated approach for extracting ABAC rules from Activity an event log. ... Event ContextualAttribute Role 1 has 1* ... EventAttribute Resource Cost ObjectData ... Figure 2. Class diagram for an event log Definition 1 (Trace, Event log, Event Attribute) [11]. Let E be the event universe, i.e., the set of all possible event Figure 1. Shows an excerpt of the event log identifiers [11]. An event log is a set of traces and each trace contains number of events, a trace < e1, e2, . . ., en > 2 T is a sequence of events. III. P RELIMINARIES Events may be characterized by various attributes, e.g., an event may have a time-stamp, that refers to an activity, which This section presents two main ingredients of our approach: is performed by a specific person, has associated costs, etc. event logs which are used as input of our approach (detailed in Section III-A) and attribute-based access control models 1 This diagram is a slightly modified version of the diagram presented in [1] 39 Let AN be a set of attribute names. For any event e 2 E and • Operation attributes: They present the action or activity name n 2 AN, #n(e) is the value of attribute n in event e. If being done e.g. read, delete, view, approve. . . event e does not have an attribute named n, then #n(e) = ? • Object attributes: attributes that express the object (or (null value) [11]. resource) being accessed e.g. the object type, the depart- Each event ei 2 E has event attribute. These event attributes ment, the location, the cost. . . corresponds to activity #activity (e) = a 2 A, and resource • Contextual (environment) attributes: attributes that deal #resource (e) = r 2 R, and #time(e) = t 2 D is the time-stamp with time, location or dynamic aspects of the access of event e. For convenience we assume the following standard control scenario. attributes: ABAC is also concerned with the policy and rules. These • #activity (e) is the activity associated to event e. rules are based on the privileges of subjects and how resources • #time (e) is the time-stamp of event e. or objects are to be protected under which environmental • #resource (e) is the resource (user) associated to event e. conditions in order to determine if access is allowed or not. We can also have another attributes that can be classified according to contextual attributes (time, location ...) and object Definition 2 (ABAC relation) We also define the following attributes (Data type, cost, status . . . ). Let CA be set of relation: A subject - operation - object SOPO permission tuple contextual attributes where CA = D,L . . . and OA be a set is a tuple containing a subject s 2 S, an operation of object attributes where OA= DT, S, CS. op 2 OP, and an object o 2 O. This tuple means that subject s has permission to perform operation op on object o. For B. Attribute Based Access Control Model convenience we assume the following standard attributes: ABAC model leads to grant or deny user requests according • #SubjectAttribute (s) is a function that returns a set of to the arbitrary conditions of the user, attributes of the object, pair (attribute name, value) of the subject. and environment conditions that can be recognized and more • #ObjectAttribute (o) is a function that returns a set of pair similar to the policies at hand. As owners of the objects, (attribute name, value) of the object. they have the permission to establish a policy that can relates • #EnvironmentalAttribute () is a function what operations may be performed upon those objects, by that returns a set of pair (attribute name, value) of the whom, and in what context those subjects may perform those environmental attributes. operations. Definition 3 (ABAC rule) A rule ABAC-R is a tuple < #SubjectAttribute (s), SOPO , #ObjectAttribute (o), #Environ- mentalAttribute () >. The ABAC rule is defined as follow: #SubjectAttribute(s) ^ SOPO ^ #EnvironmentalAt- Attribute has tribute () ) #ObjectAttribute(e). EnvironmentalAttribute SubjectAttribute ObjectAttribute IV. M INING ABAC RULES has 1* Policy 1* A. Approach overview has Has has has Has In this section, we present an overview of our automated Rule approach for extracting ABAC rules from event logs. The input here is an event log that contains attributes defined in 1 1 has Definition 1. The output of this algorithm generate ABAC rules Operation needed for an ABAC model. subject object Algorithm 1 Building an attribute based access control guid- Subject Object ance model Performs on 1) Input: E Figure 3. Class diagram for attribute based access control model 2) Output: ABAC-R 3) for e E The main elements in ABAC model is the attributes that 4) E-attributes = get-Attributes (e) can be about anything and anyone. These attributes are likely 5) E-relations = get-Relations (e, E-attributes) to fall into 4 different categories or functions (as in gram- 6) ABAC-attributes = get-Attributes (e) matical function). Figure 3 shows a class diagram for ABAC 7) ABAC-relations = get-Attributes (e) model that contains main categories. These categories are the 8) Mapping (E-attributes, E-relations, ABAC-attributes, following: ABAC-relations) 9) E-saved = Save-To-ARFF (E-attributes) • Subject attributes: That describes the attributes in which 10) end for they express the user who wants to access e.g. age, status, 11) ABAC-R = Apriori (E-saved, minS, minC) role, job title. . . 40 The algorithm proceeds in four main steps. In the first step, Table I the event log is analyzed to extract attributes and relations S HOWS M APPING BETWEEN EVENT LOGS ’ ATTRIBUTES AND ABAC MODELS ’ ATTRIBUTES needed in our model that are stored in E-attributes and E- relations respectively (lines 4, 5). Then, in the second step, Event Log Attribute Based Access Control the attribute based access control model is also analyzed to Resources Subject Attribute Object Data Object Attribute extract main attributes and relations to be stored in ABAC- Contextual Attribute Environmental Attribute attributes and ABAC-relations respectively (lines 6,7). The Activity Operation third step consists of mapping between the attributes and Resource-Activity-Object Data (RAO) Subject-Operation-Object (SOPO) relations of event logs to the attributes and relations of attribute based access control (step 8). Finally, the last step consists of generating the set of attribute based access control rules (step Moreover, we can infer a Resource-Activity-ObjectData 11). relation from these attributes which is defined as follow- ing: Definition 4 (Event relation) Resource-Activity- B. Extracting attribute based access control rules from busi- ObjectData (RAO) assignments: The relation RAO is ness process event logs defined as RAO = { (r, a, od ) 2 R × A x OD 9 e 2 In this section we present the approach in two main steps E , #resource(e) = r ^ #activity(e) = a ^ #objectData(e) as following: In the first step, we show a mapping between = od }. RAO relation holds if at least one event e 2 E event logs and ABAC models by analyzing the event log and with resource r 2 R that executes activity a 2 A and mapping the event logs’ attributes to ABAC models’ attributes. ObjectData od 2 OD is recorded in the event log. We In the second step, we applied apriori algorithm to extract say that we assign resource r to activity a that can be ABAC rules. accessed by object data od . 2) Mapping event logs’ attributes to ABAC models’ at- C. Mapping between event logs and ABAC models: tributes: Table 1 shows a mapping where a resource is said to be In this section, we present two main parts to have the subject attribute, an object data is the object attribute, mapping between an event log and an ABAC model. In the contextual attribute is the environmental attribute, and first part, we analyze the event log to extract the attributes the activity is the operation. needed. In the second part, we map the attributes extracted from the event log to ABAC model attributes. D. Apriori-based approach for extracting ABAC rules: 1) Analyzing the event log: The goal here is to find associations of items that occur In this section we analyzed the event log given as input. together more often than one would expect from a random In this step, we aim to show how we can benefit from the sampling of all possibilities. Apriori starts by selecting the event log to extract the attributes needed for creating an frequent single attribute, then generates the candidate pairs of ABAC model. The class diagram in figure 3 summarizes attributes in the event from the frequent singles and so on, until the analysis for any event log. We extended the original it finds all possible attributes according to all the events in the event log so that we can benefit from the attributes in event log. It uses Support, a well-known metric to compute creating ABAC model. An event log is decomposed of the frequency of a set of correlated attributes in the event log. traces that contain a number of events. Each event has The support is defined as the fraction of correlated attributes a set of attributes and we modified some of them as of each event in the event log in which they always appear follow: together. • Resource: name of the resource that initiates the | Ae | S= (1) event. It can also contain the role of the resource, |E| age, and the status. Where | Ae | is the number of correlated attributes in an • Contextual attributes: can be: Time-stamp: date and event and | E | is the number of events in an event log. A the time at which the event has been executed, support is equal to 1 if all the events in the event log repository establishing an order between the events. It can also contain the correlated attributes. A support is equal to 0 if contain location, and department . . . none of the events in the event log contain the corresponding • Object Data: data attributes related to the event. It attributes together. A set of correlated attributes is frequent if can contain cost, and type . . . its support is above a given threshold mSupp. After analyzing the event log we can realize that any In the second step of the algorithm, the set of relevant attribute can be classified into these categories. The attribute based access control rules in the form of LHS ) main thing is to extract attributes that match the above RHS are derived from the frequent correlated attributes. In elements so that any event should have at least the order to keep only relevant rules, the confidence metric is following attributes (activity, object data, contextual computed to evaluate the probability of occurrence of a rule. attribute, resource). The confidence of an attribute based access control rule tt: 41 LHS ) RHS is defined as the probability of occurrence and al where q is not equal to l, in which one activity depends of the attributes in the right-hand side RHS given that the on the start or finish of another in order to begin or end [6]. configurations in the left-hand side LHS are selected. In order In our approach this relationship depends on the start event of to have ABAC rule as in definition 3 we should apply filtering each trace compared with the other events in the same trace. on the attributes to the derived rules. In this definition we This is defined by the following: For each trace T 2 E in event should be sure that the attributes in the left side contains the log and each event e 2 T in the trace we obtain start event subject attribute, environmental attribute, and the operation se 2 T which is the first event in each trace. The time-stamp that the user is doing whereas the right side should contain the here is splitted to years, day, hours, minutes, and seconds. object attribute that describe the data that is accessed. Rules: We compare #timestamp(se) - #timestamp(e) in each event e #SubjectAttribute(s) ^ SOPO ^ #EnvironmentalAttribute() ) #ObjectAttribute (o) . #resource(e), #activity(e), and #object(e) so that we can have the min Years(e), Day(e), Hours(e) , Minutes(e) , Seconds(e) Sup(RHS [ LHS) between all the events. We then change all the #timestamp(e) C= (2) SupRHS to min Years(e), Day(e), Hours(e) , Minutes(e) , Seconds(e) that have same #resource(e), #activity(e), and #object(e). Where Sup(RHS LHS) is the support of the attributes (Sub- ject Attributes, Environmental Attribute, Object Attributes, and V. E VALUATION AND VALIDATION Operation) in the right-hand and left- hand sides of Rule and We implemented our approach as a java application to gen- SupRHS is the support of the attributes in the right-hand side erate ABAC rules. We integrated ProM libraries 2 in order to (Object Attribute). import and extract information from an event log. We also used When applying appriori algorithm on these events, we “filter events” in ProM to reduce the noise impact on the data. can recognize in-frequent rules since the time in these ProM is an open source framework for implementing process rules contain numerical values and each event contains mining techniques. Moreover, we integrated Weka software different time. The discretization on time-stamp is defined that contains a group of visualization tools and algorithms for in our work is the ability to specify that certain event data analysis and predictive modeling to generate rules using needs to be executed at a specified date. Time might be apriori algorithm [9]. The user can choose different values of absolute or relative and the granularity needs to be considered. the support and confidence to apply apriori algorithm so that we can generate the ABAC rules. Definition 4 (Absolute interval time-stamp) it is a time- To evaluate our approach, we used a real life event log stamp pattern constraint that defines a punctual temporal provided from BPI challenge 2018 [12]. The data set consists structure and refer to start and finish times of an activity [8]: of real business processes for EU direct payments for German 1) Must Start On (MSO), Must Finish On (MFO): indicates farmers. The event log consists of 43809 traces over a period the exact time, in which an activity must be scheduled of three years. Tables II and III show some statistics about the to begin or complete; number of traces, events, and the number of values in each 2) Start No Earlier Than (SNET), Finish No Earlier Than attribute before and after filtering the event log from the noise (FNET): indicate the earliest possible time that an ac- respectively. The attributes that are used in the event log are tivity can begin or complete; Resource, Activity, Document Type, Time-stamp. These are 3) Start No Later Than (SNLT), Finish No Later Than classified into ABAC attributes as following: (FNLT): indicate the latest possible time that this activity • Subject Attribute: Resource is to begin or complete • Object Attribute: Document Type 4) Start No Earlier Than (SNET), Finish No Later Than • Environmental Attributes: Time-stamp (FNLT): indicate the earliest possible time that this The Activity is the operation used to perform an access by activity is to begin and the latest possible time that this the subject on the object according to the following attributes. activity is to complete. Moreover, Table III shows the attributes Absolute Time-stamp, In our approach we decided to use the concept of (4) years, minutes, seconds, days, and hours are observed after where SNET indicates the minimum time and FNLT indicates applying both Relative and Absolute Time-stamp that are the maximum time. Each event e 2 E has event attributes defined in the approach. #timestamp(e), #resource(e), #activity(e), and #object(e). We In the first experiment, we evaluate the quality of the compare #timestamp(e) in each event e that have the same approach by calculating the complexity of the extracted rules. #resource(e), #activity(e), and #object(e) so that we can have This is studied by counting the number of rules that are the min Time-stamp(e) between all the events. We then change derived before and after filtering the attributes to derive ABAC all the #timestamp(e) to min Time-stamp(e) . This is similarly rules and by analyzing the impact of the Apriori support done to calculate the max Time-stamp(e). and confidence values (detailed in Section V-A). In the Definition 5 (Relative time-stamp) It is a time-stamp pattern second experiment, we evaluate the quality of the approach constraint that refers to a dependency between the activities. This dependency is a relationship between two activities, aq 2 http://www.promtools.org/ 42 Table II generated by Apriori and that are less interesting in the context S TATISTICS ON THE BPI CHALLENGE 2018 EVENT LOG of access control. Traces 43809 Events 2514266 250 values in Activity 41 values in Resource 165 200 values in Document Type 8 150 100 Table III S TATISTICS ON THE BPI CHALLENGE 2018 EVENT LOG AFTER FILTERING 50 Traces 108 0 S=0.1 S=0.1 S=0.1 S=0.5 S=0.5 S=0.5 S=0.8 S=0.8 S=0.8 Events 6183 C=0.1 C=0.5 C=0.8 C=0.1 C=0.5 C=0.8 C=0.1 C=0.5 C=0.8 values in Activity 34 Number of rules before filtering Number of rules after filtering values in Resource 102 values in Document Type 8 values in Absolute Time-stamp 25 Figure 4. Bar graph that shows the number of rules before and after filtering values in Years 3 the attributes according to different support and confidence values values in Minutes 38 values in seconds 47 values in Days 7 values in Hours 21 B. Completeness of the rules and parameter impacts In the second experiment, we target to study the complete- by measuring the completeness of the extracted rules. This is ness of our approach which is expressed in terms of the studied by counting the average number of values extracted number of attribute values extracted in the rules compared for the different attributes (detailed in Section V-B). with the total number of attribute values in the event log. On the one hand, a high number of attribute values per rule means A. Complexity of the rules and parameter impact that we are able to cover the association between the attributes In this experiment, we study the complexity of our approach of all ABAC elements from the event log. On the other hand, which is expressed in term of the number of extracted rules. a high percentage of retrieved values per attribute means that The higher the number of extracted rules, the more the our rules cover all possible elements in the event log choices complexity increases. Our objective is to study the impact of and that we are able to find all possible ABAC rules. Our the different support and confidence values on the number objective is to study the impact of the different support and of extracted rules. To do so, we compute (1) the number of confidence values on the number of extracted attribute values. extracted rules before filtering which includes all the rules To do so, We study (1) the number of extracted attribute that are generated by applying the apriori algorithm and (2) values per rule before filtering which includes all the rules the number of extracted rules after filtering which includes the that are generated by applying the apriori algorithm and (2) rules that contain the selected attributes for the ABAC. The the number of extracted attribute values per rule after filtering average number of extracted configuration rules with different which includes the rules that contain the selected attributes support and confidence values are shown in Figure 4. for the ABAC. Moreover, we study the relation between the First, the results clearly show that the number of rules complexity and the completeness of the extracted attribute decrease when the support values increase. The same applies based access control rules. We study the average number of for the confidence values. However, the effect of the support values of each attribute found in the extracted rules compared values is much more noticeable than that of the confidence to the total number of attribute values in the event log that values. This means that frequency plays an import role in the can be seen in table III. Figure 5 shows the average number context of access control. Highly frequent rules may refer to of attribute values after performing filtering to the attributes access control rules that are respected in an organization while so that the rules contains the selected attributes for the ABAC. less frequent rules may indicate a violation of permissions First, the results clearly show that the number of attribute and therefore need a closer inspection by experts. Second, the values in the rules before and after increases when the support number of rules greatly decreases after filtering the attributes. values decreases. The same applied for the confidence. How- This is because an ABAC rule is valid only if it includes ever, the effect of the support values is much more noticeable the following mandatory attributes: the name attribute of the than that of the confidence values. This can be explained by activity, at least one attribute of the subject (e.g. name, role, the fact that, in high frequency the rule is not shown. This etc.), the timestamp attribute of the environmental attributes leads to the conclusion that the complexity of the ABAC and at least one attribute of the object. An ABAC rule should rules is positively correlated with their completeness (i.e. also follow the following format (Subject attributes, Activity when the complexity increases the completeness increases) attributes, Environmental attributes ) Object attributes). By while both the completeness and complexity are negatively imposing such constraints, we can decrease the complexity of correlated with the support threshold value (i.e. when the our approach by decreasing the high number of rules that are support decreases, the completeness and complexity increase). 43 Therefore, one has to choose or to find a compromise between B. Extracting Access Control Models from Event Logs the complexity and the completeness of the results. Looking at the previous studies related to our topic, one can see a closely related approach which is extracting Role- Based Access Control (RBAC) from event logs [10]. This 35 % of values retrieved for each attribute 30 25 approach consists of three main steps where the authors aim 20 15 to derive an RBAC model from an event log. In the first 10 step, called analysis, a process mining technique is used to 5 0 extract process-related data from an event log in XES format. Activity Resource Document Type Absolute Time- stamp Years days hours minutes seconds In the next step, extracted data can be transformed into an in- ABAC attributes extracted from the event log memory RBAC model. Before that, minor adjustments could be made to the extracted data, so that the data and relationships S=0.1 C=0.1 S=0.1 C=0.5 S=0.1 C=0.8 S=0.5 C=0.1 S=0.5 C=0.5 S=0.5 C=0.8 S=0.8 C=0.1 S=0.8 C=0.5 S=0.8 C=0.8 Figure 5. Bar graph that shows the average number of values retrieved per would reflect actual settings of the business process. In the attribute according to different support and confidence values final step, an RBAC model is exported to the XML-based format in order to support the data exchange between different As a conclusion the Experimental results showed that the applications, e.g., information systems could implement the extracted attribute based access control rules are of good RBAC model or access policy management systems could be quality in the mean of complexity and completeness. used to enhance the RBAC model. ABAC models can be seen as a generalization of RBAC models, i.e. the core element of VI. R ELATED W ORK RBAC is “role” which becomes an attribute in ABAC. Our work is related to two research areas: access control models (detailed in Section VI-A) and automated extraction VII. C ONCLUSION AND F UTURE W ORK of access control models from event logs (detailed in Sec- In this paper, we proposed an automated approach to tion VI-B). extract ABAC models from event logs which record the actual execution of business process. Given an event log as an input, A. Access Control Model we proposed to extract the attributes needed for an ABAC Some access control mechanisms that the Computer security model and to learn the ABAC rules automatically. We used architects and administrators have developed to protect their process mining and data mining techniques. The output of our object by mediating a request from subjects are defined in result is a set of ABAC rules in the form of SubjectAttribute this section. Each mechanism has its own advantages and (s) ^ SOPO ^ EnvironmentalAttribute () ) limitations but it is important to note the evolution of these ObjectAttribute (o). models to fully appreciate the flexibility and applicability of We validated our approach by implementing it as a java the ABAC model. Some of these models are the following: application. We also performed experiments on a real life • MAC: Mandatory access control provides only the owner event log from BPI Challenge 2018. Experimental results and guardian management of the access controls. This showed that our approach is feasible and that the frequency is means the end user has no control over any settings that an important factor in the context of access control. Highly provide any privileges to anyone [4]. frequent rules may indicate that the right permissions are • DAC: According to [13] the Discretionary access control respected in an organization. In case our approach is used to do not supply a high security assurance for two reasons: create a new access control model, these rules help experts to First, the granting access is transitive. Second, DAC set up their access control model. On the other hand, infrequent policies are vulnerable to Trojan Horse attacks. A Trojan rules may indicate violation of permissions and need to be Horse program is the one that looks to be doing one closely inspected by experts. thing on the surface but literally does something more We aim to extend our work in two different directions. First, underneath without the cognizance of the user. we plan to perform more experiments with real-life event logs • IBAC: Identity Based access control captures the identity to validate and generalize our results. Secondly, recently there of the subjects that want to access an object. This way is has been a growing interest in cloud-based process mining considered hard to be managed. solutions. This has raised issues about privacy and the call • RBAC: Role Based Access Control model defines roles for cryptography techniques to secure event logs. Therefore, that carry specific set of privilege that a subject request we plan to extend our approach to extract ABAC rules from and by the access for object owner when determining the secured event logs in order to check if the intended security privilege associated with each role [3].The big issue with is actually respected. this access control model is that if one requires access to other files that he doesn’t has permission to, he has R EFERENCES to find another way to do it since the roles are only [1] Wil M. P. van der Aalst. Process Mining: Data Science associated with the position; otherwise, security managers in Action. 2nd ed. Heidelberg: Springer, 2016. ISBN: from other organizations could possibly get access to files 978-3-662-49850-7. DOI: 10.1007/978-3-662-49851-4. they are unauthorized for [2]. 44 [2] Mark Ciampa. Security+ guide to network security fundamentals. Cengage Learning, 2012. [3] DF Ferraiolo and DR Kuhn. “Natl Institute of Standards and Tech., Dept. of Commerce, Maryland, Role-Based Access Control”. In: Proceedings of 15th Natl Com- puter Security Conference. 1992. [4] Virginia Nunes Leal Franqueira. “Access control from an intrusion detection perspective”. In: (2006). [5] Christian W Günther and Eric Verbeek. “Standard defi- nition”. In: Fluxicon Process Laboratories, XES Version 1 (2012). [6] Rania Ben Halima et al. “Scheduling Business Process Activities for Time-Aware Cloud Resource Allocation”. In: OTM Confederated International Conferences” On the Move to Meaningful Internet Systems”. Springer. 2018, pp. 445–462. [7] Vincent C Hu et al. “Guide to attribute based access control (ABAC) definition and considerations (draft)”. In: NIST special publication 800.162 (2013). [8] Cosmina Cristina Niculae. “Time patterns in workflow management systems”. In: BPM Center Report BPM- 11-04, available online at http://bpmcenter. org/wp- content/uploads/reports/2011/BPM-l l-04. pdf (2011). [9] Weka Team. Use WEKA in your Java code. 2016. [10] Taivo Teder. “Extracting Role-Based Access Control Models from Business Process Event Logs”. In: (). [11] Wil Van Der Aalst. Process mining: discovery, confor- mance and enhancement of business processes. Vol. 2. Springer, 2011. [12] Van Dongen, B.F. (Boudewijn) and Borchert, F. (Flo- rian). BPI Challenge 2018. 2018. DOI: 10.4121/UUID: 3301445F-95E8-4FF0-98A4-901F1F204972. [13] Younis A Younis, Kashif Kifayat, and Madjid Merabti. “An access control model for cloud computing”. In: Journal of Information Security and Applications 19.1 (2014), pp. 45–60. 45