Legitimate Open-ended Dissemination of Personal Information Srinath Srinivasa, Jayati Deshmukh International Institute of Information Technology, Bangalore, India 560100 Abstract Personal and sensitive information about individuals, often needs to be legitimately exchanged among different stakeholders, to provide services, maintain public health, law and order, and so on. While such exchanges are necessary, they also impose enormous privacy and security challenges. Data protection laws like GDPR specify conditions and the legal capacity in which personal information can be solicited and disseminated further. But there is a dearth of formalisms for specifying legal capacities and juris- dictional boundaries, so that open-ended exchange of sensitive data can be implemented. This paper proposes an extensible framework called Multiverse in which sensitive data can flow across a network through “role tunnels” established based on corresponding legal capacities. Keywords Privacy, Personal information, Legal capacity, Role Tunnel 1. Introduction Current day inter-organizational informa- tion exchange are usually modelled in the form A number of services require processing and of web services, that implement authentica- exchange of private and personal information tion and access-control mechanisms to reg- of individuals. For instance, medical records ulate the exchange [1, 2]. Here, specific ap- may need to be exchanged between special- plications are granted access by having them ists across hospitals. Similarly, personal in- register with the web service, and providing formation like academic credentials, driving them with an application id and access keys. history, credit rating, etc. are routinely ex- In such mechanisms, privileges are tightly con- changed across multiple institutions for pro- nected to the identity of the client application viding services. and its owner. This makes it difficult to seam- Technologies like blockchain provide dis- lessly extend access rights to other legitimate tributed ledgers and audit trails, that protects users. For instance, if a person who is han- the integrity of the information exchange. How-dling sensitive data through their authorized ever there is still a need for formalisms for application is incapacitated or deceased, their encoding and enforcing the policy governing successor cannot seamlessly take on their role the exchange of sensitive data. unless they have the identity-based access cre- dentials. International Semantic Intelligence Conference (ISIC), Research literature has long since advocated February 25-27, 2021, New Delhi, India RBAC or Role-based access control mecha- " sri@iiitb.ac.in (S. Srinivasa); nisms to greatly simplify specification of ac- jayati.deshmukh@iiitb.org (J. Deshmukh)  0000-0001-9588-6550 (S. Srinivasa); cess privilege policies [3, 4, 5]. A role repre- 0000-0002-1144-2635 (J. Deshmukh) sents a competency to do a particular opera- © 2020 Copyright for this paper by its authors. Use permit- ted under Creative Commons License Attribution 4.0 Inter- tion, and it connects a set of people or appli- CEUR national (CC BY 4.0). CEUR Workshop Proceedings cations, with a set of privileges. Access priv- http://ceur-ws.org (CEUR-WS.org) Workshop ISSN 1613-0073 Proceedings 32 ileges are associated with roles, rather than ing of private and/or sensitive information in with individuals, and the association of in- times of crisis, to protect public health and dividuals with roles are dynamic. Authenti- order. For instance, public health manage- cation now involves not only proving one’s ment in the time of Covid crisis requires pri- identity, but also proving one’s role. vate and sensitive information about patients RBAC models are typically implemented suffering from Covid to be shared with sev- within an organizational context. This means eral stakeholders like doctors, administrators, that the RBAC mechanism is situated within volunteer organizations, etc. a larger semantic framework that establishes In such cases, there is no overarching vir- associations of users with roles. RBAC frame- tual organizational structure, or role granting works are also extended for inter-organizationaland mapping authority. The number of dis- workflows [6, 7, 8]. Several approaches are parate entities requiring the data may keep adopted for extending an RBAC framework changing over time, and may not be known a across organizations. These include creation priori. This makes it infeasible to apply exist- of “virtual organizations” representing role ing approaches to inter-organizational privi- granting authorities for inter-organizational lege management. interactions and/or mapping of roles across In this paper, we propose a modular, exten- organizations. sible framework called “Multiverse” to man- More recently, there has been increased in- age legitimate exchanges of sensitive data in terest in open-ended data dissemination in an open-ended fashion, without the need for the form of “open data” for greater common an overarching organizational framework to good. With increasing numbers of governance enforce the integrity of data exchange. In Sec- and administrative workflows appearing on- tion 2 we discuss some of the existing mod- line, there is also an increased need for ex- els of access control systems. Details of the changing data across several entities with lit- “Multiverse” framework are presented in Sec- tle or no inter-organizational authorities for tion 3. Section 4 discusses a variety of adver- managing the integrity of data exchange. sarial scenarios and how it can be handled While open data improves transparency and by the “Multiverse” framework and Section accountability in public workflows. it also 5 presents a couple of case studies where it is brings with it challenges of privacy and secu- useful. Conclusions and future directions are rity leading to several contradictory require- presented in Section 6. ments [9, 10, 11, 12, 13]. Specifically, open- ended data exchange is characterized by three divergent concerns [13]: transparency, privacy, 2. Related Work and security. Transparency requires relevant Access control systems act a mediator between data to be shared publicly in order to uphold users and data / resources to grant or deny integrity of a public action. Privacy on the access based on the underlying security pol- other hand, requires data to be withheld in icy [14]. Access control systems can be broadly order to protect the dignity and liberty of in- classified into two categories: encryption based dividuals. Security pertains to collective good, systems and proof based systems [15]. En- where open-ended sharing of certain sensi- cryption based systems encrypt the data and tive information, can endanger a community send it off to the individual. The individual or country. needs to have the appropriate key in order to In addition to the above concerns, there is decrypt the data. On the other hand, proof also a need for legitimate open-ended shar- 33 based systems require the individual to pro- cessed by different types of users. There are a duce all the necessary proofs required to au- few open data management systems [19, 20] thenticate their identity and after authentica- however most of these systems focus on data tion, the data is shared with the individual. It cleaning and pre-processing so that it can be is difficult to provide fine grained access con- stored in a database or graph etc. To the best trol using encryption based methods without of our knowledge, there does not exist a sys- increasing the number of keys as well as it is tem which combine data pre-processing, data computationally expensive to manage these storage, data retrieval, data visualization along systems specially at a large scale. Proof based with secure access control system of data specif- methods and its variants on the other hand, ically for open data systems. can better handle fine-grained granularity and Data containing personally identifiable in- constraints. formation (PII) needs to be handled much more Individuals can be granted / revoked ac- carefully than say population level aggregate cess based on their identity, which is known data, since it can reveal the identity of indi- as Identity-Based Access Control (IBAC) [16, viduals and in turn put them at risk. For ex- 15]. In these designs, the individuals need ample, in healthcare setting even anonymized to prove their identity using authentication data can be used to infer patient’s identity techniques like the use of passwords, biomet- based on their diagnosis details, location etc rics, or combinations of public and private [21, 22, 23]. There are encryption based and keys. Once an individual’s identity is authen- data masking techniques to manage access ticated, they can access the required data. How- to personal and personally identifiable data. ever, in large organizations and teams span- However, there is a dearth of computational ning multiple organizations, it is difficult and models which can define access mechanisms cumbersome to manage access controls of all for data aligned to the laws of that region or the stakeholders individually in this manner. country. It is especially crucial with data pro- Role-based access control (RBAC) methods tection laws like EU’s General Data Protec- [3, 5] were designed so that permissions can tion Regulation (GDPR), Sweden’s Data Act, be granted to users based on their roles rather Philippines’s The Data Privacy Act, India’s than their identity. This access control design Personal Data Protection Bill, California’s Con- is more effective as changing roles of an indi- sumer Privacy Act etc being defined around vidual automatically updates their privileges. the world. The roles can be assigned to the individuals by an authorized individual. A RBAC policy is designed using role-permission, user-role 3. The Multiverse and role-role relationships. There are vari- Framework ants of RBAC models like models which can handle role hierarchies, constraints, triggers The proposed framework called “Multiverse” and temporal dependencies, teams within the to create an extensible, open-ended infrastruc- organization etc [5, 3, 17]. ture for legitimate exchange of data, is de- Using open data has its own benefits as scribed in this section. A Multiverse frame- well as challenges [18], however it is diffi- work, also called a “Frame” 𝐹 is made up of cult to make appropriate use of open data in the following building blocks: the absence of open data management sys- tems which can handle handle large volume 𝐹 = (𝑊 , 𝐷, 𝐴, 𝑇 ) (1) of diverse data such that it can be securely ac- 34 Figure 1: Multiverse framework Here 𝑊 is a set of containers called “worlds” and also reads data only from 𝑤𝑜𝑟𝑙𝑑(𝑎). that represent the semantic boundary or le- Figure 1 depicts a multiverse schematically. gal jurisdiction within which, a data element The multiverse is a network of worlds con- is accessed and processed. 𝐷 represents the nected by one or more relations defined in set of all data elements or “resources” that are templates. Data are published within worlds being shared. The term 𝐴 represents “agents” and exchanged between them based on a sys- which could be users or application programs tem of legal capacities explained later on. Agents, that produce and consume resources. The term which include users and application programs, 𝑇 represents a set of “templates” where each lie outside of the multiverse cloud, but have a template defines a set of access points through representation for themselves in the form of which data may be accessed, and a set of rela- a semantic world, within the multiverse. tionship types with which worlds can be re- Worlds could be contained within one an- lated. other. If world 𝑤2 is contained within world A world represents the basic unit within 𝑤1 , this is represented as 𝑤2 ⊳ 𝑤1 . Contain- which data is accessed. Every agent 𝑎 ∈ 𝐴 ment of a world is called its “jurisdictional lo- has its own corresponding world named as cation” or simply “location” that represents 𝑤𝑜𝑟𝑙𝑑(𝑎). In addition to representing agents, a system of privilege inheritance explained a world could represent any semantic entity later on. or legal jurisdiction. Some examples of worlds Each world may implement one or more include: institutions, town municipalities, un- templates 𝑡 ∈ 𝑇 , that gives it a semantic char- divided families, resident welfare societies of acterization in the form of a set of data ac- communities, etc. Every data element is pub- cess points and relationship types with other lished within the boundaries of a world, and worlds. Any template 𝑡 ∈ 𝑇 is made up of the data is only exchanged between worlds. An following elements: agent 𝑎 only publishes its data into 𝑤𝑜𝑟𝑙𝑑(𝑎), 35 in an outgoing relationship specification means 𝑡 = (𝐷𝑎𝑡, 𝑅𝑒𝑙) (2) that the target world with which this rela- tionship is being established, should be im- Here 𝐷𝑎𝑡 represents a set of data access plementing template 𝑡. Similarly, such a con- points, and 𝑅𝑒𝑙 represents a set of relationship straint for an incoming relation means that, specifications. A data access point represents the relationship can be accepted only if the a gated interface through which a given data recipient world is implementing template 𝑡. element may be accessed. Any data access Here, the reference to template 𝑡 is in the form point 𝑑𝑎𝑡 ∈ 𝐷𝑎𝑡 comprises of the following of a universally uniquer ID like a URI. elements: Hence for example, in a template called 𝑃𝑒𝑟𝑠𝑜𝑛, we can specify that an outgoing relationship 𝑑𝑎𝑡 = (𝑄, 𝐶, 𝑃) (3) called 𝐸𝑚𝑝𝑙𝑜𝑦𝑒𝑒 can be established with a Here 𝑄 is the query with which the data target world, only if the world implements element is accessed. The terms 𝐶 and 𝑃 rep- a template called 𝐶𝑜𝑚𝑝𝑎𝑛𝑦. Similarly, for a resent legal capacity and purpose code respec- template called 𝐶𝑜𝑚𝑝𝑎𝑛𝑦 there can be an in- tively, which are both explained later. coming relationship called 𝐸𝑚𝑝𝑙𝑜𝑦𝑒𝑒, which The relationship specifications 𝑅𝑒𝑙 speci- may have a constraint that the source world fies the kind of relationships that the world should have implemented a template called can implement with other worlds, as well as 𝑃𝑒𝑟𝑠𝑜𝑛. the kinds of relationships that the world can The 𝑟𝑒𝑙𝑡(𝑛𝑎𝑚𝑒, 𝑡) constraint specifies that accept from other worlds. the target or source world should have a rela- A relationship specification may be of two tionship named 𝑛𝑎𝑚𝑒 with a world that im- kinds– an incoming relationship specification, plements template 𝑡. and an outgoing relationship specification. These Hence for example, in the template spec- are represented as 𝑅𝑒𝑙𝑖 and 𝑅𝑒𝑙𝑜 respectively. ification of a 𝑃𝑒𝑟𝑠𝑜𝑛, we can specify a rela- A relationship specification comprises of the tionship called 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 with another world following elements: 𝑤, only if 𝑤 has a relationship named 𝐷𝑜𝑐𝑡𝑜𝑟 with a world that is implementing a template called 𝐻 𝑜𝑠𝑝𝑖𝑡𝑎𝑙. In other words, a person can 𝑟𝑒𝑙𝑖 = (𝑛𝑎𝑚𝑒, 𝑐𝑜𝑛𝑠𝑡𝑟𝑎𝑖𝑛𝑡𝑠, (4) be related to another person as a patient, only 𝑝𝑟𝑖𝑣𝑖𝑙𝑒𝑔𝑒𝑠, 𝑝𝑢𝑟𝑝𝑜𝑠𝑒𝑠) if the other person is a doctor at some hospi- 𝑟𝑒𝑙𝑜 = (𝑛𝑎𝑚𝑒, 𝑐𝑜𝑛𝑠𝑡𝑟𝑎𝑖𝑛𝑡𝑠, 𝑟𝑜𝑙𝑒𝑠) (5) tal. In addition to template and relationship spec- A relationship has a name at its incoming ifications, a constraint could also identify spe- end and its outgoing end. The incoming rela- cific worlds with their unique identifiers, us- tionship name is also called a role. Any agent ing the 𝑟𝑒𝑙𝑖𝑑(𝑛𝑎𝑚𝑒, 𝑖𝑑) specification. entering a world through a relationship, where Any given 𝑟𝑒𝑙𝑖 or 𝑟𝑒𝑙𝑜 specification may the incoming name of the relationship is 𝑟, is have multiple constraints specified. In such said to be playing the role 𝑟 in the world. cases, all the specified constraints need to be Both outgoing and incoming relationships satisfied, for an instance of the relationship are subject to a set of constraints. Table 1 to be formed. specifies different kinds of constraints on a The 𝑝𝑟𝑖𝑣𝑖𝑙𝑒𝑔𝑒𝑠 part of the relationship spec- relationship. The template constraint: ification for 𝑟𝑒𝑙𝑖 , represents the privileges that any agent obtains, when traversing a rela- 𝑖𝑚𝑝𝑙𝑒𝑚𝑒𝑛𝑡𝑠(𝑡) tionship edge. 36 Constraint Specification Meaning Template implements(t) The source or target world needs to be imple- menting template 𝑡 for the relationship to be valid. Template Relation- relt(name, t) The source or target world needs to have a rela- ship tionship named 𝑛𝑎𝑚𝑒 with a world implementing template 𝑡. Identity Relation- relid(name, id) The source or target world needs to have a rela- ship tionship named 𝑛𝑎𝑚𝑒 with world identified by 𝑖𝑑. Table 1 Relationship constraints Privilege class Privilege Interpretation Resource read all forms of read queries on a resource write write or modify a resource delete delete a resource template access templates visible to this world World edit modify privileges on the current world, including management of templates and deleting the world relocate move the world from its current location to an- other location create privilege to create worlds inside this world Table 2 Role privileges An incoming agent who enters a world through Hence, in a given world 𝑤, an outgoing re- a relationship, gets the role specified in 𝑟𝑒𝑙𝑖 , lationship specification of the form: (𝑟𝑜 , 𝑐, 𝑝) and the corresponding privileges associated represents that any agent playing the role 𝑝 with it. Table 2 details a set of privilege classes, can traverse the relationship edge 𝑟𝑜 to act as that apply respectively to a set of operations a representative of the source world 𝑤, in the over resources (including templates), and the target world. world itself. A role having a resource.read Every world also has a role called 𝑜𝑤𝑛𝑒𝑟, privilege for example, enables the role player which trivially has all privileges. The creator to read resources hosted by this world. of a world is its default owner, but may add The 𝑝𝑢𝑟𝑝𝑜𝑠𝑒𝑠 element of 𝑟𝑒𝑙𝑖 specifies a set other owners and/or give up the owner role of legitimate reasons or “purpose codes” for to other agents. which a particular activity needs to be per- When an agent traverses a relationship to formed. Annotating a purpose code for each reach a new world, the legal capacity in which data access, helps in establishing official le- the agent performs any operation in the tar- gitimacy for the access. The 𝑝𝑢𝑟𝑝𝑜𝑠𝑒 code is get world is a concatenation of all the roles represented as an enumerated list of values. played by the agent beginning from the world The 𝑟𝑜𝑙𝑒𝑠 element of 𝑟𝑒𝑙𝑜 in Eq 5 specifies representing the agent. the roles within the source world that are en- Figure 2 shows an example. Here, an agent titled to traverse the given relationship edge. who is a user named Dr. Ram is accessing 37 Figure 2: Role Tunneling some data stored in a world called Sharada. pathway from the agent to the dataset based Sharada has implemented a template called on legal arrangements between worlds. Clinic, and it is in relationship with another A role tunnel is valid if each element in the world called Fortis, which has implemented role tunnel satisfies their corresponding 𝑟𝑒𝑙𝑖 a template called Hospital. The world for the constraints, and the last element in the tun- user Ram, is also in relationship with Fortis, nel represents the 𝑂𝑤𝑛𝑒𝑟(𝑝) role, where 𝑝 is with the role of Doctor. The relationship be- the id of the agent performing the access. tween the Hospital and the Clinic enables a Each resource stored in a world also has Doctor of the Hospital to appear as Advisor stored along with it, the legal capacity by which in the Clinic, which gives them some privi- it was brought there. Formally, a resource 𝑠 leges over the data. in a world has the following fields: Here, when Dr. Ram accesses some data element 𝑑 stored at Sharada, the data access 𝑠 = (𝑑, 𝑐, 𝑡𝑡𝑙) (6) point would look as follows: Here, 𝑑 is the data element, and 𝑐 is the le- gal capacity by which the data element came 𝑑𝑎𝑡 = (𝑟𝑒𝑎𝑑(𝑑), to be stored in the world. If the data element 𝐴𝑑𝑣𝑖𝑠𝑜𝑟(𝑆ℎ𝑎𝑟𝑎𝑑𝑎) ∶ is local to the world and was not imported from elsewhere, the 𝑐 field would be null. 𝐷𝑜𝑐𝑡𝑜𝑟(𝐹 𝑜𝑟𝑡𝑖𝑠) ∶ 𝑂𝑤𝑛𝑒𝑟(𝑅𝑎𝑚), A data element with a string of multiple 𝐷𝑖𝑎𝑔𝑛𝑜𝑠𝑡𝑖𝑐𝑠) roles for its legal capacity represents a remote data element brought in from a remote source. The last term 𝐷𝑖𝑎𝑔𝑛𝑜𝑠𝑡𝑖𝑐𝑠 represents the All remote data elements also have a “Time purpose code, indicating the official purpose To Live (TTL)” parameter, which indicates the for which the data is being accessed. The length of time until which it can be stored at string: 𝐴𝑑𝑣𝑖𝑠𝑜𝑟(𝑆ℎ𝑎𝑟𝑎𝑑𝑎) ∶ 𝐷𝑜𝑐𝑡𝑜𝑟(𝐹 𝑜𝑟𝑡𝑖𝑠) ∶ the remote location. After the TTL expires, 𝑂𝑤𝑛𝑒𝑟(𝑅𝑎𝑚) represents the legal capacity in the data needs to be fetched again through a which the access is being made. This rep- legal role tunnel. resents a string of role and world specifica- Every access of a data element involves check- tions that leads up from the agent to the data ing the validity of the legal capacity. A role source. tunnel of the form: 𝑟𝑛 (𝑤𝑛 ) ∶ ⋯ ∶ 𝑟2 (𝑤2 ) ∶ The string representing the legal capacity 𝑟 (𝑤 ) ∶ 𝑂𝑤𝑛𝑒𝑟(𝑤) requires 𝑛 + 1 integrity is called a Role Tunnel, since it creates a legal 1 1 38 checks before the data access can be made ileges, in all branches contained in 𝐻 . possible. If the legal capacity of the agent fails to hold when accessing a remote data Template visibility: Templates are treated element that is cached in its world, then the like any resource, and can be created within data element is removed from the world. Sub- any world by agents who have write privi- sequent access to the data element requires leges on the world. Other worlds that have the agent to approach the source world of read privileges on a given world 𝑤 can ac- the data element through an legal role tun- cess and implement the templates defined in nel, and fetch it once again. world 𝑤. When a template 𝑡 that is defined Note that a legal capacity represents a logi- in world 𝑤 is used in another world 𝑤 ′ , it cal tunnel. A role tunnel of the form 𝑟𝑛 (𝑤𝑛 ) ∶ is treated as a remote resource in 𝑤 ′ and the ⋯ ∶ 𝑟2 (𝑤2 ) ∶ 𝑟1 (𝑤1 ) ∶ 𝑂𝑤𝑛𝑒𝑟(𝑤) does not role tunnel with which 𝑡 was accessed, is stored require the data to physically flow through along with 𝑡, in addition to the TTL parame- all the intermediate worlds in the tunnel be- ter. Use of the template data access points, or tween 𝑤𝑛 and 𝑤. The interim worlds are re- creation or deletion of relationship instances quired only for establishing the legitimacy of of the template will require the legal capacity the data access. The interim worlds should of the template to be satisfied. be reachable and be able to validate the given For instance, let template 𝑡 in world 𝑤 be role at the time of access. accessed through a role tunnel 𝑟2 (𝑤2 ) ∶ 𝑟1 (𝑤1 ) ∶ Data and network level security in the form 𝑂𝑤𝑛𝑒𝑟(𝑤). The use of this template for ac- of encryption and secure communication, will cessing a data element and/or defining a rela- need to be implemented in addition to the tionship, will require the above legal capacity mechanisms of the Multiverse. The Multi- to be valid. The template 𝑡 will also need to be verse framework only provides a system for retrieved once again after its 𝑡𝑡𝑙 has expired. creating legally tractable privilege frameworks An expired template will return false for all across independent institutional contexts. its relationships and data access points. At any point during the use of a template, if the Role inheritance: Containment of worlds template role tunnel is not satisfied, the tem- have special semantics in terms of inheritance plate is marked as expired and will be unus- of roles. Suppose world 𝑤2 is contained in able, until it is retrieved again from the source. 𝑤1 and both implement a template 𝑡. In such Templates can also be subclassed from other cases, any agent playing a given role 𝑟 in the templates to form a conceptual subsumption container world, also gets the privilege of role tree. If template 𝑡 ′ is a subclass of template 𝑡, 𝑟 in the contained world. This enables aggre- then 𝑡 ′ inherits all the data access points and gation of similar worlds into a larger world, relation specifications from 𝑡. The subclass 𝑡 ′ and defining privileges on the larger, container can override definitions of data access points world, rather than on each world individu- and/or relation specifications to apply to the ally. world implementing the subclass template. Hence for example, if a Hospital 𝐻 has sev- eral branches each implementing a template Access risk: Suppose that a remote data el- of type Hospital, with each branch contained ement 𝑑 is cached in a world 𝑤 using the fol- within the larger world 𝐻 , then any agent lowing role tunnel: 𝑟𝑛 (𝑤𝑛 ) ∶ ⋯ ∶ 𝑟1 (𝑤1 ) ∶ playing a role (say, 𝐷𝑜𝑐𝑡𝑜𝑟) in 𝐻 would also 𝑂𝑤𝑛𝑒𝑟(𝑤). Accessing this data element 𝑑 re- get to play the same role with the same priv- quires 𝑛 +1 integrity checks to be made. Now 39 suppose that a given role 𝑟𝑖 (𝑤𝑖 ) is implemented it also opens up questions about how easy by world 𝑤𝑖 using template 𝑡𝑖 that itself is would it be for the framework to be compro- fetched using yet another role tunnel 𝑟𝑡(𝑡𝑖 ). mised. Validating 𝑟𝑖 (𝑤𝑖 ) will now require validating In this section, we will consider several ad- the role tunnel for the template that has de- verserial scenarios that could potentially af- fined 𝑟𝑖 . This validation may in turn require fect the integrity of data exchange, and see further validations of further templates along how the framework addresses such situations. the way. In order to reduce and limit this unfolding Scenario 1: False implementation of a tem- of role tunnel integrity checks, data access is plate: One of the constraints for a world characterized by a notion of access risk, de- to form a relationship with another world, is noted by a parameter 𝜌 ∈ [0, 1]. This repre- the 𝑖𝑚𝑝𝑙𝑒𝑚𝑒𝑛𝑡𝑠(𝑡) that requires the source or sents a decay parameter computing a proba- target world to have implemented template 𝑡. bility function, which defines whether an in- Since any world can implement a given tem- tegrity check is made at a given level. plate, it can be possible that the implement- For the initial level of data access (also called ing world is a bogus world that appears like level 0), where integrity check is done for the an instance of 𝑡. role tunnel from which a data element is re- For instance, suppose a role of type 𝐷𝑜𝑐𝑡𝑜𝑟 trieved, the integrity check is performed with can be established between a person and a a probability (1 − 𝜌)0 . For the next level of in- world of type “Hospital” (that is, the world tegrity checks, where the templates defining has implemented the “Hospital” template). Since the roles are themselves validated, integrity any world can implement any template, it could check is initiated with a probability (1 − 𝜌)1 . be possible that the world is not actually a Similarly, integrity check at level 𝑘 is initi- hospital, but a bogus world implementing the ated with a probability (1 − 𝜌)𝑘 . Hence, the template. higher the value of 𝜌 the lesser the levels to Such a scenario is possible, only if the “Hos- which integrity check is performed, and the pital” template is publicly available. To pre- greater the access risk. vent fake representations, important templates Access risk is a parameter that is set by the should be defined in a world representing a agent performing a read operation, balancing certifying authority, and read access granted between speed of access and guarantee of le- to worlds based on an offline verification of gal authenticity of the access. their authenticity. Scenario 2: False implementation of a re- 4. Adverserial Scenarios lationship: The 𝑟𝑒𝑙𝑡(𝑛𝑎𝑚𝑒, 𝑡) constraint for One of the ways in which the proposed Mul- a relationship, require the source or target tiverse framework differs from Roles Based world to have a relationship called 𝑛𝑎𝑚𝑒 with Access Control (RBAC) is the absence of an a world implementing template 𝑡. overarching role-granting authority. Role spec- In such a case, there could be two levels ifications are defined in templates that are in at which information can be falsified– either turn defined within worlds and exchanged acrosstemplate 𝑡 does not have a relationship named them based on access privileges. 𝑛𝑎𝑚𝑒, and/or the world implementing tem- While this provides enormous flexibility and plate 𝑡 is a bogus world. scalability for the access control framework, 40 In either case, the main security mecha- template, even though it was legally required nism is to control the definition of 𝑡. If tem- to discontinue its use. What would be the plate 𝑡 is defined by a certified authority and repercussions of such a case? be made accessible to worlds based on offline There are two safeguards that addresses cases validation of their credentials (which is a one- involving such malicious intermediaries. The time activity), both levels of falsification can first is the 𝑡𝑡𝑙 parameter for the template, which be addressed. limits the duration until which, the template will be illegally valid. The second safeguard Scenario 3: Unauthorized read of third- is the access risk 𝜌 parameter by the agent party data from a world: Suppose that Dr. performing a read. If the data being accessed Ram has accessed data about a patient from is very sensitive, the reader may set the ac- Sharada clinic from the example from Figure 2. cess risk 𝜌 to a low value, which will force When the resource is copied to the world of integrity check for the template that defines Dr. Ram, would it now be accessible to other a role. agents who have a read privilege on this world? To answer this, we need to note that the le- gal capacity with which the data element was 5. Case Studies accessed is also stored along with the data In this section, we will consider some case element. In this example, the legal capacity study applications where a Multiverse frame- is: 𝐴𝑑𝑣𝑖𝑠𝑜𝑟(𝑆ℎ𝑎𝑟𝑎𝑑𝑎) ∶ 𝐷𝑜𝑐𝑡𝑜𝑟(𝐹 𝑜𝑟𝑡𝑖𝑠) ∶ work would be useful. 𝑂𝑤𝑛𝑒𝑟(𝑅𝑎𝑚). This Role Tunnel representing the legal capacity is stored along with the re- source in the 𝑅𝑎𝑚 world, and has to be valid 5.1. CET Score Verification at the time of accessing the data element. Hence,Many countries have some form of a Com- another agent who is trying to access this data mon Entrance Test (CET) for graduate admis- element, will be able to do so, only if the agent sions. Applicants who take the test use these satisfies all the roles in the Role Tunnel: scores to gain admissions in universities. The 𝐴𝑑𝑣𝑖𝑠𝑜𝑟(𝑆ℎ𝑎𝑟𝑎𝑑𝑎), 𝐷𝑜𝑐𝑡𝑜𝑟(𝐹 𝑜𝑟𝑡𝑖𝑠) and number of universities who recognize CET 𝑂𝑤𝑛𝑒𝑟(𝑅𝑎𝑚). That is, the agent should not scores may be large, and may vary over time. only be listed as a co-owner of the world 𝑅𝑎𝑚, In addition, several other organizations may but should also be listed as a 𝐷𝑜𝑐𝑡𝑜𝑟 in the also consider CET scores for hiring employ- world 𝐹 𝑜𝑟𝑡𝑖𝑠 and as 𝐴𝑑𝑣𝑖𝑠𝑜𝑟 in the world ees. 𝑆ℎ𝑎𝑟𝑎𝑑𝑎. These organizations will need to indepen- dently verify scores of an applicant from the Scenario 4: Malicious Representation: In CET database. This process can be securely the example from Figure 2, suppose that the automated using the Multiverse framework hospital has implemented its 𝐻 𝑜𝑠𝑝𝑖𝑡𝑎𝑙 tem- as follows: plate by downloading it from a regulatory agency Applicant 𝐴 takes the 𝐶𝐸𝑇 which is re- 𝑅, that recognizes hospitals and issues cer- quired for admission at 𝑋 𝑌 𝑍 University. In tificates and templates for their operations. this setup as shown in Figure 3, there are three Suppose that the hospital loses its recogni- entities student 𝐴, university 𝑋 𝑌 𝑍 and 𝐶𝐸𝑇 tion due to some malpractice, and is no longer administering organization, each of which have eligible to use the 𝐻 𝑜𝑠𝑝𝑖𝑡𝑎𝑙 template. How- their own worlds. Applicant 𝐴 implements ever, the hospital continues to implement the the template of 𝑃𝑒𝑟𝑠𝑜𝑛, 𝐶𝐸𝑇 implements the 41 Figure 3: Common Entrance Exam Score Verification Application template of 𝐴𝑝𝑝𝑙𝑖𝑐𝑎𝑡𝑖𝑜𝑛 and University 𝑋 𝑌 𝑍 UID details from the UID application. Since implements the template of 𝑈 𝑛𝑖𝑣𝑒𝑟𝑠𝑖𝑡𝑦. Fol- the bank is a well known entity which has lowing relationships exist among the worlds: been pre-verified and pre-approved, it has read Applicant 𝐴 plays the role of prospective stu- rights on the UID world. Next, 𝐴 wants to dent in the world of 𝑋 𝑌 𝑍 University and a rent a house 𝐻 𝐼 𝐷. Before 𝐴 is accepted as a test applicant for 𝐶𝐸𝑇 application. Once 𝐴 tenant, house 𝐻 𝐼 𝐷 needs to verify the iden- has completed the 𝐶𝐸𝑇 , the scores are stored tify of 𝐴. Since house 𝐻 𝐼 𝐷 is not a central- in their database. 𝐴 also informs the 𝐶𝐸𝑇 ized entity, it does not have direct access to application regarding universities / organiza- the UID application. However, it is a well- tions that s/he is applying to. In turn, 𝑋 𝑌 𝑍 known fact that 𝐴′ 𝑠 identity is valid if s/he university requests to access the 𝐶𝐸𝑇 scores has an account in 𝐴𝐵𝐶 bank. And thus house of applicant 𝐴 and if the constraints such as 𝐻 𝐼 𝐷 accepts bank account details (like ac- applicant 𝐴 exists, has valid scores, and has count number and address) as valid identity applied to university 𝑋 𝑌 𝑍 are all satisfied, proof of 𝐴. The role tunnel is complete if 𝐴 then 𝐴′ 𝑠 scores are securely shared with 𝑋 𝑌 𝑍 requests 𝐴𝐵𝐶 bank to share the account de- university. tails with 𝐻 𝐼 𝐷. 5.2. Identity Validation 6. Conclusions Most countries have a unique ID (UID) of all its citizens. It is used to uniquely identify the Data utility needs to contend with three con- citizens and this UID is mandatory for a vari- flicting concerns– transparency, privacy and ety of purposes like opening a bank account, security. Most of the solutions to address these buying / renting a property etc. Even in this concerns have thus far required a larger in- scenario, Multiverse framework can be used stitutional framework, that regulates access. as follows: Extending the legitimacy of access control across As shown in Figure 4, let’s say person 𝐴 is organizational boundaries in an open-ended a citizen and his UID details are stored in the fashion had always been a challenge. UID application. When 𝐴 wants to open an The Multiverse framework proposed in this account in 𝐴𝐵𝐶 bank, the bank validates the paper addresses this problem, and uses role 42 Figure 4: UID Validation Application tunneling as the mechanism for extending inter- ware Engineering, Springer, 2008, pp. organizational access regulations in an open- 59–115. ended fashion. The Multiverse framework only [2] K.-D. Schewe, B. Thalheim, Concep- addresses legitimacy of access control. Pro- tual modelling of web information sys- tection of the data itself is a different issue tems, Data & knowledge engineering 54 that is addressed by encryption and secure (2005) 147–188. communication protocols. Similarly, protec- [3] E. Bertino, P. A. Bonatti, E. Ferrari, Tr- tion of the data after it has been accessed– for bac: A temporal role-based access con- example, by malicious agents taking a photo- trol model, ACM Transactions on Infor- graph of the data displayed on their screens– mation and System Security (TISSEC) 4 are also outside the scope of the framework. (2001) 191–233. The Multiverse framework is primarily meant [4] R. Sandhu, D. Ferraiolo, R. Kuhn, et al., to record and establish legal channels for han- The nist model for role-based access dling of sensitive data, and for establishing control: towards a unified standard, in: provenance and audit logging of data access ACM workshop on Role-based access in the form of role tunneling. control, volume 10, 2000. [5] R. S. Sandhu, E. J. Coyne, H. L. Feinstein, C. E. Youman, Role-based access control References models, Computer 29 (1996) 38–47. [6] B. Fabian, T. Ermakova, P. Junghanns, [1] E. Börger, B. Thalheim, A method for Collaborative and secure sharing of verifiable and validatable business pro- healthcare data in multi-clouds, Infor- cess modeling, in: Advances in Soft- mation Systems 48 (2015) 132–150. [7] M. H. Kang, J. S. Park, J. N. Froscher, 43 Access control mechanisms for inter- puting information, in: First Interna- organizational workflow, in: Proceed- tional Conference on Security and Pri- ings of the sixth ACM symposium on vacy for Emerging Areas in Communi- Access control models and technolo- cations Networks (SECURECOMM’05), gies, 2001, pp. 66–74. IEEE, 2005, pp. 384–396. [8] J. S. Park, Role-based access control [16] C. A. Kunzinger, Integrated system for to computing resources in an inter- network layer security and fine-grained organizational community, 2017. US identity-based access control, 2006. US Patent 9,769,177. Patent 6,986,061. [9] S. Agrawal, C. Jog, S. Srinivasa, In- [17] R. K. Thomas, Team-based access con- tegrity management in a trusted util- trol (tmac) a primitive for applying role- itarian data exchange platform, in: based access controls in collaborative OTM Confederated International Con- environments, in: Proceedings of the ferences" On the Move to Meaningful second ACM workshop on Role-based Internet Systems", Springer, 2014, pp. access control, 1997, pp. 13–19. 623–638. [18] M. Janssen, Y. Charalabidis, A. Zuider- [10] S. M. Eckartz, W. J. Hofman, A. F. wijk, Benefits, adoption barriers and Van Veenstra, A decision model for data myths of open data and open govern- sharing, in: International conference on ment, Information systems manage- electronic government, Springer, 2014, ment 29 (2012) 258–268. pp. 253–264. [19] D. Roman, N. Nikolov, A. Putlier, [11] R. Meijer, S. Choenni, R. S. Alibaks, D. Sukhobok, B. Elvesæter, A. Berre, P. Conradie, Bridging the contradic- X. Ye, M. Dimitrov, A. Simov, M. Zarev, tions of open data, in: Proceedings 13th et al., Datagraft: One-stop-shop for European Conference on eGovernment, open data management, Semantic Web Como, Italy, 2013, pp. 329–336. 9 (2018) 393–411. [12] R. Meijer, P. Conradie, S. Choenni, Rec- [20] J. R. da Silva, J. A. Castro, C. Ribeiro, J. C. onciling contradictions of open data re- Lopes, Dendro: collaborative research garding transparency, privacy, security data management built on linked open and trust, Journal of theoretical and data, in: European Semantic Web Con- applied electronic commerce research 9 ference, Springer, 2014, pp. 483–487. (2014) 32–44. [21] B. Malin, A computational model [13] S. Srinivasa, S. V. Agrawal, C. Jog, to protect patient data from location- J. Deshmukh, Characterizing utilitar- based re-identification, Artificial intel- ian aggregation of open knowledge, in: ligence in medicine 40 (2007) 223–239. Proceedings of the 1st IKDD Confer- [22] G. Loukides, J. C. Denny, B. Malin, The ence on Data Sciences, 2014, pp. 1–11. disclosure of diagnosis codes can breach [14] P. Samarati, S. C. de Vimercati, Access research participants’ privacy, Jour- control: Policies, models, and mech- nal of the American Medical Informat- anisms, in: International School on ics Association 17 (2010) 322–327. Foundations of Security Analysis and [23] B. Malin, G. Loukides, K. Benitez, Design, Springer, 2000, pp. 137–196. E. W. Clayton, Identifiability in [15] U. Hengartner, P. Steenkiste, Exploiting biobanks: models, measures, and miti- hierarchical identity-based encryption gation strategies, Human genetics 130 for access control to pervasive com- (2011) 383.