=Paper=
{{Paper
|id=Vol-3276/SSS-22_FinalPaper_38
|storemode=property
|title=Centralized Versus Decentralized Digital Identity
Architectures: Simulation Models of Data Exchange
|pdfUrl=https://ceur-ws.org/Vol-3276/SSS-22_FinalPaper_38.pdf
|volume=Vol-3276
|authors=Yoshiaki Fukami,Takumi Shimizu,Teruaki
Hayashi,Hiroki Sakaji,Hiroyasu Matsushima
|dblpUrl=https://dblp.org/rec/conf/aaaiss/FukamiSHSM22
}}
==Centralized Versus Decentralized Digital Identity
Architectures: Simulation Models of Data Exchange==
Centralized Versus Decentralized Digital Identity Architectures:
Simulation Models of Data Exchange
Yoshiaki Fukami,1 Takumi Shimizu, 2 Teruaki Hayashi, 3
Hiroki Sakaji, 4 Hiroyasu Matsushima5
Keio University,1,2 The University of Tokyo,3, 4 Shiga University,5
yofukami@sfc.keio.ac.jp,1 takumis@sfc.keio.ac.jp,2
hayashi@sys.t.u-tokyo.ac.jp, 3 sakaji@sys.t.u-tokyo.ac.jp,4 hiroyasu-matsushima@biwako.shiga-u.ac.jp5
Abstract authentication, and without linking to specific ID providers
In order to utilize big data generated from distributed cloud- such as Google and Facebook.
based services, a digital ID is required to link between data From the service provider's point of view, it is
and its subjects. Decentralized Identifiers (DID) have been
developed to manage data from various services with privacy advantageous to be able to obtain and utilize diverse data at
protection. We analyzed two ID architectures, DID and low cost, and it will encourage the emergence of innovations
centralized ID (CID), with simulation models to evaluate the in the form of new services. Both architectures, CID and
efficiency of ID architectures. In a monopoly market where
there is no competition between ID providers, there is no DID, have their advantages and disadvantages, and it is
difference between DID and CID. However, if there are difficult to determine which is better simply. Therefore, we
multiple ID providers without interoperability, service use a simulation approach in order to study many factors in
providers have access to more data in the DID architecture
compared to CID. However, this result was affected by the an integrated manner.
design of the model without ID federation technologies. In multi-agent simulation, people and objects can be
Currently, service providers can receive data from many represented as agents, and phenomena resulting from their
third-party services with the ID federation standard. Also, the
simulation results that DID is very efficient for data interactions can be observed. For example, it is applied to
distribution should be carefully interpreted by considering fields such as traffic (Bazzan & Klügl, 2009), pedestrian
the upcoming costs for implementation. flow (Yamashita et al., 2014), and market transactions
(Hirano et al., 2020; Yagi et al., 2020). By confirming the
simulation results, it is possible to support decision-making
Background in planning and policy making related to them.
In recent years, consumers have come to have a large
number of user accounts linked to more and more cloud- Models
based services. This has led to the accumulation of a wide
variety of attribute data in the cloud, increasing the potential This study employs simulation models to analyze the CID
for the creation of new services, while at the same time and DID structures and their impacts on data exchange. In
developing a means of sharing data that is fragmented the CID model, each user has some data which is managed
between services in a way that is easy to use and protects the by ID providers. Service providers have their needs (i.e.,
rights of consumers. Service providers can identify which data a service provider needs to create products) and
consumers with digital IDs provided by third party try to obtain the data they need by accessing the IDs users
companies and obtain attribute data stored by other services have. Verifiers may or may not get the data depending on an
under consumer authentication. ID that bridges transactions between users and verifiers. For
Most of the data accumulated from multiple services is instance, if a verifier asks a user to share the data “a” and
linked to the ID issued by a specific small number of the user uses the ID “A” for this transaction, the verifier can
companies, and such companies also provide functions of get the data “a”. If the user uses the ID “B” in this case, the
authorization. This means that there is some risk that verifier cannot get the data. In the DID model, there is no ID
distributed data could be accumulated, analyzed and utilized provider in the transaction. A verifier directly contacts a user
for unintended use under malicious intent. The risk of and requests the data it needs. Each user decides whether
privacy infringement is increased by aggregating various he/she accepts the request from a verifier. These models aim
attribute data. While the ID federation enhances consumer to uncover the efficient data exchange structure considering
convenience, it also increases the risk of privacy breaches. various parameters such as the number of users and CID
DID is an architecture in which the entity that provides providers and the cost of transactions. Figure 1 describes the
attribute information issues digital IDs in a distributed model structures.
manner enabled by blockchain technologies. In contrast to
DID, an architecture that uses existing ID federation
technology is called a Centralized Identifier (CID). With
DID, aggregated data can be utilized only with consumer's
___________________________________
In T. Kido, K. Takadama (Eds.), Proceedings of the AAAI 2022 Spring Symposium
“How Fair is Fair? Achieving Wellbeing AI”, Stanford University, Palo Alto, California,
USA, March 21–23, 2022. Copyright © 2022 for this paper by its authors. Use permitted
under Creative Commons License Attribution 4.0 International (CC BY 4.0).
94
Discussion
The result is that service providers have access to more data
in the DID architecture compared to CID. However, this
result was affected by the design of the model that only
introduced the authentication / authorization function of
independent third parties without ID federation technologies.
Currently, service providers are able to receive data from
many third-party services with the ID federation standard
such as OpenID connect.
On the other hand, the simulation results show that
DID is very positive for data distribution. However, DID has
Figure 1: The overview of the models not been diffused yet, and it costs for both data providers
and acquirers to implement DID technology. The benefits of
DID architecture may be offset or negated by the costs of
Results dissemination, which are not reflected in this model.
We evaluate the models based on the number of data that a Future research needs more fine-grained models which
service provider can access depending on the ID structures. reflect real-world ID operations and practices being
In the CID models, the key parameter is the number of CID developed at standard developing organizations and issues
providers. If there is one CID provider, a service provider mentioned above such as ID federation and cost structures
can access all the user data via this particular CID provider. of ID architectures. This study opens up new research
Our simulation assumes 10,000 users in the model, so a avenues for digital identity structure and data exchange by
service provider can access 10,000 user data in this case. As showing a basic understanding and implications of CID
the number of CID providers increases, user data is versus DID architectures.
dispersed across CID providers and a service provider can
obtain only subsets of user data via a CID provider. In the
DID models, the key parameter is the attrition rate of service Acknowledgments
provider’s data request. Since the DID requires users to This work was supported by JSPS KAKENHI 19K23235,
manage each transaction per data record by themselves 20H02384, and 20K13599.
unlike the CID which allows CID providers to manage it, a
service provider sometimes cannot obtain the data due to
this burden of user’s data management. Figure 2 shows the References
results of our simulation models considering various levels
of key parameters. As the graph indicates, the number of Bazzan, A.; and Klügl, F. (Eds.). 2009. Multi-Agent Systems for
Traffic and Transportation Engineering. IGI Global.
data that a service provider can access dramatically
doi.org/10.4018/978-1-60566-226-8
decreases as the number of CID providers increases. On the Hirano, M.; Izumi, K.; Matsushima, H., and Sakaji, H. 2020.
other hand, the number of accessible data in the context of Comparing Actual and Simulated HFT Traders’ Behavior
DID stays relatively large even in the case of high attrition for Agent Design. Journal of Artificial Societies and Social
rate. Simulation, 23(3). doi.org/10.18564/jasss.4304
Yagi, I.; Masuda, Y.; and Mizuta, T. 2020. Analysis of the Impact
of High-Frequency Trading on Artificial Market Liquidity.
IEEE Transactions on Computational Social Systems,
7(6): 1324-1334. doi.org/ 10.1109/TCSS.2020.3019352.
Yamashita, T.; Matsushima, H.; and Noda, I. 2014. Exhaustive
analysis with a pedestrian simulation environment for
assistant of evacuation planning. Transportation Research
Procedia, 2: 264–272. doi.org/10.1016/j.trpro.2014.09.047
Figure 2: The number of accessible data in CID/DID
95