=Paper=
{{Paper
|id=Vol-1873/IWPE17_paper_21
|storemode=property
|title=Privacy Broker: Message Oriented Middleware to Implement Privacy Controls in Schibsted’s Ecosystem of Services
|pdfUrl=https://ceur-ws.org/Vol-1873/IWPE17_paper_21.pdf
|volume=Vol-1873
|authors=Narasimha Raghavan Veeraragavan,Karen Lees
|dblpUrl=https://dblp.org/rec/conf/sp/VeeraragavanL17
}}
==Privacy Broker: Message Oriented Middleware to Implement Privacy Controls in Schibsted’s Ecosystem of Services==
Privacy Broker: Message-Oriented Middleware to
implement Privacy Controls in Schibsted’s
Ecosystem of Services (Industry Article)
Narasimha Raghavan Veeraragavan Karen Lees
Privacy Engineering Privacy Engineering
Schibsted Products and Technology Schibsted Products and Technology
Oslo, Norway Oslo, Norway
narasimha.raghavan@schibsted.com karen.lees@schibsted.com
Abstract—Schibsted is a global media and classified ads con-
glomerate with more than 200 million unique users per month, End users
operating mainly from Europe. The company is currently being
transformed away from traditional paper media and siloed sites Clients
Clients Clients
towards a unified global media giant. As part of this transfor- (Schibsted Sites and Apps)
mation, Schibsted needs to collect a wide variety of datasets
such as profile, behavior, location, payment and communication
messages about the user in order to provide personalized content Client facing services for
payment platform Event Processing Services
and target advertisements to the end users.
With the new EU General Data Protection Regulations
User Modeling
Core Payment Services
(GDPR) taking effect from May 25 2018, each user using our Services
3rd Party
products has a right to decide how his/her datasets should be AD
Server
Payment
governed or used in our products. To this end, we are building Service
Payment Provider Adapter
Services
User Profile Services
Providers
some privacy controls for the end users. These privacy controls
are realized by a message-oriented middleware.
Financial Reporting Segment Calculation and
In this paper, we present a case study of design of a centralized Services Serving Services
topic based pub/sub style of middleware towards implementing Payment Service Layer Targeted Advertising Service Layer
the privacy controls in Schibsted’s ecosystem of services.
Fig. 1. Payment and Targeted Advertisement Service Layerings of Schibsted
I. I NTRODUCTION Ecosystem
Schibsted is a global media and classifed ads conglomerate
in 30 countries with more than 200 million users/month, 20 Payment service layering is responsible for payment trans-
billion page views/month and collecting more than 700 million actions of end users. The entry point to the payment ser-
events per day. vice layer contains a set of services that handles all the
Schibsted’s ecosystem consists of several hundreds of in- incoming traffic to payment platform, performs authorization
dependent services that have been evolving organically rather and authentication of end users, and delegates invocations to
than a centralized top-down design. Due to the organic nature, downstream services. The core payment service layer has a
these services work together as many layers of dependencies bunch of services that handles all payment related opera-
rather than strict tiers (hierarchical) as used in traditional large tions such as authorize, cancel, capture, reverse, get, search
scale system design. etc. Furthermore, after executing all the necessary steps in
Every service has an owner (a team) responsible for creat- the core layer, the call is passed to the next layer of pre-
ing, operating, maintaining and deprecating the service. Every processing payment adapter services where in the appropriate
service owner is responsible for knowing their clients and pre-processing payment adapter is used to have request/reply
also their dependencies on other services. Additionally, service type of communication to the corresponding payment service
owners justify the existence of their services through its usage provider. All these communications are passed to the financial
and business value. reporting layer of services for financial reporting and tracking
An emergent property of Schibsted’s ecosystem is neat reasons.
service layering, where in the services are organized in logical Targeted advertising service layering is responsible for pro-
layers of functionality depending upon the scenarios. Figure 1 viding targeted advertisements to the end user based on their
shows two examples of service layering pattern in Schibsted interests and demographics. When the users visit Schibsted’s
ecosystem: a) Payment b) Targeted Advertising. sites, the behavorial and location events are generated and
sent to the event processing service layer which process these honors the users’ choices in a correct manner and b) security
events and pass the behavorial events to the user model- discussions. We believe these topics are complex and worth
ing layer which predicts the characteristics of users. These separate discussions on different papers in the future.
characteristics along with the location events from the event
processing layers are used to generate the profiles for the II. T YPES AND C ONSTRAINTS FOR P RIVACY C ONTROLS
anonymous users and complete the profile for identified users We broadly classify the common privacy controls offered
with the help of profile service layer. The input from the profile to the end users into two types: stateful and stateless.
service layer is used to calculate the appropriate ad segment Stateful controls represent the opt-out and opt-in type of
for the user and the calculated segment is served to the 3rd controls where the users’ choices need to be persisted and
party ad provider. continuously respected by the corresponding services until the
For each privacy control to be implemented in the ecosystem users’ change their choices. For example, if the user chose
of services, it is important to map the corresponding service to opt-out of targeted advertising based on his demographics
layering that gets affected. Additionally, within the service and interests, then this choice must be persisted. Furthermore,
layering, the appropriate layers and the services within that the demographic and interests data feeding services reflect
layers should also be mapped in order to send the privacy this choice by continuously denying the access of opted out
signals to these services. users’ datasets to the target advertising services which in turn
For example, the opt-out or opt-in control for targeted will stop the targeted advertising services to generate any new
advertising shall affect only the targeted advertising service targeted advertisements for the opted out user.
layer and has nothing to do with the other servicing lay- Stateless controls represent the data deletion request types
ers within the ecosystem. Furthermore, within the targeted of controls where the users’ requests are temporarily stored
advertising layer, the first three layers of services (event until the relevant services honor the users’ requests. The
processing, user modeling and user profiling) are also a part services honor the request exactly once unlike in the case of
of personalized content serving layer (another servicing layer stateful controls. For example, if the user issues data deletion
within the ecosystem) where in the personalized content is request for all his personal data, then the relevant services that
served to the end users. control the personal data need to execute their corresponding
The goal of targeted advertising privacy control is to af- data deletion logic. After the successful completion of the
fect only the targeted advertising scenario and not the other execution, these services do not need to execute the deletion
scenarios. Accordingly, whenever a user chooses to opt-out logic again until a new request comes in.
of targeted advertising based on a particular category, the In order to stick to the privacy by design principles and the
changes include the following: Services in the User Profile guidelines offered by the Schibsted’s privacy office, the design
Service layer update their ACL permission model, which in and implementation of the stateful controls should satisfy the
turn does not allow the services in the Segment Calculation following constraints:
and Servicing layers to access the attributes associated with
the opted out categories. Additionally, all the existing segments A. Constraints for Stateful controls
calculated and stored based on the opted out categories of the 1) Stateful controls have two states: opt−in and opt−out.
user in the segment calculation and service layers should be The user can choose between one of these two states.
deleted. 2) The front-end tool that offers the stateful controls to the
A key observation from the above scenario is that there are end users should always display the latest states of the
several services that get affected based on a single privacy controls persisted in the system.
event (opt out of targeted advertising based on a category). 3) After the relevant services start honoring the latest state
A similar pattern is observed for other privacy controls. This of the privacy controls, then the services should continue
in turn motivates us to build a centralized platform (similar honoring the last known latest states until the services
to the well known communication style of publish/subscribe are aware of next latest states.
systems) to map the privacy events to the potential services 4) If the states are not persisted due to failures, then the
and track these events to make sure that the users’ choice are user should be asked to retry the operation later. These
reflected in the system. should be kept minimum as it is a bad user experience.
The rest of the paper is organized as follows: Section II 5) If there are multiple states generated via the same
describe the types and constraints of privacy controls. Section stateful control within short time span from the same
III briefly explains the architecture of privacy broker, a central- user, then only the latest persisted state needs to be
ized publish/subscribe style middleware towards enabling the eventually honored.
privacy controls. Then, we discuss how the constraints men-
tioned in Section II are satisfied in Section III. Furthermore, III. A RCHITECTURE OF P RIVACY B ROKER
we present the related work section. Finally, we conclude this In this section, we describe the technical architecture of
paper with the future work. the privacy broker that facilitates the interaction between
Additionally, the following topics are considered outside the different services towards honoring the users’ privacy
the scope of this paper: a) verifying whether the backend choices. The communication paradigm of the architecture is
1 2
6a Moreover, our sites outside Europe have different regula-
User
Privacy Broker Console Notification
Module
tions. Due to these reasons, we require customized privacy
1 2 6a 7 4a Backend service 1 controls per geographical region.
Front-end Tool 1
Privacy Compliance
Monitor 8 Publishers generate the privacy events whenever users’ use
4b Backend service 2
Front-end Tool 2 6 the privacy controls. Each privacy event corresponds to a topic
Privacy Broker API
1 1
in a well known pub/sub model. Publishers introduce new
Privacy Broker Engine
2 2 4c Async Backend service 3
Broker
3 3 3a DB
5 5
topics (after discussing with the Privacy Office) to the privacy
Front-end Tool n
Privacy Event Profile 4d Sync Backend service n broker console.
3b DB
Publishers 3c 3c
4 Privacy Event
Subscribers are the various proprietary backend services of
Subscribers
Simple Queue Service
3c
Schibsted that are part of one or more service layerings as
5
Simple Notification Service described in the Section I. A subscriber can be a service or
a group of services. Subscribers subscribe to the available
Fig. 2. Privacy Broker Architecture topics via the privacy broker console. After subscribers receive
the appropriate events from the broker, subscribers make
appropriate changes to their services towards honoring the
loosely based on popular topic based pub/sub model, where users’ choices and notify the completion of changes back to
in each topic represents a privacy control. At one end of the the broker (flow #5 in Figure 2).
architecture, the publishers are the front-end tools that generate
the privacy signals (such as opt-out of targeted advertising) and B. Privacy Broker Console
at the other end are the subscribers, which are essentially the The main design goal of the broker console is to be a self-
backend Schibsted services. In addition to routing the signals service portal for the publishers and subscribers to configure
to the appropriate subscribers, it is important to track the status the privacy broker towards enabling the privacy controls in the
of the subscribers with respect to honoring the users’ choices. Schibsted ecosystem of services.
The high-level architecture consists of publishers, sub- The self-service portal is accessible only to the developers
scribers and five core components: a) Privacy Broker API, within Schibsted. The common functionalities of the broker
b) Privacy Broker Engine, c) Privacy Broker Console, d) console include:
Compliance Monitor and e) User Notification Module that • Register new publishers and subscribers
are essential for enabling privacy controls in Schibsted’s • Register new privacy controls for a publisher
ecosystem of services. • Update the broker configurations for publishers and sub-
scribers
A. Publishers and Subscribers • Delete publishers and subscribers
Any communications related to the configurations of pub-
Publishers are the various end user facing front-end tools
lishers and subscribers to the privacy broker happen via the
that are available at Schibsted digital products (sites and apps)
broker console as shown in data flow #1 and flow #2 in
in several countries across the world. These front-end tools
Figure 2 respectively. Additionally, all the configuration details
provide customized privacy controls based on the geographical
given by the publishers and subscribers are validated in the
region and the nature of the digital products.
broker engine and then persisted in the broker database. If
For example, the privacy controls offered to the newspaper there are any incorrect configuration details detected by the
sites are different in compared to the dating sites due to the broker engine, then appropriate error messages are shown to
inherent nature of content of these two sites. the corresponding publishers or subscribers.
Furthermore, even though the GDPR will imply more The configuration parameters for publishers and subscribers
similar privacy rules across Europe there will still be room are stored in the broker database as described in the Table I
for some interpretations by national regulators that we have to via the Broker API. Sample configurations of the publishers
take into account. For example, the general GDPR rule is that and the subscribers stored in the broker database are shown
we can not process data about individuals younger than age 16 in Table II and Table III.
without parental consent. However, the regulation leaves room
for the countries implementing the GDPR in their national C. Privacy Broker API
laws to set the bar as low as 13 instead.
The main design goal for the privacy broker API is to
In addition to the age factor, other differences include the provide endpoints for publishers, subscribers and console to
following areas, interact with the core broker engine.
• Deletion: When is data sufficiently deleted? The direct communications of the publisher and subscribers
• Security Measures: What measures need to be in place to the privacy broker API endpoints are guarded with the help
in order for security to be at sufficient level? of SDKs. SDKs ensure the communication from the broker
• Opt-out/opt-in: When do we need opt-in or opt-out as a clients (publishers, subscribers) are happening in a consistent
default option? (all clients using same protocols and messaging format), secure
TABLE I D. Privacy Broker Engine
I MPORTANT C ONFIGURATION PARAMETERS FOR P UBLISHERS AND
S UBSCRIBERS The key purposes of the privacy broker engine are the
following:
Parameters Definition
Publisher ID ID that uniquely identifies the • Process the incoming requests from broker clients and up-
front-end tool that provides privacy date the broker configuration and user profile databases.
controls within Schibsted ecosys-
• Match the incoming privacy events from publishers to
tem.
Topic Type Privacy control that will be used by appropriate subscribers.
the end users within that publisher • Disseminate the events from publishers to subscribers.
ID.
Async User Notificaiton Type Mode of notifications to the end 1) Processing Logic for the Requests Coming via Con-
users about the status of their pri- sole: Developers from Publishers (front-end) and Subscribers
vacy requests triggered via the pri- (backend services) teams provide configuration parameters via
vacy controls. Refer Table VIII for
different types of notifications. console UI to the broker (flow #1 and flow #2 in Figure 2),
Failure Retry Count Number of retries that need to be the parameters of configuration are validated with the help of
performed in case the privacy bro- corresponding configuration schemas and with routine input
ker is not reachable from the pub-
lisher. validation. After validation, the broker configuration database
Retry Delay Gap Time difference between two suc- is updated. Sample publisher and subscriber configurations in
cessive failure retries. broker database are shown in Table II and Table III.
Subscriber ID ID that uniquely identifies a service
or a group of services that needs 2) Processing Logic for the Privacy Events Coming via
to make changes to their internal Publisher: The privacy events that are coming via the pub-
behaviors and states towards hon- lisher are classified into stateful and stateless privacy events as
oring the users’ choices.
Time to Honor Total time taken for the backend described in Section II. For both stateful and stateless privacy
services to acknowledge, make ap- events, the broker engine creates privacy requests in the broker
propriate changes to their services, database (flows #3 and #3a in Figure 2) in order to notify the
and send completion signal to the
broker. The maximum value for corresponding subscribers and also track the progress of the
time to honor for a service corre- subscribers towards honoring the users’ choices.
sponding to a topic type is con- Table IV shows sample privacy requests table in the broker
trolled by the legal team of Schib-
sted. database. There are various status options used with the
Contact Emails for alert messages in pres- privacy request to track the request in Table IV. These statuses
ence of failures such as broker en- are explained in Table V.
gine not able to reach the backend
or the broker engine not receiving Furthermore, for stateless privacy events, the corresponding
the completion signal within men- publishers are notified that their privacy event will be eventu-
tioned Time to Honor. ally honored as soon as the requests are written in the broker
Subscription Type Mode of communication to the
backend service. Four options are
database (flow #3c in Figure 2). This in turn will be helpful
supported towards addressing wide for the publisher to display appropriate information to the end
variety of services owned by dif- users. If it is a stateful privacy event, the broker engine updates
ferent teams: a) Synchronous call,
b) Asynchrnous call, c) Amazon
the state of the privacy event to the corresponding user’s profile
Simple Queue Service [1] and d) in profile database (flow #3b in Figure 2) and then notifies the
Amazon Simple Notification Ser- corresponding publisher (flow #3c in Figure 2).
vice [1].
URI Endpoints offered by the backend
The primary reason for writing to the profile database is
services to the broker to send the two fold: a) Profile database serves as a source of truth for
privacy events. users’ profile attributes and settings for all back-end services.
Hence, it is relatively easy to implement filtering mechanisms
on top of profile attributes based on opt-out preferences when
(all clients are authenticated and authorized) and reliable ways these preferences are stored along with the profile. And, b) we
(number of retries in case of not able to reach the broker). prefer to avoid multi-master complications and keep only one
• Publisher SDKs are responsible for sending synchronous master/source of truth for all opt-out preferences to make it
privacy requests when users using stateful privacy con- simple.
trols and asynchronous privacy requests when users using For example, if the stateful privacy event is opt-out of
stateless privacy controls from the publishers. targeted advertising based on age and gender, then correspond-
• Subscriber SDKs are responsible for sending synchronous ing privacy request is created in the broker database towards
status notification that indicates the current status of notifying the subscribers and then broker engine updates the
progress towards honoring the users’ choices. corresponding user profile in profile database with his/her opt-
Both the SDKs use gRPC protocol [2] to communicate with out preferences.
the broker endpoints and use protocol buffer [3] as the 3) Matching the privacy events with subscribers: For each
messaging format. request available in the Table IV, the target subscribers can
TABLE II
S AMPLE P UBLISHER C ONFIGURATION IN THE BROKER DATABASE
Publisher ID Topic Type Async Notification Type Retry Count Retry delay gap
1234 Payment Data Deletion {User Email, User Mobile SMS} 3 2 seconds
1234 Behavioral Data Deletion {User Email, User Mobile SMS} 3 3 seconds
1234 Opt-out of Targeted Advertising (based on age & gender) {In-client with endpoint address} 3 2 seconds
TABLE III
S AMPLE S UBSCRIBER C ONFIGURATION IN THE B ROKER DATABASE .
Subscriber ID Topic Type Subscription Type URI Time to Honor Failure Retry count Retry delay gap Contact
12344 Payment Data Deletion {Async API call} https://payment.domainname/UUserID/Delete 1 day 5 2 seconds abc@schibsted.com
12346 Behavioral Data Deletion {Amazon SQS} https://sqs.eu-west.amazonaws.com/queueID 1 day 4 3 seconds def@schibsted.com
12348 Opt-out of Targeted Advertising (age, gender) {Amazon SNS} https://sns.eu-west.amazonaws.com/snsID 1 day 3 5 seconds jkl@schibsted.com
TABLE IV TABLE VII
S AMPLE P RIVACY E VENT R EQUESTS IN THE BROKER DATABASE S TATUS F IELD O PTIONS IN THE S UBSCRIPTION N OTIFICATION P ROGRESS
TABLE IN B ROKER DATABASE
Request ID Publisher ID Unique User ID Request Topic Type Request Status
123453 1234 A8910 Opt-out of Targeted Advertising (age) INIT
123442 1234 A1235 Account Data Deletion SOMEFAILED
Status Options Definition
123461 1234 B1235 Payment Data Deletion COMPLETED INIT Progress details are writ-
123430 1234 A4567 Opt-out of Targeted Advertising (gender) INPROGRESS
ten to the DB, Privacy
Event not yet sent to the
TABLE V subscriber
S TATUS F IELD O PTIONS IN THE P RIVACY E VENT R EQUESTS TABLE IN SENT The request has been sent
B ROKER DATABASE to the subscriber
ACKNOWLEDGED Subscriber has Acknowl-
Request Status Options Definition edged about the receipt of
INIT Progress details are writ- the request but not yet
ten to the broker database, honored the request
privacy event not yet sent COMPLETED Subscriber has honored
to the subscriber the user’s choice and
INPROGRESS The request is in progress completion notification
by the broker engine or by received
the subscribers SENDFAILED Failure in sending the re-
COMPLETED User’s choice is honored quest to the subscriber.
by all the necessary ser- Refers to the last attempt
vices to resend the request
SOMEFAILED At least one of the nec- FAILED Alert has been sent to
essary has failed to either the team. Used Maximum
honor or send the comple- number of retries
tion notification
TABLE VI the Table VI (flows #4, #4a, #4b, #4c and #4d in Figure 2)
S AMPLE S UBSCRIPTION N OTIFICATION P ROGRESS TABLE IN THE and updates the statuses for each subscriber to either SENT
B ROKER DATABASE or SENDFAILED in Table VI and INPROGRESS or SOME-
Request ID Subscriber ID Time to Honor Progress Status FAILED in Table IV. Furthermore, when the subscribers send
123453 12348 1 day INIT the acknowledgement and completion signals (flow #5 in
123442 12344, 12346 1 day FAILED Figure 2) back to the broker, then the engine updates the
123461 B1235 1 day SENT
123430 12348 1 day COMPLETED status to ACKNOWLEDGED and COMPLETED respectively.
Moreover, if the broker database has COMPLETED status
from all the subscribers, then the broker engine updates the sta-
be found by a simple lookup on the request topic type on the tus field of Privacy Event Requests Table IV to COMPLETED.
subscribers configuration Table III.
With the target list of subscribers, broker engine creates E. Compliance Monitor
a subscription notification progress table as shown in the Compliance Monitor monitors the status column in Table VI
Table VI with the default status to INIT. The various status and with the help of Table III, it scans for the requests that
options in the subscribers notification progress table and the have not been completed within the expected completed time.
corresponding definitions are mentioned in the Table VII It sends retries and updates the expected completion time
4) Disseminating the privacy events to the appropriate after each retry. After the maximum number of retries, the
subscribers: Using the pairing between Request ID and Sub- Compliance Monitor looks up the corresponding contact email
scriber ID in Table VI and the URI, Failure Retry Count and and mobile information from the Subscriber Configuration
Retry Delay information available in Table III, Broker engine Table III and sends alert messages to them (flow #6a in
sends the appropriate request to all subscribers available in Figure 2) and updates the status to FAILED in Table VI.
TABLE VIII 5) Constraint 5: If there are multiple states persisted within
A SYNC N OTIFICATION T YPES AND D EFINITIONS short time span for the same stateful control, the broker sends
Notification Types Definition notification for all the persisted states to the corresponding
Email Email address of the user services, the services will continue fetching the latest state
who triggered the privacy until it stops receiving the notification from the broker. Thus,
event
SMS Mobile Phone number of the latest state will be eventually honored.
the user who triggered the
privacy event V. R ELATED W ORK
In-client Push to the vendor spe-
cific mobile notification Variants of Publish/Subscribe systems [4] have been studied
services for mobile de-
vices and display in the
and used in many applications for several decades. The closest
privacy notification sec- variant to our proposed design is centralized topic based
tion of the front-end tool publish/subscribe systems. In the past decades, there have
been several topic based publish subscribe systems proposed
in academia and industry. Examples for academic focused
F. User Notification Module
topic based publish/subscribe systems include Scribe [5],
User Notification Module monitors the status column in Ta- Bayeux [6], SpiderCast [7] and PolderCast [8]. Examples for
ble IV. For each privacy request that has status COMPLETED, industry focused topic based publish/subscribe systems include
the User Notification Service finds the corresponding publisher JMS [9], Google Pub/Sub [10], Spotify Pub/Sub [11], and
ID and its preferred notification type (refer Table VIII) and cor- Apache Kafka [12].
responding notification details for that request using Table II However supporting the privacy controls in the services
(flow # 7). ecosystem like Schibsted require specific set of requirements
If the Preferred Notification Type is Email and/or SMS, the and constraints to be met. There is a need to distinguish
corresponding contact attributes are fetched from the profile between the stateful and stateless privacy events since stateful
database using the Unique User ID (flow #8). This contact privacy events correspond to the settings of users which in
information in turn will help the Notification Module to deliver turn needs to be continuously honored by the backend services
the messages to appropriate end users. If the Notification Type until the next change of settings have occurred. In case of
is In-client, then the corresponding endpoints are fetched from stateless events, after the relevant services honor the user’s
the Table II. request exactly once, the request can be removed or archived
IV. D ISCUSSION ON S ATISFYING THE P RIVACY C ONTROL for legal reasons.
C ONSTRAINTS Furthermore, real-time performance is not very critical
A. Discussion on Stateful Privacy Constraints (since from a legal perspective we have given some time to
honor the privacy requests) in compared to the reliability (able
1) Constraint 1: It is straight forward for front-end tool to deliver the messages at least once to the broker from front-
to provide these settings to the end users. The front-end tool end and back-end services and broker to the back-end services)
should make sure that there are no intermediate states such as and consistency (always the message displayed in the front-
unknown. end tool is consistent with the back-end happenings).
2) Constraint 2: After the request gets persisted in the
Moreover, the communication mode required by the back-
profile database and broker database, whenever the user logs in
end services are different and each service has different
to the front-end tool and see the privacy settings/notification
configuration parameters that need to be supported by the
pages, then the front-end can lookup for the status field in
centralized publish/subscribe system.
the Table IV in the broker database to know the status of the
Additionally, we need to keep track of all services involved
privacy request and can display the UX message accordingly.
per privacy event in order to ensure the completeness of the
If there are any failures happened before flow #3c in Figure 2,
privacy operations associated with that event.
then the users are asked to retry the operation.
To the best of our knowledge, this work is the first in
3) Constraint 3: The profile database always has latest
using the centralized publish/subscribe design pattern towards
state from the Privacy Broker. The broker engine sends only
enabling the privacy controls as expected by the GDPR
notifications to the appropriate backends indicating that there
regulations in the ecosystem of services. In other words, we
has been a change of state to the subscribed topic type and
present another use case for publish/subscribe system in the
the relevant backend services are required to fetch the latest
context of privacy engineering.
state from the profile database and act according to only the
latest state.
VI. C ONCLUSION
4) Constraint 4: This is straightforward in our design. If
there are any failures happened before flow #3c in Figure 2, In this paper, we present our customized design of simple
then the users are asked to retry the operation. Any failures pub/sub style middleware that we are implementing towards
happened after flow #3c will be resolved without the knowl- enabling the privacy controls required for GDPR regulations.
edge of the user. At the time of writing this paper, the design is implemented
and matured for integration testing and soon to be in produc- [12] J. Kreps, N. Narkhede, J. Rao et al., “Kafka: A distributed messaging
tion. system for log processing,” in Proceedings of the NetDB, 2011, pp. 1–7.
[13] S. Godik and T. Moses, “Oasis extensible access control markup lan-
It is possible that the integration testing may help us in guage (xacml),” OASIS Committee Secification cs-xacml-specification-
finding new practical challenges with the proposed design, 1.0, 2002.
which may require us to adapt the design according to the [14] S. Trabelsi, J. Sendor, and S. Reinicke, “Ppl: Primelife privacy policy
engine,” in 2011 IEEE International Symposium on Policies for Dis-
new findings. For example, if the majority of the teams are tributed Systems and Networks (POLICY), 2011, pp. 184–185.
not comfortable using SDKs due to legacy reasons or other
reasons, then we may provide secure REST APIs instead of
SDKs.
In the future, we also would like to experiment out with
the various open source policy based languages such as eX-
tensible Access Control Markup Language XACML [13] and
PrimeLife Policy Language [14] to figure out to what extent
they are usable, scalable, and adoptable in large scale service
oriented architectures such as in Schibsted’s ecosystem.
Last but not least, we have written this paper in the hopes
that our work will be helpful to the academic community in
giving the context of practical challenges in implementing the
privacy controls in the large scale system.
ACKNOWLEDGMENT
We thank the following people for their inputs and support
for this paper: Ingvild Naess (Group Privacy Officer, Schibsted
ASA) and Sverre Sundsdal (Director of Engineering, Schibsted
Products and Technology). Additionally, we are grateful to all
the engineers, product managers and legal members of the
privacy team who are working towards the success of the
Privacy Broker project within Schibsted. Finally, we thank the
reviewers for their feedbacks on the submitted version of the
paper.
R EFERENCES
[1] J. Varia and S. Mathew, “Overview of amazon web services,” Amazon
Web Services, 2014.
[2] Google, gRPC, http://www.grpc.io/.
[3] Protocol Buffers, https://developers.google.com/protocol-buffers/.
[4] P. T. Eugster, P. A. Felber, R. Guerraoui, and A.-M. Kermarrec, “The
many faces of publish/subscribe,” ACM computing surveys (CSUR),
vol. 35, no. 2, pp. 114–131, 2003.
[5] M. Castro, P. Druschel, A.-M. Kermarrec, and A. I. Rowstron, “Scribe: A
large-scale and decentralized application-level multicast infrastructure,”
IEEE Journal on Selected Areas in communications, vol. 20, no. 8, pp.
1489–1499, 2002.
[6] S. Q. Zhuang, B. Y. Zhao, A. D. Joseph, R. H. Katz, and J. D.
Kubiatowicz, “Bayeux: An architecture for scalable and fault-tolerant
wide-area data dissemination,” in Proceedings of the 11th international
workshop on Network and operating systems support for digital audio
and video. ACM, 2001, pp. 11–20.
[7] G. Chockler, R. Melamed, Y. Tock, and R. Vitenberg, “Spidercast: a
scalable interest-aware overlay for topic-based pub/sub communication,”
in Proceedings of the 2007 inaugural international conference on
Distributed event-based systems. ACM, 2007, pp. 14–25.
[8] V. Setty, M. Van Steen, R. Vitenberg, and S. Voulgaris, “Poldercast: fast,
robust, and scalable architecture for p2p topic-based pub/sub,” in Pro-
ceedings of the 13th International Middleware Conference. Springer-
Verlag New York, Inc., 2012, pp. 271–291.
[9] M. Hapner, R. Burridge, R. Sharma, J. Fialli, and K. Stout, “Java
message service,” Sun Microsystems Inc., Santa Clara, CA, p. 9, 2002.
[10] J. Reumann, “Goops: Pub/sub at google,” Lecture & Personal Commu-
nications at EuroSys & CANOE Summer School, 2009.
[11] V. Setty, G. Kreitz, R. Vitenberg, M. Van Steen, G. Urdaneta, and
S. Gimåker, “The hidden pub/sub of spotify:(industry article),” in
Proceedings of the 7th ACM international conference on Distributed
event-based systems. ACM, 2013, pp. 231–240.