<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Assessing Privacy in Social Media Aggregators</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Gaurav Misra</string-name>
          <email>g.misra@lancaster.ac.uk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lauren Gill</string-name>
          <email>l.gill2@lancaster.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Computing and Communications, Lancaster University</institution>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Security Lancaster, School of Computing and Communications, Lancaster University</institution>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>-Social Media Aggregator (SMA) applications present a platform enabling users to manage multiple Social Networking Sites (SNS) in one convenient application, which results in a unique concentration of data from several SNS accounts in addition to the user's mobile phone data available to them. We describe a three-step methodology to assess how privacy is considered in these applications: 1) We inspect the mobile data and social media data; 2) we study any privacy policies and their compliance with respect to distributor's vetting policies; and 3) we perform a qualitative assessment of traceability between privacy policies and the actual transparency and control mechanisms offered to users by the apps' interfaces. We then present the results we obtained for 13 popular SMAs from 3 app stores, showing a variation in data accessed by the individual applications, an absence of privacy policies for 5 of the SMAs evaluated, and a lack of traceability between privacy policies and transparency and control of interface operations. After this, we report our experiences using the methodology and the lessons learned, together with potential future work to improve the methodology and its potential to also assess privacy in other mobile applications that also connect with social media. Index Terms-Social Media Aggregators, Social Media Privacy 1http://marketingland.com/facebook-usage-accounts-1-5-minutes-spentmobile-171561 2http://www.pewinternet.org/2013/12/30/social-media-update-2013/</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>I. INTRODUCTION</title>
      <p>It is evident that our engagement with Social Networking
Sites (SNS) is becoming ever more ingrained in our daily lives.
This has been, in part, facilitated by the spectacular growth of
mobile social networking, which has a worldwide penetration
of 23% (1.7 billion). This proliferation of mobile devices have
enabled the users to access social media accounts with more
ease and convenience. This is demonstrated by the huge surge
in usage of social applications on mobile platform to the extent
that an estimated 80% of time spent on social media is using
mobile applications1.</p>
      <p>This shift towards the mobile platform for social media
activity has led to the development of Social Media Aggregators
(SMAs) which enable users to access all of their social media
accounts from a single application. This is partly driven by the
fact that users are often found to have accounts on multiple
Social Networking Sites (SNSs)2. It can be quite attractive to
users to use SMAs, a single application for all social media
accounts, compared to installing separate applications for all
their social media sites. An additional attraction of installing
a single SMA replacing all social media applications is also
related to better utilization of the often limited resources
(RAM, CPU power and battery) of the mobile phone itself.
Indeed, many SMAs clearly convey this to potential customers
as an advantage and a selling point3.</p>
      <p>While it is clear that SMAs can be beneficial for users, they
also potentially introduce severe privacy risks for users. Users
are meant to use SMAs to combine multiple social media
accounts and all the activity is routed through a single SMA.
This is different from using separate applications for different
social media accounts as a user’s Facebook application, for
example, cannot access their Twitter activity unless an explicit
link is made by the user. Such a link between various social
media profiles is implicit in the case of SMAs. Moreover,
this information about social media activity is augmented with
mobile device data such as GPS location, contact lists, camera,
etc. Given this potential threat to the privacy of social media
users, it is essential to take a closer look at the transparency
and control mechanisms offered by these applications. This
understanding will help further in-depth analysis of gaps in
policy and technology which are required to be overcome in
order to safeguard user privacy and enable appropriate usage
of SMAs.</p>
      <p>In this paper, we describe a three-step methodology to assess
how privacy is considered in these applications. We begin by
looking at the Data Permissions requested by SMAs. This
includes both mobile data as well as social media data of
the user. We then check whether the SMAs have relevant
Privacy Policies or other related documentation which explain
the collection, usage and purpose of the user data being
collected by them. Then, we qualitatively analyze the privacy
policies and perform a Traceability Analysis where we evaluate
whether the interface provided to the users are congruent
with documented policies to evaluate how transparent data
collection is and whether users have a control over the amount
and nature of data being collected.</p>
      <p>We report the results we obtained for 13 popular SMAs
from 3 app stores, showing: a variety in the data accessed,
especially when it comes to mobile data; a partial lack of
privacy policies (5 out of the 13 SMAs do not have privacy
policies); and that a substantial proportion (45%) of SMAs
show Broken traceability between policy documentation and
3https://play.google.com/store/apps/details?id=com.friends.
socialnetworkingsites
interface operation whereas Complete traceability is observed
in about 19% of the cases. We also report our experiences
using the methodology, together with lessons learned and
potential future improvements to the methodology.</p>
    </sec>
    <sec id="sec-2">
      <title>II. METHODOLOGY</title>
      <p>We begin by listing the various SMAs we have considered
in our research along with their sources. We have surveyed 13
popular SMAs for this research. We studied the 6 most popular
SMAs (in terms of reviews and installs) each from Google
Play Store and iTunes. Additionally, we included a Cydia
SMA to account for the variation between SMAs with different
levels of adoption as well as between different app stores that
have different vetting procedures or policies (e.g., Cydia only
works on rooted iOS devices and does not have a vetting
process in place). The SMAs are listed with their platform,
number of times they have been rated and the number of times
they have been downloaded (wherever available)4 in Table
I. Note that number of reviews was not available for Social
Butter and Social hub as there were not enough reviews for
iTunes to publish the number.</p>
      <sec id="sec-2-1">
        <title>A. Examining Data Permissions</title>
        <p>The first step of our analysis requires us to identify exactly
which SMAs request permissions to access personal data
from the user. All mobile applications are required to request
permission for the data they access on the user’s phone. We
compare the permissions requested by the 11 SMAs included
in our analysis. It is important to note here that applications
asking for permissions of any data from the user does not
mean they are actually accessing it. However, it means that
this data is available to them with the consent of the user
(demonstrated by granting the access permission while using
the application).</p>
        <p>Most applications have a “permissions screen” which is
shown to the user to communicate the list of data access
permissions requested by the application (refer to Fig. 1).
However, for the analysis, in addition to the permissions
screens, we also looked at the phone settings section for
the individual permissions the applications were using. Both
Android and iOS display the data access permissions for
each application installed on the mobile phone. We also
checked the permissions granted to individual SMAs by using
“Permissions Manager” application on Android devices. We
examine the social network data (such as profile information,
communication, lists, etc.) that are accessed by the SMAs
separately. This helps us understand exactly what information
each SMA will try to have access for each of the SNS the user
will associate to the SMA. To look at this, we created social
media accounts and then authorized the individual SMAs. We
then checked the social media site to see what permissions the
4These figures were found from the respective app stores and are accurate
as of 9th February, 2017. Please note that Apple does not publish official
statistics about number of downloads for individual iOS applications so this
information is absent from the table. Statistics for iSocial could not be found
as well.
SMA had been granted. The permissions can also be checked
by the user when the SMA is used to log in to a particular
social network account for the first time. Only permissions
which were specified explicitly in either the permission screen
or the phone settings (or seen using “Permissions Manager”
on Android SMAs) were included in our results.</p>
      </sec>
      <sec id="sec-2-2">
        <title>B. Privacy Policies</title>
        <p>The next step in our analysis was to examine the privacy
policies of the individual SMAs. In some cases, the relevant
document was titled differently (such as “Terms of Service”)
but we refer to all privacy related documentation as privacy
policies for simplicity. The aim of this evaluation was to check
for compliance with distributor vetting policies.</p>
        <p>The 3 app stores included in our research are:
1) Cydia: It does not have an official vetting process for its
applications.
2) iTunes Store: It has a vetting process which reviews all
applications.5 Personally identifiable information may not
be collected or used without the user’s consent. More
generally, privacy policies are required if an application
stores, shares or uses personal data.
5https://developer.apple.com/app-store/review/guidelines/
3) Google Play Store: It has a vetting process which looks
at app permissions6 and outlines the application provider
agreement to protect the privacy and legal rights of
users.7 If an application accesses registration or personal
information, users must be made aware of this, and an
adequate privacy policy must be provided in appliance
with the law.</p>
      </sec>
      <sec id="sec-2-3">
        <title>C. Mapping Traceability</title>
        <p>
          Finally, we performed a qualitative analysis of the privacy
related documentation to facilitate the traceability analysis
with transparency and control interface operations. Previous
research has identified a methodology for analysing software
requirements from privacy policies [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. Concepts,
categorized as a commitment, privilege or right, are attained from
statements by identifying helping verbs, and used to produce
a set of software requirements. Similarly, we use content
analysis to identify action statements through verbs that we
then categorize into privacy implications, which are split into
categories by way of answering the following questions:
1) What information is collected by the application?
2) What is the purpose of collection?
3) Who can access this information?
4) How long is information retained?
        </p>
        <p>These privacy implications help us in contextualizing the
traceability analysis. In particular, we map the extent to which
application features and controls match expectations set out
to users as data actions in privacy policies or application
interfaces. By measuring the traceability of privacy policy
implications in application content, we can assess the extent
to which data transparency and control are delivered to the
user.</p>
        <p>
          For those applications with privacy policies, information
provided in these documents present a means of gathering
expectations for this analysis. A method for traceability analysis
of SNS is presented by Anthonysamy, et al. [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] where action
statements identified in privacy policies are mapped to those in
interface operations by way of assessing the extent to which
data actions are controllable by users. We applied a similar
methodology to SMAs and extended it to consider mobile
phone data and the transparency of interface operations. In
Anthonysamy’s methodology, privacy implications found in
policies are matched to corresponding operations available
through interfaces during installation and use of the
application. We have defined actions of privacy policies as privacy
implications, and define features and controls of an application
as its operations. Also, and extending upon Anthonysamy’s
methodology, our study aims to identify the traceability of data
privacy implications through interface awareness mechanisms.
Therefore we assess the transparency of data actions through
interface operations, as well as controls.
        </p>
        <p>For SMAs with privacy policies, transparency of data usage
is analyzed, mapping information provided in the privacy
6https://support.google.com/googleplay/answer/6014972?hl=enGB&amp;ref topic=6046245
7https://play.google.com/about/developer-distribution-agreement.html
policy, to that presented through application operations.
Traceability between data actions and the extent to which we control
each privacy implication is the second aspect for analysis. In
this way we map privacy implications to data transparency and
control operations for SMA applications with privacy policies,
by carrying out the following steps.</p>
        <p>For each privacy implication identified:
1) Identify a corresponding interface operation by matching
terminology of data actions.
2) Assess the transparency of data actions made visible to
the user through interface operations, contrasting data
actions in privacy policies.
3) Assess the extent of user control on data actions through
operations, mapping data visible in the previous step (2)
with control operations.</p>
        <p>
          We measure the extent to which privacy implications are
transparent and controllable through user interfaces against
three main categories; complete, partial and broken in a
similar way as in Anthonysamy, et al. [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], but specifying
the categories both for transparency and control:
        </p>
        <p>Complete mappings signify complete transparency of
information presented to the user, through both transparency and
control operations. Information presented to users is
unambiguous; with unmistakable meaning and appropriate detail.
For transparency, complete traceability can be achieved by
providing accurate information to the user through the user
interface. An example is when a user is accurately informed
about all data being accessed by an app through the permission
screen. The control operation is mapped as complete when the
user can regulate this list and can choose to withhold certain
items of information.</p>
        <p>Partial mappings involve ambiguous information provided
in privacy documentation or data operations. For example,
vague terms like ‘personal information’, which are not
explicitly defined, make mapping data operations difficult. Access
permissions are partial data operations because they do not
inform users of all data collected. Hootsuite collects location
and traffic data, much like most other applications. Although
we are prompted for permission regarding location access,
the application does not provide any information on the user
of traffic data collection. Control over a privacy implication
is found to be partial when incomplete, with some control
provided but not all data collected have associated controls.
Taking Everypost as another example, we find partial control
operations are evident for traffic data collected. Everypost’s
privacy policy8 states that cookies used by third parties may
be opted out of, as is apparent through interface operations.
However, collection of traffic data for internal usage such as
analytics does not match any control operations.</p>
        <p>Broken mappings occur when there is a disconnect between
privacy implication expectations and application operations.
Control operation mappings are broken when documented
expectancies and/or data transparency operations do not have
a matching control. Detachment from policy expectations is
8http://everypost.me/privacy-policy/
apparent among privacy implications such as advertising and
aggregation. These purposes for data collection are expressed
in privacy policies but no corresponding information is
provided through application data or control operations. Likewise
implications of age restriction in concern to data retention are
expressed in policies with disconnect to interface operations.</p>
        <p>There are many cases in which there is an absence of a
clear traceability mapping between privacy implications and
interface operations. We have classified these applications as
Unknown and represented them in our analysis.</p>
        <p>Apart from the above 4 classifications, there are some
cases where the privacy implication was not applicable to a
particular SMA. In such cases, we have represented this as
N/A in our analysis. The detailed results of our analysis is
presented in section 4.3.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>III. RESULTS</title>
      <sec id="sec-3-1">
        <title>A. Data Access Permissions</title>
      </sec>
      <sec id="sec-3-2">
        <title>1) Mobile Data Access Permissions: As can be seen from</title>
        <p>
          the results in Table II, most applications require access to
photos/media, location, identity, which refers to any user
accounts on the phone accessed by the application, and network
access. In addition, many application require access to the
USB storage as well. These findings confirm that personal data
of the user is accessed by most of the application that were
analyzed. An interesting observation is that permissions seem
consistent for the same SMA developers across app stores.
However, for different SMAs we observe a wide variety in
the mobile data being accessed. While this could be attributed
to different functionality being provided, it may also be a sign
of some SMAs asking for more permissions than required [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ],
as arguably one of the most mature and used SMA (Hootsuite)
seems to use a relatively smaller set of permissions when
compared to other SMAs. An interesting case is that of Social
Media all in one, which seems to access everything except
Identity (which could be retrieved from the SNSs accessed
anyway).
        </p>
        <p>2) Social Media Data Access Permissions: SMAs are
different from other mobile applications as they can access a
user’s social media data as well. We have summarized the data
permissions requested by SMAs while a user logs into their
social media accounts in Table III. We have used general terms
such as “Activity” and “Lists” in this table to simply convey
the meaning as each social media site uses different names for
such features. For example, “posts” on Facebook and “tweets”
on Twitter as well as inbox messages are classified under
“Activity”. Similarly, “Lists” refers to groups or lists that the
user might have created (or used by default) to organize their
contacts on various social media sites.</p>
        <p>We can find in Table III that 5 SMAs, namely, iSocial,
Social Networking All in One, Social Media all in one, Social
Media and Social Media Vault are marked with a ‘ * ’
sign and are shown to access all social media data. This is
to highlight the fact that these applications do not disclose
what social media data they access to function as they just
provide an interface for either the social media apps (such as
Facebook, Twitter) already installed on the user’s phone or
to the web link of the social network via the web browser.
As all the social media activity goes via these applications,
they have the potential to access all communication. Moreover,
these applications do not require to be authorized by the
user with their Facebook account so the user cannot regulate
the permissions by logging into their Facebook account as
is possible with other Facebook applications. For the other
SMAs, we find that many of them access almost all social
media activity such as posting on walls/tweeting, access the
friend or contact lists, update the profile on the users’ behalf,
post on their behalf, access to inbox messages or the email
ID which was used to create the account. Needless to say, all
this information may be classified as personal and sensitive to
the user and we find that most applications who disclose the
permissions access this information.</p>
      </sec>
      <sec id="sec-3-3">
        <title>B. Application Privacy Policies</title>
        <p>Applications that collect personally identifiable information
are required to produce a privacy policy in order to comply
with the previously discussed distributor vetting policies. Table
IV shows that 8 out of the 13 SMAs that we evaluated were
found to include this documentation. The lack of privacy
policies among the other 5 SMAs seems to suggest a
violation of the distributor vetting policies which mandate such
documentation for all applications which process personal data
from users. We did find in Table II that the SMAs without
a privacy policy do not access “Identity”, so technically
they may argue they do not access personally identifying
information. However, they are found to be able to access
most of the social media data, photos, location, etc., which
can be classified as personal information.</p>
      </sec>
      <sec id="sec-3-4">
        <title>C. Traceability for Transparency and Control</title>
        <p>Common data actions have been categorized to form 14
privacy implications seen in the left column of Table V.
Privacy implications fall under further categories by way
of answering our privacy questions set out in section 3.3;
collection, purpose, access and retention of data. Operations
refer to features provided by SMA providers or distributors
which inform us of data collection and use as well as
providing us with control over data actions. Each symbol in
the table provides a mapping to the degree of traceability
offered by transparency and control operations respectively.
Data operations refer to the extent to which transparency of
data actions is presented to the user through interfaces, these
include access permission prompts and other mechanisms
which detail privacy implications. Control operations refer to
features and mechanisms presented through interfaces which
enable control over some data action, these include device
settings, accept/decline button options etc. If the same degree
is found for both transparency and control operations assessed,
then only one symbol need be provided in representation. If
a different degree of traceability is found, the first symbol in
the particular cell of the table corresponds to transparency
operations and the second symbol corresponds to control
operations. In the resulting table, we refer to content as the
social media data collected shown in Table III. Other privacy
implications and results will be further explained and justified
in the following subsections.</p>
        <p>1) Complete: All SMAs provide control over some data
collection through access permissions. iSocial does not specify
any such method of informing the user of data collected
through the requirement to accept access permissions. iSocial’s
SMA
iSocial
Hootsuite</p>
        <p>Buffer
Social Networking all in one</p>
        <p>Social Media all in one</p>
        <p>Everypost
Social Media</p>
        <p>Hootsuite</p>
        <p>Buffer</p>
        <p>Everypost
Social Media Vault</p>
        <p>Social butter</p>
        <p>Social hub
Key:
terms and conditions specifies privacy implications; “Any site
registration information is used only by the website and is
not sold or given out to others”, likewise users may provide
an email address for the service provider to provide support.
Complete transparency for collection can be found when an
SMA communicates the data its going to access to the user
through the interface operations. Fig 2a shows Hootsuite’s
permissions screen which tells the user about the social media
data that will be accessed by it. Complete traceability mapping
for control operations are when a user can regulate the access
permissions through interface operations (such as Fig. 2b
which shows Hootsuite for iOS).</p>
        <p>Users have control over content provided for use by
services, through accepting access permissions and the posting
of information. Sharing information intentionally with SNS
involves sharing this with these third parties by users, the
transparency of third party access is completely apparent to the
user in this case. Some applications offer settings which enable
the user a level of control over who accesses information
posted to SNS, and the restriction of data access to particular
accounts. Controls offered are as found on common SNS;
(a) Notification of Social
Media data access by Hootsuite
(b) iOS device settings which
enable users to restrict access
permissions
share with only friends or everyone. Asset transfer refers
to personally identifiable information being transferred as
businesses buy and sell assets.</p>
        <p>2) Partial: The transparency of privacy implications
through access permissions maps only partially to expectations
provided by SMA privacy policies. An example of which
is partial content collection made visible and controllable to
the user. SMAs with privacy policies commonly state their
rights to collect all information provided to the site, including
shared with associated SNS. Google Play’s Hootsuite provides
a ‘Send usage data’ setting; the user is informed anonymous
data is collected which is used to help improve Hootsuite.
Partial transparency and control over internal use is apparent,
with an ambiguous description collection and purpose, along
with control over ‘anonymous data’ but no matching control
for all data collected as specified in the privacy policy, such
as content posted.</p>
        <p>3) Broken: Internal use of data includes analytics used
to improve or better understand services. It is common for
servers to automatically collect usage information; “Server
logs may include such information as a mobile device
identification number and device identifier, web requests, IP
address, browser type, browser language, referring/exit pages
and URLs, platform type, number of clicks, domain names,
search terms, landing pages ...”, the list goes on and on. This
type of information collected is referred to as the traffic data
privacy implication, and may be shared with third parties on
an aggregate basis for advertising and analytic purposes. We
can see that both transparency and control for this example are
broken in most SMAs, leaving users unaware in their normal
use through the interface of the collection of this data and
without a way of controlling that in any shape or form.</p>
        <p>4) Unknown: Analyzed traceability mapping of data use
as specified in privacy policies has shown us not to expect
applications to inform users about the passive collection of
non-identifiable information. We are aware that providers are
likely to use and share traffic or aggregate data with third
parties, for the purpose of analytics and advertising. We are
unable to determine whether an application without a privacy
policy passively collects such non-identifiable information.
Therefore, for some SMAs, data disclosure to 3rd parties by
the provider are shown to be unknown.</p>
        <p>5) Summary: Table VI summarizes our results, presenting
rounded percentages of privacy implications found to be
complete, partial, broken, unknown or not applicable. We provide a
breakdown for each of the 3 app stores. The overall traceability
of transparency and control are also provided.</p>
        <p>We find a general lack of transparency across SMAs with
45 percent of SMAs revealing broken transparency mappings.
Privacy implications offering complete transparency of data
involve collection of personal information made visible to the
user through in some way (e.g. showing the access
permissions required). In order to consider current guidelines for
user privacy as adequate, we must rule out mistrust between
the user’s expectations and reality of how SMAs treat their
information by making them aware, either through privacy
policies or through other awareness mechanisms, of any data
collected, how it will be used, whom it will be shared with,
and how long it will be retained.</p>
        <p>
          We also find that users have a lack of control as less
than a quarter of the results indicated complete control over
privacy implications. In order to give more control to users,
developers could work to increase application functionality
while restricting access to data. Settings should enable control
over all data collected, including information perceived as
nonidentifiable. Research has shown that pragmatic approaches of
providing privacy related intervention, where users are shown
the effect of exposures of their data, work well [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>IV. DISCUSSION AND LESSONS LEARNED</title>
      <p>In this paper, we inspected how SMAs handle privacy
and looked at it from three different angles. Evaluating the
permissions requested by the SMAs was fairly straightforward.
The SMAs communicate permissions to the user directly
and the user also has the opportunity to verify social media
permissions by checking their social media account and
authorizing the applications. While there are many tools that enable
the user to automatically check the mobile data permissions
requested by apps, checking of social media permissions
is slightly more complex. The process may potentially be
automated by simulating an authorization of the SMA to a
dummy social media account (like a “guest” account, possibly
built-in to the SMA), to reveal the permissions to the user,
before they use the SMA with their own social media account.
The larger problem here seems to be the lack of understanding
that users have about the permissions requested by mobile
applications. Greater awareness is desirable where users are
informed about the implications of the permissions they are
granting.</p>
      <p>
        While looking for privacy related documentation, we found
a fair degree of ambiguity. Not only do different application
providers have different names for such documentation
(“privacy policy”, “terms of service”, etc.), there is an absence of
consistency in the content of these documents as well. This
inconsistency makes it difficult to construct any expectations
from the users’ perspective of what they should be looking for
in order to educate themselves about the privacy implications
of using a particular app. Moreover, we found 5 SMAs which
do not provide this documentation at all. This is, as pointed
out earlier, in clear disagreement with the vetting policies of
both Google Play and iTunes app stores. A possible mitigation
may be found in automated solutions like “AutoPPG” which is
an automatic privacy policy generator for Android applications
[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. It simply identifies the important privacy issues emanating
from the usage of the application by conducting a static
analysis of the application’s source code. Automated solutions
such as these may enable development of a consistent structure
and terminology in such privacy policies which would enable
easier traceability analysis. Furthermore, such mechanisms
may also encourage SMA and other application developers
to include privacy policies without putting in too much effort.
      </p>
      <p>The qualitative analysis of privacy policies and analyzing
traceability with interface controls was a comparatively less
objective part of our methodology. Such analysis is harder due
to the relative inconsistencies in privacy related documentation
across apps as mentioned earlier. Moreover, the interfaces
for each individual SMAs have different operations which
necessitate a case-by-case analysis. This is the most costly
part of the methodology in terms of time and effort. It is
possible to automate the traceability analysis if the privacy
documentation is standardized and the privacy implications are
clearly defined. It is an interesting future direction in which
research can progress where such an automatic tracaeability
analysis might be used to certify SMAs. Any such efforts can
rely on the analysis methodology shown by similar work in
the area of social media sites and indeed the work done in this
paper.</p>
      <p>The methodology proposed in this paper may also be
extended to other apps which provide users with the
opportunity to link their social media accounts (such as gaming
apps). It would be interesting to see whether the problems
highlighted in this paper are specific to SMAs or whether
other similar apps, which let the users post to multiple social
media accounts, portray similarly low traceability. Future
attempts at using this methodology may consider using multiple
researchers to conduct the traceability analysis and look for
a consensus based approach or provide inter-rater reliability
between multiple researchers. This would potentially enhance
the objectivity of the traceability analysis.</p>
    </sec>
    <sec id="sec-5">
      <title>V. RELATED WORK</title>
      <sec id="sec-5-1">
        <title>A. Analysis of Mobile Data Access Permissions</title>
        <p>
          Mobile applications generally are explicit in disclosing the
data access permissions they require to the users. There is
generally a screen which is shown to the user at the time
of installation which tells them the data that the particular
application will be allowed to access. The major issue is the
“all or nothing” nature of mobile applications [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. The user is
required to grant the requested permissions to the application
for them to use it. This is a problem as it has been shown that
mobile applications often introduce risk vectors by asking for
more permissions than required [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. The problem is that the
applications are somewhat hamstrung in this regard and have
to request for permissions that they envisage using at any time
during execution. There have been some solutions put forth
to detect and possibly prevent malicious mobile applications
by using anomaly detection to detect applications behaving
maliciously and in a deviant manner from normally expected
behavior [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. The idea is to use static analysis to create profiles
of applications’ expected behavior and detect anomalies at
runtime to secure mobile applications. This is similar to the work
of Hussain et al. which looks at detecting malicious database
applications [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. Another proposed approach, “PrivacyGuard”
uses the VPN service of Android devices to intercept network
traffic of mobile applications to detect information leakage
[
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. It also provides mechanisms of tricking the malicious
applications by manipulating the leaked information. We found
that most of the previous work in this area only looks at
leakage of mobile data and not social media data which SMAs
have access to as well.
        </p>
      </sec>
      <sec id="sec-5-2">
        <title>B. Analysis of Privacy Policy Traceability</title>
        <p>
          There is previous work which shows that control over
data disclosure can affect decisions made by users [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ].
Greater transparency about data being shared often acts as a
mitigating factor against erroneous decisions being made. Our
work looks at the traceability for transparency and control by
looking at the interface operations and how closely they match
with privacy policies. Qualitative analysis of documented
policies and analyzing traceability with interface features is
an extensively researched topic in software engineering. More
recently, this technique has been used to analyze whether
the privacy policies outlined by SNSs are congruent with
the interface controls provided to the users. Anthonysamy et
al. demonstrated that SNSs themselves suffer from a lack of
traceability between data actions defined in privacy policies
and corresponding data operations apparent to users through
interfaces [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. Our work extends this methodology
to perform a privacy analysis for SMAs by performing an
analysis of the mobile phone data and social media data
accessed by the SMAs in addition to a traceability mapping
which considers the transparency of interface operations and
the control provided to the user.
        </p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>VI. CONCLUSIONS</title>
      <p>In this paper, we described a three-step methodology to
examine the privacy issues posed by SMAs by examining the
data (both mobile and social media) permissions requested
by them, checking whether they provide the user with
privacy related documentation and analyzing traceability between
privacy implications identified in the privacy policy with
the interface operations provided to the user. We used this
methodology to evaluate 13 popular Social Media Aggregators
(SMAs) from 3 app stores and found that the majority of
the SMAs we evaluated accessed users’ personal information
including their social media activity. However, we also found
that 5 of the 13 SMAs did not provide any privacy related
documentation which is in clear conflict with the vetting
policies of the app stores. Our results show that 45% of SMAs
show Broken traceability between privacy documentation and
interface operations while Complete traceability is observed
in only 19% of the cases. These results highlight the need for
major improvements to ensure that the usage of SMAs does
not compromise user privacy. The methodology described in
this paper can be reused for further investigation of SMAs or
be extended, with certain improvements, to examine similar
applications which enable the user to link their social media
activity.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Young</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Anto´</surname>
          </string-name>
          n et al.,
          <article-title>“A method for identifying software requirements based on policy commitments</article-title>
          ,” in Requirements Engineering Conference (RE),
          <year>2010</year>
          18th
          <string-name>
            <given-names>IEEE</given-names>
            <surname>International. IEEE</surname>
          </string-name>
          ,
          <year>2010</year>
          , pp.
          <fpage>47</fpage>
          -
          <lpage>56</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>P.</given-names>
            <surname>Anthonysamy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Greenwood</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Rashid</surname>
          </string-name>
          , “
          <article-title>Social networking privacy: Understanding the disconnect from policy to controls</article-title>
          ,” Computer, no.
          <issue>6</issue>
          , pp.
          <fpage>60</fpage>
          -
          <lpage>67</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>P. H.</given-names>
            <surname>Chia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yamamoto</surname>
          </string-name>
          , and
          <string-name>
            <given-names>N.</given-names>
            <surname>Asokan</surname>
          </string-name>
          , “
          <article-title>Is this app safe?: a large scale study on application permissions and risk signals</article-title>
          ,”
          <source>in Proceedings of the 21st international conference on World Wide Web. ACM</source>
          ,
          <year>2012</year>
          , pp.
          <fpage>311</fpage>
          -
          <lpage>320</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Kapadia</surname>
          </string-name>
          and
          <string-name>
            <given-names>A. J.</given-names>
            <surname>Lee</surname>
          </string-name>
          , “
          <article-title>Improving privacy through exposure awareness and reactive mechanisms,” in CHI 2016 Workshop on Bridging the Gap between Privacy by Design and Privacy in Practice</article-title>
          . ACM,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>L.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Luo</surname>
          </string-name>
          , and L. Xue, “Autoppg:
          <article-title>Towards automatic generation of privacy policy for android applications</article-title>
          ,”
          <source>in Proceedings of the 5th Annual ACM CCS Workshop on Security and Privacy in Smartphones and Mobile Devices. ACM</source>
          ,
          <year>2015</year>
          , pp.
          <fpage>39</fpage>
          -
          <lpage>50</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Amini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. I.</given-names>
            <surname>Hong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Sadeh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lindqvist</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , “
          <article-title>Expectation and purpose: understanding users' mental models of mobile app privacy through crowdsourcing</article-title>
          ,”
          <source>in Proceedings of the 2012 ACM Conference on Ubiquitous Computing. ACM</source>
          ,
          <year>2012</year>
          , pp.
          <fpage>501</fpage>
          -
          <lpage>510</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>E.</given-names>
            <surname>Bertino</surname>
          </string-name>
          , “Securing mobile applications,” Computer, vol.
          <volume>49</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>9</fpage>
          -
          <lpage>9</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S. R.</given-names>
            <surname>Hussain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Sallam</surname>
          </string-name>
          , and E. Bertino, “Detanom:
          <article-title>Detecting anomalous database transactions by insiders</article-title>
          ,”
          <source>in Proceedings of the 5th ACM Conference on Data and Application Security and Privacy. ACM</source>
          ,
          <year>2015</year>
          , pp.
          <fpage>25</fpage>
          -
          <lpage>35</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Song</surname>
          </string-name>
          and U. Hengartner, “
          <article-title>Privacyguard: A vpn-based platform to detect information leakage on android devices,”</article-title>
          <source>in Proc. of the 5th Annual ACM CCS Workshop on Security and Privacy in Smartphones and Mobile Devices. ACM</source>
          ,
          <year>2015</year>
          , pp.
          <fpage>15</fpage>
          -
          <lpage>26</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>S.</given-names>
            <surname>Patil</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Schlegel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kapadia</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A. J.</given-names>
            <surname>Lee</surname>
          </string-name>
          , “
          <article-title>Reflection or action?: How feedback and control affect location sharing decisions</article-title>
          ,”
          <source>in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM</source>
          ,
          <year>2014</year>
          , pp.
          <fpage>101</fpage>
          -
          <lpage>110</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>P.</given-names>
            <surname>Anthonysamy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Greenwood</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Rashid</surname>
          </string-name>
          , “
          <article-title>A method for analysing traceability between privacy policies and privacy controls of online social networks,” in Privacy Technologies</article-title>
          and Policy. Springer,
          <year>2014</year>
          , pp.
          <fpage>187</fpage>
          -
          <lpage>202</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>