Assessing Privacy in Social Media Aggregators Gaurav Misra Jose M. Such Lauren Gill Security Lancaster Department of Informatics School of Computing and Communications School of Computing and Communications King’s College London, UK Lancaster University, UK Lancaster University, UK Email: jose.such@kcl.ac.uk Email: l.gill2@lancaster.ac.uk Email: g.misra@lancaster.ac.uk Abstract—Social Media Aggregator (SMA) applications related to better utilization of the often limited resources present a platform enabling users to manage multiple Social (RAM, CPU power and battery) of the mobile phone itself. Networking Sites (SNS) in one convenient application, which Indeed, many SMAs clearly convey this to potential customers results in a unique concentration of data from several SNS accounts in addition to the user’s mobile phone data available to as an advantage and a selling point3 . them. We describe a three-step methodology to assess how privacy While it is clear that SMAs can be beneficial for users, they is considered in these applications: 1) We inspect the mobile data also potentially introduce severe privacy risks for users. Users and social media data; 2) we study any privacy policies and are meant to use SMAs to combine multiple social media their compliance with respect to distributor’s vetting policies; accounts and all the activity is routed through a single SMA. and 3) we perform a qualitative assessment of traceability between privacy policies and the actual transparency and control This is different from using separate applications for different mechanisms offered to users by the apps’ interfaces. We then social media accounts as a user’s Facebook application, for present the results we obtained for 13 popular SMAs from 3 app example, cannot access their Twitter activity unless an explicit stores, showing a variation in data accessed by the individual link is made by the user. Such a link between various social applications, an absence of privacy policies for 5 of the SMAs media profiles is implicit in the case of SMAs. Moreover, evaluated, and a lack of traceability between privacy policies and transparency and control of interface operations. After this, we this information about social media activity is augmented with report our experiences using the methodology and the lessons mobile device data such as GPS location, contact lists, camera, learned, together with potential future work to improve the etc. Given this potential threat to the privacy of social media methodology and its potential to also assess privacy in other users, it is essential to take a closer look at the transparency mobile applications that also connect with social media. and control mechanisms offered by these applications. This Index Terms—Social Media Aggregators, Social Media Privacy understanding will help further in-depth analysis of gaps in I. I NTRODUCTION policy and technology which are required to be overcome in order to safeguard user privacy and enable appropriate usage It is evident that our engagement with Social Networking of SMAs. Sites (SNS) is becoming ever more ingrained in our daily lives. In this paper, we describe a three-step methodology to assess This has been, in part, facilitated by the spectacular growth of how privacy is considered in these applications. We begin by mobile social networking, which has a worldwide penetration looking at the Data Permissions requested by SMAs. This of 23% (1.7 billion). This proliferation of mobile devices have includes both mobile data as well as social media data of enabled the users to access social media accounts with more the user. We then check whether the SMAs have relevant ease and convenience. This is demonstrated by the huge surge Privacy Policies or other related documentation which explain in usage of social applications on mobile platform to the extent the collection, usage and purpose of the user data being that an estimated 80% of time spent on social media is using collected by them. Then, we qualitatively analyze the privacy mobile applications1 . policies and perform a Traceability Analysis where we evaluate This shift towards the mobile platform for social media ac- whether the interface provided to the users are congruent tivity has led to the development of Social Media Aggregators with documented policies to evaluate how transparent data (SMAs) which enable users to access all of their social media collection is and whether users have a control over the amount accounts from a single application. This is partly driven by the and nature of data being collected. fact that users are often found to have accounts on multiple We report the results we obtained for 13 popular SMAs Social Networking Sites (SNSs)2 . It can be quite attractive to from 3 app stores, showing: a variety in the data accessed, users to use SMAs, a single application for all social media especially when it comes to mobile data; a partial lack of accounts, compared to installing separate applications for all privacy policies (5 out of the 13 SMAs do not have privacy their social media sites. An additional attraction of installing policies); and that a substantial proportion (45%) of SMAs a single SMA replacing all social media applications is also show Broken traceability between policy documentation and 1 http://marketingland.com/facebook-usage-accounts-1-5-minutes-spent- mobile-171561 3 https://play.google.com/store/apps/details?id=com.friends. 2 http://www.pewinternet.org/2013/12/30/social-media-update-2013/ socialnetworkingsites interface operation whereas Complete traceability is observed TABLE I: The 13 SMAs evaluated,the app stores they belong in about 19% of the cases. We also report our experiences to, number of reviews and downloads when available. using the methodology, together with lessons learned and SMA Platform No. of Installs Reviews potential future improvements to the methodology. iSocial Cydia − − Hootsuite Google Play 80760 1000k - 5000k II. M ETHODOLOGY Buffer Google Play 24948 500k - 1000k Social Networking Google Play 18336 1000k - 5000k We begin by listing the various SMAs we have considered all in one Social Media Google Play 11106 1000k - 5000k in our research along with their sources. We have surveyed 13 all in one popular SMAs for this research. We studied the 6 most popular Everypost Google Play 4502 100k - 500k Social Media Google Play 1392 100k - 500k SMAs (in terms of reviews and installs) each from Google Hootsuite iTunes 4865 − Play Store and iTunes. Additionally, we included a Cydia Buffer iTunes 1150 − Everypost iTunes 138 − SMA to account for the variation between SMAs with different Social Media Vault iTunes 12 − levels of adoption as well as between different app stores that Social butter iTunes N/A − Social hub iTunes N/A − have different vetting procedures or policies (e.g., Cydia only works on rooted iOS devices and does not have a vetting process in place). The SMAs are listed with their platform, number of times they have been rated and the number of times they have been downloaded (wherever available)4 in Table I. Note that number of reviews was not available for Social Butter and Social hub as there were not enough reviews for iTunes to publish the number. A. Examining Data Permissions The first step of our analysis requires us to identify exactly which SMAs request permissions to access personal data from the user. All mobile applications are required to request permission for the data they access on the user’s phone. We compare the permissions requested by the 11 SMAs included in our analysis. It is important to note here that applications Fig. 1: Mobile data access permissions required by Hootsuites asking for permissions of any data from the user does not on Android device mean they are actually accessing it. However, it means that this data is available to them with the consent of the user (demonstrated by granting the access permission while using SMA had been granted. The permissions can also be checked the application). by the user when the SMA is used to log in to a particular Most applications have a “permissions screen” which is social network account for the first time. Only permissions shown to the user to communicate the list of data access which were specified explicitly in either the permission screen permissions requested by the application (refer to Fig. 1). or the phone settings (or seen using “Permissions Manager” However, for the analysis, in addition to the permissions on Android SMAs) were included in our results. screens, we also looked at the phone settings section for the individual permissions the applications were using. Both B. Privacy Policies Android and iOS display the data access permissions for The next step in our analysis was to examine the privacy each application installed on the mobile phone. We also policies of the individual SMAs. In some cases, the relevant checked the permissions granted to individual SMAs by using document was titled differently (such as “Terms of Service”) “Permissions Manager” application on Android devices. We but we refer to all privacy related documentation as privacy examine the social network data (such as profile information, policies for simplicity. The aim of this evaluation was to check communication, lists, etc.) that are accessed by the SMAs for compliance with distributor vetting policies. separately. This helps us understand exactly what information The 3 app stores included in our research are: each SMA will try to have access for each of the SNS the user 1) Cydia: It does not have an official vetting process for its will associate to the SMA. To look at this, we created social applications. media accounts and then authorized the individual SMAs. We 2) iTunes Store: It has a vetting process which reviews all then checked the social media site to see what permissions the applications.5 Personally identifiable information may not be collected or used without the user’s consent. More 4 These figures were found from the respective app stores and are accurate generally, privacy policies are required if an application as of 9th February, 2017. Please note that Apple does not publish official statistics about number of downloads for individual iOS applications so this stores, shares or uses personal data. information is absent from the table. Statistics for iSocial could not be found as well. 5 https://developer.apple.com/app-store/review/guidelines/ 3) Google Play Store: It has a vetting process which looks policy, to that presented through application operations. Trace- at app permissions6 and outlines the application provider ability between data actions and the extent to which we control agreement to protect the privacy and legal rights of each privacy implication is the second aspect for analysis. In users.7 If an application accesses registration or personal this way we map privacy implications to data transparency and information, users must be made aware of this, and an control operations for SMA applications with privacy policies, adequate privacy policy must be provided in appliance by carrying out the following steps. with the law. For each privacy implication identified: C. Mapping Traceability 1) Identify a corresponding interface operation by matching terminology of data actions. Finally, we performed a qualitative analysis of the privacy 2) Assess the transparency of data actions made visible to related documentation to facilitate the traceability analysis the user through interface operations, contrasting data with transparency and control interface operations. Previous actions in privacy policies. research has identified a methodology for analysing software 3) Assess the extent of user control on data actions through requirements from privacy policies [1]. Concepts, catego- operations, mapping data visible in the previous step (2) rized as a commitment, privilege or right, are attained from with control operations. statements by identifying helping verbs, and used to produce a set of software requirements. Similarly, we use content We measure the extent to which privacy implications are analysis to identify action statements through verbs that we transparent and controllable through user interfaces against then categorize into privacy implications, which are split into three main categories; complete, partial and broken in a categories by way of answering the following questions: similar way as in Anthonysamy, et al. [2], but specifying the categories both for transparency and control: 1) What information is collected by the application? Complete mappings signify complete transparency of infor- 2) What is the purpose of collection? mation presented to the user, through both transparency and 3) Who can access this information? control operations. Information presented to users is unam- 4) How long is information retained? biguous; with unmistakable meaning and appropriate detail. These privacy implications help us in contextualizing the For transparency, complete traceability can be achieved by traceability analysis. In particular, we map the extent to which providing accurate information to the user through the user application features and controls match expectations set out interface. An example is when a user is accurately informed to users as data actions in privacy policies or application about all data being accessed by an app through the permission interfaces. By measuring the traceability of privacy policy screen. The control operation is mapped as complete when the implications in application content, we can assess the extent user can regulate this list and can choose to withhold certain to which data transparency and control are delivered to the items of information. user. Partial mappings involve ambiguous information provided For those applications with privacy policies, information in privacy documentation or data operations. For example, provided in these documents present a means of gathering ex- vague terms like ‘personal information’, which are not explic- pectations for this analysis. A method for traceability analysis itly defined, make mapping data operations difficult. Access of SNS is presented by Anthonysamy, et al. [2] where action permissions are partial data operations because they do not statements identified in privacy policies are mapped to those in inform users of all data collected. Hootsuite collects location interface operations by way of assessing the extent to which and traffic data, much like most other applications. Although data actions are controllable by users. We applied a similar we are prompted for permission regarding location access, methodology to SMAs and extended it to consider mobile the application does not provide any information on the user phone data and the transparency of interface operations. In of traffic data collection. Control over a privacy implication Anthonysamy’s methodology, privacy implications found in is found to be partial when incomplete, with some control policies are matched to corresponding operations available provided but not all data collected have associated controls. through interfaces during installation and use of the applica- Taking Everypost as another example, we find partial control tion. We have defined actions of privacy policies as privacy operations are evident for traffic data collected. Everypost’s implications, and define features and controls of an application privacy policy8 states that cookies used by third parties may as its operations. Also, and extending upon Anthonysamy’s be opted out of, as is apparent through interface operations. methodology, our study aims to identify the traceability of data However, collection of traffic data for internal usage such as privacy implications through interface awareness mechanisms. analytics does not match any control operations. Therefore we assess the transparency of data actions through Broken mappings occur when there is a disconnect between interface operations, as well as controls. privacy implication expectations and application operations. For SMAs with privacy policies, transparency of data usage Control operation mappings are broken when documented is analyzed, mapping information provided in the privacy expectancies and/or data transparency operations do not have 6 https://support.google.com/googleplay/answer/6014972?hl=en- a matching control. Detachment from policy expectations is GB&ref topic=6046245 7 https://play.google.com/about/developer-distribution-agreement.html 8 http://everypost.me/privacy-policy/ TABLE II: Mobile data accessed by each SMA SMA Identity Photos Location Contacts Wi-Fi Camera Mic Device ID SMS Phone Network In App USB /Media & Call info Access Purchases Storage iSocial 3 − − − − − − − − − 3 − − Hootsuite 3 3 3 − 3 − − − − − 3 3 3 Buffer 3 3 − − − 3 − − − − 3 3 3 Social Networking − − 3 − 3 − − − − − 3 − − all in one Social Media − 3 3 3 3 3 3 3 3 3 3 − 3 all in one Everypost 3 3 3 3 3 − − − − − 3 3 − Social Media − − 3 − − − − − − − 3 − − Hootsuite 3 3 3 − 3 − − − − − 3 3 3 Buffer 3 3 − − − 3 − − − − 3 3 3 Everypost 3 3 3 3 3 − − − − − 3 3 − Social Media Vault − − 3 − − 3 − − − − 3 − − Social butter − 3 3 − − 3 − − − − 3 − − Social hub − 3 3 3 − − − 3 − 3 3 − − Key: Yes: 3 No: − apparent among privacy implications such as advertising and permissions requested by SMAs while a user logs into their aggregation. These purposes for data collection are expressed social media accounts in Table III. We have used general terms in privacy policies but no corresponding information is pro- such as “Activity” and “Lists” in this table to simply convey vided through application data or control operations. Likewise the meaning as each social media site uses different names for implications of age restriction in concern to data retention are such features. For example, “posts” on Facebook and “tweets” expressed in policies with disconnect to interface operations. on Twitter as well as inbox messages are classified under There are many cases in which there is an absence of a “Activity”. Similarly, “Lists” refers to groups or lists that the clear traceability mapping between privacy implications and user might have created (or used by default) to organize their interface operations. We have classified these applications as contacts on various social media sites. Unknown and represented them in our analysis. We can find in Table III that 5 SMAs, namely, iSocial, Apart from the above 4 classifications, there are some Social Networking All in One, Social Media all in one, Social cases where the privacy implication was not applicable to a Media and Social Media Vault are marked with a ‘ * ’ particular SMA. In such cases, we have represented this as sign and are shown to access all social media data. This is N/A in our analysis. The detailed results of our analysis is to highlight the fact that these applications do not disclose presented in section 4.3. what social media data they access to function as they just provide an interface for either the social media apps (such as III. R ESULTS Facebook, Twitter) already installed on the user’s phone or A. Data Access Permissions to the web link of the social network via the web browser. 1) Mobile Data Access Permissions: As can be seen from As all the social media activity goes via these applications, the results in Table II, most applications require access to they have the potential to access all communication. Moreover, photos/media, location, identity, which refers to any user ac- these applications do not require to be authorized by the counts on the phone accessed by the application, and network user with their Facebook account so the user cannot regulate access. In addition, many application require access to the the permissions by logging into their Facebook account as USB storage as well. These findings confirm that personal data is possible with other Facebook applications. For the other of the user is accessed by most of the application that were SMAs, we find that many of them access almost all social analyzed. An interesting observation is that permissions seem media activity such as posting on walls/tweeting, access the consistent for the same SMA developers across app stores. friend or contact lists, update the profile on the users’ behalf, However, for different SMAs we observe a wide variety in post on their behalf, access to inbox messages or the email the mobile data being accessed. While this could be attributed ID which was used to create the account. Needless to say, all to different functionality being provided, it may also be a sign this information may be classified as personal and sensitive to of some SMAs asking for more permissions than required [3], the user and we find that most applications who disclose the as arguably one of the most mature and used SMA (Hootsuite) permissions access this information. seems to use a relatively smaller set of permissions when compared to other SMAs. An interesting case is that of Social B. Application Privacy Policies Media all in one, which seems to access everything except Applications that collect personally identifiable information Identity (which could be retrieved from the SNSs accessed are required to produce a privacy policy in order to comply anyway). with the previously discussed distributor vetting policies. Table 2) Social Media Data Access Permissions: SMAs are dif- IV shows that 8 out of the 13 SMAs that we evaluated were ferent from other mobile applications as they can access a found to include this documentation. The lack of privacy user’s social media data as well. We have summarized the data policies among the other 5 SMAs seems to suggest a vio- TABLE III: Social media data accessed by each SMA TABLE IV: Whether privacy policies are provided by each SMA Activity Lists Update Post Messages Email SMA provider Profile ID SMA Privacy Policies iSocial* 3 3 3 3 3 3 Hootsuite 3 3 3 3 3 3 iSocial 3 Buffer 3 3 3 3 3 3 Hootsuite 3 Social Networking* 3 3 3 3 3 3 Buffer 3 all in one Social Networking all in one − Social Media* 3 3 3 3 3 3 Social Media all in one − all in one Everypost 3 Everypost − 3 − 3 − 3 Social Media − Social Media* 3 3 3 3 3 3 Hootsuite 3 Hootsuite 3 3 3 3 3 3 Buffer 3 Buffer 3 3 3 3 3 3 Everypost 3 Everypost − 3 − 3 − 3 Social Media Vault − Social Media Vault* − 3 − − − 3 Social butter − Social butter − 3 − − − 3 Social hub 3 Social hub 3 − − − − 3 Key: Yes: 3 No: − Key: Yes: 3 No: − TABLE V: Traceability mappings represent transparency and control of privacy implications respectively, or collectively. lation of the distributor vetting policies which mandate such documentation for all applications which process personal data Social hub Hootsuite Hootsuite Buffer Buffer Everypost Everypost iSocial from users. We did find in Table II that the SMAs without a privacy policy do not access “Identity”, so technically Collection they may argue they do not access personally identifying Mobile Data G# G# G# G# G# G# G# information. However, they are found to be able to access Social Media Data G# G# G# − − G# G# l most of the social media data, photos, location, etc., which Traffic Data 6 6 G# 6 6 G# G# 6 Purpose can be classified as personal information. Services 6 G# Internal use 6 G# G# 6 6 6 6 6 Asset transfer ? 6 6 ? ? 6 6 6 C. Traceability for Transparency and Control Advertising G# 6G # − − − − 6 G# 6 Aggregation 6 6 6 6 6 − − ? Common data actions have been categorized to form 14 Access privacy implications seen in the left column of Table V. Service Provider G# G# G# G# G# G# G# 6 3rd party by user G# 6 6 6 6 Privacy implications fall under further categories by way 3rd party by provider 6G# 6 6 − − ? ? 6 Legality 6 6 6 6 6 6 6 6 of answering our privacy questions set out in section 3.3; Retention collection, purpose, access and retention of data. Operations Age Restriction G# 6 6 6 6 6 6 6 Information 6 6 6 − − 6 6 − refer to features provided by SMA providers or distributors Key: which inform us of data collection and use as well as pro- Complete: G# Partial: Broken: 6 Unknown: ? N/A: − viding us with control over data actions. Each symbol in the table provides a mapping to the degree of traceability offered by transparency and control operations respectively. terms and conditions specifies privacy implications; “Any site Data operations refer to the extent to which transparency of registration information is used only by the website and is data actions is presented to the user through interfaces, these not sold or given out to others”, likewise users may provide include access permission prompts and other mechanisms an email address for the service provider to provide support. which detail privacy implications. Control operations refer to Complete transparency for collection can be found when an features and mechanisms presented through interfaces which SMA communicates the data its going to access to the user enable control over some data action, these include device through the interface operations. Fig 2a shows Hootsuite’s settings, accept/decline button options etc. If the same degree permissions screen which tells the user about the social media is found for both transparency and control operations assessed, data that will be accessed by it. Complete traceability mapping then only one symbol need be provided in representation. If for control operations are when a user can regulate the access a different degree of traceability is found, the first symbol in permissions through interface operations (such as Fig. 2b the particular cell of the table corresponds to transparency which shows Hootsuite for iOS). operations and the second symbol corresponds to control Users have control over content provided for use by ser- operations. In the resulting table, we refer to content as the vices, through accepting access permissions and the posting social media data collected shown in Table III. Other privacy of information. Sharing information intentionally with SNS implications and results will be further explained and justified involves sharing this with these third parties by users, the in the following subsections. transparency of third party access is completely apparent to the 1) Complete: All SMAs provide control over some data user in this case. Some applications offer settings which enable collection through access permissions. iSocial does not specify the user a level of control over who accesses information any such method of informing the user of data collected posted to SNS, and the restriction of data access to particular through the requirement to accept access permissions. iSocial’s accounts. Controls offered are as found on common SNS; TABLE VI: Summary of traceability mappings for trans- parency, control and overall traceability of all privacy implica- tions analyzed. Figures rounded to the nearest whole number. Complete Partial Broken Unknown N/A Transp. 29% 0% 57% 7% 7% Cydia Control 29% 0% 57% 7% 7% Total 29% 0% 57% 7% 7% Transp. 17% 24% 43% 2% 14% Android Control 17% 19% 45% 5% 14% Total 17% 21% 44% 4% 14% Transp. 14% 27% 45% 4% 11% iOS Control 22% 17% 45% 5% 11% Total 18% 22% 45% 4% 11% Transp. 17% 23% 45% 4% 12% Overall Control 21% 16% 46% 5% 12% Total 19% 19% 45% 4% 12% likely to use and share traffic or aggregate data with third (a) Notification of Social Me- (b) iOS device settings which parties, for the purpose of analytics and advertising. We are dia data access by Hootsuite enable users to restrict access unable to determine whether an application without a privacy permissions policy passively collects such non-identifiable information. Fig. 2: Transparency and control operations Therefore, for some SMAs, data disclosure to 3rd parties by the provider are shown to be unknown. 5) Summary: Table VI summarizes our results, presenting share with only friends or everyone. Asset transfer refers rounded percentages of privacy implications found to be com- to personally identifiable information being transferred as plete, partial, broken, unknown or not applicable. We provide a businesses buy and sell assets. breakdown for each of the 3 app stores. The overall traceability 2) Partial: The transparency of privacy implications of transparency and control are also provided. through access permissions maps only partially to expectations We find a general lack of transparency across SMAs with provided by SMA privacy policies. An example of which 45 percent of SMAs revealing broken transparency mappings. is partial content collection made visible and controllable to Privacy implications offering complete transparency of data the user. SMAs with privacy policies commonly state their involve collection of personal information made visible to the rights to collect all information provided to the site, including user through in some way (e.g. showing the access permis- shared with associated SNS. Google Play’s Hootsuite provides sions required). In order to consider current guidelines for a ‘Send usage data’ setting; the user is informed anonymous user privacy as adequate, we must rule out mistrust between data is collected which is used to help improve Hootsuite. the user’s expectations and reality of how SMAs treat their Partial transparency and control over internal use is apparent, information by making them aware, either through privacy with an ambiguous description collection and purpose, along policies or through other awareness mechanisms, of any data with control over ‘anonymous data’ but no matching control collected, how it will be used, whom it will be shared with, for all data collected as specified in the privacy policy, such and how long it will be retained. as content posted. We also find that users have a lack of control as less 3) Broken: Internal use of data includes analytics used than a quarter of the results indicated complete control over to improve or better understand services. It is common for privacy implications. In order to give more control to users, servers to automatically collect usage information; “Server developers could work to increase application functionality logs may include such information as a mobile device identi- while restricting access to data. Settings should enable control fication number and device identifier, web requests, IP ad- over all data collected, including information perceived as non- dress, browser type, browser language, referring/exit pages identifiable. Research has shown that pragmatic approaches of and URLs, platform type, number of clicks, domain names, providing privacy related intervention, where users are shown search terms, landing pages ...”, the list goes on and on. This the effect of exposures of their data, work well [4]. type of information collected is referred to as the traffic data privacy implication, and may be shared with third parties on IV. D ISCUSSION AND L ESSONS L EARNED an aggregate basis for advertising and analytic purposes. We In this paper, we inspected how SMAs handle privacy can see that both transparency and control for this example are and looked at it from three different angles. Evaluating the broken in most SMAs, leaving users unaware in their normal permissions requested by the SMAs was fairly straightforward. use through the interface of the collection of this data and The SMAs communicate permissions to the user directly without a way of controlling that in any shape or form. and the user also has the opportunity to verify social media 4) Unknown: Analyzed traceability mapping of data use permissions by checking their social media account and autho- as specified in privacy policies has shown us not to expect rizing the applications. While there are many tools that enable applications to inform users about the passive collection of the user to automatically check the mobile data permissions non-identifiable information. We are aware that providers are requested by apps, checking of social media permissions is slightly more complex. The process may potentially be researchers to conduct the traceability analysis and look for automated by simulating an authorization of the SMA to a a consensus based approach or provide inter-rater reliability dummy social media account (like a “guest” account, possibly between multiple researchers. This would potentially enhance built-in to the SMA), to reveal the permissions to the user, the objectivity of the traceability analysis. before they use the SMA with their own social media account. The larger problem here seems to be the lack of understanding V. R ELATED W ORK that users have about the permissions requested by mobile A. Analysis of Mobile Data Access Permissions applications. Greater awareness is desirable where users are Mobile applications generally are explicit in disclosing the informed about the implications of the permissions they are data access permissions they require to the users. There is granting. generally a screen which is shown to the user at the time While looking for privacy related documentation, we found of installation which tells them the data that the particular a fair degree of ambiguity. Not only do different application application will be allowed to access. The major issue is the providers have different names for such documentation (“pri- “all or nothing” nature of mobile applications [6]. The user is vacy policy”, “terms of service”, etc.), there is an absence of required to grant the requested permissions to the application consistency in the content of these documents as well. This for them to use it. This is a problem as it has been shown that inconsistency makes it difficult to construct any expectations mobile applications often introduce risk vectors by asking for from the users’ perspective of what they should be looking for more permissions than required [3]. The problem is that the in order to educate themselves about the privacy implications applications are somewhat hamstrung in this regard and have of using a particular app. Moreover, we found 5 SMAs which to request for permissions that they envisage using at any time do not provide this documentation at all. This is, as pointed during execution. There have been some solutions put forth out earlier, in clear disagreement with the vetting policies of to detect and possibly prevent malicious mobile applications both Google Play and iTunes app stores. A possible mitigation by using anomaly detection to detect applications behaving may be found in automated solutions like “AutoPPG” which is maliciously and in a deviant manner from normally expected an automatic privacy policy generator for Android applications behavior [7]. The idea is to use static analysis to create profiles [5]. It simply identifies the important privacy issues emanating of applications’ expected behavior and detect anomalies at run- from the usage of the application by conducting a static time to secure mobile applications. This is similar to the work analysis of the application’s source code. Automated solutions of Hussain et al. which looks at detecting malicious database such as these may enable development of a consistent structure applications [8]. Another proposed approach, “PrivacyGuard” and terminology in such privacy policies which would enable uses the VPN service of Android devices to intercept network easier traceability analysis. Furthermore, such mechanisms traffic of mobile applications to detect information leakage may also encourage SMA and other application developers [9]. It also provides mechanisms of tricking the malicious to include privacy policies without putting in too much effort. applications by manipulating the leaked information. We found The qualitative analysis of privacy policies and analyzing that most of the previous work in this area only looks at traceability with interface controls was a comparatively less leakage of mobile data and not social media data which SMAs objective part of our methodology. Such analysis is harder due have access to as well. to the relative inconsistencies in privacy related documentation across apps as mentioned earlier. Moreover, the interfaces B. Analysis of Privacy Policy Traceability for each individual SMAs have different operations which There is previous work which shows that control over necessitate a case-by-case analysis. This is the most costly data disclosure can affect decisions made by users [10]. part of the methodology in terms of time and effort. It is Greater transparency about data being shared often acts as a possible to automate the traceability analysis if the privacy mitigating factor against erroneous decisions being made. Our documentation is standardized and the privacy implications are work looks at the traceability for transparency and control by clearly defined. It is an interesting future direction in which looking at the interface operations and how closely they match research can progress where such an automatic tracaeability with privacy policies. Qualitative analysis of documented analysis might be used to certify SMAs. Any such efforts can policies and analyzing traceability with interface features is rely on the analysis methodology shown by similar work in an extensively researched topic in software engineering. More the area of social media sites and indeed the work done in this recently, this technique has been used to analyze whether paper. the privacy policies outlined by SNSs are congruent with The methodology proposed in this paper may also be the interface controls provided to the users. Anthonysamy et extended to other apps which provide users with the oppor- al. demonstrated that SNSs themselves suffer from a lack of tunity to link their social media accounts (such as gaming traceability between data actions defined in privacy policies apps). It would be interesting to see whether the problems and corresponding data operations apparent to users through highlighted in this paper are specific to SMAs or whether interfaces [2], [11]. Our work extends this methodology other similar apps, which let the users post to multiple social to perform a privacy analysis for SMAs by performing an media accounts, portray similarly low traceability. Future at- analysis of the mobile phone data and social media data tempts at using this methodology may consider using multiple accessed by the SMAs in addition to a traceability mapping which considers the transparency of interface operations and [11] P. Anthonysamy, P. Greenwood, and A. Rashid, “A method for analysing the control provided to the user. traceability between privacy policies and privacy controls of online social networks,” in Privacy Technologies and Policy. Springer, 2014, pp. 187–202. VI. C ONCLUSIONS In this paper, we described a three-step methodology to examine the privacy issues posed by SMAs by examining the data (both mobile and social media) permissions requested by them, checking whether they provide the user with pri- vacy related documentation and analyzing traceability between privacy implications identified in the privacy policy with the interface operations provided to the user. We used this methodology to evaluate 13 popular Social Media Aggregators (SMAs) from 3 app stores and found that the majority of the SMAs we evaluated accessed users’ personal information including their social media activity. However, we also found that 5 of the 13 SMAs did not provide any privacy related documentation which is in clear conflict with the vetting policies of the app stores. Our results show that 45% of SMAs show Broken traceability between privacy documentation and interface operations while Complete traceability is observed in only 19% of the cases. These results highlight the need for major improvements to ensure that the usage of SMAs does not compromise user privacy. The methodology described in this paper can be reused for further investigation of SMAs or be extended, with certain improvements, to examine similar applications which enable the user to link their social media activity. R EFERENCES [1] J. D. Young, A. Antón et al., “A method for identifying software re- quirements based on policy commitments,” in Requirements Engineering Conference (RE), 2010 18th IEEE International. IEEE, 2010, pp. 47– 56. [2] P. Anthonysamy, P. Greenwood, and A. Rashid, “Social networking pri- vacy: Understanding the disconnect from policy to controls,” Computer, no. 6, pp. 60–67, 2013. [3] P. H. Chia, Y. Yamamoto, and N. Asokan, “Is this app safe?: a large scale study on application permissions and risk signals,” in Proceedings of the 21st international conference on World Wide Web. ACM, 2012, pp. 311–320. [4] A. Kapadia and A. J. Lee, “Improving privacy through exposure aware- ness and reactive mechanisms,” in CHI 2016 Workshop on Bridging the Gap between Privacy by Design and Privacy in Practice. ACM, 2016. [5] L. Yu, T. Zhang, X. Luo, and L. Xue, “Autoppg: Towards automatic generation of privacy policy for android applications,” in Proceedings of the 5th Annual ACM CCS Workshop on Security and Privacy in Smartphones and Mobile Devices. ACM, 2015, pp. 39–50. [6] J. Lin, S. Amini, J. I. Hong, N. Sadeh, J. Lindqvist, and J. Zhang, “Ex- pectation and purpose: understanding users’ mental models of mobile app privacy through crowdsourcing,” in Proceedings of the 2012 ACM Conference on Ubiquitous Computing. ACM, 2012, pp. 501–510. [7] E. Bertino, “Securing mobile applications,” Computer, vol. 49, no. 2, pp. 9–9, 2016. [8] S. R. Hussain, A. M. Sallam, and E. Bertino, “Detanom: Detecting anomalous database transactions by insiders,” in Proceedings of the 5th ACM Conference on Data and Application Security and Privacy. ACM, 2015, pp. 25–35. [9] Y. Song and U. Hengartner, “Privacyguard: A vpn-based platform to detect information leakage on android devices,” in Proc. of the 5th Annual ACM CCS Workshop on Security and Privacy in Smartphones and Mobile Devices. ACM, 2015, pp. 15–26. [10] S. Patil, R. Schlegel, A. Kapadia, and A. J. Lee, “Reflection or action?: How feedback and control affect location sharing decisions,” in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 2014, pp. 101–110.