Towards Automated Android App Collusion Detection

Irina Mariuca Asăvoae1 , Jorge Blasco2 , Thomas M. Chen2 , Harsha Kumara Kalutarage3 , Igor
      Muttik4 , Hoang Nga Nguyen5 , Markus Roggenbach ∗,1 , and Siraj Ahmed Shaikh5
                                                    1
                                                      Swansea University, UK
                                                  2
                                                   City University London, UK
                                               3
                                                 Queen’s University of Belfast, UK
                                                       4
                                                         Intel Security, UK
                                                   5
                                                     Coventry University, UK


                                                                       another sandbox which has been denied permission to
                                                                       handle such data) and eventually leak out.
                        Abstract                                          The Android ecosystem exacerbates the problem
                                                                       as the market pressure leads many developers to em-
    Android OS supports multiple communication                         bed advertisement libraries into their apps. As a re-
    methods between apps. This opens the pos-                          sult, such code may be present in thousands of apps
    sibility to carry out threats in a collaborative                   (all bearing different permissions). Advertisers have a
    fashion, c.f. the Soundcomber example from                         known tendency to disregard user privacy in favour of
    2011. In this paper we provide a concise defi-                     monetisation. So there exists a risk that ad-libraries
    nition of collusion and report on a number of                      may communicate between sandboxes even without
    automated detection approaches, developed in                       knowledge of apps authors to transmit sensitive data
    co-operation with Intel Security.                                  across, risking exposure and disregarding privacy.
                                                                          Of course, any unscrupulous developer may also
                                                                       split functionality which they prefer to hide between
1    Introduction                                                      multiple apps. Malicious behaviours similar to this are
The Android operating system (OS) is designed with a                   evident from known cases of apps exploiting insecure
number of built-in security features such as app sand-                 exposure of sensitive data by other apps [2].
boxing and fairly granular access controls based on                       Researchers have demonstrated that sets of apps
permissions. In real life, however, the isolation of apps              may violate the permissions model causing data leaks
is limited. In some respects even the opposite is true –               or carrying malware [23]. Such apps are called col-
there are many ways for apps to communicate across                     luding sets of apps and the phenomenon is called app
sandbox boundaries. The Android OS supports mul-                       collusion. Unfortunately, there are no effective tools
tiple communication methods which are fully docu-                      to detect app collusion. The search space posed by
mented (such as messaging via intents). The ability                    possible combination of apps means that this is not
of apps with different security postures to communi-                   straightforward. Effective methods are needed to nar-
cate has a negative effect on security as an app (in a                 row down the search to collusion candidates of interest.
sandbox which has permissions to handle such data)                        In recent work [13], dating after submission of this
is allowed to let sensitive data flow to another app (in               paper, we have discovered that app collusion was ac-
                                                                       tually being used in the field for quite a long time and
    ∗ Corresponding author M.Roggenbach@swan.ac.uk
                                                                       without being detected in a large group of applications
Copyright c by the paper’s authors. Copying permitted for              which use a popular Android SDK. This SDK is known
private and academic purposes. This volume is published and            to be included in more than a 1000 applications. This
copyrighted by its editors.
                                                                       discovery was a result of applying techniques described
In: D. Aspinall, L. Cavallaro, M. N. Seghir, M. Volkamer (eds.):
                                                                       in this paper to a large set of the most popular apps.
Proceedings of the Workshop on Innovations in Mobile Privacy
and Security IMPS at ESSoS’16, London, UK, 06-April-2016,                 This paper contributes towards a practical auto-
published at http://ceur-ws.org                                        mated system for collusion detection. We give a def-


                                                             29
                                                                   1
inition of collusion in Section 2. This is followed by                          Contact_app            Weather_app
                                                                                              Shared
two potential approaches to filter down to potential                 READ                     Prefs.
                                                                                                                     INTERNET
                                                                   CONTACTS
candidates for collusion, using a rule based approach
developed in Prolog in Section 3 and another statisti-
cal approach in Section 4. Section 5 presents a model-
checking approach to detecting collusion in Android                      Figure 1: An example of colluding apps
apps. Section 6 delves into the experimental outcomes           A3: A threat t = (T, ≤) is a partially ordered set. Let
and Section 7 discusses related work. Section 8 sum-                τ denote the set of all threats. In the scope of this
marises our contributions and Section 9 concludes the               paper, τ represents the set of all known threats
paper with thoughts on future work.                                 caused by single applications.

2     Defining collusion                                        A4: An inter-app communication c = (C, ≤) is a par-
                                                                    tially ordered set. Let com denote the set of all
Our notion of collusion refers to the ability for a set             known inter-app communications.
of apps to carry out a threat in a collaborative fash-
ion. This is in contrast to most existing work, where           Definition 1 (Collusion). A set S consisting of at
collusion is usually associated with inter-app commu-           least two apps is colluding if together they execute a
nications and information leakage (see Section 7). We           sequence A ∈ Act∗ such that:
consider that colluding apps can carry out any threat            1. there exists a subsequence A0 of A such that A0 ∈
such as the ones posed by single apps. The range of                 Ex(t) for some t ∈ τ ; furthermore, A0 is collec-
such threats includes [25]:                                         tively executed involving all apps in S, i.e., each
                                                                    app in S executes at least one action in A0 ; and
    • Information theft: happens when one app accesses
      sensitive information and the other sends informa-         2. there exists a subsequence C 0 of A such that C 0 ∈
      tion outside the device boundaries.                           Ex(c) for some c ∈ com.
    • Money theft: happens when an app sends infor-                It is a general challenge to distinguish between be-
      mation to another app that is capable of using            nign and malicious application behaviours. Thus, our
      money sensitive API calls (e.g. SMS).                     definition makes the realistic assumption that there are
                                                                a number of known threats, e.g., Intel Security regu-
    • Service misuse: happens when one app is able to           larly identifies apps to be malicious, realising known
      control a system service and receives information         threats. Now, collusion can be seen as a camouflage
      (commands) from another app to control those              mechanism: the individual apps appear harmless, none
      services.                                                 of them alone poses a threat, e.g., they would not be
   A threat can be described by a set of actions ex-            able to execute a threat just in terms of their permis-
ecuted in a certain order. We model this by a par-              sions; in combinatation, however, they realise a threat.
tially ordered set (T, ≤), where T is the set of actions           To illustrate our definition we present an abstract
and ≤ specifies the execution order. When (T, ≤) is             example1 .
carried out, actions from T are sequentially executed,          Example 1 (Stealing contact data). The two apps
according to some total order ≤∗ such that ≤⊆≤∗ ; in            graphically represented in Figure 1 perform infor-
other words, (T, ≤∗ ) is a total extension of (T, ≤). Let       mation theft: the Contact app reads the contacts
Ex((T, ≤)) denote the set of all possible total exten-          database to pass the data to Weather app, which sends
sions of (T, ≤); i.e., all possible ways of carrying out        the data to a remote server controlled by the adversary.
threat (T, ≤). Similarly, we also define inter-app com-         The information is sent through shared preferences.
munication as a partially ordered set. In this paper we            Using the collusion definition we can describe the
discuss overt communications channels only.                     actions performed by both apps as: ActContact app =
   We define collusion as follows:                              {ar ead contacts }, ActWeather app = {asend f ile }. with
                                                                pms(ar ead contacts )         =       {Permission contacts}
A1: Actions are operations provided by Android API
                                                                and pms(asend f ile )           =    {Permission internet}.
    (such as record audio, access file, write file, send
                                                                The information threat t is given by T                        =
    data, etc.). Let Act denote the set of all actions.
                                                                {ar ead contacts , asend f ile } and defining ar ead contacts ≤
A2: Actions can be associated with a number of at-              asend f ile . The inter-app communication is de-
    tributes (such as permissions, input parameters,            fined as comContact app             =     {sendshared pref s },
    etc.). Let B denote the set of all action attributes        comWeather app          =          {recvshared pref s }     and
    and pms : Act → ℘(B) specify the set of permis-             sendshared pref s ≤ recvshared pref s .
    sions required by Android to execute an action.               1 Concrete examples are available on request.


                                                       30
                                                            2
3     Detecting collusion threat                                       inter-app communication, apps can use key-value
                                                                       pairs to exchange information if proper permis-
Our first approximation to detect app collusion utilises
                                                                       sions are defined (before Android 4.4).
Logic Programming in Prolog. Its goal is to serve as
a fast, computationally cheap filter that detects po-               We map apps to sending and receiving actions by
tentially colluding apps. For such a first filter it is          inspecting their code and manifest files. When using
enough to be based on permissions; together with a               intents and shared preferences we are able to specify
further sifting mechanism (not discussed here) on per-           the communication channel using the intent actions
missions, in practical work on real world apps this filter       and preference files and packages respectively. If an
turns out to be effective to detect colluding apps in the        application sends a broadcast intent with the action
wild.                                                            SEND FILE we consider the following:
   Our filter (1) uses Androguard [9] to extract facts
about the communication channels and permissions of                     send broadcast(App, Intentsend f ile )
all single apps in a given app set S, (2) which is then                        → send(App, Intentsend f ile )
abstracted into an over-approximation of actions and
communication channels that could be used by a single            We consider that two apps communicate if one of them
app. (3) Finally the collusion rules are fired if the            is able to send and the other to receive through the
proper combinations of actions and communications                same channel. This allows to detect communication
are found in S.                                                  paths composed by an arbitrary number of apps:

3.1    Actions                                                   send(Appa , channel) ∧ receive(Appb , channel) →
We utilise an action set Actprolog composed out of four                         communicate(Appa , Appb , channel)
different high level actions: accessing sensitive infor-
mation, using an API that can directly cost money,               3.3    Threats
controlling device services (e.g. camera, etc.), and
                                                                 Our threat set τprolog considers information theft,
sending information to other devices and the Inter-
                                                                 money theft and service misuse. As our definition
net. To find out which of these actions an app could
                                                                 states, each of the threats is characterised by a se-
carry out, we extract its set of permissions pmsprolog .
                                                                 quence of actions. For example, the information theft
Each permission is mapped to one or more of the four
                                                                 threat is codified as the following Prolog rule:
high level actions. For example, an app that declares
the INTERNET permission will be capable of sending                         sensitive information(Appa )
information outside the device:                                          ∧ information outside(Appb )
                                                                         ∧ communicate(Appa , Appb , channel)
    uses(App, PInternet ) → information outside(App)
                                                                       →   collusion(Appa , Appb )
3.2    Communications                                            Currently, we do not take into account the order of
The communication channels established by an app                 action execution.
are characterised by its API calls and the permissions
declared in its manifest file. We cover communication            4     Assessing the collusion possibility
actions (comprolog ) that can be created as follows:
                                                                 In this section, we apply machine learning to classify
    • Intents are messages used to request tasks from            app sets into colluding and non-colluding ones. To this
      other application components (activities, services         end, we first define a probabilistic model. Then we
      or broadcast receivers). Activities, services and          train the model, i.e., estimate the model parameters
      broadcast receivers declare the intents they can           on a training data set. As a third step we validate
      handle by declaring a set of intent filters.               the model using a validation data set. Additionally, in
                                                                 Section 6.3, we check the model with testing data.
    • External Storage is a storage space shared be-
      tween all the apps installed without restrictions.         4.1    Probabilistic Model
      Apps accessing the external storage need to de-
      clare the READ EXTERNAL STORAGE per-                       Estimating the collusion possibility within a set S of
      mission. To enable writing, apps must declare              apps involves to estimate two different likelihood com-
      WRITE EXTERNAL STORAGE.                                    ponents Lτ and Lcom . Lτ denotes the likelihood of
                                                                 carrying out a threat. Lcom denotes the likelihood of
    • Shared Preferences are an OS feature to store key-         performing some inter-app communication. Hence, the
      value pairs of data. Although it is not intended for       likelihood of colluding within S is given by Lτ × Lcom .


                                                        31
                                                             3
   In order to estimate Lτ , we employ a so-called Naive
Bayesian informative [18] model. We consider a multi-
variate random variable Y = (y1 , . . . , yk ). Here, k is


                                                                         1.0
the total number of permissions in Android OS, and
yj ∈ {0, 1} are independent Bernoulli random vari-
                                                                    LT


                                                                         0.6
                                                                                                                         LT =0.55
ables. Variable yj is 1 if permission j is found in S,
0 otherwise. P (Y ) stands for the probability of ob-
taining S with permissions as described by Y . Our


                                                                         0.2
probabilistic model is then given by the equation:                               0           50    100         150      200
                                                                                                  App pair index
                         k
                         Y      y
              P (Y ) =         λj j (1 − λj )1−yj                                      Colluding pairs       Non-colluding pairs
                         j=1
                                                                 Figure 2: Distribution of Lτ over evaluation sample.
where λj ∈ [0, 1] is the Bernoulli parameter.                    Blue dotted line: best possible linear discriminant line.
    To compute Lτ we define Lτ = ln{(P (Y ))−1 }. This           4.3      Model Validation
reflects on the likelihood of an app set to “being ma-
                                                                 We validate our model on a larger set of apps (a blind
licious” increasing as the number of permissions re-
                                                                 sample) that we created for the purpose of calibration
quested increases. Therefore, any monotonically de-
                                                                 of our collusion detection mechanisms. This validation
creasing function can be chosen [18]. For comparisons
                                                                 set consists of 240 app pairs in which half are colluding
we average out Lτ and scale down to the range [0,1].
                                                                 pairs, the other half is non-colluding.
   To complete our modelling, we need to estimate val-              For this analysis, we implemented an automated
ues λˆj that replace λj in the computation of Lτ . To            process using R [19] and Bash scripts, which also in-
this end – to avoid over fitting P (Y ) – we estimate            cludes calls to a third party research prototype [5]
λj using informative beta prior distributions [14] and           to find intent based communications. Additionally, a
define the maximum posterior estimation                          simple permission based rule set was defined to find
                            P                                    communication using external storage. Overall, this
                              yj + αj                            process consists of the following steps: (1) extracting
                   λˆj =                              (1)
                           N + α j + βj                          permissions of all single apps in a given app set S;
                                                                 (2) computing Lτ using extracted permissions; (3) if
where N is the number of apps in the training set and            Lτ is greater than a certain threshold then estimating
αj , βj are penalty effects. In this work we set αj = 1.         Lcom as mentioned above; and finally (4) computing
βj depend on the critical level of permissions as given          Lτ × Lcom .
in [22, 18]. βj can take either the value 2N (most                  Figure 2 presents Lτ values for the validation
critical), N (critical) or 1 (non-critical).                     dataset in which good visual separation can be seen
                                                                 between two classes with a lower (=0.50) and up-
   We consider Lcom to be a binary function such that
                                                                 per (=0.69) bounds for a discriminant line. Table 1
Lcom ∈ {1, 0} which takes the value 1 if there is any
                                                                 presents the confusion matrix obtained by fitting the
inter app communication within S, 0 otherwise.
                                                                 best possible linear discriminant line at Lτ = 0.55 in
                                                                 Figure 2.
                                                                                                   Actual            Actual
4.2   Model Training                                                                 n=240
                                                                                                  Colluding       Non-Colluding
                                                                                 Predicted
                                                                                                     114                 7
Up to our recent discovery [13], there were no known                             Colluding
colluding examples in the wild. However, there are                               Predicted
                                                                                                         6              113
                                                                               Non-Colluding
sets of apps available, where individual apps have been
classified by industry experts. Thus, we have utilised           Table 1: Confusion matrix for the evaluation sample.
one such set provided by Intel Security, which consists
of 9k+ malicious and 9k+ clean apps. As mentioned in               Performance measures precision(=0.94) and F-
Section 2, there is no evidence suggesting differences           measure(=0.95) were computed using the Table 1.
between threats caused by single apps and colluding                 2 Though our attack model is multiple apps, L focuses only
                                                                                                                   τ
apps. Thus, we can estimate Lτ based on these two                on operations required to execute a threat by a single app at-
sets2 . As for Lcom , there is no need to estimate any           tack model. Additional communication operations required to
constants.                                                       execute the same threat in colluding model are covered by Lcom .


                                                        32
                                                             4
                                                                encoding SMALI instructions in K, we also model the
                                                                infrastructure that Android provides to apps.
                                                                    The executions of the above semantics, instruction
                                                                after instruction, defines the computations. However,
                                                                the concrete semantics is far too detailed for effective
                                                                model-checking. Therefore, we implement an abstrac-
                 Figure 3: Work-flow                            tion in form of a data flow analysis. Here we represent
Precision quantifies the notion of specificity while F-         each method individually as a graph, where registers,
measure provides a way to judge the overall perfor-             types, and constants are the nodes, and there is an
mance of the model for each class. The higher these             edge between two nodes n1 and n2 iff n2 is a param-
values are, the better the performance is. Such values          eter of the SMALI command that stores a value in
could be due to a bias of the validation sample towards         n1 . For example, the SMALI command const r1, 42
the methodology. Further investigation on the bias of           would lead to an edge from node “42” to node “R1”.
the evaluation dataset is left for future work.                 Analysing such graphs allows to detect which com-
                                                                mands influence the values sent, say, in a broadcast
                                                                intent, or publishing via the internet; these commands
5   Model-Checking for collusion
                                                                can be grouped into blocks and – rather than comput-
We demonstrate the effective attempt to detect col-             ing with concrete values – their effect can be captured
lusion via model-checking. Figure 3 shows the basic             symbolically.
work-flow employed: it starts with a set of apps in the             For the detection of collusion, our K semantics car-
Dalvik Executable format DEX [1]; this code is dis-             ries a “trace” cell that records which operations pro-
assembled and (currently manually) translated into a            vided by the Android API – see Definition 1 – have
semantically faithful representation in the framework           been executed in a specific run. Based on the “trace”
K [20] – this step also triggers a data flow analysis           cell we define collusion via a K rule: provided that the
of the app code; compilation translates the K repre-            GPS location has been accessed, this value has been
sentation into a rewrite theory in Maude [7]; using the         sent in a broadcast, this value has been received from
Maude model-checker provides an answer if the app set           a broadcast, and finally this value has been published,
colludes or not: in case of collusion, the tool provides        in that case we detect collusion “information theft”.
a readable counter example trace (see Section 6.4).                 Practical experiments with small apps demonstrate
   The rewriting logic semantics framework K allows             that this approach is feasible. Using the Maude model-
the user to define configurations, computations and             checker, the state space of the (abstraction) of two
rules [20]. Configurations organise the program state           apps is small, with only 8 states for the given example
in units called cells. In case of SMALI assembly pro-           and the check takes less than a second. When modify-
grams, on the method level program states essentially           ing the apps in such a way that the information flow is
consist of the current instruction, registers and param-        broken, by, say using a different name for the broadcast
eters, and the class it belongs to. The operational se-         and thus disabling communication between the apps,
mantics of, say, the assembly instruction                       model-checking shows that no collusion is happening.
                                                                    We use symbolic analysis over the byte code of the
                  const Register, Int
                                                                set S in order to obtain an in-depth inspection of the
for loading an integer constant Int into a register Reg-        communication patterns. In the byte code of each app
ister is captured by a rule                                     in S we detect the flow of communication with another
                                                                app and a safe over-approximation of the data being
                                                                communicated. We use static analysis over the byte
                                                                code of apps to extract communication-flow between
                                                                apps and data-flow inside each app. The data-flow
                                                                information is filtered based on the private data being
                                                                read and communicated.
which reads: provided the current instruction is to load            It is future work, to complete the encoding to cover
an integer I into a register R, then the cell “regs” cap-       the full DEX language and to experiment with larger
turing the state of the registers is updated by binding         apps all well as with larger number of apps to address
the value I to register R. The SMALI language also in-          the fundamental question of scalability. We further
cludes procedure calls. These allow to access function-         intend to prove the abstraction to be sound, to show
ality provided by the Android operating system, e.g.,           that if model-checking the abstracted code detects col-
to read protected resources such as the GPS location            lusion then there is collusion in the original code, and
or to send/receive a broadcast intent. Thus, besides            to have a more general definition of collusion.


                                                       33
                                                            5
6     First experimental results                                  id 1 2 3 4 5 6 7 8 9             10 11 12 13 14
                                                                   1   ♣                           ♣ ♣
In this section we compile first experimental results ob-          2
tained by applying the three methods discussed above               3       $♣ ♠ ♠
                                                                   4
to an artificial set.
                                                                   5     ♣                          ♣   ♣
                                                                   6     ♣ ♣
6.1    An artificial colluding app set                             7   ♣           ♣ ♣              ♣   ♣
                                                                   8                 ♣
In order to validate our different approaches against              9
a known ground truth, we have created fourteen apps               10                                    ♣
that cover all threats and use all communication chan-            11
nels described earlier in this paper. They are or-                12                                            ♣   ♣
ganised in four colluding sets: The Document Ex-                  13
                                                                  14
tractor set consists of one app (id 1 ) that looks
for sensitive documents on the external storage; the            Table 2: Collusion Matrix of the Prolog program. ♣
other app (id 2 ) sends the information received (via           = Information theft. $ = Money theft. ♠ = Service
SharedPreferences) to a remote server. The Botnet               misuse. ♣, $, ♠ = False positives.
set consists of four apps. One app (id 3 ) acts as a re-
lay that receives orders from the command and control           eight false positives due to over-approximation. Note,
center. The other colluding apps execute commands               that there are no false negatives due to the nature
(delivered via BroadcastIntents) depending on their             of our test set: it utilises only those communication
permissions: sending SMS messages (id 4 ), stealing             methods that our Prolog approach is able to identify.
the user’s contacts (id 5 ) and starting and stopping
tasks (id 6 ). The Contact Extractor set consists of            6.3   Computing collusion possibility
three apps. The first (id 7 ) reads contacts from the           Table 3 shows results from our alternative approach
address book, the second (id 8 ) forwards them via the          from Section 4 on the same set of apps. Each cell de-
external storage to the third one (id 9 ), which sends          notes the Lτ value for the corresponding pair. To min-
them to the Internet. The first and second app com-             imise false negatives, we use the lower bound (=0.50)
municate via BroadcastIntents. The Location Steal-              gained from the validation data set for the discrimi-
ing set consists of one app (id 12 ) that reads the user        nant line as threshold for Lτ . We report possible col-
location and shares it with the second app (id 13 ),            lusion if Lτ ≥ 0.5 and Lcom = 1, otherwise we report
which sends the information to the Internet.                    non-collusion. This yields symmetric data – for read-
   The three non-colluding apps are a document                  ability we leave the lower half of the matrix empty.
viewer (id 10 ), an information sharing app (id 11 )            Dark red shows true positives, light red shows false
and a location viewer (id 14 ). The first app is able           positives, dark green shows true negatives, and light
to display different file types in the device screen and        green shows false negatives.
use other apps (via broadcast intents) to share their              This approach finds two of the four colluding app
uniform resource identifier (URI). The second app re-           sets and flags five false positives. It relies on a 3rd
ceives text fragments from other apps and sends them            party tool to detect inter-app communication which
to a remote server. The third app receives a location           ignores communication using shared preferences, thus
from another app (with the same intent used by apps             the app set {id 1, id 2 } is not detected. As we restrict
12 and 13) and shows it to the user on the screen.              ourselves to pairwise analysis only, the app set {id 7, id
                                                                9 } can’t be detected, as it communicates via id 8. Fi-
6.2    Detecting collusion threat with Prolog
                                                                nally, app set {id 12, id 13 } was not reported since its
Table 2 shows the results obtained with our rule based          Lτ value is less than the chosen threshold. Choosing
approached from Section 3 on the apps from Section              a lower threshold could avoid this false negative, but
6.1. The entry “dark red club” in row 1 and column              at the cost of a lower class accuracy and performance.
2 means: the program detects that app id 1 sends in-               A precise estimation of Lcom would be useful in or-
formation to app id 2, and these two apps collude on            der to reduce the number of false positives in our anal-
an “information theft”. As we take communication di-            ysis. However, communication is only a necessary, not
rection into consideration, the resulting matrix is non-        a sufficient condition for collusion. A recent study [10]
symmetric, e.g., there is no entry in row 2 and column          shows that 84.4% of market available apps can com-
1. The entry “light red club” in row 1 and column 10            municate with other apps either using explicit (11.3%)
means: the program flags collusion of type “informa-            or implicit (73.1%) intent calls. Therefore the threat
tion theft” though the set {id 1, id 10 } is clean.The          element (i.e. Lτ ) is far more informative in collusion
approach identifies all colluding app sets. It also flags       estimation than the communication element (Lcom ) in


                                                       34
                                                            6
    ID    1     2       3       4       5       6       7        8      9       10       11      12      13      14
    1           0.51    0.61    0.97    1       0.8     1        0.81   0.77    0.77     0.77    0.44    0.44    0.95
    2                   0.48    0.62    0.55    0.49    0.55     0.58   0.51    0.51     0.58    0.31    0.31    0.49
    3                           0.69    0.64    0.56    0.64     0.48   0.61    0.61     0.72    0.41    0.41    0.58
    4                                   1       0.84    1        0.85   0.71    0.71     0.82    0.56    0.56    0.95
    5                                           0.84    1        0.86   0.67    0.67     0.82    0.47    0.47    1
    6                                                   0.84     0.68   0.58    0.58     0.65    0.43    0.43    0.78
    7                                                            0.86   0.67    0.67     0.82    0.47    0.47    1
    8                                                                   0.51    0.51     0.58    0.31    0.31    0.77
    9                                                                           0.77     0.77    0.44    0.44    0.61
    10                                                                                   0.77    0.44    0.44    0.61
    11                                                                                           0.47    0.47    0.73
    12                                                                                                   0.47    0.41
    13                                                                                                           0.41
    14


                                       Table 3: Matrix for collusion possibility.
our model. Figure 2 supports this claim as it present        ports its path witnesses, such as:
Lτ for both classes.                                         <trace> call(readSecret, p1)
   Both approaches to detect the potential of collusion      -> r1 := callRet(readSecret)
are constrained in terms of the type of inter-app com-       -> call(getBroadcast,r1,r1,"locsteal",p1)
munication channels they account for due to reasons          -> call(sendBroadcast,"locsteal",r1)
explained previously. This makes it difficult to provide     -> r2 := callRet(getBroadcast)
for a straightforward comparison. The rule-based ap-         -> call(publish,r2)
proach is not limited to pairs of set for collusion (which   </trace>
may involve more than two apps). It also allows us as-
sess for the direction of the colluding behaviour (for       7 Related work
cases of information flow). Defining rules requires ex-
                                                             App collusion can be traced back to confused deputy
pert knowledge and for some cases explicit rules may
                                                             attacks [12]. They happen in form of permission re-
not exist; this is overcome by providing for rules to
                                                             delegation attacks [11, 8, 26] when a permission is care-
over-approximate for potentially colluding behaviour.
                                                             lessly exposed through a public component. Sound-
   A statistical approach, on the other hand, has the
                                                             comber [23] is an example where extracted information
advantage that it can reflect on varying degrees of pos-
                                                             is transmitted using both overt and covert channels.
sible collusion for a given set of apps. In this paper we
                                                                 Marforio et al. [17] define colluding applications as
have largely focused on static attributes, such as per-
                                                             those applications that cooperate in a violation of some
missions, a number of other similar static attributes
                                                             security property of the system. Another definition is
(such as developer ID, download metrics, and so on)
                                                             given by [16] where multiple apps can come together to
may also be placed on a scale of suspicion for collusion;
                                                             perform a certain task, which is out of their permission
albeit the current implementation of the statistical ap-
                                                             capabilities. The malicious component of collusion is
proach in this paper is limited due to tool availability.
                                                             also acknowledged by Bagheri, Sadeghi et al. [21]. A
                                                             more detailed definition is given by Elish [10], which
6.4 Software model-checking                                  defines it as the collaboration between malicious apps,
We inspect two sets of applications, one colluding {id       likely written by the same adversary, to obtain a set
12, id 13 } and another non-clouding pair {id 12, id         of permissions to perform attacks.
14 } with software model checking as described in Sec-           ComDroid [6] is a static analysis tool that looks
tion 5. The sender id 12 and receiver (id 13 or id           for confused deputies through Intents. XManDroid [3]
14 ) communicate via broadcast messages containing           and   TrustDroid [4] extend the Android OS. Both allow
the details of the GPS location. In the colluding case       for fine-grained   policies that control inter-app informa-
the broadcast is successful, i.e., the sender reads the      tion   exchange;   none  of them address covert channels.
GPS location then broadcasts it, and the receiver gets       In  [24]  authors   analyze,   using different risk metrics,
the broadcast then publishes the GPS location on the         several    compartmentalisation     strategies to minimise
internet. In the non-colluding case, the sender still        the   risk  of app  collusion,   showing  two or three app
broadcasts the private data which reaches the receiver       compartments       drastically  reduce the  risk of collusion
but this data is never published. Instead, the re-           for  a  set of 20 to 50 apps.
ceiver publishes something else. The data-flow analy-
sis shows, for the non-colluding case, that the private
data is received but is never published. This aspect is
not obvious for our Prolog filter. Similarly, the data-
flow analysis detects collusion in the first case and re-


                                                        35
                                                             7
8   Summary                                                     low constructing a full view of existing app sets on user
                                                                devices. Only naturally occurring sets (either installed
We have presented a new and concise definition of col-
                                                                on same device or actually executing simultaneously)
lusion in the context of Android OS. Colluding apps
                                                                may be analysed for collusion which should drastically
may carry out information theft, money theft, or ser-
                                                                reduce the number of sets that require deeper analysis.
vice misuse; to this end, malware is distributed over
                                                                   Our goal is to build a fully automated and effective
multiple apps. As demonstrated by Soundcomber (and
                                                                collusion detection system, and tool performance will
our own app sets), collusion is a possibility – which we
                                                                be central to address scale. It is not clear yet where
recently demonstrated to exists in the wild [13]. To-
                                                                the bottleneck will be when we apply our approach to
gether with our industrial partner Intel Security, we
                                                                real-life apps. Further work will focus on identifying
have developed a number of approaches towards effec-
                                                                these bottlenecks to optimise the slowest elements of
tively detecting collusion. Early experimental results
                                                                our tool-chain. Detecting covert channels would be a
on small app sets (of a size nearly the scale of the
                                                                challenge as modelling such will not be trivial.
number of apps one has on a phone) look promising:
a combination of the rule-based approach and the ma-
chine learning approach could serve as a filter, after          Acknowledgement
which we employ model checking as a decision proce-             This work has been funded by EPSRC and we are
dure.                                                           excited to work on this challenging piece of research3 .
                                                                A special thanks goes to Erwin R. Catesbeiana (Jr) for
9   Future work                                                 excellent guidance through the Android ecosystem.
Like in [13], we will continue using our tools for
analysing app sets in the wild. However, it should be
                                                                References
noted that a frontal attack on detecting collusions to           [1] Android Open Source Project. Dalvik Byte-
analyse analysing pairs, triplets and larger sets is not             code. https://source.android.com/devices/
practical given the search space. An effective collusion-            tech/dalvik/dalvik-bytecode.html, 2016.
discovery tool must include an effective set of methods
to isolate potential sets which require further exami-           [2] S. Arzt, S. Rasthofer, C. Fritz, E. Bodden, A. Bar-
nation.                                                              tel, J. Klein, Y. Le Traon, D. Octeau, and P. Mc-
    We have to emphasise that such tools are essen-                  Daniel. FlowDroid: precise context, flow, field,
tial for many collusion-detection methods. Besides                   object-sensitive and lifecycle-aware taint analysis
our model checking approach, one could, for example,                 for Android apps. In ACM SIGPLAN Notices -
think of the approach to merge pair of apps into a sin-              PLDI’14, volume 49, pages 259–269. ACM, 2014.
gle app to inspect aggregated data flows [15]. Merging           [3] S. Bugiel, L. Davi, A. Dmitrienko, T. Fischer, and
is slow and therefore app-combining approach is pred-                A.-R. Sadeghi. XManDroid: A new Android evo-
icated on effective filtering of app pairs of interest.              lution to mitigate privilege escalation attacks. TU
    Alternatively (or additionally), to the two filters              Darmstadt, TR-2011-04, 2011.
described in our paper, imprecise heuristic methods
to find “interesting” app sets may include: statistical          [4] S. Bugiel, L. Davi, A. Dmitrienko, S. Heuser,
code analysis of apps (e.g. to locate APIs potentially               A.-R. Sadeghi, and B. Shastry. Practical and
responsible to communication, accessing sensitive in-                lightweight domain isolation on Android. In
formation, etc.); and taking into account apps’ pub-                 SPSM’11, pages 51–62. ACM, 2011.
lication time and distribution channel (app market,
                                                                 [5] J. Burket, L. Flynn, W. Klieber, J. Lim, W. Shen,
direct installation, etc.).
                                                                     and W. Snavely. Making DidFail succeed: En-
    Attackers are more likely to release colluding apps
                                                                     hancing the CERT static taint analyzer for An-
in a relatively short time frame and that they are likely
                                                                     droid app sets. CMU/SEI-2015-TR-001. 2015.
to engineer the distribution in such a way that suffi-
cient number of users would install the whole set (likely        [6] E. Chin, A. P. Felt, K. Greenwood, and D. Wag-
from the same app market). To discover such sce-                     ner. Analyzing inter-application communication
narios one can employ: analysis of security telemetry                in Android. In MobiSys’11, pages 239–252, 2011.
focused on users devices to examine installation/re-
moval of apps, list of processes simultaneously execut-          [7] M. Clavel, F. Duran, S. Eker, P. Lincoln,
ing, device-specific APK download/installation logs                  N. Martı-Oliet, J. Meseguer, and C. Talcott. All
from app markets (like Google PlayTM ) and meta-data                 about Maude. LNCS, 4350, 2007.
about APKs in app markets (upload time by develop-                 3 All code related to the project can be found in our GitHub

ers, developer ID, source IP, etc.). Such data would al-        page: https://www.github.com/acidrepo


                                                       36
                                                            8
 [8] L. Davi, A. Dmitrienko, A.-R. Sadeghi, and                [22] B. P. Sarma, N. Li, C. Gates, R. Potharaju,
     M. Winandy. Privilege escalation attacks on An-                C. Nita-Rotaru, and I. Molloy. Android permis-
     droid. In Information Security, pages 346–360.                 sions: a perspective combining risks and benefits.
     2010.                                                          In SACMAT’12, pages 13–22. ACM, 2012.

 [9] A. Desnos. Androguard. https://github.com/                [23] R. Schlegel, K. Zhang, X.-y. Zhou, M. Int-
     androguard/androguard, 2016.                                   wala, A. Kapadia, and X. Wang. Soundcomber:
                                                                    A stealthy and context-aware sound trojan for
[10] K. O. Elish, D. Yao, and B. G. Ryder. On the need              smartphones. In NDSS’11, pages 17–33, 2011.
     of precise inter-app ICC classification for detect-
     ing Android malware collusions. In MoST, 2015.            [24] G. Suarez-Tangil, J. E. Tapiador, and P. Peris-
                                                                    Lopez. Compartmentation policies for Android
[11] A. P. Felt, H. J. Wang, A. Moshchuk, S. Hanna,                 apps: A combinatorial optimization approach. In
     and E. Chin. Permission re-delegation: Attacks                 Network and System Security, pages 63–77. 2015.
     and defenses. In USENIX Security 2011.
                                                               [25] G. Suarez-Tangil, J. E. Tapiador, P. Peris-Lopez,
[12] N. Hardy. The confused deputy:(or why capabil-                 and A. Ribagorda. Evolution, detection and anal-
     ities might have been invented). ACM SIGOPS                    ysis of malware for smart devices. Comm. Surveys
     Operating Systems Review, 22(4):36–38, 1988.                   & Tutorials, IEEE, 16(2):961–987, 2014.

[13] M. R. Jorge Blasco, Igor Muttik. Wild android             [26] L. Wu, X. Du, and H. Zhang. An effective ac-
     collusions, 2016. submitted.                                   cess control scheme for preventing permission leak
                                                                    in Android. In Comp., Networking and Comm.,
[14] K. Krishnamoorthy. Handbook of statistical dis-                pages 57–61. IEEE, 2015.
     tributions with applications. CRC Press, 2015.

[15] L. Li, A. Bartel, T. F. Bissyandé, J. Klein, and
     Y. Le Traon. ApkCombiner: Combining multiple
     Android apps to support inter-app analysis. In
     SEC’15, pages 513–527. Springer, 2015.

[16] T. Mall and S. Gupta. Critical evaluation of secu-
     rity framework in Android applications: Android-
     level security and application-level security. Int.
     Res. J. of Comp. and Elec. Engg., 2, 2014.

[17] C. Marforio, A. Francillon, and S. Capkun. Ap-
     plication collusion attack on the permission-based
     security model and its implications for modern
     smartphone systems. technical report, 2011.

[18] H. Peng, C. Gates, B. Sarma, N. Li, Y. Qi,
     R. Potharaju, C. Nita-Rotaru, and I. Molloy. Us-
     ing probabilistic generative models for ranking
     risks of Android apps. In CCS’12, pages 241–252.

[19] R Core Team. R: A Language and Environment
     for Statistical Computing. R Foundation for Sta-
     tistical Computing, Vienna, Austria, 2013.

[20] G. Rosu. From rewriting logic, to programming
     language semantics, to program verification. In
     Logic, Rewriting, and Concurrency, pages 598–
     616. 2015.

[21] A. Sadeghi, H. Bagheri, and S. Malek. Analysis
     of Android inter-app security vulnerabilities using
     COVERT. In ICSE’15, pages 725–728, 2015.


                                                      37
                                                           9