<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Detecting privacy leaks in Android Apps</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Luxembourg - SnT</institution>
          ,
          <country country="LU">Luxembourg</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The number of Android apps have grown explosively in recent years and the number of apps leaking private data have also grown. It is necessary to make sure all the apps are not leaking private data before putting them to the app markets and thereby a privacy leaks detection tool is needed. We propose a static taint analysis approach which leverages the control- ow graph (CFG) of apps to detect privacy leaks among Android apps. We tackle three problems related to intercomponent communication (ICC), lifecycle of components and callback mechanism making the CFG imprecision. To bridge this gap, we explicitly connect the discontinuities of the CFG to provide a precise CFG. Based on the precise CFG, we aim at providing a taint analysis approach to detect intra-component privacy leaks, inter-component privacy leaks and also inter-app privacy leaks.</p>
      </abstract>
      <kwd-group>
        <kwd>Static analysis</kwd>
        <kwd>Taint analysis</kwd>
        <kwd>Privacy Leaks</kwd>
        <kwd>ICC</kwd>
        <kwd>CFG</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Android has become the most popular mobile phone operating system over the
last three years. There are hundreds of thousands of applications emerging every
day. As of May 2013, 48 billion apps have been installed from the Google Play
store, and as of September 3, 2013, 1 billion Android devices have been
activated1. Meanwhile, the Android operating system also becomes a worthwhile
target for security and privacy attacks. A major problem in Android is private
data leaks. A lot of data leaks have been reported this years, such as sending
short messages, making phone calls and HTTP connections.</p>
      <p>We use a static taint analysis technique based on control- ow graph (CFG) of
analyzed apps to detect privacy leaks in Android. Static taint analysis technique
is a kind of data ow analysis technique which keeps track of values derived from
sensitive data. We rst label the private data that we call source (for instance
a method returning GPS coordinate), and then track the data by statically
analyzing the code. If the private data goes to a method which sends it outside
the application, also called sink method, we identify this as a private data leak
and we tag the path from the source to the sink as a detected tainted path. In
the CFG, a tainted path means it is reachable from the source method to the
sink method. Thus, privacy leak detection identi es paths between pre-de ned
source and sink methods in the CFG of analyzed apps.
1 http://en.wikipedia.org/wiki/Android (operating system)
startActivity
onCreate
Button Click</p>
      <p>Android System
onCreate
onBind
onClick
ICC methods
Since we detect privacy leaks through identifying paths from source methods to
sink methods in the generated CFG of analyzed apps, a precise CFG is essential.
However, there are 3 problems that make the generated CFG imprecision. The
3 problems are shown in Fig. 1. The rst problem is related to inter-component
communication (ICC) methods in Android, we detail it in Section 2.1. The second
problem is related to Android's lifecycle methods and the last problem is related
to callback methods. We detail them in Section 2.2 and Section 2.3 respectively.
1 c l a s s A c t i v i t y 1 f
2 void onCreate ( Bundle s t a t e ) f
3 Button btn = ( Button ) findViewById ( to2a ) ;
4 btn . s e t O n C l i c k L i s t e n e r (new O n C l i c k L i s t e n e r ( ) f
5 public void o n C l i c k ( View view ) f
6 S t r i n g d e v i c e i d = telphonyManager . g e t D e v i c e I d ( ) ;
7 I n t e n t i n t e n t = new I n t e n t ( this , A c t i v i t y 2 . c l a s s ) ;
8 i n t e n t . putExtra ( " d e v i c e i d " , d e v i c e i d ) ;
9 A c t i v i t y 1 . t h i s . s t a r t A c t i v i t y ( i n t e n t ) ;
10 gggg
11 c l a s s A c t i v i t y 2 f
12 void onResume ( ) f
13 I n t e n t i n t e n t = g e t I n t e n t ( ) ;
14 S t r i n g d e v i c e i d = i n t e n t . g e t S t r i n g E x t r a ( " d e v i c e i d " ) ;
15 H t t p C l i e n t H e l p e r . send ( d e v i c e i d ) ;
16 gg</p>
      <p>Listing 1.1. An example code about crossing component data leaks
2.1</p>
      <sec id="sec-1-1">
        <title>ICC methods</title>
        <p>A component is the basic unit to build Android apps. There are four types
of components: a) Activity, representing the user interface; b) Service,
executing tasks in background; c) Broadcast Receiver, receiving messages from other
components or the system; and d) Content Provider, acting as the standard
interface to share structured data between applications. Some speci c Android
system methods are used to trigger component communication. We call them
Inter-Component Communication (ICC) methods. The most used ICC method is
startActivity method which starts a new Activity. Components use Intent to
communicate between them. All ICC methods take at least one Intent as their
parameter. Intents can also encapsulate data and thus transfer them between
components.</p>
        <p>Take Listing 1.1 as an example. Activity1 and Activity2 are two
components. One ICC method exists in Activity1 is startActivity. Activity1
contains one source method getDeviceId which returns the unique device ID
(e.g., the IMEI for GSM and the MEID or ESN for CSMA phones) considering
that the device id is sensitive data. Activity2 contains one sink method send
which sends data outside the application. Neither Activity1 nor Activity2
contains taint path. But In fact, it does exist one data leak from source method
getDeviceId in Activity1 to sink method send in Activity2.</p>
        <p>Because of the component communication mechanism of Android, we cannot
detect crossing component taint paths by tracking tainted data since there is
no real code connection between two components but instead only glue code
for inter component communication. What we need is to connect components
together so we can build a precise CFG and thereby enabling us to track tainted
data across multiple components. To achieve this, we want to tackle the following
challenges.</p>
        <p>Getting Precise ICC links among components Two types of ICC links exist in
Android: explicit ICC links and implicit ICC links. Identifying implicit ICC links is
more di cult than identifying explicit ICC links because they have complicated
matching mechanism for two components. The Android system introduces three
conditions (Action, Category and Data) to perform implicit ICC (also IAC). To
precisely get all the implicit ICC links, we need to handle all the conditions
related to implicit ICC.</p>
        <p>Distinguishing Intent Data. Intents are used to transfer data between
components. One Intent can contain a lot of data but only part of these data may be
tainted. We need to distinguish them to avoid false positive results.
Resolving Special ICC methods. Some ICC methods, which are called special
ICC methods, have more complicated semantics comparing with common ICC
methods that only trigger one-way communication between components (e.g.,
startActivity method). We need to handle them speci cally. With the method
startActivityForResult, a result may be sent back from an activity when it
ends. For example, Component A uses startActivityForResult method to
start Component B and waits for until B ends. When B ends, Component A
retrieves results returned from Component B and runs again.
2.2</p>
      </sec>
      <sec id="sec-1-2">
        <title>Lifecycle methods</title>
        <p>In the lifecycle management of the components in Android, there is no main
method as in a traditional Java application. Instead, the Android system switches
between states of a components lifecycle by calling callback methods such as
onStart, onResume or onCreate. The lifecycle of an Activity is shown in Fig. 2.
At least six lifecycle methods (e.g., onPause) are involved in an Activity's life
START
onReStart
(A) (F) (B) (C)
onCreate onStart onResume RUNNING onPause onStop onDestory
(A): Another activity comes into the foreground (C): The activity is finishing or being destroyed by the system (E): User navigates to the activity
(B): The activity is no longer visible (D): The process of App is killed (F): User returns to the activity
(E)
(D)</p>
        <p>END
time. These methods are not directly connected in the app's code, but instead
they are executed by the Android system.
With the user-centric nature of Android apps, a user can interact a lot with the
apps (or system) through the touch screen. The management of user inputs is
mainly done by handling speci c callback methods such as the onClick method
which is called when the user clicks on a button. For example, method onClick
(line 5 in Listing 1.1) is a callback method which will be executed when its
related button is clicked. However, there are no code directly connected to the
methods in the application.
3</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Aims and Goals</title>
      <p>Our main goal is to detect private data leaks in Android applications. For this we
de ne the speci cations, design and implement a static analysis tool that will
detect sensitive taint paths between customized sources and sinks in Android
applications. The tool will not only detect intra-component sensitive paths, but
also inter-component and inter-application sensitive paths.</p>
      <p>The main expected contribution of this research to the eld of Engineering
Secure Software and Systems is a taint analysis tool that is precise, sound, e
cient and that produces less false positive for analysts who work in the eld of
Android security. The main usage of the tool is to detect privacy leaks. This tool
can build call paths between two methods, one being a source and the other a
sink. We de ne sources and sinks for our goal to detect private data leaks.
However, the tool could be used for other purpose. For example, it can be used to
detect that a resource is opened but never closed by de ning the open resource
method as the source and the close resource method as the sink.
4</p>
    </sec>
    <sec id="sec-3">
      <title>Solution</title>
      <p>We plan to statically detect privacy leaks with the CFG of analyzed apps. But
three problems detailed in Section 2 exist in Android system which make the
CFG imprecise. We cannot rely on an imprecise CFG to detect privacy leaks.
Thus, our approach rst builds a precise CFG by connecting all the isolate CFGs
Activity1</p>
      <p>Activity2
onCreate()
getDeviceId()
onCreate()
getIntent()
onClick()
startActivity(intent)
onResume()
send(deviceid)
ICC problem</p>
      <p>Lifecycle problem</p>
      <p>Callback problem
of Android apps. Then, based on the precise CFG, we check whether a source
method can reach a sink method or not. If a sink method is reached from a
source method, it means that a privacy leak is detected. Since the precise CFG
also models the inter-component communication, we can detect ICC based as
well as IAC based privacy leaks .</p>
      <p>The precise CFG of the example illustrated in Listing 1.1 is shown in Fig. 3.
In the CFG, we connect startActivity and onCreate methods to resolve ICC
problem. We connect onCreate and onResume methods to resolve Lifecycle
Problem and we connect onCreate and onClick methods to resolve Callback
problem. Based on the accurate CFG, we can detect an ICC based privacy leak from
getDeviceId in Activity1 to send in Activity2.</p>
      <p>
        Current Release. To detect privacy leaks in Android apps, we have realized
a prototype tool based on our previous work2: Epicc [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] which generates links
between components of Android applications and Flowdroid [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] which performs
intra-component taint analysis. Both Epicc and Flowdroid use the Soot
framework [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] which uses the Dexpler plugin [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] to convert Android Dalvik byte code
to Soot's internal representation called Jimple. The current version has basically
solved the three problems detailed in Section 2. But in some special case (e.g.,
distinguish the data in an Intent), still need to be advanced.
      </p>
      <p>Evaluation Plan. We aim at analyzing private data leaks in either third-party
apps or preloaded apps. We intend to test intra-component, inter-component
and inter-app based privacy leaks. We plan to test our approach against sample
apps written by ourselves since we know the expected output of the apps so that
we can compare the precision of our tool with the other existing tools. Then, we
look for privacy leaks in the real word applications.
5</p>
    </sec>
    <sec id="sec-4">
      <title>Related Work</title>
      <p>
        Android privacy leaks have recently attracted lots of attentions. AppIntent [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]
analyzes user-intended sensitive data transmission in Android. Woodpecker [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]
2 With our colleagues of TU Darmstadt and Penn State University
studies capability leaks which analyze the reachability of a dangerous permission
from a public, unguarded interface. Yajin et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] report passive content leaks
which cause a ected applications to passively disclose in-application data. Our
approach is di erent from them that we provide a generic taint analysis tool
which can detects all the above leaks with low false alarms. CHEX [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] uses
taint analysis technique to detect component hijacking vulnerabilities in Android
Applications. However, it does not analyze calls into the Android framework
itself.
      </p>
      <p>
        We aim at providing a static taint analysis tool with more precise and sound
results comparing with the existing tools. To achieve this we rely on
FlowDroid [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] a highly precise taint analysis tool for Android and Epicc [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] a highly
e ective ICC mapping (from ICC method to destination component) tool in
Android.
6
      </p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>We have described the problems of detecting privacy leaks in Android apps. We
have also described the solutions of resolving the above problems. Our motivation
is to build an precise CFG and thereby to detect ICC based as well as IAC based
privacy leaks in Android apps.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Damien</given-names>
            <surname>Octeau</surname>
          </string-name>
          et al. \
          <article-title>E ective inter-component communication mapping in android with epicc: An essential step towards holistic security analysis"</article-title>
          .
          <source>In: Proceedings of the 22nd USENIX Security Symposium</source>
          .
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Steven</given-names>
            <surname>Arzt</surname>
          </string-name>
          et al. \
          <article-title>FlowDroid: Precise Context, Flow, Field, Object-sensitive and Lifecycle-aware Taint Analysis for Android Apps"</article-title>
          .
          <source>In: Proceedings of the 35th Conference on Programming Language Design and Implementation</source>
          .
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Patrick</given-names>
            <surname>Lam</surname>
          </string-name>
          et al. \
          <article-title>The Soot framework for Java program analysis: a retrospective"</article-title>
          .
          <source>In: Cetus Users and Compiler Infastructure Workshop (CETUS</source>
          <year>2011</year>
          ).
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Alexandre</given-names>
            <surname>Bartel</surname>
          </string-name>
          et al. \
          <article-title>Dexpler: Converting Android Dalvik Bytecode to Jimple for Static Analysis with Soot"</article-title>
          .
          <source>In: ACM Sigplan International Workshop on the State Of The Art in Java Program Analysis</source>
          . Beijing, China,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Zhemin</given-names>
            <surname>Yang</surname>
          </string-name>
          et al. \
          <article-title>Appintent: Analyzing sensitive data transmission in android for privacy leakage detection"</article-title>
          .
          <source>In: Proceedings of the 2013 ACM SIGSAC conference on Computer &amp; communications security. ACM</source>
          .
          <year>2013</year>
          , pp.
          <volume>1043</volume>
          {
          <fpage>1054</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Michael</given-names>
            <surname>Grace</surname>
          </string-name>
          et al. \
          <article-title>Systematic detection of capability leaks in stock Android smartphones"</article-title>
          .
          <source>In: Proceedings of the 19th Annual Symposium on Network and Distributed System Security</source>
          .
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Yajin</given-names>
            <surname>Zhou</surname>
          </string-name>
          and
          <string-name>
            <given-names>Xuxian</given-names>
            <surname>Jiang</surname>
          </string-name>
          . \
          <article-title>Detecting passive content leaks and pollution in android applications"</article-title>
          .
          <source>In: Proceedings of the 20th Annual Symposium on Network and Distributed System Security</source>
          .
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Long</given-names>
            <surname>Lu</surname>
          </string-name>
          et al. \
          <article-title>Chex: statically vetting android apps for component hijacking vulnerabilities"</article-title>
          .
          <source>In: Proceedings of the 2012 ACM conference on Computer and communications security. ACM</source>
          .
          <year>2012</year>
          , pp.
          <volume>229</volume>
          {
          <fpage>240</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>