=Paper= {{Paper |id=Vol-2691/paper10 |storemode=property |title=Energy Consumption Measurement Frameworks for Android OS: A Systematic Literature Review |pdfUrl=https://ceur-ws.org/Vol-2691/paper10.pdf |volume=Vol-2691 |authors=Vladislav Myasnikov,Stanislav Sartasov,Ilya Slesarev,Pavel Gessen }} ==Energy Consumption Measurement Frameworks for Android OS: A Systematic Literature Review== https://ceur-ws.org/Vol-2691/paper10.pdf
    Energy Consumption Measurement Frameworks for
      Android OS: A Systematic Literature Review
                        Vladislav Myasnikov∗ , Stanislav Sartasov† , Ilya Slesarev‡ and Pavel Gessen§
                                                    ∗ Saint Petersburg State University

                                                       vladislav.myasnikov@bk.ru
                                                    † Saint Petersburg State University

                                                        stanislav.sartasov@spbu.ru
                                                    ‡ Saint Petersburg State University

                                                          slesarev.pr@gmail.com
                                               § Saint Petersburg Polytechnical University

                                                           pashagess2@mail.ru


      Abstract—In a modern world smartphones became a com-              one should be able to compare application or module energy
   monly used electronic devices performing numerous day-to-            consumption before and after refactoring. Therefore a tool
   day tasks and much more. Quick battery discharge degrades            for conducting such experiments is required. Such a tool
   user experience, and computationally intensive or badly written
   programs are responsible for it. It is not always evident which      may come as one-time testbed for a specific project or a
   tool to use and how to set up an experiment to estimate energy       more generic and reusable framework or utility. It should be
   consumption of a specific application. For this we conducted a       noted that there’s little uniformity among such tools. They
   Systematic Literature Review (SLR) to list existing frameworks to    differ not only in metering methodologies but in measurement
   measure application power metering for Android OS, to classify       results as well, i.e. some frameworks measure battery charge
   the approaches used to create them and also to assess their
   accuracy and experimental methodology. Our findings indicate         percentage change, other calculate consumed power in watts,
   that although there is a considerable amount of studies in this      while another group operates in abstract units of measure.
   field with various approaches, there is still a vacant place for a      In order to help practitioners and engineers better under-
   readily available tool, and it is difficult to compare accuracy of   stand current state-of-the-art approaches to energy consump-
   different frameworks. However there is a solid set of practices      tion measurement and to select proper approach, technique or
   and techniques for experimental setup in application energy
   measurement.                                                         tool for a particular experiment, we conduct a systematic liter-
                                                                        ature review (SLR) on the energy consumption measurement
                          I. I NTRODUCTION                              frameworks for mobile devices using Android OS. We selected
      It is impossible to imagine a modern world without smart-         this mobile platform as it is the most presented platform on
   phones and other mobile devices. Their compact size com-             the market compared to iOS and Windows Phone [3], and it
   bined with significant computational capabilities grant them a       is also open-sourced, meaning that some types of frameworks
   firm position as a day-to-day informational and recreational         (for example those that modify OS kernel) might be absent on
   tool. At the end of 2019 a number of smartphone users is             other platforms.
   expected to be 3.2 billion with Android OS as a leading mobile          This paper is organized as follows. In Section 2 our method-
   operating system [1].                                                ology for SLR is described and research questions (RQs) are
      As smartphones and tablets are mobile electronic devices,         formulated. Section 3 contains answers for RQs. Limitations
   its user experience is substantially defined by battery lifetime.    of this study are reported in Section 4. A side question of
   While hardware components are constantly improving with              relating our proposed framework classification to a number of
   power-saving electronics and more capacitous batteries being         commercial profilers is addressed in Section 5. Conclusions
   available to the market, inefficiently written software causes       are drawn and future work is outlined in Section 6.
   degradation of user experience due to elevated charge drain.
                                                                                                 II. M ETHOD
      Different applications and even different versions of the
   same application consume energy differently. A school of                We followed SLR guidelines by Kitchenham and Charters
   thought called ”green software development” advocates a need         [4] with a number of differences:
   to consider energy consumption as well as performance met-              • Instead of manual search process we addressed online
   rics during application development [2]. A common practice is             search engine. While this decision certainly affects the
   energy-efficient refactoring — such a change in software code             selection of studies compared to manual search, we used
   that doesn’t change its end user functionality but decreases its          a large number of papers to make a preliminary list and
   energy footprint.                                                         introduced additional phases to study selection process,
      Was a particular energy refactoring effective? How much                so we think the impact of this change on SLR quality is
   energy did we save by applying it? To answer these questions              negligible.



Copyright© 2020 for this paper by its authors.
Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
  •   Quality assessment was done along with the data ex-             the experiment itself and its conclusions are the centerpiece
      traction process and in some sense — as a part of               of the corresponding article. If, however, this artifact is the
      it. However Kitchenham and Charters [4] allow such              main discussion point for the article, we consider it to be a
      methodology, noting that quality information can be used        framework.
      either to assist primary study selection by constructing           Usually a testbed is only vaguely described, while frame-
      detailed inclusion/exclusion criteria prior to the main data    works are documented in detail, up to the point of open
      collection activity or to assist data analysis and synthesis,   source repositories, so this separation worked well. There
      so it is collected along with the main data.                    was a handful of notable exceptions however where a study
                                                                      focused on a specific energy-efficiency question also contained
A. Research Questions                                                 a detailed description of its testing environment along with the
   A following list of research questions (RQs) was compiled          experimental methodology and software architecture. In such
during initial literature assessment:                                 cases a question was asked if this testbed is described in such
   • RQ1: What approaches exist to estimate code fragments,           a detail that it can be a subject of a framework-centered article.
     methods or application energy consumption under An-              In the case of positive answer a study was added to short-list.
     droid OS?                                                           Additionally we do not include articles regarding smart
   • RQ2: Are there open source code repositories or other            watches and similar devices as its functionality is rather lim-
     programming artifacts for corresponding frameworks?              ited compared to the smartphone, therefore only mainstream
   • RQ3: What devices are used in measurement experi-                Android OS is considered as a focus of this SLR. We also
     ments?                                                           exclude articles focused on Windows Phone and iOS.
   • RQ4: How many frameworks specify or suggest experi-                 To summarize, our final criteria to include a study into a
     mental methodology?                                              short-list was as follows:
   • RQ5: What is the measurement precision and what is the              • Studies describing technical artifacts for other operating
     base scenario to compare to?                                           systems than mainstream Android OS are excluded.
   However after preliminary data extraction we also added               • Studies focused on a framework as an end result of a

the following questions due to the freqently occuring gaps in               research are included.
methodology in the reviewed studies:                                     • Studies focused on other topics but containing enough
                                                                            information on a testbed or a method to make one for it
   • RQ6: What units of measurements are used in experi-
                                                                            to be considered alienable as a standalone framework are
     ments with a particular framework and what is measured?
                                                                            included.
   • RQ7: How do frameworks deal with metering hardware
                                                                         • All other articles are exlcuded.
     frequency being considerably lower than CPU frequency?
                                                                         At first stage Myasnikov, Sartasov and Gessen processed
   RQ6 is answered to classify all the different ways report          every article independently. An article was assigned ”+” if
measurement results. RQ7 came into attention due to the               a reviewer considered it to be worthy of inclusion, ”-” if a
possibility of a child thread be entirely executed between            review was negative and ”∼” if in doubt. Three pluses or two
multimeter synchronization impulses therefore its contribution        pluses and a tilde meant inclusion to short-list, while an article
to the energy consumption would not be properly recorded.             with three minuses was excluded from further reading. Other
B. Search Process                                                     review combinations signified review conflicts. About 45% of
                                                                      studies were marked conflicts at the end of this phase.
   A preliminary list of 931 studies was formed with a Google
                                                                         Secondly, studies with conflicting reviews were additionally
Scholar1 resource. The query was ”android ”energy-efficiency”
                                                                      reviewed by Slesarev, and each article was read in greater
framework”, therefore ”energy-efficiency” was required to
                                                                      detail and discussed until a consensus was formed among the
appear as a phrase in a study text.
                                                                      researchers. At the end of this phase 51 articles were added
C. Study Selection                                                    to short list.
   At first inclusion criteria were rather simple — include              Some studies in the result list deserve specific mention.
studies about energy measuring frameworks, exclude all others         Although WattsOn framework [5] targets Windows Phone, it
— but it was quickly became apparent they were not sufficient.        was included because authors claim their approach is portable
Indeed some studies were discussing a software and hardware           to Android OS. 3 additional articles were added to the short
complex or purely software tool for measuring energy expen-           list manually as they were not a part of original study list:
                                                                         • Yoon, Kim et al. [6]
diture which was generic enough to be called a framework.
                                                                         • Zhang, Tiwana et al. [7]
However many articles contained a description of a testbed —
                                                                         • Li and Gallagher [8]
a testing environment that measures specific program energy
consumption in one way or another. In a considerable amount              Those studies became known either due to past literature
of cases the distinction between a testbed and a framework was        reviews in similar topics or during review process when they
blurry. We consider a programming artifact to be a testbed if         were referenced in multiple other articles. They were also
                                                                      reviewed independently by three researchers and got 3 pluses
 1 https://scholar.google.com                                         for review.
   Third stage of study selection was done along with data                4) How is raw metering data downloaded or otherwise
extraction. 6 studies in fact were written about a different                  obtained for processing?
subject while looking like a framework article at a top level,            5) What are the raw data transformations used to
so our short-list contained 48 articles.                                      present experiment outcome?
   However at this time we’ve found out that in some cases                6) Is there an open source code for a framework?
multiple studies were describing the same framework under            • Experimental methodology questions:
different angles or at different points of development cycles.            1) What are the preliminary actions to undertake on a
It is a common situation for an ongoing research project. As                  test device before starting measurements? Examples
the focus of our SLR is the frameworks and not the articles                   include turning off Wi-Fi, stopping applications,
per se, if multiple studies were written by the same or similar               charging up to 100%, controlling device temperature
collective of authors from the same institution and described                 etc.
a similarly named frameworks based on a similar principles,               2) Is measurement accuracy estimated? If yes, what is
after discussion and consensus between researchers they were                  the base of comparison?
considered to be written about a single framework, and data               3) Does framework experimental methodology con-
extraction results were merged for these studies. In particular               sider frequency of metering tools? What is the
we grouped the articles concerning the following frameworks:                  frequency of a tools used by authors?
  • JouleUnit by Wilke et al. [9], [10]                              If an article didn’t contain an information required to answer
  • Greendroid by Couto et al. [11], [12], [13]                   the question or if a question was not applicable to a specific
  • PETrA by Di Nucci, Palomba et al. [14], [15]                  article, it was also noted.
  • Aggarwal et al. [16] and Feghi [17]
                                                                     A pool of works was distributed along Myasnikov, Sle-
  • Li and Gallagher [18], [8]
                                                                  sarev and Sartasov, and data from each study was extracted
  • Ahmad et al. [19], [20]
                                                                  independently. Quality control was applied selectively if there
  In the end, out of 931 studies aggregated automatically and     was a misunderstanding between team members regarding
3 papers added manually our short-list contained 48 articles      particular data extraction result for a specific study. In this
summarized into 41 frameworks.                                    case data extraction was repeated collectively until consensus
                                                                  was formed.
D. Data Extraction                                                   Data extraction results were aggregated in an online docu-
  To answer the stated RQs we answered the following data         ment2 . We were able to extract relevant information for each
extraction questions for each included study or study group:      question for each framework in the list.
  • Conceptual questions:                                         E. Quality assessment
     1) What is the method of reading current battery charge         Generally SLRs are accompanied with a quality assessment
         level or energy expenditure? Examples include ex-        procedure concerning every included article. As we’re more
         ternal multimeter, internal sensor, Android OS API       interested in frameworks than in individual articles themselves,
         etc.                                                     we decided to modify this process by assessing the quality
     2) How large is the part of an application to be             of individual studies if they were not grouped with other
         profiled? For example, one may measure energy            publications, otherewise we assessed a group of articles as a
         consumption of a single line of code, a code block,      whole. Therefore if a framework is described in two articles,
         a method, a unit-test, an application or all currently   we assess the quality from both of them simultaneously.
         running applications.                                       To assess the quality we formulate the following questions:
     3) What is the end result of a measurement experi-
         ment?                                                       1) Is there an open-source code for the framework?
     4) What are the units of measurement utilized in a              2) Is framework measurement approach described?
         framework (i.e. watts, joules, relative units)?             3) Is experimental methodology described?
                                                                     4) Is framework accuracy assessed?
  • Technical questions:
                                                                     These questions intentionally overlap with our RQs, as
     1) Is the application code executed on a smartphone
                                                                  this information is in our opinion essential if a reader wants
         or special test device and what is its Android OS
                                                                  to understand a proposed framework. They were scored as
         version?
                                                                  follows:
     2) What kind of code instrumentation is used if any?
                                                                     • Question 1: yes (Y) if a repository can be found, partly
         In this context we define code instrumentation as
         addition of framework-specific subprograms (i.e. for           (P) if a program using a framework can be found, but not
         method start and end logging) to the target source             a source code, no (N) otherwise. Even if an article states
         code.                                                      2 https://docs.google.com/spreadsheets/d/
     3) How is energy consumption measurement started             17D1ArPavFQaFPGnU-OI3r8x1QoboWOoEa-rHR-9U98I/edit?
         from a technical standpoint?                             usp=sharing
     a number of lines of code for a described framework or               are generated — power readings and application execu-
     gives code fragments or algorithms, it is still a no.                tion trace as in Couto et al. [11]. In this case system
  • Question 2: yes (Y) if a framework approach is described              clocks on smartphone and multimeter or controlling PC
     in great detail, and entire thought process from principles          should be synchronized before the start of experiments.
     to implementation can be tracked, partly (P) if only a               Power readings can then be aligned with execution trace
     general description of framework principles is present,              and provide an insight which method or piece of code
     no (N) otherwise.                                                    is responsible for high energy usage. As instrumentation
  • Question 3: yes (Y) if a methodology is described in great            introduces additional code to be executed, its energy
     detail and limits and bounds are stated, partly (P) if only a        overhead should also be estimated and subtracted from
     broad description of measurement methodology is given,               final readings.
     no (N) otherwise.                                                  • Internal meter: This approach utilizes internal power
  • Question 4: yes (Y) if an experiment and a base case                  meters installed on the smartphone by manufacturer and
     are described, partly (P) if some consideration is given             Android OS API to access them. While generally such
     to accuracy, but not a thorough one, no (N) otherwise.               a tool can get a good power consumption estimate for
  The scoring procedure was Y = 1, P = 0.5, N = 0. Article                a smartphone as a whole, under specific experimental
quality was obtained as a sum of individual question scores.              conditions it can be trimmed down to a level of a
Results are given in Table I.                                             single application in question. Because smartphone and
                                                                          power sensor are using the same system clock, there are
        III. D ISCUSSION OF RESEARCH QUESTIONS                            no issues with clock synchronization. Power readings
                                                                          requests may be integrated into instrumentation code.
   This sections contains our findings in relation to the specific
                                                                          Energy consumption is estimated in the same way as
research questions as stated above.
                                                                          before.
A. RQ1: What approaches exist to estimate code fragments,               Indirect measurement approach (or model-based approach)
methods or application energy consumption under Android              means that profiling software is aggregating some information
OS?                                                                  regarding code execution and relates it to energy consumption
                                                                     using some mathematical model. Frameworks utilizing this
   We can classify different approaches for software energy
                                                                     approach operate in two phases: model calibration and energy
consumption estimation into two bins: if power measurement
                                                                     estimation. At first stage model coefficients are determined
is direct or indirect.
                                                                     or tuned with preliminary experiments or reference data. It is
   Direct measurement approach means that some metering
                                                                     an important step not only for different smartphone models,
agent is directly measuring voltage, current or even power.
                                                                     but for different devices of a same model as well [21]. After
Subclasses of this approach are as follows:
                                                                     a model is tailored for a device, actual energy metering
   • External meter: External digital multimeter is con-             experiments may be commenced.
      nected to battery contacts of smart device. Sometimes             Subclasses are formed based on the type of aggregated
      to compensate voltage drop in Li-ion batteries during          information used in the model:
      discharge and therefore normalize power readings an
                                                                        • Working time model: For this group information regard-
      external power generator working as a constant voltage
                                                                          ing working time of various smartphone systems is aggre-
      source is connected to testing device instead of default
                                                                          gated. Various devices may consume energy differently
      battery. Framework launches an application in question or
                                                                          under different modes of operation. For example, Wi-Fi
      a unit test alongside power measurement. In the end total
                                                                          module consumes different amounts of energy when idle,
      power consumption is estimated by linear interpolation
                                                                          when looking for a network and when transferring data
      as
                                                                          over a network. Therefore energy consumption value is
                    X
                  Nread −1
                                  Ii+1 + Ii                               calculated as
            E=             Ui ×             × (ti+1 − ti )    (1)
                     i=1
                                      2                                                         X   X
                                                                                                Ndev N Pi

                                                                                           E=               Pij × tij               (2)
     where E is total energy, Nread is number of power
                                                                                                 i=1 j=1
     readings, Ui is ith voltage read, Ii is ith current read,
     ti is time of ith read. Some multimeters write the time             where E is total energy, Ndev is a number of tracked
     of a read directly, while for other ti+1 − ti is an inverse         devices on a smartphone, N Pi is a number of different
     of a multimeter frequency. Step interpolation is also used          power characteristics of ith device, Pij is a jth power
     in some works.                                                      characteristic of ith device, tij is active time for ith device
     If a more detailed report is needed, for example at a               operating under Pij power characteristic.
     level of methods or code blocks, then the source code               Calibration phase consists of experiments for determining
     is instrumented with additional logging instructions for            power profiles for individual components using one of the
     method or code beginning and end, and two data traces               direct measurement approaches: both external and inter-
                                                                TABLE I
                                                            S TUDIES QUALITY

                                                           Q1          Q2           Q3           Q4          Quality
                                Studies                 Repository   Approach   Methodology   Experiment      score
                            Zhang et al. [7]               Y            Y           Y             Y             4
                           Hindle et al. [21]              Y            Y           Y             P            3,5
                  Di Nucci, Palomba et al. [14], [15]       P           Y           Y             Y            3,5
                   Tuysuz, Uçan and Trestian [22]          P           Y           Y             Y            3,5
                         Wilke et al. [9], [10]            Y            Y            P            P             3
                            Hao et al. [23]                N            Y           Y             Y             3
                     Couto et al. [11], [12], [13]         Y            Y           Y             N             3
                              Bareth [24]                  N            Y           Y             Y             3
                  Kamiyama, Inamura and Ohta [25]          N            Y           Y             Y             3
                            Saksonov [26]                  N            Y           Y             Y             3
                     Sahar, Bangash and Beg[27]            Y            Y            P            P             3
                        Ahmad et al. [19], [20]            N            Y           Y             Y             3
                        Pandiyan and Wu [28]               N            Y           Y             Y             3
                           Huang et al.[29]                N            Y           Y             Y             3
                           Oliveira et al.[30]             N            Y           Y             P            2,5
                   Aggarwal et al.[16], Feghi [17]         N            Y            P            Y            2,5
                     Westfield and Gopalan [31]             P           Y           N             Y            2,5
                       Dolezal and Becvar [32]             N            Y           Y             P            2,5
                             Hu et al. [33]                N            Y            P            Y            2,5
                         Yoon, Kim et al. [6]              N            Y            P            Y            2,5
                      Larsson and Stigelid [34]            N            Y            P            Y            2,5
                  Fischer, Brisolara and Mattos [35]       N            Y            P            Y            2,5
                       Lee, Yoon and Cha [36]              N            Y            P            P             2
                   Mittal, Kansal and Chandra [5]          N            Y           N             Y             2
                         Chen and Zong [37]                N            Y            P            P             2
                           Hung et al. [38]                N            Y           N             Y             2
                           Carette et al. [39]             N            Y           Y             N             2
                  Kapetanakis and Panagiotakis [40]        N            Y            P            P             2
                      Dong, Lan and Zhong [41]             N            Y            P            P             2
                            Lee et al. [42]                N            P            P            Y             2
                      Chung, Lin and King [43]             N            Y            P            N            1,5
                            Shin et al. [44]               N            P           N             Y            1,5
                       Jung, Kim and Cha [45]              N            Y            P            N            1,5
                            Tsao et al. [46]               N            Y           N             P            1,5
                               Metri [47]                  N            Y            P            N            1,5
                             Gao et al.[48]                N            P            P            P            1,5
                          Banerjee et al. [49]             N            Y           N             P            1,5
                      Li and Gallagher [18], [8]           N            Y            P            N            1,5
                             Li et al. [50]                N            Y            P            N            1,5
                         Walcott-Justice [51]              N            Y           N             N             1
                      Kim, Kyong and Lim[52]               N            Y           N             N             1



  nal meters are conceptually suitable. Linear regression                  method calls, floating-point operations etc. Energy con-
  is used to extract power coefficients from experimental                  sumption is calculated as
  data. As an alternative some frameworks utilize Android
  power profile data [53].                                                                          X
                                                                                                    Ninstr
  Energy estimation is assessed by measuring active device
                                                                                              E=             Pi × n i              (3)
  time during experimental code execution. Different com-
  ponents have different ways for measuring their active                                             i=1
  time, i.e. CPU stores information about its time in differ-
  ent power states in proc folder, while Wi-Fi generates                   where E is total energy, Ninstr is number of different
  system events when it transitions from one power state                   instruction types in a model, Pi is power consumption of
  to another.                                                              a single instruction of ith type, ni is number of ith type
  Additional estimations may also be incorporated into such                instructions in the code.
  model, for example, corrections for battery discharge rate               Model calibration is done by measuring power consump-
  [44].                                                                    tion of each instruction type in a synthetic tests using
• Instruction energy model: This group of frameworks                       direct approach.
  considers energy consumptions of various code instruc-                   Total energy is calculated from instruction statistics of a
  tion types — conditional statements, loop controls,                      specific code execution trace. It should be noted that it
                                                                           is not required to launch test code under Android OS if
     no specific Android API is invoked. Instruction statistics               and GreenDroid (January 2019), so we may call them semi-
     may be aggregated in any suitable environment.                           abandoned. In our experience lack of recent commits usually
   • Method/API call energy model: This approach is similar                   indicates either a research project being finished or it was
     to the previous one, but instead of a power profile for a                abandoned for some reason. Additionally, none of the projects
     single instruction energy consumption of a system or API                 can be built out of the box, and build instructions are not
     call, or framework method is calculated. Models under                    provided in details. As stated before, PowerTutor was used in
     this approach are created under assumption that most                     a number of other studies when it was supported, but it is not
     of the time and energy is spent outside of application                   surprising that a significant amount of more recent testbeds
     code, and therefore good estimation of application energy                we’ve seen in studies during study selection phase are based
     consumption can be obtained by analyzing its API usage.                  on industrial grade Monsoon power monitor [54]3 .
  We assign each of the listed frameworks to its approach in                     We conclude that studied frameworks rarely provide their
the Table II.                                                                 code in open source repositories. Those who do are not easy to
                                                                              launch. What’s worse, not all of the frameworks are available
                            TABLE II                                          to the practinioners even in the form of proprietary software.
         A PPROACHES FOR ENERGY CONSUMPTION ESTIMATION
                                                                              C. RQ3: What devices are used in measurement experiments?
                          Direct measurement
                                                                                 The number of devices running Android OS is hard to
          External meter                 [43], [9], [21], [46], [39],
                                            [50], [52], [34], [41]            reliably count: from the very first HTC Dream model line to
          Internal meter           [9], [45], [51], [37], [40], [34], [35]    modern foldable smartphones as well as tablets. Android OS
                         Indirect measurement                                 is also continues to being developed with a major release by
      Working time model           [44], [11], [14], [5], [47], [36], [32],   Google every year. For example Android Q version released
                                   [48], [25], [24], [49], [38], [26], [6],
                                      [7], [27], [52], [22], [29], [42]       in 2019 supports foldable smartphones and 5G-devices [55].
    Instruction energy model                   [23], [18], [19]               Additionally, Android API is not static, although generally
  Method/API call energy model              [30], [16], [31], [33]            an application is forward compatible with a newer Android
                                                                              OS version [56]. Regardless, application well-behaving on a
                                                                              specific version of Android platform might be glitching on
B. RQ2: Are there open source code repositories or other                      another version due to the API change. With such variety of
programming artifacts for corresponding frameworks?                           devices and versions it is important to understand what range
   This research question intentionally overlaps with the Ques-               of devices and Android OS versions are covered by existing
tion 1 in Quality Assessment section. Energy consumption                      energy metering frameworks.
measurement frameworks are practical and generic tools by                        We obtained the following distribution of target devices
definition, therefore it is reasonable to expect its source code              from the listed frameworks:
available for reuse. As an alternative a metering application                    • A special testbed is used in ∼17% studies. In this context
may also be sufficient.                                                             a testbed is a special platform running Android OS which
   Among the 41 included frameworks the results are as                              is not a smartphone or a tablet, although it might be
follows:                                                                            functionally similar. For example, Odroid-A platform has
   • 5 frameworks have their source code available in repos-                        a set of functions similar to Samsung Galaxy S2 [44].
      itory: JouleUnit [9], [10], GreenMiner [21], GreenDroid                    • An emulator is used in ∼12% studies. Emulator allows
      [11], [12], [13], EnSights [27], PowerTutor [7]                               to run applications without necessity to procure a real
   • 3 frameworks are openly presented as a ready application:                      device. However emulator is not a complete substitute
      PowerTutor [7], PETrA [14], [15], ”PowerProfiler &                            for a smartphone or a tablet, in particular application
      Energy-aware Network Selection” application in Google                         performance might be worse in emulated environments
      Play by Uçan, Tuysuz and Trestian [22]. A web-site for                       thus negatively affecting measurement accuracy. Another
      another framework, Orka [31], is available in Imperial                        limiting factor is a number of devices available for
      College of London intranet.                                                   emulation, for example, only one model of the Nexus 7
   • 3 studies include code samples or statistics in the text:                      tablet is emulated in Westfield and Gopalan [31]. Hence
      Sema [35], Huang et al. [29], Kapetanakis and Panagio-                        a range of available emulated devices is significantly
      takis [40].                                                                   limited compared to commercially available smartphone
   Note that PowerTutor [7] is marked as having both a source                       and tablet ranges, albeit this approach is significantly
code and an application available. Non-mentioned studies                            cheaper. WattsOn framework [5] is also included into this
don’t refer to the source code.                                                 3 Monsoon power monitor is a high frequency multimeter to be connected
   It should be noted separately that all of the frameworks                   between smart device battery and electronics. It is capable to upload mea-
with their code available are not kept up to date. PowerTutor                 surement results to a PC. While it was designed with smart device energy
and JouleUnit were abandoned over 5 years ago. At the                         measurements in mind, it is just that - a high-quality multimeter, which can
                                                                              be used in a variety of scenarios not limited to Android power profiling.
time of writing (January 2020) most recent commits are                        Therefore we do not consider Monsoon to be a metering framework by itself,
found in GreenMiner (May 2018), EnSights (August 2018)                        but it can lie in a foundation of one.
    category as its authors claim it to be portable to Android    D. RQ4: How many frameworks specify or suggest experi-
    emulators.                                                    mental methodology?
  • Real devices such as smartphones and tablets are used in         It is not enough to write code, launch a framework to
    the other studies. Smartphones are much more prevalent        estimate its energy consumption and get results. Experimental
    than tablets — the latter are experimented upon in a single   methodology is as important as framework itself. By method-
    study [51]. A number of devices used for experiments          ology we mean a set of rules, procedures and techniques aimed
    varies from 1 to 3.                                           to decrease, eliminate or otherwise take control of external and
   To evaluate a potential range of devices for a particular      internal influences, which are not independent variables of the
framework one can try different methods. Firstly, this range      experiment, on experimental outcome. In a way, methodology
may be estimated using target Android OS version used in the      defines a context of framework usage as authors see it, and
experiments described in a selected study. Then the article is    non-stringent approach to experimental setup and outcome
considered applicable for devices with the specified version      interpretation leads to innacurate or outright wrong results.
and (with a caution) higher due to forward compatibility [56].    Ill-thought procedures or rules or lack of thereof in a study is
Such an assessment, however, is rather imprecise as there are     an alarming signal.
no guarantee that experiments were conducted on a minimally          In this regard studies in our list generally take experimental
available version. Secondly, an estimate of supported Android     methodology into consideration with only 6 articles not men-
versions can be based on measurement tools being used in          tioning it. We extracted this information from the others and
experiments if they are specified in a study. In general this     aggregated them into following groups:
assessment is more precise than the first. Thirdly, it’s worth       • Testing environment setup.
to take into account technical restrictions imposed by the           • Repeated launch of tests, benchmarks etc.
framework itself if they are mentioned.                              • Overhead estimation.

   With the listed frameworks we obtained the following              Table III contains study distribution into each group.
results:                                                             A smartphone or a tablet is a complex device with numerous
                                                                  settings. Using it as a testing device requires to pay attention
  • In the articles where it was possible to draw conclusions     to its configuration. To a lesser extent this is also true for
    on Android version experiments are conducted on An-           emulator. Such parameters include the following:
    droid OS versions 2 to 5, so approximately 90% of all
                                                                     1) Reducing activity of background processes (both system
    devices are supported [57].
                                                                         and non-system) [23], [51], [39], [27], [40], [28].
  • Some restrictions are stated explicitly, for example, de-
                                                                     2) Increasing scheduling priority of the subject application
    vice power profile must be uploaded to the framework
                                                                         [23].
    server prior to experiments [19]. However, in some
                                                                     3) Decresing screen brightness or switching it off com-
    studies its restrictions are implied, for example, Android
                                                                         pletely [21], [25], [39], [27], [28], [22], [29].
    Power Profiler tool used for energy metering is available
                                                                     4) Switching off all unnecessary modules (Wi-Fi, 3G, GPS,
    only for Android 5.0+ devices [30], while Trepn Profiler
                                                                         accelerometer, proximity etc.) [39], [26], [20], [40], [28].
    is available for a limited range of devices for Android
                                                                     5) Charging battery for the same level [24], [39], [22].
    4.0+ [51], [47], [33].
                                                                     6) Shutting down for cooling down the battery heat [39].
  • At least 12 frameworks require root access for the device,
                                                                     7) Warmup execution [30].
    mainly to use additional API. It might be unacceptable
                                                                     8) Time waiting before beginning a new test in order to let
    for some users as rooted devices void the manufacturer’s
                                                                         background processes calm down [32], [27].
    warranty. Additionally there is a risk of disrupting proper
                                                                     9) Application reinstalling [21].
    OS functioning under root privileges compared to a non-
                                                                    10) Application cache erasing [14], [27].
    root user.
                                                                    11) Rebooting the device [26].
  • In 7 of those 12 frameworks it is not enough to just
                                                                    12) Factory data reset [24], [26].
    have a root access, as one must also integrate a kernel
    module into the OS [45], [46], [36], [6], [42] or otherwise      If left unchecked these factors introduce additional energy
    modify an existing Android framework [48], [39]. Such         drain and thus skew measurement results and decrease re-
    preparatory actions impose a considerably higher entry        peatability, therefore a proper setup is required. About 40%
    barrier than simply obtaining root access rights for the      of the listed frameworks take it into consideration one way or
    device.                                                       another.
                                                                     A framework itself might introduce additional energy con-
   In the end the range of supported devices is wide despite      sumption. For example, code instrumentation is required to
some of the frameworks targeting testbeds or emulators. While     obtain execution trace, but as it is implemented in the form
individual restrictions make some frameworks harder to set up     of additional code, it inevitably introduces energy overhead.
than others, we conclude that development tools versioning        Aggregating system data while executing a test suite is another
and framework accessibility in general (see RQ2) is more          reason behind elevated energy consumption during measure-
limiting for their usage than device range.                       ment. To alleviate this problem one should estimate this over-
head and subtract it from measurement results. In particular         setup or even impossible. This accuracy estimation technique
code instrumentation is estimated by running all the tracking        is used both for direct measurement and model-based.
code from a selected trace without the original test code and           Another way to assess accuracy of a newer framework is to
measuring its energy footprint. Around 50% of all studies            compare it with an already existing tool which is considered
address this issue, with some studies only stating measures          to be a baseline. Once again accuracy is a relative difference
to overcome it [37], [35], while other actually estimate its         between frameworks. Several studies match themselves against
effect [25], [6].                                                    a PowerTutor framework [31], [22], while one of the studies
   A notable number of studies — about 37% — repeat                  compares the proposed framework with Appscope [42]. While
their measurements multiple times to get an average or mean          such comparisons allow researchers to evaluate measurement
value of consumed energy. In this way authors reduce random          accuracy with respect to the few existing solutions, lack of
metering equipment jittering and OS scheduling impact on             homogenity in baseline framework selection indicates that
measurement results [9], [30], [17], [29].                           there is no universally accepted accuracy standard.
   Chung, Lin and King [43] and Jung, Kim and Cha [45]                  As stated above, several studies fell into several groups at
should be mentioned separately. Those studies specify amount         the same time. For example, in some studies a comparison is
of time for the experiment to last to produce adequate results.      made both with real measurements and with other frameworks
However they do not specify reasoning behind selecting par-          [19], [35], [42]. One interesting take on this approach is
ticular values, so we do make another group for them.                described in Saksonov [26] where a derivable energy profile
   Overall, we conclude that collectively studies in our list con-   is compared to the default Android Power Profile4 and direct
tain enough guidelines for a practitioner to properly conduct        measurements.
energy consumption estimation process.                                  Based on this we conclude that such two-factor estimation
                                                                     process provides better context of actual framework accuracy.
                          TABLE III                                  Complete lack of accuracy assessment strongly indicates a
              E XPERIMENTAL METHODOLOGY GROUPS
                                                                     study of a subpar quality.
                     Group          Number of papers
              Overhead estimation         20                                                   TABLE IV
               Environment setup          16                                    M EASUREMENT ACCURACY EVALUATION GROUPS
                 Relaunching              15
                     None                  6                                         Comparison group                Number of papers
                                                                                     Real measurements                     18
                                                                           Another tool (framework, power model)           13
E. RQ5: What is the measurement precision and what is the                                   None                           17
base scenario to compare to?
   Measurement accuracy is one of the key characteristics of
                                                                     F. RQ6: What units of measurements are used in experiments
energy profiling framework. Thus a question of framework
                                                                     with a particular framework and what is measured?
accuracy might be expected to be answered in a corresponding
study. Our review shows that accuracy estimation of some sort           Code energy consumption measure using a framework can
was carried out approximately in 61% works, which can be             be expressed in different terms. For example, recall equation
divided by a type of estimation into the following groups:           2. With the direct approach total energy is the most obvious
   • Comparison with direct measurement.
                                                                     way to estimate consumed energy. However when voltage is
   • Comparison with another tool (frameworks, power mod-
                                                                     constant amount of consumed Ampere-hours is as informative
      els).                                                          as total energy. Likewise, if power reads are regular their
                                                                     amount and distribution contain all necessary information.
   Study distribution into each group is shown in Table IV.
                                                                     Similar idea is also frequently used in model-based approach
Some studies fell into several groups at the same time.
                                                                     for estimating CPU power usage [53]. Moreover, the same
   If a study compares framework accuracy with a baseline in
                                                                     value can be presented with varying degree of accuracy. For
the form of external power or current metering device data, we
                                                                     example, under some experimental setups energy may be
correspond it to comparison with direct measurement group.
                                                                     better measured in µJ than in mJ or J for a more conclusive
Such a device can be an ordinary multimeter [20] or a special
                                                                     results. Therefore lack of homogenity in result presentation
equipment like, for example, Monsoon [14], [5], [6]. Accuracy
                                                                     among the listed frameworks can be explained.
is estimated as a relative difference between framework and
                                                                        Table V aggregates the characteristics measured in the listed
equipment data. Note that baseline scenario organization for
                                                                     studies and distributions of the corresponding measurement
this group is similar to the direct measurement approach in
                                                                     units. Among absolute characteristics power and energy were
framework building (see RQ1). A significant limitation of
                                                                     most often measured with their values presented in mW and J,
this technique is also the similar — usage of an external
                                                                     respectively, current, voltage and electric charge are estimated
meter comes at a price of estimating only the total power
                                                                     less often. In specific cases higher accuracy measurement was
consumption of a test device without breaking it down by
hardware components, and therefore intercomponent compar-              4 An XML file detailing power characteristics of smartphone components
ative analysis may be limited and require specific experimantal      provided by its manufacturer.
used with µW for power [47], and nJ for energy [28]. However,                     10 KHz. As amperemeters average variable current between
some studies are not precise in terminology, in particular                        synchronization impulses, power spikes of higher frequencies
difference in battery charge is also labeled as consumed energy                   may be smoothed out to levels of measurement errors and pass
[44].                                                                             undetected. Therefore it is important to understand if studies
   It is not uncommon to estimate specific energy calculated                      dealing with direct measurement approach frameworks or di-
per unit of device operation. Examples include instructions                       rect measurement baseline comparison alter their methodology
[43], MHz of processor, screen inch or dB of speakers [32],                       to address this issue.
bits of transmitted traffic [29].                                                    We’ve identified 10 frameworks that in one way or another
   Relative quantities are also encountered in the listed studies,                acknowledge it:
although rarely. Banerjee et al. [49] introduce a measure                            • Mittal et al. [5], Dolezal and Becvar [32], Saksonov [26],
reflecting the energy efficiency of the application for a certain                      Pandiyan and Wu [28] turn off DVFS completely or set it
time period. Dong, Lan and Zhong [41] measure energy                                   to a more controlled mode of operation, so CPU current
consumption in percentages relative to the total energy con-                           draw is at least more predictable and consistent.
sumption of the system.                                                              • Hung et al. [38], Yoon et al. [6], Zhang et al. [7] include
   Jung, Kim and Cha [45] organize their measuring process                             CPU frequency data as an additional input for total energy
using both absolute and relative quantities. The percentage                            consumption estimation.
of screen brightness, the percentage of processor utilization                        • Couto et al. [11], Li and Gallagher [18] adjust test running
and its frequency, the strength of Wi-Fi and 3G signals in                             time to be comparable to metering hardware frequency.
dBm were measured along with many other characteristics of                           • Larsson and Stigelid [34] adjust metering frequency to
a device energy behavior.                                                              test running time. We include this study with caution
   Overall, about 65% of the listed studies report a single                            though, as authors access Android API and not an internal
quantity, in other studies two or more are reported.                                   meter itself, therefore frequency of API calls might be
                                                                                       unrelated to frequency of internal meter readings.
                              TABLE V                                                We conclude that the issue is not widely acknowledged, but
                      M EASURED CHARACTERISTICS
                                                                                  it can be alleviated at least to an extent both by configuring
      Quantity        Number of papers         Accuracy distribution              the test device and by properly setting up experiment. We
       Power                21              mW — 14, W — 6, µW — 1                suggest that testing those measures for adequacy in a real-
      Energy                22                  J — 12, mJ — 7,                   life multicore scenario with threads created and destroyed
                                             kJ — 1, µJ — 1, nJ — 1
     Current                  3                 A — 2, mA — 1
                                                                                  regularly and at high frequency is an open challenge.
     Voltage                  2                 V — 1, mV — 1
                                                                                                    IV. S TUDY LIMITATIONS
  Electric charge             3               mAh — 2, mAms — 1
                                                                                     One of the main issues when conducting SLR is to find all
                                                                                  relevant studies to include. We used automatic search system
G. RQ7: How do frameworks deal with metering hardware                             (Google Scholar) and therefore our search is as efficient and
frequency being considerably lower than CPU frequency?                            thorough as this service is. While the query term ”energy
   This RQ was included at later stages of data extraction                        efficiency” allowed us to find a good amount of relevant
process as a reaction to the following phenomena.                                 studies, we admit that in some of the additionally included
   Modern CPUs including systems on a chip like smartphones                       articles it was absent. There is a probability that we missed
or tablets operate at frequencies of several GHz. While the                       some other relevant studies not using this term in the text.
operating voltage for a CPU is more or less constant5 , the                          Additionally several months passed since our original search
current it draws can vary tremendously. When multiple cores                       and article publication, so there is also a possibility of some
are working, it is higher than in the case of a single core                       newer articles published after summer 2019 being missed in
working. Entire CPU or even each of its cores may potentially                     our study.
work at different frequencies with different corresponding cur-                      It is also possible that we missed to group some articles,
rent, changing dynamically — this process is called Dynamic                       although we don’t think there’s a high probability to it. We
Voltage Frequency Scaling (DVFS). Such changes in drawn                           assume our merge process was reliable enough as an idea to
current may appear at frequencies of at least MHz range.                          merge [16] and [17] into a single group came initially from
   On the contrary metering hardware in the listed works                          their common institution of origin despite completely different
operates at considerably lower frequencies. Maximum fre-                          sets of authors.
quency of 100 KHz is found in Wilke et al. [9], [10], but                            Another limitation was a selective quality control of a
the bulk of studies uses multimeters operating way below                          data extraction process. In some cases particular studies were
                                                                                  handled by a single researcher with extracted data being well-
   5 For Li-ion and Li-pol batteries commonly used in modern smartphones          written and not raising questions, so other researchers relied
voltage drops slightly between 90% and 20% of charge, but it is largely           on this data instead of original study. Such process could
insignificant drop. Electronic devices can in fact safely operate in a range of
voltages, so only a significant drop in voltage after 20% of charge results in    introduce errors in extracted data due to misunderstanding of
functional degrade.                                                               a study.
    V. C LASSIFICATION OF SOME COMMERCIAL POWER                       circumstances accuracy comparison between approaches is
                           PROFILERS                                  currently extremely hard to undertake.
   As we prepared this article, a frequently asked question              On a brighter side, these frameworks are reported to be
was how to relate results obtained from SLR to a number               used. To our knowledge only a PowerTutor framework [7] was
of commercially available power profilers used in practical           able to evolve further and became a broadly used application,
Android development. While comparative accuracy overview              while some of the frameworks were documented to be used
is out of the scope of this article, we think it is useful to apply   internally as a tools for higher level research projects [21],
framework classification obtained in RQ1 to those profilers.          [11].
Our goal here is to help practitioners better understand their           A number of devices supported by existing frameworks is
intended use cases. The list is definitely not exhaustive, but        large, so even if a practitioner would write a framework itself,
those were the most frequently mentioned profilers.                   test device itself is not a limiting factor. A set of method-
   Trepn Power Profiler [58] is a tool developed by Qualcomm          ological practices and techniques was established to improve
which is no longer supported. It is a direct measurement              measurement accuracy, and even relatively low frequency of
framework using internal meter, and therefore its energy              hardware equipment compared to measured device can be
consumption data is as accurate as the meter itself is. This          offset by smart experimental design, although we think that
issue was acknowledged by developers and a list of accurate           there are still open questions in this area.
devices was compiled [59].                                               We’ve also shown that our proposed classification is useful
   BatteryHistorian [60] is a tool for background system in-          to analyze possible use cases for commercially available power
formation aggregation in Android OS. Not only it provides             profilers targeting Android OS.
power drain information but also statistics of runtime system            Results of our study affected design decisions and helped to
events like Wi-Fi scans or wakelocks. Being classified as a           continue developent of our own tool for energy consumption
direct measurement framework with internal meter, it displays         metering for Android OS - a working time model-based
energy consumption as percentage of battery charge which is           Navitas framework6 . From the beginning it was conceived as
a victim to battery degradation. A noticable drop in battery          an open-sourced framework which could be used in different
charge usually requires a significant amount of computations          scenarios, and one of possible applications we’re developing is
and/or peripherals work, therefore this value is not suitable for     a plugin for Android Studio IDE. Its development, evaluation
short time experiments, for example, those that run only for          and application for energy refactorings is a focus for future
several seconds.                                                      work.
   Android Studio Energy Profiler [61] shows energy con-
sumption as a real-time graph in relative units. Regretfully,                                  ACKNOWLEDGMENT
there is virtually no documentation on the model itself, but            Authors would like to thank Lanit-Tercom company for
after analyzing the information obtained from various sources         hosting our research project as part of its Summer School’19.
(such as StackOverflow [62]) and experimenting with tool
itself we came to the conclusion it is a working time model-                                        R EFERENCES
based solution. It aggregates various system events, assigns
them weights and gives a resulting relative power drain metric.        [1] Marko Milijic, “29+ Smartphone Usage Statistics: Around the World in
                                                                           2020,” https://leftronic.com/smartphone-usage-statistics/, 2019, [Online;
While it is granular enough to be used for energy refactorings             accessed 14-January-2020].
efficiency estimation, we have concerns if its ”one-size-fits-         [2] C. Sahin, F. Cayci, I. Manotas, J. Clause, F. Kiamilev, L. Pollock,
all” model is representative of real smart devices hardware                and K. Winbladh, “Initial explorations on design pattern energy usage,”
                                                                           2012 1st International Workshop on Green and Sustainable Software,
energy consumption.                                                        GREENS 2012 - Proceedings, 06 2012.
                                                                       [3] S. O’Dea, “Share of global smartphone shipments by operating sys-
          VI. C ONCLUSIONS AND F UTURE W ORK                               tem from 2014 to 2023,” https://www.statista.com/statistics/272307/
   The results of our study indicate that there are various                market-share-forecast-for-smartphone-operating-systems/, 2019, [On-
                                                                           line; accessed 14-January-2020].
approaches to measure application energy consumption in An-            [4] B. Kitchenham and S. Charters, “Guidelines for performing systematic
droid OS with some frameworks utilizing external equipment                 literature reviews in software engineering,” vol. 2, 01 2007.
to gather necessary data and others using previously calibrated        [5] R. Mittal, A. Kansal, and R. Chandra, “Empowering developers to
                                                                           estimate app energy consumption,” in Proceedings of the 18th Annual
models. While there are merits to all of those approaches it               International Conference on Mobile Computing and Networking, ser.
is difficult to compare framework accuracy with one another                Mobicom ’12. New York, NY, USA: Association for Computing
based on the studies themself as there is no uniform accuracy              Machinery, 2012, p. 317–328. [Online]. Available: https://doi.org/10.
                                                                           1145/2348543.2348583
estimation methodology presented in the listed publications.           [6] C. Yoon, D. Kim, W. Jung, C. Kang, and H. Cha, “Appscope: Applica-
   Even if one is going to compare frameworks experimen-                   tion energy metering framework for android smartphones using kernel
tally, only a handful of studies have their source code or                 activity monitoring,” in Proceedings of the 2012 USENIX Conference on
                                                                           Annual Technical Conference, ser. USENIX ATC’12. USA: USENIX
sample application openly available which considerably limits              Association, 2012, p. 36.
framework representation for such analysis. What’s worse,
code repositories look mostly abandoned, and framework start-           6 https://github.com/Stanislav-Sartasov/
up documentation is scarce. We conclude that under these              Navitas-Framework/
 [7] L. Zhang, B. Tiwana, R. P. Dick, Z. Qian, Z. M. Mao, Z. Wang,               [25] T. Kamiyama, H. Inamura, and K. Ohta, “A model-based energy profiler
     and L. Yang, “Accurate online power estimation and automatic battery             using online logging for android applications,” in 2014 Seventh Inter-
     behavior based power model generation for smartphones,” in 2010                  national Conference on Mobile Computing and Ubiquitous Networking
     IEEE/ACM/IFIP International Conference on Hardware/Software Code-                (ICMU), Jan 2014, pp. 7–13.
     sign and System Synthesis (CODES+ISSS), Oct 2010, pp. 105–114.              [26] A. Saksonov, “Method to derive energy profiles for android platform,”
 [8] X. Li and J. P. Gallagher, “Fine-grained energy modeling for the source          2014.
     code of a mobile application,” in Proceedings of the 13th International     [27] H. Sahar, A. Bangash, and M. Beg, “Towards energy aware object-
     Conference on Mobile and Ubiquitous Systems: Computing, Networking               oriented development of android applications,” Sustainable Computing:
     and Services, ser. MOBIQUITOUS 2016. New York, NY, USA:                          Informatics and Systems, vol. 21, 11 2018.
     Association for Computing Machinery, 2016, p. 180–189. [Online].            [28] D. Pandiyan and C. Wu, “Quantifying the energy cost of data movement
     Available: https://doi.org/10.1145/2994374.2994394                               for emerging smart phone workloads on mobile platforms,” in 2014
 [9] C. Wilke, S. Götz, and S. Richly, “Jouleunit: A generic framework               IEEE International Symposium on Workload Characterization (IISWC),
     for software energy profiling and testing,” in Proceedings of the 2013           Oct 2014, pp. 171–180.
     Workshop on Green in/by Software Engineering, ser. GIBSE ’13. New           [29] J. Huang, F. Qian, A. Gerber, Z. M. Mao, S. Sen, and O. Spatscheck,
     York, NY, USA: Association for Computing Machinery, 2013, p. 9–14.               “A close examination of performance and power characteristics of 4g
     [Online]. Available: https://doi.org/10.1145/2451605.2451610                     lte networks,” in Proceedings of the 10th International Conference on
[10] C. Wilke, “Energy-aware development and labeling for mobile applica-             Mobile Systems, Applications, and Services, ser. MobiSys ’12. New
     tions,” Ph.D. dissertation, 03 2014.                                             York, NY, USA: Association for Computing Machinery, 2012, p.
[11] M. Couto, J. Cunha, J. P. Fernandes, R. Pereira, and J. Saraiva,                 225–238. [Online]. Available: https://doi.org/10.1145/2307636.2307658
     “Greendroid: A tool for analysing power consumption in the android          [30] W. Oliveira, R. Oliveira, F. Castor, B. Fernandes, and G. Pinto, “Rec-
     ecosystem,” in 2015 IEEE 13th International Scientific Conference on             ommending energy-efficient java collections,” in 2019 IEEE/ACM 16th
     Informatics, Nov 2015, pp. 73–78.                                                International Conference on Mining Software Repositories (MSR), May
[12] M. Couto, “Monitoring energy consumption in android applications,”               2019, pp. 160–170.
     2014.                                                                       [31] B. Westfield and A. Gopalan, “Orka: A new technique to profile
[13] M. Couto, T. Carção, J. Cunha, J. Fernandes, and J. Saraiva, “Detecting        the energy usage of android applications,” in 2016 5th International
     anomalous energy consumption in android applications,” 10 2014, pp.              Conference on Smart Cities and Green ICT Systems (SMARTGREENS),
     77–91.                                                                           April 2016, pp. 1–12.
[14] D. Di Nucci, F. Palomba, A. Prota, A. Panichella, A. Zaidman, and           [32] J. Dolezal and Z. Becvar, “Methodology and tool for energy consump-
     A. De Lucia, “Petra: A software-based tool for estimating the energy             tion modeling of mobile devices,” in 2014 IEEE Wireless Communi-
     profile of android applications,” in 2017 IEEE/ACM 39th International            cations and Networking Conference Workshops (WCNCW), April 2014,
     Conference on Software Engineering Companion (ICSE-C), May 2017,                 pp. 34–39.
     pp. 3–6.                                                                    [33] Y. Hu, J. Yan, D. Yan, Q. Lu, and J. Yan, “Lightweight energy
[15] ——, “Software-based energy profiling of android apps: Simple, effi-              consumption analysis and prediction for android applications,” Science
     cient and reliable?” in 2017 IEEE 24th International Conference on               of Computer Programming, vol. 162, 05 2017.
     Software Analysis, Evolution and Reengineering (SANER), Feb 2017,           [34] M. Larsson and M. Stigelid, “Energy efficient data synchronization in
     pp. 103–114.                                                                     mobile applications : A comparison between different data synchroniza-
[16] M. Feghhi, “Multi-layer tracing of android applications for energy-              tion techniques,” Ph.D. dissertation, 08 2015.
     consumption analysis,” 2017.                                                [35] L. M. Fischer, L. B. d. Brisolara, and J. C. B. d. Mattos, “Sema: An
[17] K. Aggarwal, C. Zhang, J. C. Campbell, A. Hindle, and E. Stroulia, “The          approach based on internal measurement to evaluate energy efficiency
     power of system call traces: Predicting the software energy consumption          of android applications,” in 2015 Brazilian Symposium on Computing
     impact of changes,” in Proceedings of 24th Annual International Con-             Systems Engineering (SBESC), Nov 2015, pp. 48–53.
     ference on Computer Science and Software Engineering, ser. CASCON           [36] S. Lee, C. Yoon, and H. Cha, “User interaction-based profiling system
     ’14. USA: IBM Corp., 2014, p. 219–233.                                           for android application tuning,” in Proceedings of the 2014 ACM
[18] X. Li and J. Gallagher, “An energy-aware programming approach for                International Joint Conference on Pervasive and Ubiquitous Computing,
     mobile application development guided by a fine-grained energy model,”           ser. UbiComp ’14. New York, NY, USA: Association for Computing
     05 2016.                                                                         Machinery, 2014, p. 289–299. [Online]. Available: https://doi.org/10.
[19] R. W. Ahmad, A. Naveed, J. J. P. C. Rodrigues, A. Gani, S. A. Madani,            1145/2632048.2636091
     J. Shuja, T. Maqsood, and S. Saeed, “Enhancement and assessment             [37] X. Chen and Z. Zong, “Android app energy efficiency: The impact of
     of a code-analysis-based energy estimation framework,” IEEE Systems              language, runtime, compiler, and implementation,” in 2016 IEEE In-
     Journal, vol. 13, no. 1, pp. 1052–1059, March 2019.                              ternational Conferences on Big Data and Cloud Computing (BDCloud),
[20] R. Ahmad, A. Gani, S. h. Ab hamid, A. Naveed, K. Ko, and J. Rodrigues,           Social Computing and Networking (SocialCom), Sustainable Computing
     “A case and framework for code analysis-based smartphone application             and Communications (SustainCom) (BDCloud-SocialCom-SustainCom),
     energy estimation,” International Journal of Communication Systems,              Oct 2016, pp. 485–492.
     vol. 30, 11 2016.                                                           [38] S. Hung, F. Liang, C. Tu, and N. Chang, “Performance and power
[21] A. Hindle, A. Wilson, K. Rasmussen, E. J. Barlow, J. C.                          estimation for mobile-cloud applications on virtualized platforms,” in
     Campbell, and S. Romansky, “Greenminer: A hardware based                         2013 Seventh International Conference on Innovative Mobile and Inter-
     mining software repositories software energy consumption framework,”             net Services in Ubiquitous Computing, July 2013, pp. 260–267.
     in Proceedings of the 11th Working Conference on Mining Software            [39] A. Carette, M. A. A. Younes, G. Hecht, N. Moha, and R. Rouvoy,
     Repositories, ser. MSR 2014. New York, NY, USA: Association                      “Investigating the energy impact of android smells,” in 2017 IEEE
     for Computing Machinery, 2014, p. 12–21. [Online]. Available:                    24th International Conference on Software Analysis, Evolution and
     https://doi.org/10.1145/2597073.2597097                                          Reengineering (SANER), Feb 2017, pp. 115–126.
[22] M. Tuysuz, M. Uçan, and R. Trestian, “A real-time power monitoring         [40] K. Kapetanakis and S. Panagiotakis, “Efficient energy consumption’s
     and energy-efficient network/interface selection tool for android smart-         measurement on android devices,” in 2012 16th Panhellenic Conference
     phones,” Journal of Network and Computer Applications, vol. 127, 11              on Informatics, Oct 2012, pp. 351–356.
     2018.                                                                       [41] M. Dong, T. Lan, and L. Zhong, “Rethink energy accounting
[23] S. Hao, D. Li, W. G. J. Halfond, and R. Govindan, “Estimating android            with cooperative game theory,” in Proceedings of the 20th Annual
     applications’ cpu energy usage via bytecode profiling,” in 2012 First            International Conference on Mobile Computing and Networking, ser.
     International Workshop on Green and Sustainable Software (GREENS),               MobiCom ’14. New York, NY, USA: Association for Computing
     June 2012, pp. 1–7.                                                              Machinery, 2014, p. 531–542. [Online]. Available: https://doi.org/10.
[24] U. Bareth, “Simulating power consumption of location tracking algo-              1145/2639108.2639128
     rithms to improve energy-efficiency of smartphones,” in 2012 IEEE 36th      [42] S. Lee, W. Jung, Y. Chon, and H. Cha, “Entrack: a system facility for
     Annual Computer Software and Applications Conference, July 2012, pp.             analyzing energy consumption of android system services,” 09 2015, pp.
     613–622.                                                                         191–202.
[43] Y. Chung, C. Lin, and C. King, “Aneprof: Energy profiling for android
     java virtual machine and applications,” in 2011 IEEE 17th International
     Conference on Parallel and Distributed Systems, Dec 2011, pp. 372–379.
[44] Donghwa Shin, Kitae Kim, Naehyuck Chang, Woojoo Lee, Yanzhi
     Wang, Qing Xie, and M. Pedram, “Online estimation of the remaining
     energy capacity in mobile systems considering system-wide power
     consumption and battery characteristics,” in 2013 18th Asia and South
     Pacific Design Automation Conference (ASP-DAC), Jan 2013, pp. 59–64.
[45] W. Jung, K. Kim, and H. Cha, “Userscope: A fine-grained framework
     for collecting energy-related smartphone user contexts,” in 2013 Inter-
     national Conference on Parallel and Distributed Systems, Dec 2013, pp.
     158–165.
[46] S. Tsao, C. Kao, I. Suat, Y. Kuo, Y. Chang, and C. Yu, “Powermemo:
     A power profiling tool for mobile devices in an emulated wireless
     environment,” in 2012 International Symposium on System on Chip
     (SoC), Oct 2012, pp. 1–5.
[47] G. Metri, “Energy efficiency analysis and optimization for mobile
     platforms,” 2014.
[48] X. Gao, D. Liu, D. Liu, H. Wang, and A. Stavrou, “E-android: A new
     energy profiling tool for smartphones,” in 2017 IEEE 37th International
     Conference on Distributed Computing Systems (ICDCS), June 2017, pp.
     492–502.
[49] A. Banerjee, L. K. Chong, S. Chattopadhyay, and A. Roychoudhury,
     “Detecting energy bugs and hotspots in mobile apps,” in Proceedings
     of the 22nd ACM SIGSOFT International Symposium on Foundations
     of Software Engineering, ser. FSE 2014. New York, NY, USA:
     Association for Computing Machinery, 2014, p. 588–598. [Online].
     Available: https://doi.org/10.1145/2635868.2635871
[50] D. Li, C. Sahin, J. Clause, and W. G. J. Halfond, “Energy-directed test
     suite optimization,” in 2013 2nd International Workshop on Green and
     Sustainable Software (GREENS), May 2013, pp. 62–69.
[51] K. Walcott-Justice, “Maue: A framework for detecting energy bugs from
     user interactions on mobile applications,” 2016.
[52] H.-J. Kim, J. Kyong, and S.-S. Lim, “A systematic power and perfor-
     mance analysis framework for heterogeneous multiprocessor system,”
     Journal of IEMEK, vol. 9, pp. 315–321, 12 2014.
[53] “Power Profiles for Android,” https://source.android.com/devices/tech/
     power, 2019, [Online; accessed 14-January-2020].
[54] Monsoon Solutions, Inc., “High voltage power monitor,” https://www.
     msoon.com/online-store, 2019, [Online; accessed 18-August-2019].
[55] “Android 10 for Developers,” https://developer.android.com/about/
     versions/10/highlights, 2019, [Online; accessed 14-January-2020].
[56] “Application forward compatibility.” [Online]. Available: https://
     developer.android.com/guide/topics/manifest/uses-sdk-element.html#fc
[57] “Distribution dashboard.” [Online]. Available: https://developer.android.
     com/about/dashboards
[58] “Trepn Power Profiler,” https://developer.qualcomm.com/forums/
     software/trepn-power-profiler, 2019, [Online; accessed 14-January-
     2020].
[59] “Which mobile devices report accurate system power consumption?”
     https://developer.qualcomm.com/forum/qdn-forums/software/
     trepn-power-profiler/28349, 2013, [Online; accessed 14-January-2020].
[60] “Analyze power use with Battery Historian,” https://developer.android.
     com/topic/performance/power/battery-historian, 2020, [Online; accessed
     14-January-2020].
[61] “Inspect energy use with Energy Profiler,” https://developer.android.
     com/studio/profile/energy-profiler, 2019, [Online; accessed 14-January-
     2020].
[62] “Energy        consumption         on      Android      Studio      Pro-
     filer,”                    https://stackoverflow.com/questions/52647045/
     energy-consumption-on-android-studio-profiler,        2018,      [Online;
     accessed 14-January-2020].