=Paper= {{Paper |id=None |storemode=property |title=Combining Multiple Dimensions of Knowledge in API Migration |pdfUrl=https://ceur-ws.org/Vol-708/mdsm2011-bartolomei-et-al-11-apimig.pdf |volume=Vol-708 }} ==Combining Multiple Dimensions of Knowledge in API Migration== https://ceur-ws.org/Vol-708/mdsm2011-bartolomei-et-al-11-apimig.pdf
                 Combining Multiple Dimensions of Knowledge in API Migration
                        Thiago Tonelli Bartolomei1 , Mahdi Derakhshanmanesh2 , Andreas Fuhr2 , Peter Koch2 ,
                                     Mathias Konrath2 , Ralf Lämmel2 , and Heiko Winnebeck2
                                                 1
                                                    University of Waterloo, Canada
                                             2
                                               University of Koblenz-Landau, Germany


   Abstract—We combine multiple dimensions of knowledge                      Acknowledgement We are grateful to Daniel Ratiu for providing
about APIs so that we can support API migration by                           us with data related to the programming ontology of [9], [10].
wrapping or transformation in new ways. That is, we                          We are also grateful to four anonymous MDSM 2011 reviewers
assess wrapper-based API re-implementations and provide                      for their excellent advice.
guidance for migrating API methods. We demonstrate our
approach with two major GUI APIs for the Java platform                                  II. T HE INTEGRATED REPOSITORY
and two wrapper-based re-implementations for migrating
between the GUI APIs.                                                           We integrate three data sources with API knowledge
                                                                             into a repository. Let us describe those data sources, the
   Keywords-Software migration, API migration, API analy-
sis, Wrapping, Mining software repositories                                  metamodel of the integrated repository, and the repository
                                                                             technology as such.
                            I. I NTRODUCTION                                 A. Data sources
   API migration is a kind of software migration; it may                       •  A PI M ODEL (developed by the present authors)—a
be necessary to meet requirements for software modern-                            model of API implementations (including S WING,
ization, application integration, and others. API migration                       SWT, S WING WT, SWTS WING) with an underlying
is realized by wrapping or transformation. We refer to [1],                       metamodel that is a (very) limited Java metamodel
[2], [3], [4], [5], [6], [7], [8] for recent work on the subject.                 for structural properties and calling relationships;
   For instance, consider the following re-engineering sce-                     • A PI U SAGE (developed by Lämmel et al. [11])—a
nario. Two Java applications need to be integrated, but they                      fact base (say, database) with usage properties of
use different GUI APIs, say S WING and SWT. Based on                              1476 open-source Java projects at SourceForge, in
the exercised features and possibly other considerations,                         particular with facts for API method calls within the
one of the two APIs is favored for the integrated ap-                             projects’ code;
plication. The disfavored API (the “source API”) can be                         • A PI L INKS (developed by Ratiu et al. [9], [10])—
re-implemented in terms of the favored API (the “target                           an ontology for programming concepts that were
API”) as a wrapper so that the migration requires little, if                      extracted semi-automatically from APIs in different
any, rewriting of the application’s code. Incidentally, there                     programming domains, complete with trace links
are two advanced open-source wrappers that serve both                             between concepts and the API source-code elements
directions of migration: S WING WT1 and SWTS WING2 .                              from which they were derived.
   In previous work [6], [8], we substantiated that migra-                      The A PI M ODEL source contributes basic knowledge
tion between independently developed source and target                       about types and methods of genuine API implementations,
APIs may be complex because of significantly different                       and their coverage by the typically incomplete wrapper-
generalization hierarchies, contracts, and protocols.                        based re-implementations. The A PI U SAGE source helps to
      Contribution: In the present paper, we describe an                     assess, for example, the relevance of genuine methods that
approach for the combination of multiple dimensions of                       are not implemented in a wrapper. The A PI L INKS source
knowledge about APIs so that API migration can be                            helps to derive candidate classes and methods that could
supported in new ways. That is, we assess wrapper-                           be used in a wrapper-based API re-implementation.
based API re-implementations and provide guidance for
migrating API methods. To this end, we leverage a model-                     B. Metamodel of the repository
based approach to the integration of knowledge about APIs
                                                                               Fig. 1 shows the metamodel (a UML class diagram) of
into a repository for convenient use in declarative queries.
                                                                             our integrated repository where metaclasses are tagged by
Throughout the paper, we use the S WING/SWT APIs and
                                                                             data sources A PI M ODEL, A PI U SAGE, and A PI L INKS. We
the above-mentioned wrappers as subjects under study.
                                                                             must note that the metamodel does not cover all elements
      Road-map: Sec. II describes the integrated reposi-                     of the sources, but is streamlined to fit our objectives.
tory. Sec. III and Sec. IV cover different forms of support-
                                                                               The metaclass NamedElement represents package-
ing API migration. Related work is discussed in Sec. V,
                                                                             qualified names of packages, classes, and methods. Be-
and the paper is concluded in Sec. VI. The paper and
                                                                             cause of the composition relationships in the metamodel,
accompanying material are available online.3
                                                                             NamedElements are also qualified by the name of an
  1 http://swingwt.sourceforge.net/: re-implements S WING in terms of SWT    API, in fact, by a particular implementation, which could
  2 http://swtswing.sourceforge.net/: re-implements SWT in terms of S WING   be a genuine implementation or a wrapper-based re-
  3 http://softlang.uni-koblenz.de/apirep/                                   implementation.
                                                                         using a:
                                                                         from c: V{Class}
                                                                         with c.qualifiedName =˜ a and count(c−−>{CorrespondsTo}) = 0
                                                                         reportSet c
                                                                         end

                                                                           That is, a is an argument of the query for the name of
                                                                        the API; the query selects (“reports”) all classes c such
                                                                        that the qualified name of c matches with a and there
                                                                        are no outgoing edges of the type CorrespondsTo (see
                                                                        -->{CorrespondsTo}) from c.
                                                                                        III. W RAPPER ASSESSMENT
                                                                           Consider again our introductory scenario for API migra-
                                                                        tion. Which wrapper, S WING WT or SWTS WING, should
                                                                        we favor? Such decision making should take into account
                                                                        wrapper qualities, e.g., its completeness or compliance—
                                                                        both relative to the genuine API implementation. In case
                                                                        we want to improve a given wrapper, we should also track
                                                                        progress by simple metrics. Accordingly, we propose some
                                                                        concepts for wrapper assessment.
Figure 1.   Metamodel of the integrated repository with API knowledge
                                                                        A. Coverage of source API
   The metaclasses Package, Class, and Method represent                    We can trivially compare the A PI M ODEL data between
the package hierarchy with the Java classes and their                   genuine API implementation and wrapper to get a basic
methods, further with extension relationships between                   sense of completeness in terms of (the percentage of)
classes (see association Extends) and calling relationships             genuine packages, classes, and methods that are covered
between methods (see association Calls). As a means of                  (say, re-implemented) by the wrapper. Table I collects such
prioritization, we leave out interfaces; they are trivially             metrics for the S WING/SWT wrappers. The numbers show
copied by wrappers.                                                     that the wrappers are highly incomplete.
   Classes of genuine API implementations are linked with
                                                                                                   S WING WT           SWTS WING
the corresponding classes of wrappers (see association                              Packages       25 (78.12 %)        16 (51.61 %)
CorrespondsTo). Here we note that wrappers may use                                  Classes        533 (18.61 %)       372 (56.97 %)
different package prefixes. Also, these links improve con-                          Methods        4533 (26.60 %)      3426 (42.59 %)

venience for those queries that need to navigate between                                            Table I
the different API implementations. The metaclass Concept                                    C OVERAGE OF SOURCE API
models concepts in the sense of A PI L INKS’ ontology.
Classes and methods can be linked with concepts; see
                                                                        B. Wrapper compliance issues
associations IsClass and IsMethod. Hence, classes and
methods of different APIs may be linked transitively.                      Some forms of non-compliance of a wrapper with the
   The metaclass MethodUsage represents the usage data                  genuine API implementation can be determined by simple
that was integrated from A PI U SAGE. That is, for each                 queries on our repository, e.g., differences regarding gen-
API method, we maintain the number of calls to the                      eralization hierarchies or the declaring classes for meth-
method (if any) within the SourceForge projects covered                 ods. Consider the following extension chain for S WING’s
by A PI U SAGE [11]. We translated this number also into a              AbstractButton:
relative measure in the sense of the percentage of the calls            java.lang.Object
                                                                         |_ java.awt.Component
to the given method relative to the number of all calls to                   |_ java.awt.Container
methods of the API.                                                              |_ javax.swing.JComponent
                                                                                     |_ javax.swing.AbstractButton
C. Repository technology
                                                                           The chain itself is preserved by S WING WT. However,
   The repository leverages the model-based TGraph ap-                  S WING declares the method addActionListener on the
proach [12]. The metamodel of Fig. 1 is represented as                  class AbstractButton whereas S WING WT declares the
a TGraph schema; converters instantiate the schema from                 method already on the class Component.
the different data sources. All analysis is performed by
means of queries on TGraphs using the language GReQL                                                           S WING WT       SWTS WING
                                                                               • Declarations on supertypes    516             161
(Graph Repository Query Language) [13]. For brevity,                           • Empty implementations         1006            230
we describe all queries (“measurements”) only informally                       • Missing methods               12506           4618
                                                                                 ◦ Class missing               9604            3698
in this paper, but here is a simple, illustrative GReQL                          ◦ Class present               2902            920
example for retrieving all classes c of an API a that are
                                                                                                    Table II
not implemented by a wrapper:                                                             W RAPPER COMPLIANCE ISSUES
   Table II shows numbers for some metrics for (lack of)       A. Concept-based method candidates
wrapper compliance. In reference to the above example             We can use A PI L INKS’ trace links between API meth-
of the method addActionListener, we measure the number         ods and concepts to propose method candidates. The idea
of methods that are declared “earlier” on a supertype in       is that if methods of the source and target APIs are
the wrapper. Further, we measure methods with empty            related to the same concept, then the latter may be useful
implementations, i.e., implementations without any out-        in re-implementing the former. Further, let us sort all
going method calls, while the corresponding genuine im-        such candidates by their cumulative usage, say, by their
plementations had outgoing method calls. (The substantial      relevance as far as A PI U SAGE is concerned.
number of empty implementations may be surprising,
but these wrappers are nevertheless reportedly useful in        Qualified candidate name                        Cumulative usage (%)
practice.) Finally, we also subdivide missing methods into      swing.javax.swing.ImageIcon.ImageIcon                          0,4816
those that are implied by missing classes vs. those that are    swing.java.awt.image.BufferedImage.BufferedImage               0,1063
                                                                swing.java.awt.Frame.getIconImage                              0,0059
missing from existing classes.                                  swing.java.awt.....MemoryImageSource                           0,0046
                                                                swing.java.awt.Frame.setIconImage                              0,0042
                                                                swing.javax.swing.text.html.ImageView.ImageView                0,0005
C. Relevance in terms of usage                                  swing.java.awt.....ImageGraphicAttribute                          N/A

   Let us qualify wrapper (in-) completeness with                                          Table IV
A PI U SAGE data. If the developers of the wrappers ap-           C ANDIDATES FOR RE - IMPLEMENTING SWT’ S Button.setImage
plied the right judgement call for leaving out classes and
methods, then the missing methods should be less relevant         Suppose you need to migrate SWT’s Button.setImage to
in practice than the implemented ones. Table III lists usage   S WING. Table IV shows the method candidates that were
metrics for the S WING/SWT wrappers.                           automatically determined by a GReQL query. Consider the
                                                               first line with the constructor of ImageIcon. We show the
                                S WING WT   SWTS WING
                                                               line in bold face to convey the fact that there is an existing
        Unimplemented methods
         • Any usage            9,01 %      2,90 %             wrapper, SWTS WING, whose method implementation of
         • Cumulative usage     2,88 %      2,35 %             setImage readily involves the constructor of ImageIcon.
        Empty methods
         • Any usage            42,53 %     25,71 %               Further inspection reveals that S WING’s JButton, which
         • Cumulative usage     11,41 %     1,49 %             is a counterpart to SWT’s Button, does not provide an Im-
        Non-empty methods
         • Any usage            48,46 %     71,39 %            age property and, hence, we cannot simply migrate SWT’s
         • Cumulative usage     85,72 %     96,17 %            Button.setImage to a corresponding setter of S WING. Extra
                          Table III                            state and a more complex idiom (indeed involving Image-
           U SAGE OF API METHODS IN S OURCE F ORGE             Icon) is needed.
                                                               B. Assessment of the ontology
   In the table, we break down S WING’s and SWT’s
methods into categories according to the wrappers as fol-         The above example shows that A PI L INKS may suggest
lows: unimplemented, empty, and non-empty implemented          reasonable candidates—in principle. We would like to
methods. For each category, we show the percentage of          assess A PI L INKS’s relevance more generally. In particular,
methods with “any usage” (say, any calls) in the Source-       we could compare A PI L INKS-based links with actual
Forge projects in the scope of the A PI U SAGE source. We      calling relationships in existing wrapper implementations,
also show “cumulative usage” for each category, i.e., the      as they are available through A PI M ODEL’s data. Table V
contribution of the category to all API method calls. These    lists corresponding metrics for the S WING/SWT wrappers.
are contrasting numbers which show, for example, that                                                 S WING WT      SWTS WING
the many unimplemented and empty methods (see again                Unimplemented methods with links   10.83 %        0.35 %
Table II) are exercised much less frequently than the fewer        Implemented methods with links     28.06 %        24.98 %
non-empty methods.                                                 Correct links                      42.75 %        37.20 %
                                                                                           Table V
             IV. G UIDANCE FOR MIGRATION                                      API LINKS BETWEEN S WING AND SWT

   A given wrapper may be effectively incomplete in that          The coverage of API parts by A PI L INKS’ trace links
a missing method is actually exercised by the application      is an artifact of the underlying semi-automatic ontology
under API migration. In this case, we seek guidance for        extraction approach [9], [10], which involves elements
migrating the API method in question. Such guidance is         of name matching and thresholds for the inclusion of
universally useful for API migration—even when transfor-       concepts. We cannot expect to retrieve links for arbitrary
mation is used instead of wrapping. A practical approach       methods from A PI L INKS.
to guidance would need to combine elements of API type            In the table, we break down S WING’s and SWT’s
matching, IDE support (such as autocompletion and stub         methods into the categories of unimplemented and im-
generation), and others. We focus here on the aspect of        plemented methods according to the wrappers. For both
proposing method candidates to be called in methods of         categories, we show the percentage of methods that are
wrapper-based API re-implementations.                          linked (transitively) with one or more methods of the
corresponding target API. The numbers are such that                                           R EFERENCES
implemented methods happen to be much better linked               [1] I. Balaban, F. Tip, and R. Fuhrer, “Refactoring support for class
than unimplemented ones.                                              library migration,” in Proc. of OOSPLA 2005. ACM, 2005, pp.
                                                                      265–279.
   At the bottom of the table, we also list the percentage        [2] J. Henkel and A. Diwan, “CatchUp!: capturing and replaying
of correct A PI L INKS’ trace links. We say that a link from          refactorings to support API evolution,” in Proc. of ICSE 2005.
the method m of the source API s to a method m0 of                    ACM, 2005, pp. 274–283.
the target API t is correct, if a given wrapper-based re-         [3] J. H. Perkins, “Automatically generating refactorings to support
                                                                      API evolution,” in Proc. of the Workshop on Program Analysis
implementation of s in terms of t implements m in a way               for Software Tools and Engineering (PASTE). ACM, 2005, pp.
that it directly calls m0 . When we specify the percentage,           111–114.
we consider as the baseline (100%) only those methods             [4] I. Şavga, M. Rudolf, S. Götz, and U. Aßmann, “Practical
m that both have associated trace links to t and actually             refactoring-based framework upgrade,” in Proc. of the Conference
                                                                      on Generative Programming and Component Engineering (GPCE).
call some method of t. It turns out that A PI L INKS predicts         ACM, 2008, pp. 171–180.
a correct link in more than 1/3 of the cases. We have to          [5] D. Dig, S. Negara, V. Mohindra, and R. Johnson, “ReBA:
note though that A PI L INKS typically proposes multiple              refactoring-aware binary adaptation of evolving libraries,” in Proc.
candidates—with a median of 8.                                        of ICSE 2008. ACM, 2008, pp. 441–450.
                                                                  [6] T. T. Bartolomei, K. Czarnecki, R. Lämmel, and T. van der Storm,
                                                                      “Study of an API Migration for Two XML APIs,” in Proc. of
                    V. R ELATED WORK                                  Conference on Software Language Engineering (SLE 2009), ser.
                                                                      LNCS, vol. 5969. Springer, 2010, pp. 42–61.
   Work on API migration has previously focused on
                                                                  [7] M. Nita and D. Notkin, “Using Twinning to Adapt Programs to
transformation and wrapper-generation techniques for API              Alternative APIs,” in Proc. of ICSE 2010, 2010.
upgrades [2], [3], [4], [5] and, to a lesser extent, on           [8] T. T. Bartolomei, K. Czarnecki, and R. Lämmel, “Swing to SWT
migration between independently developed APIs [1], [6],              and Back: Patterns for API Migration by Wrapping,” in Proc. of
                                                                      ICSM 2010. IEEE, 2010, 10 pages.
[7], [8]. The present work is the first to integrate diverse
                                                                  [9] D. Ratiu, M. Feilkas, and J. Jürjens, “Extracting Domain Ontologies
data sources to assess wrappers and to guide their devel-             from Domain Specific APIs,” in 12th European Conference on Soft-
opment. Typically, wrappers are assessed by testing (i.e.,            ware Maintenance and Reengineering, CSMR 2008, Proceedings.
testing whether the application under migration continues             IEEE, 2008, pp. 203–212.
to function, or recovers from any test failures that had to be   [10] D. Ratiu, M. Feilkas, F. Deissenboeck, J. Jürjens, and R. Marinescu,
                                                                      “Towards a Repository of Common Programming Technologies
addressed by improving a pre-existing wrapper) [6]. There             Knowledge,” in Proc. of the Int. Workshop on Semantic Technolo-
is no previous work on guiding API-wrapper development                gies in System Maintenance (STSM), 2008.
for independently developed APIs.                                [11] R. Lämmel, E. Pek, and J. Starek, “Large-scale, AST-based API-
   Most of the techniques that we integrate are inspired by           usage analysis of open-source Java projects,” in SAC’11 - ACM
                                                                      2011 SYMPOSIUM ON APPLIED COMPUTING, Technical Track
program comprehension research. For instance, our com-                on “Programming Languages”, 2011, to appear.
parison of different API implementations is a simple form        [12] J. Ebert, V. Riediger, and A. Winter, “Graph Technology in Reverse
of object-model matching [14]. Also, our exploitation of              Engineering: The TGraph Approach,” in WSR 2008, ser. GI-
API-usage data is straightforward, when compared to other             EditionProceedings, vol. 126. Gesellschaft für Informatik, 2008,
                                                                      pp. 67–81.
scenarios of exploiting such data in the context of API
                                                                 [13] D. Bildhauer and J. Ebert, “Querying Software Abstraction
usability [15] and understanding API usage (patterns) [16],           Graphs,” in Query Technologies and Applications for Program
[17]. Our proposal for guided migration can be viewed                 Comprehension (QTAPC 2008), Workshop at ICPC 2008, 2008.
as one specific approach to advanced (“intelligent”) code        [14] Z. Xing and E. Stroulia, “UMLDiff: an algorithm for object-
completion systems [18], [19].                                        oriented design differencing,” in 20th IEEE/ACM International
                                                                      Conference on Automated Software Engineering (ASE 2005), Pro-
                                                                      ceedings. ACM, 2005, pp. 54–65.
              VI. C ONCLUDING REMARKS                            [15] J. Stylos, B. A. Myers, and Z. Yang, “Jadeite: improving API
                                                                      documentation using usage information,” in Proc. of the 27th
   The complexity of API migration requires many skills               Intern. Conf. on Human Factors in Computing Systems, CHI 2009.
and techniques. Of course, one must understand the API’s              ACM, 2009, pp. 4429–4434.
domain, and the application under migration. Basic soft-         [16] J. Stylos and B. A. Myers, “Mica: A Web-Search Tool for Finding
                                                                      API Components and Examples,” in 2006 IEEE Symposium on
ware engineering skills such as testing, design by contract,          Visual Languages and Human-Centric Computing (VL/HCC 2006),
effective use of documentation are critical as well. Still            Proceedings. IEEE, 2006, pp. 195–202.
API migrations are largely unstructured today, and they          [17] T. Xie and J. Pei, “MAPO: mining API usages from open source
come with unpredictable costs. We submit that techniques              repositories,” in MSR ’06: Proceedings of the 2006 international
                                                                      workshop on Mining software repositories. ACM, 2006, pp. 54–
for assessment and guidance, such as those discussed                  57.
in this short paper, are needed to tackle non-trivial API        [18] D. Mandelin, L. Xu, R. Bodı́k, and D. Kimelman, “Jungloid
migrations in the future.                                             mining: helping to navigate the API jungle,” in Proc. of the 2005
   Clearly, our work is at an early state, and makes only             ACM SIGPLAN conference on Programming language design and
                                                                      implementation (PLDI 2005). ACM, 2005, pp. 48–61.
a limited contribution to the larger API migration theme.
                                                                 [19] M. Bruch, M. Monperrus, and M. Mezini, “Learning from ex-
There is a need for a comprehensive approach for guided               amples to improve code completion systems,” in Proceedings of
API migration, which should combine diverse elements                  ESEC/SIGSOFT FSE 2009. ACM, 2009, pp. 213–222.
of assessment, mapping, matching, code completion, code
generation, and testing.