=Paper=
{{Paper
|id=None
|storemode=property
|title=Combining Multiple Dimensions of Knowledge in API Migration
|pdfUrl=https://ceur-ws.org/Vol-708/mdsm2011-bartolomei-et-al-11-apimig.pdf
|volume=Vol-708
}}
==Combining Multiple Dimensions of Knowledge in API Migration==
Combining Multiple Dimensions of Knowledge in API Migration
Thiago Tonelli Bartolomei1 , Mahdi Derakhshanmanesh2 , Andreas Fuhr2 , Peter Koch2 ,
Mathias Konrath2 , Ralf Lämmel2 , and Heiko Winnebeck2
1
University of Waterloo, Canada
2
University of Koblenz-Landau, Germany
Abstract—We combine multiple dimensions of knowledge Acknowledgement We are grateful to Daniel Ratiu for providing
about APIs so that we can support API migration by us with data related to the programming ontology of [9], [10].
wrapping or transformation in new ways. That is, we We are also grateful to four anonymous MDSM 2011 reviewers
assess wrapper-based API re-implementations and provide for their excellent advice.
guidance for migrating API methods. We demonstrate our
approach with two major GUI APIs for the Java platform II. T HE INTEGRATED REPOSITORY
and two wrapper-based re-implementations for migrating
between the GUI APIs. We integrate three data sources with API knowledge
into a repository. Let us describe those data sources, the
Keywords-Software migration, API migration, API analy-
sis, Wrapping, Mining software repositories metamodel of the integrated repository, and the repository
technology as such.
I. I NTRODUCTION A. Data sources
API migration is a kind of software migration; it may • A PI M ODEL (developed by the present authors)—a
be necessary to meet requirements for software modern- model of API implementations (including S WING,
ization, application integration, and others. API migration SWT, S WING WT, SWTS WING) with an underlying
is realized by wrapping or transformation. We refer to [1], metamodel that is a (very) limited Java metamodel
[2], [3], [4], [5], [6], [7], [8] for recent work on the subject. for structural properties and calling relationships;
For instance, consider the following re-engineering sce- • A PI U SAGE (developed by Lämmel et al. [11])—a
nario. Two Java applications need to be integrated, but they fact base (say, database) with usage properties of
use different GUI APIs, say S WING and SWT. Based on 1476 open-source Java projects at SourceForge, in
the exercised features and possibly other considerations, particular with facts for API method calls within the
one of the two APIs is favored for the integrated ap- projects’ code;
plication. The disfavored API (the “source API”) can be • A PI L INKS (developed by Ratiu et al. [9], [10])—
re-implemented in terms of the favored API (the “target an ontology for programming concepts that were
API”) as a wrapper so that the migration requires little, if extracted semi-automatically from APIs in different
any, rewriting of the application’s code. Incidentally, there programming domains, complete with trace links
are two advanced open-source wrappers that serve both between concepts and the API source-code elements
directions of migration: S WING WT1 and SWTS WING2 . from which they were derived.
In previous work [6], [8], we substantiated that migra- The A PI M ODEL source contributes basic knowledge
tion between independently developed source and target about types and methods of genuine API implementations,
APIs may be complex because of significantly different and their coverage by the typically incomplete wrapper-
generalization hierarchies, contracts, and protocols. based re-implementations. The A PI U SAGE source helps to
Contribution: In the present paper, we describe an assess, for example, the relevance of genuine methods that
approach for the combination of multiple dimensions of are not implemented in a wrapper. The A PI L INKS source
knowledge about APIs so that API migration can be helps to derive candidate classes and methods that could
supported in new ways. That is, we assess wrapper- be used in a wrapper-based API re-implementation.
based API re-implementations and provide guidance for
migrating API methods. To this end, we leverage a model- B. Metamodel of the repository
based approach to the integration of knowledge about APIs
Fig. 1 shows the metamodel (a UML class diagram) of
into a repository for convenient use in declarative queries.
our integrated repository where metaclasses are tagged by
Throughout the paper, we use the S WING/SWT APIs and
data sources A PI M ODEL, A PI U SAGE, and A PI L INKS. We
the above-mentioned wrappers as subjects under study.
must note that the metamodel does not cover all elements
Road-map: Sec. II describes the integrated reposi- of the sources, but is streamlined to fit our objectives.
tory. Sec. III and Sec. IV cover different forms of support-
The metaclass NamedElement represents package-
ing API migration. Related work is discussed in Sec. V,
qualified names of packages, classes, and methods. Be-
and the paper is concluded in Sec. VI. The paper and
cause of the composition relationships in the metamodel,
accompanying material are available online.3
NamedElements are also qualified by the name of an
1 http://swingwt.sourceforge.net/: re-implements S WING in terms of SWT API, in fact, by a particular implementation, which could
2 http://swtswing.sourceforge.net/: re-implements SWT in terms of S WING be a genuine implementation or a wrapper-based re-
3 http://softlang.uni-koblenz.de/apirep/ implementation.
using a:
from c: V{Class}
with c.qualifiedName =˜ a and count(c−−>{CorrespondsTo}) = 0
reportSet c
end
That is, a is an argument of the query for the name of
the API; the query selects (“reports”) all classes c such
that the qualified name of c matches with a and there
are no outgoing edges of the type CorrespondsTo (see
-->{CorrespondsTo}) from c.
III. W RAPPER ASSESSMENT
Consider again our introductory scenario for API migra-
tion. Which wrapper, S WING WT or SWTS WING, should
we favor? Such decision making should take into account
wrapper qualities, e.g., its completeness or compliance—
both relative to the genuine API implementation. In case
we want to improve a given wrapper, we should also track
progress by simple metrics. Accordingly, we propose some
concepts for wrapper assessment.
Figure 1. Metamodel of the integrated repository with API knowledge
A. Coverage of source API
The metaclasses Package, Class, and Method represent We can trivially compare the A PI M ODEL data between
the package hierarchy with the Java classes and their genuine API implementation and wrapper to get a basic
methods, further with extension relationships between sense of completeness in terms of (the percentage of)
classes (see association Extends) and calling relationships genuine packages, classes, and methods that are covered
between methods (see association Calls). As a means of (say, re-implemented) by the wrapper. Table I collects such
prioritization, we leave out interfaces; they are trivially metrics for the S WING/SWT wrappers. The numbers show
copied by wrappers. that the wrappers are highly incomplete.
Classes of genuine API implementations are linked with
S WING WT SWTS WING
the corresponding classes of wrappers (see association Packages 25 (78.12 %) 16 (51.61 %)
CorrespondsTo). Here we note that wrappers may use Classes 533 (18.61 %) 372 (56.97 %)
different package prefixes. Also, these links improve con- Methods 4533 (26.60 %) 3426 (42.59 %)
venience for those queries that need to navigate between Table I
the different API implementations. The metaclass Concept C OVERAGE OF SOURCE API
models concepts in the sense of A PI L INKS’ ontology.
Classes and methods can be linked with concepts; see
B. Wrapper compliance issues
associations IsClass and IsMethod. Hence, classes and
methods of different APIs may be linked transitively. Some forms of non-compliance of a wrapper with the
The metaclass MethodUsage represents the usage data genuine API implementation can be determined by simple
that was integrated from A PI U SAGE. That is, for each queries on our repository, e.g., differences regarding gen-
API method, we maintain the number of calls to the eralization hierarchies or the declaring classes for meth-
method (if any) within the SourceForge projects covered ods. Consider the following extension chain for S WING’s
by A PI U SAGE [11]. We translated this number also into a AbstractButton:
relative measure in the sense of the percentage of the calls java.lang.Object
|_ java.awt.Component
to the given method relative to the number of all calls to |_ java.awt.Container
methods of the API. |_ javax.swing.JComponent
|_ javax.swing.AbstractButton
C. Repository technology
The chain itself is preserved by S WING WT. However,
The repository leverages the model-based TGraph ap- S WING declares the method addActionListener on the
proach [12]. The metamodel of Fig. 1 is represented as class AbstractButton whereas S WING WT declares the
a TGraph schema; converters instantiate the schema from method already on the class Component.
the different data sources. All analysis is performed by
means of queries on TGraphs using the language GReQL S WING WT SWTS WING
• Declarations on supertypes 516 161
(Graph Repository Query Language) [13]. For brevity, • Empty implementations 1006 230
we describe all queries (“measurements”) only informally • Missing methods 12506 4618
◦ Class missing 9604 3698
in this paper, but here is a simple, illustrative GReQL ◦ Class present 2902 920
example for retrieving all classes c of an API a that are
Table II
not implemented by a wrapper: W RAPPER COMPLIANCE ISSUES
Table II shows numbers for some metrics for (lack of) A. Concept-based method candidates
wrapper compliance. In reference to the above example We can use A PI L INKS’ trace links between API meth-
of the method addActionListener, we measure the number ods and concepts to propose method candidates. The idea
of methods that are declared “earlier” on a supertype in is that if methods of the source and target APIs are
the wrapper. Further, we measure methods with empty related to the same concept, then the latter may be useful
implementations, i.e., implementations without any out- in re-implementing the former. Further, let us sort all
going method calls, while the corresponding genuine im- such candidates by their cumulative usage, say, by their
plementations had outgoing method calls. (The substantial relevance as far as A PI U SAGE is concerned.
number of empty implementations may be surprising,
but these wrappers are nevertheless reportedly useful in Qualified candidate name Cumulative usage (%)
practice.) Finally, we also subdivide missing methods into swing.javax.swing.ImageIcon.ImageIcon 0,4816
those that are implied by missing classes vs. those that are swing.java.awt.image.BufferedImage.BufferedImage 0,1063
swing.java.awt.Frame.getIconImage 0,0059
missing from existing classes. swing.java.awt.....MemoryImageSource 0,0046
swing.java.awt.Frame.setIconImage 0,0042
swing.javax.swing.text.html.ImageView.ImageView 0,0005
C. Relevance in terms of usage swing.java.awt.....ImageGraphicAttribute N/A
Let us qualify wrapper (in-) completeness with Table IV
A PI U SAGE data. If the developers of the wrappers ap- C ANDIDATES FOR RE - IMPLEMENTING SWT’ S Button.setImage
plied the right judgement call for leaving out classes and
methods, then the missing methods should be less relevant Suppose you need to migrate SWT’s Button.setImage to
in practice than the implemented ones. Table III lists usage S WING. Table IV shows the method candidates that were
metrics for the S WING/SWT wrappers. automatically determined by a GReQL query. Consider the
first line with the constructor of ImageIcon. We show the
S WING WT SWTS WING
line in bold face to convey the fact that there is an existing
Unimplemented methods
• Any usage 9,01 % 2,90 % wrapper, SWTS WING, whose method implementation of
• Cumulative usage 2,88 % 2,35 % setImage readily involves the constructor of ImageIcon.
Empty methods
• Any usage 42,53 % 25,71 % Further inspection reveals that S WING’s JButton, which
• Cumulative usage 11,41 % 1,49 % is a counterpart to SWT’s Button, does not provide an Im-
Non-empty methods
• Any usage 48,46 % 71,39 % age property and, hence, we cannot simply migrate SWT’s
• Cumulative usage 85,72 % 96,17 % Button.setImage to a corresponding setter of S WING. Extra
Table III state and a more complex idiom (indeed involving Image-
U SAGE OF API METHODS IN S OURCE F ORGE Icon) is needed.
B. Assessment of the ontology
In the table, we break down S WING’s and SWT’s
methods into categories according to the wrappers as fol- The above example shows that A PI L INKS may suggest
lows: unimplemented, empty, and non-empty implemented reasonable candidates—in principle. We would like to
methods. For each category, we show the percentage of assess A PI L INKS’s relevance more generally. In particular,
methods with “any usage” (say, any calls) in the Source- we could compare A PI L INKS-based links with actual
Forge projects in the scope of the A PI U SAGE source. We calling relationships in existing wrapper implementations,
also show “cumulative usage” for each category, i.e., the as they are available through A PI M ODEL’s data. Table V
contribution of the category to all API method calls. These lists corresponding metrics for the S WING/SWT wrappers.
are contrasting numbers which show, for example, that S WING WT SWTS WING
the many unimplemented and empty methods (see again Unimplemented methods with links 10.83 % 0.35 %
Table II) are exercised much less frequently than the fewer Implemented methods with links 28.06 % 24.98 %
non-empty methods. Correct links 42.75 % 37.20 %
Table V
IV. G UIDANCE FOR MIGRATION API LINKS BETWEEN S WING AND SWT
A given wrapper may be effectively incomplete in that The coverage of API parts by A PI L INKS’ trace links
a missing method is actually exercised by the application is an artifact of the underlying semi-automatic ontology
under API migration. In this case, we seek guidance for extraction approach [9], [10], which involves elements
migrating the API method in question. Such guidance is of name matching and thresholds for the inclusion of
universally useful for API migration—even when transfor- concepts. We cannot expect to retrieve links for arbitrary
mation is used instead of wrapping. A practical approach methods from A PI L INKS.
to guidance would need to combine elements of API type In the table, we break down S WING’s and SWT’s
matching, IDE support (such as autocompletion and stub methods into the categories of unimplemented and im-
generation), and others. We focus here on the aspect of plemented methods according to the wrappers. For both
proposing method candidates to be called in methods of categories, we show the percentage of methods that are
wrapper-based API re-implementations. linked (transitively) with one or more methods of the
corresponding target API. The numbers are such that R EFERENCES
implemented methods happen to be much better linked [1] I. Balaban, F. Tip, and R. Fuhrer, “Refactoring support for class
than unimplemented ones. library migration,” in Proc. of OOSPLA 2005. ACM, 2005, pp.
265–279.
At the bottom of the table, we also list the percentage [2] J. Henkel and A. Diwan, “CatchUp!: capturing and replaying
of correct A PI L INKS’ trace links. We say that a link from refactorings to support API evolution,” in Proc. of ICSE 2005.
the method m of the source API s to a method m0 of ACM, 2005, pp. 274–283.
the target API t is correct, if a given wrapper-based re- [3] J. H. Perkins, “Automatically generating refactorings to support
API evolution,” in Proc. of the Workshop on Program Analysis
implementation of s in terms of t implements m in a way for Software Tools and Engineering (PASTE). ACM, 2005, pp.
that it directly calls m0 . When we specify the percentage, 111–114.
we consider as the baseline (100%) only those methods [4] I. Şavga, M. Rudolf, S. Götz, and U. Aßmann, “Practical
m that both have associated trace links to t and actually refactoring-based framework upgrade,” in Proc. of the Conference
on Generative Programming and Component Engineering (GPCE).
call some method of t. It turns out that A PI L INKS predicts ACM, 2008, pp. 171–180.
a correct link in more than 1/3 of the cases. We have to [5] D. Dig, S. Negara, V. Mohindra, and R. Johnson, “ReBA:
note though that A PI L INKS typically proposes multiple refactoring-aware binary adaptation of evolving libraries,” in Proc.
candidates—with a median of 8. of ICSE 2008. ACM, 2008, pp. 441–450.
[6] T. T. Bartolomei, K. Czarnecki, R. Lämmel, and T. van der Storm,
“Study of an API Migration for Two XML APIs,” in Proc. of
V. R ELATED WORK Conference on Software Language Engineering (SLE 2009), ser.
LNCS, vol. 5969. Springer, 2010, pp. 42–61.
Work on API migration has previously focused on
[7] M. Nita and D. Notkin, “Using Twinning to Adapt Programs to
transformation and wrapper-generation techniques for API Alternative APIs,” in Proc. of ICSE 2010, 2010.
upgrades [2], [3], [4], [5] and, to a lesser extent, on [8] T. T. Bartolomei, K. Czarnecki, and R. Lämmel, “Swing to SWT
migration between independently developed APIs [1], [6], and Back: Patterns for API Migration by Wrapping,” in Proc. of
ICSM 2010. IEEE, 2010, 10 pages.
[7], [8]. The present work is the first to integrate diverse
[9] D. Ratiu, M. Feilkas, and J. Jürjens, “Extracting Domain Ontologies
data sources to assess wrappers and to guide their devel- from Domain Specific APIs,” in 12th European Conference on Soft-
opment. Typically, wrappers are assessed by testing (i.e., ware Maintenance and Reengineering, CSMR 2008, Proceedings.
testing whether the application under migration continues IEEE, 2008, pp. 203–212.
to function, or recovers from any test failures that had to be [10] D. Ratiu, M. Feilkas, F. Deissenboeck, J. Jürjens, and R. Marinescu,
“Towards a Repository of Common Programming Technologies
addressed by improving a pre-existing wrapper) [6]. There Knowledge,” in Proc. of the Int. Workshop on Semantic Technolo-
is no previous work on guiding API-wrapper development gies in System Maintenance (STSM), 2008.
for independently developed APIs. [11] R. Lämmel, E. Pek, and J. Starek, “Large-scale, AST-based API-
Most of the techniques that we integrate are inspired by usage analysis of open-source Java projects,” in SAC’11 - ACM
2011 SYMPOSIUM ON APPLIED COMPUTING, Technical Track
program comprehension research. For instance, our com- on “Programming Languages”, 2011, to appear.
parison of different API implementations is a simple form [12] J. Ebert, V. Riediger, and A. Winter, “Graph Technology in Reverse
of object-model matching [14]. Also, our exploitation of Engineering: The TGraph Approach,” in WSR 2008, ser. GI-
API-usage data is straightforward, when compared to other EditionProceedings, vol. 126. Gesellschaft für Informatik, 2008,
pp. 67–81.
scenarios of exploiting such data in the context of API
[13] D. Bildhauer and J. Ebert, “Querying Software Abstraction
usability [15] and understanding API usage (patterns) [16], Graphs,” in Query Technologies and Applications for Program
[17]. Our proposal for guided migration can be viewed Comprehension (QTAPC 2008), Workshop at ICPC 2008, 2008.
as one specific approach to advanced (“intelligent”) code [14] Z. Xing and E. Stroulia, “UMLDiff: an algorithm for object-
completion systems [18], [19]. oriented design differencing,” in 20th IEEE/ACM International
Conference on Automated Software Engineering (ASE 2005), Pro-
ceedings. ACM, 2005, pp. 54–65.
VI. C ONCLUDING REMARKS [15] J. Stylos, B. A. Myers, and Z. Yang, “Jadeite: improving API
documentation using usage information,” in Proc. of the 27th
The complexity of API migration requires many skills Intern. Conf. on Human Factors in Computing Systems, CHI 2009.
and techniques. Of course, one must understand the API’s ACM, 2009, pp. 4429–4434.
domain, and the application under migration. Basic soft- [16] J. Stylos and B. A. Myers, “Mica: A Web-Search Tool for Finding
API Components and Examples,” in 2006 IEEE Symposium on
ware engineering skills such as testing, design by contract, Visual Languages and Human-Centric Computing (VL/HCC 2006),
effective use of documentation are critical as well. Still Proceedings. IEEE, 2006, pp. 195–202.
API migrations are largely unstructured today, and they [17] T. Xie and J. Pei, “MAPO: mining API usages from open source
come with unpredictable costs. We submit that techniques repositories,” in MSR ’06: Proceedings of the 2006 international
workshop on Mining software repositories. ACM, 2006, pp. 54–
for assessment and guidance, such as those discussed 57.
in this short paper, are needed to tackle non-trivial API [18] D. Mandelin, L. Xu, R. Bodı́k, and D. Kimelman, “Jungloid
migrations in the future. mining: helping to navigate the API jungle,” in Proc. of the 2005
Clearly, our work is at an early state, and makes only ACM SIGPLAN conference on Programming language design and
implementation (PLDI 2005). ACM, 2005, pp. 48–61.
a limited contribution to the larger API migration theme.
[19] M. Bruch, M. Monperrus, and M. Mezini, “Learning from ex-
There is a need for a comprehensive approach for guided amples to improve code completion systems,” in Proceedings of
API migration, which should combine diverse elements ESEC/SIGSOFT FSE 2009. ACM, 2009, pp. 213–222.
of assessment, mapping, matching, code completion, code
generation, and testing.