<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Structural type inference in Java-like languages - Extended abstract -</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Martin Plümicke</string-name>
          <email>pl@dhbw.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Baden-Wuerttemberg Cooperative State University Stuttgart/Horb Florianstraße 15</institution>
          ,
          <addr-line>D-72160 Horb</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2007</year>
      </pub-date>
      <volume>5437</volume>
      <fpage>4</fpage>
      <lpage>6</lpage>
      <abstract>
        <p>In the past we considered type inference for Java with generics and lambdas. Our type inference algorithm determines nominal types in subjection to a given environment. This is a hard restriction as separate compilation of Java classes without relying on type informations of other classes is impossible. In this paper we present a type inference algorithm for a Java-like language, that infers structural types without a given environment. This allows separate compilation of Java classes without relying on type informations of other classes.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Let us consider an example that shows the idea. In [Plü15] for the following program no type can be inferred,
as there is no type assumption for elementAt.</p>
      <p>class A { m (v) { return v.elementAt(0); } }
Only with an import declaration import java.util.Vector; a type can be inferred.</p>
      <p>In this paper we give an algorithm that infers for v a structural type , which has a method elementAt. Our
algorithm is a generalization of an idea, that is given in [ADDZ05]. In the introducing example from [ADDZ05]
the method E m(B x){ return x.f1.f2; } is given. The compilation algorithm generates the polymorphic
typed Java expressions E m(B x){ return [[x:B].f1: ].f2: ; }, where and are type variables. In this
system m is applicable to instances of the class B with the field f1 with the type , where must have a field
f2 with the type and the constraint E. In this approach B and E are still nominal types.
We generalize this approach, such that also untyped methods like m(x){ return x.f1.f2; } can be compiled,
that means the type of x and the return type are type variables, too.</p>
      <sec id="sec-1-1">
        <title>The idea</title>
        <sec id="sec-1-1-1">
          <title>Input syntax :</title>
          <p>L ::= class C extends (CT) ff; Mg
M ::= m(x)f return e; g
NCT ::= CT j C&lt;TVar = CT&gt;
CT ::= C&lt;CT&gt;
e ::= x j e:f j e:m(e) j new NCT(e) j (CT)e
CONS ::= T extends T</p>
        </sec>
        <sec id="sec-1-1-2">
          <title>Output syntax :</title>
          <p>Lt</p>
          <p>::= I CLt
T
Mt
et
I</p>
          <p>::= CT j TVar</p>
          <p>MH ::= T m(T x);</p>
          <p>CLt ::= class C&lt;TVar&gt;[CONS] extends (CT) fT f; Mtg
::= MH f return et; g
::= x : T j e:f : T j e:m(e) : T j new NCT(e) : CT j (CT)e : CT
::= interface I&lt;TVar&gt;fT f; MHg
There are some extensions in comparison to usual Java. The class declarations in the output syntax have the form
class C&lt;TVar&gt; [CONS]. TVar are the generics and [CONS] is a set of subtype constraints T extends T0, that
must fulfill all instances of the class. In any class there is an implicit constructor with all fields (including them
from the superclasses) as arguments. There is no differentiation between extends and implements declarations.
Both are declared by extends. Interfaces can have fields. Furthermore, the use of the new-statement is allowed
without assigning all generics. This is done by the syntax C&lt;TVar = CT&gt;. The not assigned generics are derived
by the type inference algorithm.
3</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>The algorithm</title>
      <p>The algorithm TI consists of three parts. First the function TYPE inserts types (usually type variables) to all
sub-terms and collects constraints. Second, the function construct generates the interfaces and completes the
constraints. Finally, the function solve unifies the type constraints and applies the unifier to the class.
In the following definition we give the different forms of constraints, that are collected in TYPE. This definition
is oriented at [ADDZ05]:</p>
      <sec id="sec-2-1">
        <title>Definition 1 (Type constraints)</title>
        <p>c l c0 means c is a subtype of c0.</p>
        <p>( c; f; c0 ) means c provides a field f with type c0.</p>
        <p>( c; m; c; (c0; c0) ) means c provides a method m applicable to arguments of type c, with return type c0 and
parameters of type c0.</p>
        <p>Note that ( c; m; c; (c0; c0) ) implicitly includes the constraints c
c0.</p>
        <p>Let &lt; be the extends relation defined by the Java declarations und
the corresponding subtyping relation.</p>
      </sec>
      <sec id="sec-2-2">
        <title>The type–inference algorithm</title>
        <p>Let TypeAssumptions be a set of assumptions, that can consists of assumptions for fields, methods and whole
classes with fields and methods. The functions fields and mtype extracts the typed fields respectively the typed
methods from a given class, as in [IPW01].</p>
        <p>In the type inference algorithm we use the following name conventions for type variables:
fA: Type variable for the field f in the class A.
mA;i; Am;i: Type variable for the i-th argument of the method m in the class A.</p>
        <p>Am; Am: is an abbreviation for the tuple Am;1; : : : ; Am;n respectively Am;1; : : : ; Am;n.
mA: Type variable for the return type of the method m in the class A.</p>
      </sec>
      <sec id="sec-2-3">
        <title>The main function TI</title>
        <p>The main function TI calls the three functions TYPE, construct, and solve. The input is a set of type
assumptions TypeAssumptions and an untyped Java class L. The result Lt is the typed Java class extended by
a set of interfaces.</p>
        <p>TI: TypeAssumptions</p>
        <p>L ! Lt
TI ( Ass; class A extends B f f; M g ) =
let
(clt; C) = Type( Ass; cl )
(I1 : : : Im clt) = construct( clt; C )
TYPE( Ass; class A extends B f f; M g ) = let
fass := f this:f : fA j f 2 f g [ f this:f : T j T f 2 f ields( B ) g
mass := f this:m : Am ! Am j m(x)f return e; g 2 M g [ f this:m : aty ! rty j mtype( m; B ) = aty ! rty g
AssAll = Ass [ fass [ mass [ f this : A g
For m(x)f return e; g 2 mM;jf
Ass = AssAll [ f xj : A j x = x1 : : : xni g
(et : rty; C0) = TYPEExpr( Ass; e )
C = C [ C0[ mA 7! rty]g</p>
        <p>Mt = f rty m( Am x)f return et; g j m(x)f return e; g 2 M g
in(class A extends B f A f; Mt g; C)
The function TYPEExpr inserts types into the expressions and collects the corresponding constraints. The
function TYPEExpr is given for all cases of expressions e. In the following we present TYPEExpr for the
both most important cases, the method-call and the new-statement.</p>
        <p>TYPEExpr for Method-call: First, the types of the receiver and the arguments are determined, recursively.
Then it is differed between methods with and without known receiver types. In the known case a subtype relation
is introduced. Otherwise a constraint is generated that demands a corresponding method in the type.</p>
        <p>in
TYPEExpr( Ass; e0:m(e) ) = let
(e0t : ty0; C0) = TYPEExpr( Ass; e0 )
(eit : tyi; Ci) = TYPEExpr( Ass; ei ); 81 6 i 6 n
if (ty0 is no type variable) &amp;&amp; (ty0 2 Ass) &amp;&amp; (mtype( m; ty0 ) = aty ! rty) then</p>
        <p>((e0t : ty0):m(e1t : ty1; : : : ; ent : tyn) : rty; (C0 [ Si Ci) [ f ty l aty g)
else</p>
        <p>((e0t : ty0):m(e1t : ty1; : : : ; ent : tyn) : mty0 ; (C0 [ Si Ci) [ f ( ty0; m; ty; ( mty0 ; tmy0 ) ) g)
TYPEExpr for the new-statement: The use of the new-statement is allowed without assigning all generics.
This is done by the syntax C&lt;TVar = CT&gt;. First, fresh type variables are introduced in the assumptions of
the corresponding class. Then the types of the arguments are determined. Finally, the assigned generics are
introduced and the subtype relations between the argument types and the fields of the class and its super classes
are added.</p>
        <p>TYPEExpr( Ass [ f class A&lt;T&gt;[CA] extends B fTA f; Mtg g;new A&lt;S&gt;( e ) ) =
where S = [T (1) = 1; : : : ; T (k) = k] with k n for jTj = n
let</p>
        <p>fresh type variables, that substitute T in class A
S0 = S[T 7! ]
in
(eit : tyi; Ci) = TYPEExpr( Ass; ei ); 81 6 i 6 m
(new A&lt;S0&gt;( e1t : ty1; : : : ; emt : tym ) : A&lt; [ 7!
where elds( B ) = TB g
j =
2 S0]&gt;; (Si Ci) [ CA[ 7! ] [ f ty l TB TA[ 7! ] g</p>
      </sec>
      <sec id="sec-2-4">
        <title>The function construct</title>
        <p>The function construct takes the result from TYPE, a typed class and a set of constraints. It generates for
any type ty1, ty2 occuring in constraints ( ty1; f; ) or ( ty2; m; ; ( ; ) ) corresponding interfaces with the
demanded fields respectively methods:
interface ty1&lt; ..., , ...&gt; {</p>
        <p>f;
},
interface ty2 &lt;..., ; , ...&gt; {</p>
        <p>m( x1);
introduces fresh type variables X1, X2 and constraints, that have to implement these interfaces: X1 l ty1&lt;...&gt;;
X2 l ty2&lt;...&gt;. Finally, the occuring type variables are introduced as generics of the class.</p>
      </sec>
      <sec id="sec-2-5">
        <title>The function solve</title>
        <p>The function solve takes the result of construct and solves the constraints of the class by the type unification
algorithm from [Plü09], such that the constraints contains only pairs with at least one type variable.
Now we give an example, that shows first a structural typing of a class independent from any environment. Then
a concrete implementation of this class is given.</p>
        <p>Example 2 In this example we print the input syntax (user written) in black and the output syntax
(automatically generated) in gray. Let the following class be given
class A {</p>
        <p>mt(x, y, z) { return x.sub(y).add(z); }
}
The result of construct( clt; C ) is (with fresh type variables):
The result of TYPE is: class A { adsdumbt;1 mt( Amt;1 x, Amt;2 y,</p>
        <p>A
et = [[[x : Amt;1]:sub([y : Amt;2]) : sumbt;1 ]:add(z : Amt;3) : add
sub ] and</p>
        <p>A mAt;1
C = f ( Amt;1; sub; Amt;2; ( sumAbt;1 ; sumAbt;1;1 ) ); ( sumAbt;1 ; add; Amt;3; ( adsdumAbt;1 ; adsdu;mAb1t;1 ) ) g</p>
        <p>mt;1 &lt;Gamma_m, Beta_m&gt; { Gamma_m sub(Beta_m x); }
interface A
interface sumbt;1 &lt;Gamma_n, Beta_n&gt; { Gamma_n add(Beta_n x); }</p>
        <p>A
6 mt( 1 x, 3 y, 4 z) { return x.sub(y).add(z); }
class A &lt; 1, 2, 3, 4, 5, 6, 7&gt;[ 3 extends 5, 4 extends 7, 1 extends Amt;1&lt; 2, 5&gt;, 2 extends sumbt;1 &lt; 6, 7&gt;]{
A
}
As the application of solve changes nothing, this is the result of TI’s application.</p>
        <p>In the following we show, as an instance of the type inferred class can be used.</p>
        <p>For this implementations of mAt;1 and sumbt;1 must be given:</p>
        <p>A
mt;3 z) { return et; } }, with</p>
        <p>A
class myInteger extends Amt;1&lt;myInteger, myInteger&gt;, sumbt;1 &lt;myInteger, myInteger&gt; {</p>
        <p>A
Integer i;
myInteger sub(myInteger x) { return new myInteger(i - x.i); }
myInteger add(myInteger x) { return new myInteger(i + x.i); } }
In the class Main an instance of A is used and the method mt is called.
class Main {
main() { return new A&lt; 1=myInteger, 2=myInteger&gt;()</p>
        <p>.mt(new myInteger(2), new myInteger(1), new myInteger(3)); } }
myInteger l 3; myInteger l 4; g
The mappings 1=myInteger, 2=myInteger means that 1 and 2 are instantiated and all other parameters
of A should be inferred by the type inference algorithm TI. We call TI for Main with the set of assumptions
consisting of the class A and the class myInteger.</p>
        <p>The constraint set of the result of TYPE is given as</p>
        <p>mt;1&lt;myInteger; 5&gt;; myInteger l sumbt;1 &lt; 6; 7&gt;;
Cmain = f 3 l 5; 4 l 7; myInteger l A
A
The functions construct adds no interfaces, as there is no call of abstract fields or methods.
In solve Cmain is unified. The result of the unification is:</p>
        <p>= f 5 7! myInteger; 6 7! myInteger; 7 7! myInteger; 3 7! myInteger; 4 7! myInteger g
The resulting Java class is given as:
class Main {
myInteger main() {
return new A&lt;myInteger, myInteger, myInteger, myInteger, myInteger, myInteger, myInteger&gt;()
.mt(new myInteger(2), new myInteger(1), new myInteger(3)); }
}
4</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Summary References</title>
      <p>We have presented a type inference algorithm for a Java-like language. The algorithm allows to declare type-less
Java classes independently from any environment. This allows separate compilation of Java classes without
relying on type informations of other classes. The algorithm infers structural types, that are given as generated
interfaces. The instances have to implement these interfaces.
[IPW01]
[Plü15]</p>
      <p>Atsushi Igarashi, Benjamin C Pierce, and Philip Wadler. Featherweight Java: a minimal core calculus
for Java and GJ. ACM Transactions on Programming Languages and Systems (TOPLAS), 23(3):396–
450, 2001.</p>
      <p>Martin Plümicke. More Type Inference in Java 8. In Andrei Voronkov and Irina Virbitskaite, editors,
Perspectives of System Informatics - 9th International Ershov Informatics Conference, PSI 2014,
St. Petersburg, Russia, June 24-27, 2014. Revised Selected Papers, volume 8974 of Lecture Notes in
Computer Science, pages 248–256. Springer, 2015.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [ADDZ05]
          <string-name>
            <given-names>Davide</given-names>
            <surname>Ancona</surname>
          </string-name>
          , Ferruccio Damiani, Sophia Drossopoulou, and
          <string-name>
            <given-names>Elena</given-names>
            <surname>Zucca</surname>
          </string-name>
          . Polymorphic Bytecode:
          <article-title>Compositional Compilation for Java-like Languages</article-title>
          .
          <source>In Proceedings of the 32nd ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, POPL '05</source>
          , pages
          <fpage>26</fpage>
          -
          <lpage>37</lpage>
          , New York, NY, USA,
          <year>2005</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>[Plü09]</mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>