<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Adding overloading to Java type inference</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Andreas Stadelmeier</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Martin Plumicke</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Baden-Wurttemberg Cooperative State University Stuttgart Department of Computer Science Florianstra e 15</institution>
          ,
          <addr-line>D-72160 Horb</addr-line>
        </aff>
      </contrib-group>
      <fpage>127</fpage>
      <lpage>132</lpage>
      <abstract>
        <p>void m() f Function&lt;Function&lt;Integer, Integer&gt;, IntVector&gt; map1; map1 = f -&gt; this.stream().map(f) .collect(IntVector::new, IntVector::add, IntVector::addAll); Function&lt;Function&lt;Integer, Integer&gt;,Vector&lt;Integer&gt;&gt; map2; map2 = map1; //not correct(!) g g There are two local variables map1 and map2, which represent a function which takes an (Integer! Integer)-function and gives elements of di erent result types. The rst result type is IntVector and the second is its supertype Vector&lt;Integer&gt;. But the assignment map2 = map1; is not correct in Java, Copyright c by the paper's authors. Copying permitted for private and academic purposes. Submission to: 8. Arbeitstagung Programmiersprachen, Dresden, Germany, 18-Mar-2015, to appear at http://ceur-ws.org</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Zusammenfassung</title>
      <p>In this paper we extend our Java with type inference by adding
methods. Functions had been realized as lambda expressions
dened in elds until now, which led to the restrictions that no
overloading is available. Therefore the main challenge of adding
methods is to deal with overloading. We present the change of
the data-structures and the algorithm.</p>
    </sec>
    <sec id="sec-2">
      <title>1 Introduction</title>
      <p>In [Plu11, Plu14] we have proposed to introduce a type inference system for Java. The reason for
developing such a system are the often complex and sometimes confusing principal types of
Javaexpressions with functional interfaces and wildcards. Let us consider the following example de ning a
lazy map-function for a declared class.</p>
      <p>import java.util.*;
import java.util.function.*;
import java.util.stream.*;
class IntVector extends Vector&lt;Integer&gt; f
although in the view of type theory the type of map1 is a subtype of the type of map2. The reason is that
the function constructor is normally contravariant in the argument types and covariant in the result
types. In Java contra- and covariance have to be de ned by wildcards. This means for a declaration of
map2 with the principal type</p>
      <p>Function&lt;? super Function&lt;Integer,Integer&gt;,? extends Vector&lt;Integer&gt;&gt; map2;
the assignment map2 = map1; would be correct. Our type inference system allows to leave out the types
such that problems like this do not arise any longer. The principal types are inferred automatically.
The program looks then like
class IntVector extends Vector&lt;Integer&gt; f
void m() f
map1;
map1 = f -&gt; this.stream().map(f)</p>
      <p>.collect(IntVector::new, IntVector::add, IntVector::addAll);
Function&lt;Function&lt;Integer, Integer&gt;,Vector&lt;Integer&gt;&gt; map2;
map2 = map1; //correct
g</p>
      <p>g
Our type inference algorithm works on a subset of the Java 8 programming language. The subset of the
Java language supports mostly all features of Java 8 including generics and lambda expressions. Only
methods are simulated by lambda expressions of elds. This leads to the restriction that overloading is
impossible. Furthermore every type declaration, which is necessary in Java 8, is optional in the given
subset. The type inference algorithm is able to infer all possible types for a missing type declaration in
the source code.</p>
      <p>To publish a workable implementation of the algorithm it ships bundled with an Eclipse plugin [SP14].
The Eclipse plugin is able to parse the Java subset and inject missing types into the source to generate
valid Java 8 source code. Afterwards a Java 8 compiler can be used to compile the inferred source.
Due to this approach only one type solution is possible, because the standard Java compiler needs an
explicit typed Java source code as input. Every type solution is uni ed to a most general version.
At the moment our algorithm excludes methods and method invocation. We will consider the addition
of them in this paper. The main challenge is the consideration of overloading and overriding.
The paper is structured as follows: First we consider the basic structure of the type inference algorithm.
Then we consider the extensions using overloading. In the third section the consequences for the whole
algorithm are considered. After that we close with a conclusion and an outlook.
2</p>
    </sec>
    <sec id="sec-3">
      <title>Basic structure of the algorithm</title>
      <p>The algorithm is integrated into a compiler for the described subset of Java 8. The following modi
cations di er this application from a standard Java compiler. The semantic check of the compiler is
replaced by the type inference algorithm. Instead of Java Bytecode the compiler generates a typed
output of the parsed source code.</p>
      <p>The type inference algorithm itself consists of two functions TYPE and SOLVE and works on the
abstract syntax tree of a parsed input. In TYPE rst type placeholders (fresh type variables) are
added for every unknown type in the syntax tree. Subsequently the TYPE-function of the algorithm
generates constraints by traversing the given tree. After that in the function SOLVE the generated
constraints are uni ed [Plu09]. The actual type inference algorithm generates a single constraint set
and therefore only a single uni cation is needed.
Every statement implies a single set of constraints. A constraint describes a correlation between a type
and type placeholders.
2.1</p>
      <sec id="sec-3-1">
        <title>Type Assumptions</title>
        <p>Preceding to the TYPE-function the algorithm generates a set of type assumptions. This set contains
the following informations: A listing of the classes available in the scope of the inferred source as well
as the elds and methods of these classes. A class assumption additionally holds information about
their subclasses.</p>
        <p>A method assumption includes the name of the method, its parameter types and its return type.
This set of type assumptions is used by the TYPE-function as well as the UNIFY-method.
3</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Overloading</title>
      <p>In the case of the descriped type inference algorithm overloading has a slightly di erent meaning than
the overloading of methods known from the Java programming language. While in normal Java only
method names can be overloaded, in Java with type inference method invocations can be overloaded
too. In case of a method call the type inference algorithm has to generate constraints out of the given
information. In an untyped environment only the name of the method and its parameter count is
known. Neither the types of the parameters nor the Java class which de nes the called method.
Therefore the algorithm searches the type assumptions for methods with the same name and number
of parameters. Every matching method is considered as a possible option. For each of these options
the TYPE-function generates a constraint set. At the time of the TYPE-function it is impossible
to determine which of the generated constraints leads to a correct constraints set. Therefore every
constraint set generated for one of the possible methods gets added to the nal constraints set. These
are bundled to an OR-constraint set (see Section 3.1).</p>
      <p>For the following cases the type inference algorithm generates multiple constraints for a single
invocation.</p>
      <p>The used method for the invocation is overloaded with more than one method, which has the same
number of parameters.</p>
      <p>There is more than one known class which implements a method with the same signature and
parameter count as the invoked one.
3.1</p>
      <sec id="sec-4-1">
        <title>Constraint Set</title>
        <p>Out of usability concerns a special data structure is implemented for a constraint set. This data structure
consists of two di erent subgroups.</p>
        <p>OR-constraint set: This set represents a set of constraint sets which are connected by OR operator.
AND-constraint set: This set represents a set of constraint sets which are connected by AND
operator. The AND-constraint set can also contain single constraints. Only this set would be needed
to collect the constraints of the TYPE-function without overloading. To support overloading the
AND-constraint set is also capable of holding OR-constraint sets.</p>
        <p>Due to the OR connected constraints this structure allows the storage of multiple constraint
compositions in a compact manner.</p>
        <p>But the UNIFY-algorithm works on a set of single constraints. So preceding to this step the described
constraints set needs a conversion to a set consisting out of single constraints. This is done by a cartesian
product operation. Every composition of the given constraints is possibly a right solution.
3.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Overloading example</title>
        <p>Let us consider the following example:
class OL f
m(Integer x) f return x + x; g
m(Boolean x) f return x || x; g
First the algorithm gives a fresh type variable to every statement or expression, which has an unknown
type for now. In the given example the fresh type variables or type placeholders (TPH) are assigned
as the following. A type placeholder (TPH) named ol is assigned to the untyped Field ol. The type
placeholder TPH B is used as the type variable for the Parameter x and the TPH C for the return type
of the main method. The TPH A represents the type of the return statement ol.m(x).
Afterwards the TYPE algorithm creates a constraint set. The generated constraints for the class Main
in this example look like the following.
(OL &lt;. TPH ol) &amp; // generated by "ol = new OL();"
//"ol.m(x)" generates following OR constraints due to overloading:
[[ (java.lang.Integer &lt;. TPH A), (TPH B &lt;. java.lang.Integer), (TPH ol &lt;. OL) ] |
[ (java.lang.Boolean &lt;. TPH A), (TPH B &lt;. java.lang.Boolean), (TPH ol &lt;. OL) ]] &amp;
(TPH A &lt;. TPH C) // TPH C is the return type of the method "main"
4</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Consequences for the whole algorithm</title>
      <p>The most time consuming operation of the type inference algorithm is the UNIFY-part. For every set
of constraints generated by the cartesian product of the OR- and AND-constraints sets the
UNIFYoperation is applied. But the size of the cartesian product grows exponentially with every overloaded
method that comes into account. Therefore the type inference algorithm becomes very slow when
inferring large untyped source code.</p>
      <p>A solution to this problem is trying to lter out incorrect constraint sets before the cartesian product
gets applied.</p>
      <p>Figure 1 shows an exemplary situation. The type assumptions in this example contains multiple
assumptions for the intValue-method. The classes Double, Integer, Float, Long and the class Example
implement a method named intValue with zero parameters. The TYPE-function generates constraints
for each of these assumptions. This happens for both of the intValue-method invocation in the given
example. So two OR constraint sets are generated which contain ve constraint sets each. This equals
25 di erent constraints after the cartesian product was built.</p>
      <p>Only one of the constraint sets can be uni ed to a valid mapping of type placeholders to Java types.
The other constraint sets hold inconsistent constraints. These constraint sets can be eliminated before
building the cartesian product, such that only one uni cation is necessary.</p>
      <p>This happens in several steps. At rst all OR constraint sets are separated from the AND constraint
sets. Every constraint in the OR constraint sets needs to pass a test. In this test the constraint set from
the OR constraints gets uni ed together with all AND constraints. If the UNIFY-method succeeds
class Example f</p>
      <p>test()f
int intValue()f
return 1;
var1;
var2;
var1 = this;
var2 = var1.intValue();
return var1.intValue();</p>
      <p>Abbildung 1: Simple source code which generates a huge amount of constraint sets
on this subset of the constraints, the OR constraint remains in its constraints set. Otherwise the tested
constraint set gets removed out of its OR constraint set. Afterwards the cartesian product can be
applied on the remaining constraints.</p>
      <p>For the example in Figure 1 this means the following. The TYPE-method generates two OR constraint
sets with ve elements each. For 10 di erent constraint sets the test and therefore the UNIFY-method
gets applied. Only one constraint for each of the OR constraint sets passes this test. So the cartesian
product returns only a single constraint set, which has to be uni ed.</p>
      <p>By sorting out inconsistend constraints from the OR constraint sets the size of the cartesian product
can be reduced.</p>
      <p>This method works best when only one of the overloaded methods is applicable. Otherwise multiple
constraints remain in the OR constraint sets which leads to a exponential increase of the resulting
cartesian product with every overloaded method call.
5</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusions and outlook</title>
      <p>In this paper we have described the addition of methods to our Java subset set with type inference, the
main challenge in this context is overloading. We showed the data-structure for storing the contraint
sets and gave an optimization possibility to reduce the complexity of the induced cartesian product.
As there is the possibility that the cartesian product still grows enormously, we have the further idea to
split the set of constraings before building the cartesian product. Each part of constraints would then
consist only of constraints that type variables interdepend, which means that only a small number of
cartesian products would be necessary.</p>
    </sec>
    <sec id="sec-7">
      <title>Literatur</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [Plu09]
          <string-name>
            <given-names>Martin</given-names>
            <surname>Plu</surname>
          </string-name>
          <article-title>micke. Java type uni cation with wildcards</article-title>
          .
          <source>In Dietmar Seipel</source>
          ,
          <string-name>
            <given-names>Michael</given-names>
            <surname>Hanus</surname>
          </string-name>
          , and Armin Wolf, editors,
          <source>17th International Conference, INAP 2007, and 21st Workshop on Logic Programming</source>
          ,
          <source>WLP</source>
          <year>2007</year>
          , Wurzburg, Germany, October 4-
          <issue>6</issue>
          ,
          <year>2007</year>
          , Revised Selected Papers, volume
          <volume>5437</volume>
          <source>of Lecture Notes in Arti cial Intelligence</source>
          , pages
          <fpage>223</fpage>
          {
          <fpage>240</fpage>
          . Springer-Verlag Heidelberg,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [Plu11]
          <string-name>
            <given-names>Martin</given-names>
            <surname>Plu</surname>
          </string-name>
          <article-title>micke. Well-typings for Java</article-title>
          .
          <source>In Proceedings of the 9th International Conference on Principles and Practice of Programming in Java, PPPJ '11</source>
          , pages
          <fpage>91</fpage>
          {
          <fpage>100</fpage>
          , New York, NY, USA,
          <year>2011</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [Plu14]
          <string-name>
            <given-names>Martin</given-names>
            <surname>Plu</surname>
          </string-name>
          <article-title>micke. More type inference in Java 8</article-title>
          . In Perspectives of Systems Informatics - 9th
          <source>International Andrei Ershov Memorial Conference, PSI</source>
          <year>2014</year>
          ,
          <article-title>St</article-title>
          . Petersburg, Russia, Lecture Notes in Computer Science. Springer,
          <year>2014</year>
          . (to appear).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [SP14]
          <article-title>Andreas Stadelmeier and Martin Plumicke. Java type inference as an Eclipse plugin</article-title>
          .
          <source>In Programmiersprachen und Rechenkonzepte: 31</source>
          . Workshop der GI-Fachgruppe \
          <article-title>Programmiersprachen und Rechenkonzepte"</article-title>
          ,
          <source>Bad Honnef, 28. { 30. April</source>
          <year>2014</year>
          , Technische Berichte der TU Wien,
          <year>2014</year>
          . (to appear).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>