<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Movie Database Case: An EMF-INCQUERY Solution</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Gábor Szárnyas</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Oszkár Semeráth</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Benedek Izsó</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Csaba Debreceni</string-name>
          <email>debreceni@mit.bme.hu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Ábel Hegedüs Zoltán Ujhelyi Gábor Bergmann Budapest University of Technology and Economics, Department of Measurement and Information Systems</institution>
          ,
          <addr-line>H-1117 Magyar tudósok krt. 2., Budapest</addr-line>
          ,
          <country country="HU">Hungary</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2014</year>
      </pub-date>
      <abstract>
        <p>This paper presents a solution for the Movie Database Case of the Transformation Tool Contest 2014, using EMF-INCQUERY and Xtend for implementing the model transformation. Automated model transformations are frequently integrated to modeling environments, requiring both high performance and a concise programming interface to support software engineers. The objective of the EMF-INCQUERY [2] framework is to provide a declarative way to define queries over EMF models. EMF-INCQUERY extended the pattern language of VIATRA with new features (including transitive closure, role navigation, match count) and tailored it to EMF models [1]. EMF-INCQUERY is developed with a focus on incremental query evaluation. The latest developments extend this concept by providing a preliminary rule execution engine to perform transformations. As the engine is under heavy development, the design of a dedicated rule language (instead of using the API of the engine) is currently subject to future work. Conceptually, the environment relies on graph transformation (GT) rules: conditions are specified as EMF-INCQUERY patterns, while model manipulation and environment configuration is managed using the Xtend language [3]. One case study of the 2014 Transformation Tool Contest describes a movie database transformation [4]. The main characteristics of the transformation related to the application of EMF-INCQUERY are that i) it only adds new elements to the input model (i.e. couple and group are does not modify the input model), and ii) it is non-incremental (i.e. creating a new group will not affect rule applicability). The rest of the paper is structured as follows: Section 2 gives an overview of the implementation, Section 3 describes the solution including measurement results, and Section 4 concludes our paper.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>The overview of the rule-based solution is illustrated in Figure 1a. The input of the transformation is a
movie model. The result is a transformed movie model including various groups (couples and n-cliques)
and their average rating [4]. The transformation runs in a Java application, that uses pattern matchers
provided by EMF-INCQUERY and model manipulation specified in Xtend. The pattern matcher monitors</p>
      <p>This work was partially supported by the MONDO (EU ICT-611125) and TÁMOP (4.2.2.B-10/1–2010-0009) projects.
This research was realized in the frames of TÁMOP 4.2.4. A/1-11-1-2012-0001 „National Excellence Program – Elaborating
and operating an inland student and researcher personal support system”. The project was subsidized by the European Union
and co-financed by the European Social Fund.
Movie model</p>
      <p>Java application</p>
      <p>Rule
condition</p>
      <p>Rule
consequence
Pattern
matcher
« notifies » Ruleeenxgeinceution</p>
      <p>EMF
« monitors » resources
« modifies »</p>
      <p>Transformed
movie model</p>
      <p>ContainedElement children 0.*</p>
      <p>Root
&lt;&lt;enumeration&gt;&gt;</p>
      <p>MovieType
MOVIE
TV
VVIIDDEEOOGAME
triattleinGg:MGGE:oGSvEtireDinoguble c0o.m*monMovies
yearG:GEInt
typeG:GMovieType
movies 0.*
personsPerso0n.* p1 0.1
nameG:GEString
p2 0.1
0.*
persons</p>
      <p>Actress
Actor</p>
      <p>Group
avgRatingG:GEDouble
EAttribute0
Couple</p>
      <p>Clique
(a) The specification and runtime
(b) Our extended metamodel
the resources to incrementally update match sets. The application initially reads the input movie database,
creates the output resources, then executes the transformation, and finally serializes the results into files.</p>
      <p>In the Ecore model of the specification, no containment hierarchy is used and all objects are held in
the contents list of the EMF resource. However, the performance of the transformation was affected by
the resource implementation used (since it will determine the implementation of the list operations). To
avoid this issue, we have extended the metamodel by a Root object (see Figure 1b). This object serves
as a container for all Group, Movie and Person objects. According to our experiments, this increases the
speed of the pattern matching by a factor of two. For compatibility with the specification, we ensured that
our solution works with the models provided and persists outputs in a format without this root element.</p>
    </sec>
    <sec id="sec-2">
      <title>3 Solution</title>
      <p>3.1</p>
      <sec id="sec-2-1">
        <title>Patterns and Transformations</title>
        <p>Task 1: Generating Test Data The synthetic test data is generated in Xtend (see Listing A.2.1). The
code tightly follows the specification defined in the case description [4].</p>
        <sec id="sec-2-1-1">
          <title>Task 2: Finding Couples</title>
          <p>Couples are listed with the following pattern:
1 pattern personsToCouple ( p1name , p2name ) {
2 find cast ( p1name , M1 ); find cast ( p2name , M1 );
3 find cast ( p1name , M2 ); find cast ( p2name , M2 );
4 find cast ( p1name , M3 ); find cast ( p2name , M3 );
5 M1 != M2 ; M2 != M3 ; M1 != M3 ;
6 check ( p1name &lt; p2name );
7 }
8 pattern cast ( name , M) { Movie . persons . name (M , name ); }
9 pattern personName (p , pName ) { Person . name (p , pName ); }</p>
          <p>Note that the cast pattern returns the names of persons that play in a given movie. This is important
since the names of the persons can be used to avoid symmetric matches in the personsToCouple pattern
by sorting. The Couple objects are created and configured in Xtend (see createCouples in line 45 of
Listing A.2.2). This includes setting the p1 and p2 references using a personName pattern and computing
the commonMovies by simple set intersection operators (retainAll).
Task 3: Computing Average Rankings The average rankings are computed in Xtend by calculating
the mean of the rating attributes of a couple’s common movies (see calculateAvgRatings in line 122 of
Listing A.2.2). The movies are enumerated with the following pattern:
1 pattern commonMoviesOfCouple (c , m) { Couple . commonMovies (c , m); }</p>
          <p>Extension Task 1: Compute Top-15 Couples This task is mostly implemented in Xtend (see
topGroupByRating in line 70 and topGroupByCommonMovies in line 84 of Listing A.2.2), however, it uses
the groupSize pattern in order to filter the groups with the particular number of members.
1 pattern groupSize ( group , S) {
2 Group ( group );
3 S == count find memberOfGroup (_ , group );
4 }</p>
          <p>This pattern uses the count find construct which computes the number of matches for a given pattern.
Additionally, specific comparators are used to sort and determine the top-15 lists by rating or number of
common movies (see Listing A.2.4).</p>
          <p>Extension Task 2: Finding Cliques The pattern for finding cliques is implemented similarly to the
personsToCouple pattern 3.1. The pattern for 3-cliques is defined as follows:
1 pattern personsTo3Clique (P1 , P2 , P3 ) {
2 find cast (P1 , M1 ); find cast (P2 , M1 ); find cast (P3 , M1 );
3 find cast (P1 , M2 ); find cast (P2 , M2 ); find cast (P3 , M2 );
4 find cast (P1 , M3 ); find cast (P2 , M3 ); find cast (P3 , M3 );
5 M1 != M2 ; M2 != M3 ; M1 != M3 ;
6 check ( P1 &lt; P2 ); check ( P2 &lt; P3 );
7 check ( P1 &lt; P3 );
8 }</p>
          <p>The creation of cliques is done similarly to couples (see createCliques in line 138 of Listing A.2.2).
However, this pattern has a redundant check constraint, as P1 &lt; P2 and P2 &lt; P3 already imply P1 &lt; P .
3
This works as a hint for the query engine and allows it to filter the permutation of the results (e.g.
(a2; a1; a3); (a1; a3; a2); : : :)) earlier.</p>
          <p>For performance considerations, additional patterns were defined manually for 4- and 5-cliques. For
larger cliques (n &gt; 5), patterns could be automatically generated using code generation techniques.
General solution for n-cliques. We also provide the outline for a more general solution (for arbitrary
n values). For the sake of clarity, we will refer to couples as 2-cliques. In this approach, the cliques are
built iteratively. Suppose we already have all k-cliques in the graph (e.g. we already added the 2-, 3-,
4and 5-cliques with the previous patterns). To get the (k + 1)-cliques, we look for a group g0 and a person
p0 that (i) have at least 3 movies in common, (ii) g = g0 [ fp0g is a group that is not a subset of any other
groups (see Figure 2).</p>
          <p>Formally, (ii) can be expressed as (6 9g0) : g g0. Using g = g0 [ fp0g, we derive the following
expression (6 9g0) : (g0 g0) ^ (p0 2 g0). The g0 g0 expression can be formulated as follows: (8p 2 g0) :
p 2 g0. As the EMF-INCQUERY Pattern Language does not have a universal quantifier, we rewrite this
using the existential quantifier: (6 9p 2 g0) : p 62 g0.</p>
          <p>The resulting expression for condition (ii) is the following: (6 9g0) : ((6 9p 2 g0) : p 62 g0) ^ (p0 2 g0).
We have implemented this general solution (see Listing A.2.3). Pattern subsetOfGroup implements
condition (ii), while nextClique pattern is capable of determining the (k + 1)-cliques given a model
m
m
m
m</p>
          <p>m
g0</p>
          <p>p0
g
containing all k-cliques. This approach is functionally correct, however, it only works for very small
input models and hence is omitted from our measurements.</p>
          <p>Extension Task 3: The average rankings are computed the same way as in task 3.
Extension Task 4: The top 15 average rankings are computed the same way as in extension task 2.
3.2</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>Optimizations</title>
        <p>
          To increase the performance of the transformations, we carried out some optimizations. (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) The
common movies of the two Person objects are computed from Xtend instead of EMF-INCQUERY. (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) The
patterns for 3-, 4- and 5-cliques are implemented manually. (
          <xref ref-type="bibr" rid="ref3">3</xref>
          ) Common subpatterns were identified and
extracted them into separate patterns, as the engine can reuse the pattern for each occurrence, and makes
the query definition file easier to maintain. For an example, see the cast pattern in A.1.
3.3
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>Benchmark Results</title>
        <p>The implementation was benchmarked in the SHARE cloud, on an Ubuntu 12.04 64-bit operating system
running in a VirtualBox environment. The virtual machine used one core of an Intel Xeon E5-2650 CPU
and had 6 GB of RAM. The transformations were ran in a timeout window of 10 minutes.
3.4</p>
      </sec>
      <sec id="sec-2-4">
        <title>Synthetic model</title>
        <p>Results are displayed in Figure 3. The diagram shows the transformation times for creating couples and
cliques for synthetic models. The results show that the transformations run in near linear time.</p>
        <p>The dominating factor of the running time is the initialization of the query engine. However, after
initialization, creating groups can be carried out efficiently. Furthermore, our experiments showed that
the limiting factor for our solution is the memory consumption of the incremental query engine. Given
more memory, the solution is capable of transforming larger models as well.
3.5</p>
      </sec>
      <sec id="sec-2-5">
        <title>IMDb model</title>
        <p>In the given time range and memory constraints, the transformation of the IMDb model could only
generate the couples and 3-cliques for the smallest instance model. Finding the couples took 3 minutes,</p>
        <p>Runtime of Transformation Tasks
while finding 3-cliques took 6. However, in case of a live and evolving model, our solution is capable of
incrementally running the transformation which in practice results in near instantaneous response time.
Our solution was developed as Eclipse plug-ins, however, it is also available as a command line
application compiled using the Apache Maven.The transformation runs correctly for the provided test cases on
SHARE1, and the source code is also available in a Git repository2. The results of the transformations
were spot-checked for both synthetic and IMDb models.
4</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Conclusion</title>
      <p>In this paper we have presented our implementation of the Movie Database Case. The solution uses
EMF-INCQUERY as a model query engine: the transformation is specified using declarative graph
pattern queries over EMF models for rule preconditions, and Xtend code for model manipulations. The
main conclusion of the performance evaluation is that EMF-IncQuery’s incremental approach is not a
good fit for this case study as the transformation is very model manipulation dominant.</p>
      <p>1http://is.ieis.tue.nl/staff/pvgorp/share/?page=ConfigureNewSession&amp;vdi=Ubuntu12LTS_TTC14_
64bit_TTC14-EIQ-imdb.vdi</p>
      <p>2https://git.inf.mit.bme.hu/w?p=projects/viatra/ttc14-eiq.git (username: anonymous, no password).
A Appendix – Movie Database Case Transformation Code
A.1 EMF-INCQUERY Graph Patterns
check ( P1 &lt; P2 ); check ( P2 &lt; P3 ); check ( P3 &lt; P4 ); check ( P4 &lt; P5 );</p>
      <sec id="sec-3-1">
        <title>A.2 Xtend Code</title>
        <sec id="sec-3-1-1">
          <title>A.2.1 Generator Code</title>
          <p>// create a positive match for the test case
// initialize some movies and actors / actresses</p>
          <p>// create interconnections according to a logic that will yield a positive match
def createPositive ( int n) {
val movies = newArrayList ()
(0 .. 4) . forEach [ movies += createMovie (10 * n + it )]
// We define this extension to help with model element creation
extension MoviesFactory = MoviesFactory . eINSTANCE
// The EMF resource on which the transformation operates
public Resource r
// create a positive match for the test case
// initialize some movies and actors / actresses</p>
          <p>
            // create interconnections according to a logic that will yield a negative match
def createNegative ( int n) {
val movies = newArrayList ()
(5 .. 9) . forEach [ movies += createMovie (10 * n + it )]
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98 }
val actors = #[a , b , c , d , e]
movies . get (0) . persons += #[a , b]
movies . get (
            <xref ref-type="bibr" rid="ref1">1</xref>
            ) . persons += #[a , b , c]
movies . get (
            <xref ref-type="bibr" rid="ref2">2</xref>
            ) . persons += #[ b , c , d]
movies . get (
            <xref ref-type="bibr" rid="ref3">3</xref>
            ) . persons += #[ c , d , e]
movies . get (
            <xref ref-type="bibr" rid="ref4">4</xref>
            ) . persons += #[ d , e]
r. contents += actors
r. contents += movies
          </p>
          <p>A.2.2 Transformation Code
println ("# of couples = " + newCouples . size )
engine . dispose
addElementsToResource ( newCouples );
// calculate the top group by rating
def topGroupByRating ( int size ) {
println ("Top -15 by Average Rating ")
println (" ======================== ")
val n = 15;
* Helper function to add elements to the target resource .
* @param
*/
def addElementToResource ( ContainedElement containedElement ) {</p>
          <p>root . children . add ( containedElement )
val engine = IncQueryEngine .on(r)
val coupleWithRatingMatcher = engine . groupSize
val rankedCouples = coupleWithRatingMatcher . getAllValuesOfgroup ( size ). sort (
new GroupAVGComparator )
printCouples (n, rankedCouples )
// calculate the top group by common movies
def topGroupByCommonMovies ( int size ) {
println ("Top -15 by Number of Common Movies ")
println (" ================================= ")
val n = 15;
val engine = IncQueryEngine .on(r)
val coupleWithRatingMatcher = engine . groupSize
val rankedCouples = coupleWithRatingMatcher . getAllValuesOfgroup ( size ). sort (
new GroupSizeComparator
)
printCouples (n, rankedCouples )
// pretty - print couples
def printCouples ( int n, List &lt;Group &gt; rankedCouples ) {
(0 .. n - 1). forEach [
if(it &lt; rankedCouples . size ) {
val c = rankedCouples . get (it);
println (c. printGroup (it))
}
val n = commonMovies . size
group . avgRating = sumRating / n
// create cliques
public def createCliques ( int cliques ) {
val engine = AdvancedIncQueryEngine . createUnmanagedEngine (r)
val personMatcher = getPersonName ( engine )
var Collection &lt; Clique &gt; newCliques
if( cliques == 3) {
val clique3 = getPersonsTo3Clique ( engine )
newCliques = clique3 . allMatches . map [x| generateClique (
personMatcher . getOneArbitraryMatch (null ,x.p1).p,
personMatcher . getOneArbitraryMatch (null ,x.p2).p,
personMatcher . getOneArbitraryMatch (null ,x.p3).p)]. toList ;
}
else if( cliques == 4) {</p>
          <p>val clique4 = getPersonsTo4Clique ( engine )
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197 }
newCliques = clique4 . allMatches . map [x| generateClique (
personMatcher . getOneArbitraryMatch (null ,x.p1).p,
personMatcher . getOneArbitraryMatch (null ,x.p2).p,
personMatcher . getOneArbitraryMatch (null ,x.p3).p,
personMatcher . getOneArbitraryMatch (null ,x.p4).p)]. toList ;
}
else if( cliques == 5) {
val clique5 = getPersonsTo5Clique ( engine )
newCliques = clique5 . allMatches . map [x| generateClique (
personMatcher . getOneArbitraryMatch (null ,x.p1).p,
personMatcher . getOneArbitraryMatch (null ,x.p2).p,
personMatcher . getOneArbitraryMatch (null ,x.p3).p,
personMatcher . getOneArbitraryMatch (null ,x.p4).p,
personMatcher . getOneArbitraryMatch (null ,x.p5).p)]. toList ;
The resulting expression for condition (ii) is the following: (6 9g0) : ((6 9p0 2 g0) : p0 62 g0) ^ (p 2 g0).</p>
          <p>This is equivalent to the following EMF-INCQUERY pattern:
1 /** Group g0 is a subset of Group gx. */
2 pattern subsetOfGroup (g0 : Group , gx : Group ) {
3 neg find notSubsetOfGroup (p0 , g0 , gx);
4 }
5
6 /** This pattern returns is a helper for the subsetOfGroup pattern . */
7 pattern notSubsetOfGroup (p0 : Person , g0 : Group , gx : Group ) {
8 find memberOfGroup (p0 , g0);
9 neg find memberOfGroup (p0 , gx);
10 }
11
12 /** Person p is a member of Group g. A Group is either a Couple or a Clique . */
13 pattern memberOfGroup (p, g) {
14 Couple .p1(g, p);
15 } or {
16 Couple . p2 (g , p);
17 } or {
18 Clique . persons (g , p);
19 }</p>
          <p>Based on the subsetOfGroup pattern, we may implement the nextClique pattern like follows:
1 /* * the nextCliques pattern */
2 pattern nextCliques (g : Group , p : Person ) {
3 neg find alphabeticallyLaterMemberOfGroup (g , p);
4 n == count find commonMovieOfGroupAndPerson (g , p , m);
5 check (n &gt;= 3) ;
6 neg find union (g , p);
7 }
8
9 /* * p is a member of g for which another alphabetically previous member exists */
10 pattern alphabeticallyLaterMemberOfGroup (g : Group , p : Person ) {
11 find memberOfGroup (m , g);
12 Person . name (p , pName );
13 Person . name (m , mName );
14 check ( mName &gt;= pName );
15 }
16
17 /* * m is a common movie of g and p */
18 pattern commonMovieOfGroupAndPerson (g , p , m) {
19 find commonMoviesOfGroup (g , m);
20 Person . movies (p , m);
21 }
22
23 /* * m is a common movie of g */
24 pattern commonMoviesOfGroup (g , m) {
25 Group . commonMovies (g , m);
26 }
27
28 /* * p is in g0 */
29 pattern union (g0 , p) {
30 find memberOfGroup (p , gx );
31 find subsetOfGroup (g0 , gx );
32 }</p>
          <p>A.2.4 Comparator Code for Top-15
1 class GroupSizeComparator implements Comparator &lt; Group &gt;{
2
3
4
5
6
7 }
8 }
9
10 class GroupAVGComparator implements Comparator &lt; Group &gt;{
11
12
13
14
15
16 }
17 }</p>
        </sec>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Gábor</given-names>
            <surname>Bergmann</surname>
          </string-name>
          , Zoltán Ujhelyi, István Ráth &amp; Dániel
          <string-name>
            <surname>Varró</surname>
          </string-name>
          (
          <year>2011</year>
          )
          <article-title>: A Graph Query Language for EMF models</article-title>
          .
          <source>In: Theory and Practice of Model Transformations, Fourth Int. Conf., LNCS 6707</source>
          , Springer.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Eclipse</surname>
          </string-name>
          .org (
          <year>2014</year>
          )
          <article-title>: EMF-IncQuery</article-title>
          . http://eclipse.org/incquery/.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Eclipse</surname>
          </string-name>
          .org (
          <year>2014</year>
          ): Xtend - Modernized
          <string-name>
            <surname>Java</surname>
          </string-name>
          . https://www.eclipse.org/xtend/.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Matthias</given-names>
            <surname>Tichy Tassilo Horn</surname>
          </string-name>
          , Christian
          <string-name>
            <surname>Krause</surname>
          </string-name>
          (
          <year>2014</year>
          ):
          <article-title>The TTC 2014 Movie Database Case</article-title>
          .
          <source>In: 7th Transformation Tool Contest (TTC</source>
          <year>2014</year>
          ), EPTCS.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>