=Paper= {{Paper |id=Vol-3089/ttc21_paper_ocl2sql_GarciaDominguez_solution |storemode=property |title=The Epsilon Solution to the OCL2PSQL Case |pdfUrl=https://ceur-ws.org/Vol-3089/ttc21_paper12_ocl2sql_GarciaDominguez_solution.pdf |volume=Vol-3089 |authors=Antonio García-Domíguez |dblpUrl=https://dblp.org/rec/conf/ttc/Garcia-Domiguez21 }} ==The Epsilon Solution to the OCL2PSQL Case== https://ceur-ws.org/Vol-3089/ttc21_paper12_ocl2sql_GarciaDominguez_solution.pdf
The Epsilon Solution to the OCL2PSQL Case
Antonio Garcia-Dominguez1
1
    Aston University, Birmingham, UK


                                             Abstract
                                             There have been several attempts to map Object Constraint Language queries to SQL: one of these is the OCL2PSQL mapping
                                             proposed by Nguyen and Clavel. In this paper, I describe an implementation of OCL2PSQL using two languages from
                                             the Eclipse Epsilon project: 744 lines of ETL code for the model-to-model transformation itself, and 97 lines of EGL code
                                             for a model-to-text transformation that produces more readable SQL than the reference version. The solution passes all
                                             correctness tests set out in the original framework: the transformation has a median time of 0.93s across all scenarios.

                                             Keywords
                                             OCL, SQL, model transformation, abstract syntax graphs



1. Introduction                                                                                                       model-to-text transformation (the Epsilon Generation
                                                                                                                      Language), model validation, pattern matching, model
The OMG Object Constraint Language is a well known,                                                                   migration, unit testing and other tasks. The solution
standardized language for specifying constraints and                                                                  passes all correctness checks, though some minor refine-
queries in models: while typically associated with UML,                                                               ments of the proposed mappings were required.
it has been widely used for MOF-based modelling lan-                                                                    The rest of the paper is structured as follows: Section 2
guages, and particularly those based on the Ecore imple-                                                              explains the overall structure of the solution. Section 3
mentation of EMOF. Given the increasingly large size of                                                               describes the key features of the implemented model-
the models used by Model-Driven Engineering practition-                                                               to-model transformation from OCL to SQL. Section 4
ers, one common solution is to persist them in databases.                                                             describes the alternative model-to-text transformation
This has motivated attempts to map the OCL queries                                                                    that has been developed from the SQL models to textual
(written in terms of the abstract syntax of the model-                                                                SQL queries. Finally, Section 5 presents the preliminary
ing language) to SQL queries that run directly on the                                                                 performance results obtained by the solution author.
underlying relational database used to store the models.
   Some of these attempts have used imperative features
such as loops and cursors to deal with iterators, possibly                                                            2. Overall structure
reducing their compatibility across vendors (due to the
limited standardization of these features). OCL2PSQL                                                                  The Epsilon solution to the OCL2PSQL case is a Java ap-
(“OCL to pure SQL”) is an approach that provides a map-                                                               plication, using Apache Maven for its dependency man-
ping of nested iterators while staying entirely within the                                                            agement. Epsilon is usable as a standalone Java library,
broadly standardized and declarative parts of SQL [1].                                                                with stable versions available through Maven Central and
The OCL2PSQL TTC case has selected a core subset of                                                                   snapshot versions available through the OSS Sonatype
this mapping (with some erratum added since the origi-                                                                repository. The solution should work in Epsilon 2.3.0, but
nal release of the mapping), and has invited tool authors                                                             uses the latest 2.4.0 interim versions to avoid a warning
to demonstrate the usability, conciseness and ease of                                                                 message when using the Epsilon Generation Language.
understanding of their model transformation languages                                                                    The solution (now merged into the TTC’21 OCL2PSQL
through alternative implementations of this subset.                                                                   solutions repository1 ) reuses the basic scaffolding of the
   This paper presents an outline of a solution based on                                                              reference solution, including the generated code for the
the Eclipse Epsilon family of model management lan-                                                                   OCL and SQL metamodels, and the classes responsible
guages. Since the original release in 2006 [2], Eclipse Ep-                                                           for interpreting the environment variables, communi-
silon has grown to include languages for model-to-model                                                               cating with the MySQL database, and performing the
transformation (the Epsilon Transformation Language),                                                                 correctness tests. The solution adds the following Java
                                                                                                                      classes:
TTC’21: Transformation Tool Contest, Part of the Software
Technologies: Applications and Foundations (STAF) federated                                                                • SampleLauncher, which transforms all OCL
conferences, Eds. A. Boronat, A. García-Domínguez, and G. Hinkel,                                                            queries without using the environment variables
25 June 2021, Bergen, Norway (online).                                                                                       of the benchmark framework. This is mostly for
" a.garcia-dominguez@aston.ac.uk (A. Garcia-Dominguez)                                                                       internal development.
 0000-0002-4744-9150 (A. Garcia-Dominguez)
                                       © 2021 Copyright for this paper by its authors. Use permitted under Creative        1
                                       Commons License Attribution 4.0 International (CC BY 4.0).                            https://github.com/TransformationToolContest/
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)                                        ttc2021-ocl2psql/tree/master/solutions/Epsilon
     • OCL2SQL, which encapsulates a model-to-model
                                                                            Listing 1: Main ETL script
       transformation written in the Epsilon Transfor-
       mation Language into an easy-to-use Java class.       import ’s0_literals.etl’;                                   1
       More information on the ETL transformation is         import ’s1_equals_and.etl’;                                 2
       available in Section 3.                               import ’s2_allInstances.etl’;                               3
     • SQL2Text, which replaces the model-to-text            import ’s3_size.etl’;                                       4
       transformation in the reference solution with one     import ’s4_collect_variable.etl’;                           5
       based on the Epsilon Generation Language. More        import ’s5_attributes.etl’;                                 6
       details on the EGL transformation are given in        import ’s6_associationEnds.etl’;                            7
       Section 4.                                            import ’s7_exists.etl’;                                     8
                                                             import ’s8_existsWithFree.etl’;                             9
   Finally, the solution changed the code of the Solution                                                                10
class in the reference solution to use the OCL2SQL and       post {                                                      11
SQL2Text classes from above.                                   var firstRootSelect = SQL!PlainSelect.all.selectOne(      12
                                                                     ps|ps.eContainer.isUndefined());
                                                                                                                         13
3. Model-to-model                                                var stmt = new SQL!SelectStatement;                     14
   transformation with ETL                                       stmt.selectBody = firstRootSelect;                      15
                                                             }                                                           16
The Epsilon project includes several languages for per-
forming model transformations:

     • The Epsilon Object Language (EOL) is an OCL-        subgraph produced from the OCL e subtree, allowing
       inspired imperative language which is well suited   the different SQL subtrees to be linked together. The
       for small in-place (endogenous) model transfor-     OCL2PSQL case did not require any manual rule schedul-
       mations, though it can be used for purely imper-    ing or the use of greedy/lazy rules: the automated rule
       ative exogenous transformations as well.            scheduling done by ETL based on source element types
     • The Epsilon Transformation Language (ETL)           and guards was sufficient.
       builds on top of EOL by adding the concept of a        ETL allows for breaking up the transformation across
       rule, which transforms a certain type of source     several modules. This made it relatively easy to itera-
       model element (possibly with some guards lim-       tively implement the various stages in OCL2PSQL and
       iting its applicability) to a target model element. test out how it behaved for the various challenges. List-
       The rule scheduling can be kept entirely declara-   ing 1 shows the main script of the ETL transformation:
       tive, or can be controlled to some extent through   it is composed of a number of import statements that
       the use of “greedy” or “lazy” rules. ETL is well-   bring in the rules needed for each stage, and a post rule
       suited for exogenous transformations, where a       which places the one PlainSelect without a container
       new model is produced from the source model.        into a SelectStatement element as mandated by the
     • Flock can be thought of as ETL with an automated    SQL metamodel.
       “copy-unless-otherwise-stated” default strategy.       Some of the OCL2PSQL mappings had to produce sig-
       It is well suited for model migration tasks, where  nificantly large SQL subtrees: to keep the code concise,
       a model has to undergo small changes from one       a library of EOL operations (utilities.eol) was cre-
       version of a metamodel to the next.                 ated and reused from the ETL rules. This library largely
     • The Epsilon Wizard Language (EWL) is a variant      contained a set of functions for simple creation of SQL
       of EOL which allows users to define “wizards”       model elements, an implementation of the OCL2PSQL
       that users can manually trigger on specific model   functions for listing the free variables in an OCL expres-
       elements, performing small in-place transforma-     sion 𝑒 (FVars(𝑒)) and for listing the source variables that
       tions (perhaps with some simple user interaction).  the value of a subexpression 𝑒′ of the OCL expression 𝑒
                                                           depends upon (SVars𝑒 (𝑒′ )), and several other miscella-
  Out of these languages, ETL was chosen since the neous functions. As a simple example, Listing 2 shows
original formulation of OCL2PSQL as a collection of re- the code needed to transform OCL integer literals to the
cursive mappings by source type was a close match to target SQL metamodel.
the rules used by ETL. For the most part, each of those       The ETL rules are for the most part a direct one-to-
mappings was translated into an ETL rule, and calls to one translation from the descriptions at the end of the
map𝑒 (𝑒) were translated into e.equivalent() calls in ETL. OCL2PSQL case, except for two changes.
The e.equivalent() built-in operation retrieves the SQL       The first change was considering one special case listed
                                     Listing 2: Excerpt of ETL for stage 0 (integer literals)
 1   import ’utilities.eol’;
 2
 3   /∗
 4    ∗ All these boil down to:
 5    ∗ mape(l) = SELECT l as res, 1 as val
 6    ∗/
 7
 8   rule IntLiteral
 9   transform e:OCL!IntegerLiteralExp
10   to ps:SQL!PlainSelect {
11     ps.selItems.add(longSelectItem(’res’, e.integerValue.asLong()));
12     ps.selItems.add(longSelectItem(’val’, 1l));
13   }



     in the original OCL2PSQL paper [1] but not in the case
                                                                          Listing 3: Fix for “subtree stealing” in ETL
     paper. The mapping of collect and exists in the original
     OCL2PSQL paper covered the case when 𝑣 ∈     / 𝐹 𝑉 𝑎𝑟𝑠(𝑏),   operation copyIfContained(value) {                            1
     but this mapping had been omitted from the OCL2PSQL            var emfTool = new Native(                                   2
     case description. It turned out that this special case was        "org.eclipse.epsilon.emc.emf.tools.EmfTool");            3
     needed for some of the queries, e.g. challenge 0 in stage      if (value.eContainer.isDefined()) {                         4
     4 (Car.allInstances()−>collect(c|5)).                             return emfTool.ecoreUtil.copy(value);                    5
        The second change was due to an unexpected inter-           }                                                           6
     action between the recursive approach used to define           return value;                                               7
     OCL2PSQL, the definition of SubSelect.selectBody as a        }                                                             8
     containment reference in the SQL metamodel, and how
     the ETL e.equivalent() operation works. ETL will only
     apply a certain rule once to each matching source model
     element, and from them on e.equivalent() will always           and if so it performed a deep cloning of the SQL subtree.
     return the same counterpart in the target model (e.g. the         A better fix (which unfortunately would have required
     exact same object). This resulted in some SQL queries          a rewriting of the input files for this case) would be to
     “losing” the body of their SubSelect objects to other sub-     change the SQL metamodel so that SubSelect.selectBody is
     trees of the SQL model, as they also needed the mapping        no longer a containment reference, and the same Plain-
     for that part of the OCL expression.                           Select can be reused from multiple SubSelect model
        For instance, consider the final challenge:                 elements.
                                                                       Even further, this suggests that the OCL2PSQL map-
     Car.allInstances()−>exists(c|                                  ping really produces a SQL expression graph (where some
        c.owners−>exists(p|                                         subexpressions are reused) rather than a SQL abstract
           p.name = ’Peter’))                                       syntax tree. Instead of running the same subquery from
                                                                    several places, it may be advisable to redefine OCL2PSQL
         In this query, the SQL mapping of c.owners−>exists(...) so it produces a sequence of SQL queries rather than a
     and the SQL mapping of p.name both require using the single large SQL query: it would run these reused sub-
     mapping of c.owners as a subquery. ETL successfully queries first, and then provide their results to the higher-
     maps c.owners to SQL, but EMF will not allow a single level queries. Otherwise, there may be a risk that the
     model element to be contained from more than one place. SQL query could grow exponentially if sufficiently large
     Since the mapping of c.owners−>exists(...) will complete subqueries have to be duplicated across several locations.
     last, it will effectively “steal” the subquery representing       Overall, the transformation required writing 14 rules
     c.owners from the mapping of p.name.                           across 531 lines of ETL code, with a support library of
         The fix for this issue turned out to be simple, as shown EOL operations that was 213 lines long. These line counts
     in Listing 3. All uses of the e.equivalent() operation were included whitespace and comments: if these are excluded,
     wrapped into a new EOL operation: the operation tested the line counts are reduced to 305 lines of ETL code and
     if this “stealing” was about to take place (i.e. if the Plain-
     Select was already contained in another SubSelect),
      Listing 4: Excerpt of EGL to generate SQL query text              Listing 5: SQL-specific LanguageFormatter used to in-
 1   [%= SelectStatement.all.first.generate() %][%                                 dent the SQL query text
 2                                                                      private static class SQLFormatter                               1
 3   @template                                                            extends LanguageFormatter                                     2
 4   operation SelectStatement generate() { %]                          {                                                               3
 5   [%=self.selectBody.generate()%];                                     private static final String increasePattern = "\\(\\s∗$";     4
 6   [% }                                                                 private static final String decreasePattern = "^\\)";         5
 7                                                                                                                                      6
 8   @template                                                              public SQLFormatter() {                                     7
 9   operation PlainSelect generate() { %]                                    super(Pattern.compile(increasePattern,                    8
10   SELECT                                                                                          Pattern.MULTILINE),                9
11   [% for (si in self.selItems) { %]                                              Pattern.compile(decreasePattern,                    10
12   [%=si.generate() + (hasMore ? "," : "")%]                                                       Pattern.MULTILINE));               11
13   [% }                                                                   }                                                           12
14       if (self.fromItem.isDefined()) { %]                            }                                                               13
15   FROM [%=self.fromItem.generate() %]
16   [% }
17       for (join in self.joins) {%]
18   [%=join.generate()%]                                               Listing 6: SQL query for challenge 0 in stage 1, as gener-
19   [% }                                                                          ated by EGL
20       if (self.whereExp.isDefined()) { %]                            SELECT                                                          1
21   WHERE [%=self.whereExp.generate() %]                               TEMP_left.res = TEMP_right.res AS res,                          2
22   [% }                                                               1 AS val                                                        3
23     if (self.groupBy.isDefined()) { %]                               FROM (                                                          4
24   [%= self.groupBy.generate() %]                                              SELECT                                                 5
25   [%                                                                          2 AS res,                                              6
26     }                                                                         1 AS val                                               7
                                                                        ) AS TEMP_left                                                  8
                                                                        JOIN (                                                          9
                                                                                 SELECT                                                 10
     154 lines of EOL code2 .                                                    3 AS res,                                              11
                                                                                 1 AS val                                               12
                                                                        ) AS TEMP_right;                                                13
     4. Model-to-text transformation
        to SQL with EGL
     The reference solution included a model-to-text transfor-          operations, which allow EGL templates in their body and
     mation that produced the SQL query to be run in MySQL              return strings which can be used within expressions (e.g.
     from the SQL model. During the development of this so-             for concatenating separators, as in line 12).
     lution, it was found that the generated SQL was difficult             The script also uses some of the built-in Epsilon vari-
     to read in the presence of multiple levels of subqueries,          ables: hasMore is a built-in Epsilon variable available in
     as it was entirely on one line.                                    loops which is true if and only if there are more values
        In order to improve the readability of the SQL queries          after the current one.
     and help with the debugging, an alternative implementa-               One useful feature in EGL is its ability to integrate for-
     tion was written in 96 lines of EGL. The EGL template tra-         matters that postprocess the generated text. In particular,
     verses the SQL model recursively from the root Select-             the EGL LanguageFormatter was customised for SQL
     Statement, breaking up SELECT statements, CASE ex-                 (as shown in Listing 5) to automatically indent the lines
     pressions, joins, and subqueries across multiple lines.            of the SQL script to improve readability, while keeping
        Listing 4 shows an excerpt of the EGL script: the first         the EGL script as simple as possible. This class only re-
     line is the entry point of the entire script, kicking off          quires the regular expressions that should increase and
     the recursive descent of the SQL model from the Select-            decrease the indentation level after a match. Using this
     Statement. The EGL script makes heavy use of template              script, queries are generated in the more readable form
         2                                                              shown in Listing 6.
             These   counts     were     obtained     using       the
     count-etl-lines.sh script included in the solution folder.
                                  1. Transform (JVM warmup)   2. Transform (avg over 10 runs)   3. Query test


                            1.2
                            1.1
                            1.0
                            0.9
                  Seconds   0.8
                            0.7
                            0.6
                            0.5
                            0.4
                            0.3
                            0.2
                            0.1
                            0.0
                                      0

                                      1

                                      2

                                      0

                                      1

                                      2

                                      0

                                      0

                                      1

                                      0

                                      1

                                      2

                                      0

                                      1

                                      0

                                      1

                                      0

                                      1

                                      2

                                      3

                                      0
                                    C

                                    C

                                    C

                                    C

                                    C

                                    C

                                    C

                                    C

                                    C

                                    C

                                    C

                                    C

                                    C

                                    C

                                    C

                                    C

                                    C

                                    C

                                    C

                                    C

                                    C
                                  S0

                                  S0

                                  S0

                                  S1

                                  S1

                                  S1

                                  S2

                                  S3

                                  S3

                                  S4

                                  S4

                                  S4

                                  S5

                                  S5

                                  S6

                                  S6

                                  S7

                                  S7

                                  S7

                                  S7

                                  S8
Figure 1: Execution times in seconds per stage and challenge, for the first warmup execution of the transformation, the
average of 10 additional runs of the transformation, and the test of the generated SQL statement.



5. Results                                               the transformation was then run 10 more times within the
                                                         same JVM, and the average execution time was recorded.
After implementing the ETL model-to-model transfor- This was followed by a single execution of the generated
mation and the EGL model-to-text transformation, Java SQL query, to test if the expected results were produced.
code to encapsulate these transformations and integrate     Whereas the execution times for the “warmup” runs
them with the TTC benchmark framework was added. are between 0.9s and 1.2s, the average execution times are
The transformations passed all correctness cases for all much smaller and comparable to other solutions of the
scenarios across all stages and challenges.              contest, ranging between 0.1s and 0.25s. Test execution
   In terms of execution time, the transformations were times ranged between 0.3s and 0.4s.
run on a Lenovo X1 laptop with an i7-6600U CPU run-
ning at 2.60GHz with 16GiB of physical RAM, running
Ubuntu Linux 20.04.2 LTS with Linux 5.4.0-74-generic References
and the Oracle JDK 11.0.8. The default Java memory
allocation settings were used (no -Xmx or other JVM op- [1] H. Nguyen Phuoc Bao, M. Clavel, OCL2PSQL: An
tions were given). The Docker image provided by the           OCL-to-SQL code-generator for model-driven engi-
OCL2PSQL case authors was used to run MySQL, using            neering, in: T. K. Dang, J. Küng, M. Takizawa, S. H.
Docker Engine 20.10.7.                                        Bui (Eds.), Future Data and Security Engineering,
   The transformation and test times are shown in Fig-        Springer International Publishing, Cham, 2019, pp.
ure 1: the transformation times include both the ETL          185–203. doi:10.1007/978-3-030-35653-8_13.
model-to-model transformation and the EGL model-to-      [2]  D. S. Kolovos,  R. F. Paige, F. Polack, The Epsilon
text transformation, in order to mimic the two trans-         Object Language (EOL), in: Model Driven Architec-
formations done by the reference implementation (the          ture - Foundations and Applications, Second Euro-
proper OCL2PSQL transformation, and a model-to-model          pean Conference, ECMDA-FA 2006, Bilbao, Spain,
transformation between a 3rd-party JSqlParser meta-           July 10-13, 2006, Proceedings, 2006, pp. 128–142.
model and the EMF-based metamodel). It was noted              doi:10.1007/11787044_11.
during the open peer review stage of the contest that the
first execution of the transformation was considerably
slower than later executions, due to Java class-loading
and just-in-time recompiler warmup times. In order to
obtain more representative results from a typical user
(who would have a long-running Java process running
the transformation repeatedly for different OCL queries),