=Paper= {{Paper |id=Vol-2245/ammore_paper_3 |storemode=property |title=Exploring Model Repositories by Means of Megamodel-aware Search Operators |pdfUrl=https://ceur-ws.org/Vol-2245/ammore_paper_3.pdf |volume=Vol-2245 |authors=Francesco Basciani,Juri Di Rocco,Davide Di Ruscio,Ludovico Iovino,Alfonso Pierantonio |dblpUrl=https://dblp.org/rec/conf/models/BascianiRRIP18 }} ==Exploring Model Repositories by Means of Megamodel-aware Search Operators== https://ceur-ws.org/Vol-2245/ammore_paper_3.pdf
                         Exploring model repositories by means of
                           megamodel-aware search operators
           Francesco Basciani                             Davide Di Ruscio                            Juri Di Rocco
          University of L’Aquila                        University of L’Aquila                     University of L’Aquila
              L’Aquila, Italy                               L’Aquila, Italy                             L’Aquila, Italy
       francesco.basciani@univaq.it                    davide.diruscio@univaq.it                   ljuri.dirocco@univaq.it

                                    Ludovico Iovino                          Alfonso Pierantonio
                              Gran Sasso Science Institute                     University of L’Aquila
                                    L’Aquila, Italy                                L’Aquila, Italy
                                ludovico.iovino@gssi.it                    alfonso.pierantonio@univaq.it
ABSTRACT                                                               understanding of the repository. For instance, in order to
Great strides have been made in the development of tools               locate an artifact, it might be useful to be able to predi-
and techniques for advance model management over the last              cate over both repository-wide attributes, including artifact
decade. Despite the use of model repositories is gaining trac-         types, metamodels, domain types, and maturity levels, and
tion in industry, their use is still hampered by the limited           metamodel elements, such as classes and structural features.
understanding of the underlying platform semantics. Conse-                This article outlines a novel approach to model search
quently, the all-important goal of reusing artefacts has led to        that leverages the repository structure into a megamodel.
an enduring quest for ways to search and retrieve artifacts            The approach provides designers with dedicated operators
more efficiently and accurately. Arguably, a contributory fac-         to explore the model repository without requiring the knowl-
tor limiting the use of current search engines is the poor             edge of low-level details about the underlying platforms to
alignment between the query languages and the lattice of               formulate the queries. The approach has been implemented
relations among the different and heterogeneous artifacts in           atop of MDEForge [4] by employing Lucene [16] to feature
the repository.                                                        efficient text search.
   In this paper, a novel approach to model search is presented.       Structure of the paper. The paper is structured as follows.
By leveraging the repository structure into megamodels, well-          Next section presents a motivating scenarios. Section 3 makes
formed search operators have been conceived in order to                an overview of existing model search approaches. Next section
permit designers to reliably explore and browse model repos-           introduces the approach, which is demonstrated in Sect. 5.
itories. An experimental investigation has been conducted              Finally, Sect. 6 concludes the paper and discusses future
by implementing the approach in the MDEForge platform                  work.
by employing the Lucene search library.
                                                                       2     MOTIVATING SCENARIOS
1   INTRODUCTION                                                       In this section, we discuss explanatory scenarios that involve
The pervasiveness of modelling techniques in everyday soft-            models, metamodels, and model transformations. The goals
ware practice has escalated the importance of reliable model           are:
repositories [11]. Consequently, the availability of efficient               – highlighting the need for proper methods and tools sup-
and accurate ways to retrieve artifacts is becoming of high                    porting the exploration of model repositories managing
relevance. Thus, relying on sound and well-formed models for                   different kinds of interrelated modelling artifacts;
discovering and reusing existing artifacts is key to preserving              – showing that even the implementation of simple search
productivity benefits related to model-based processes [18].                   scenarios can be error-prone and time-consuming if not
While such advantages are particularly attractive, substan-                    adequately supported.
tial shortcomings have been identified as problems that are            For each scenario, a corresponding query implemented in OCL
inherent to the way repositories are modelled themselves.              (and executed in Java) is presented. For the sake of clarity, the
Arguably, a contributory factor limiting the use of current            queries assume the availability of modelling artifacts stored
search engines is the poor alignment between the query lan-            in a local user folder.
guages and the lattice of relations among the different and
                                                                       Scenario 1. In this scenario, the modeler is interested in
heterogeneous artifacts in the repository. In particular, the
                                                                       metamodels that contain a specific metaclass defined in terms
diversity and numerosity of models stored in a repository
                                                                       of its name and structural features. For instance, metamodels
require query mechanisms based on a finer-grained level of
                                                                       for graphs can be found by searching the terms nodes and
                                                                       edges. Listing 1 shows a query whose definition is given in
The research described in this paper has been partially supported by
the CROSSMINER Project, EU Horizon 2020 Research and Innovation        OCL at line 4. Since it has to be evaluated on all the artifacts
Programme, grant agreement No. 732223.                                 locally stored, at lines 5-12 all the metamodels in a specified
   folder are retrieved. Then, for each corresponding package                                              specification (given in a some specific ADL) of an already
   the checkConstraint method is executed. Line 6 checks the                                               implemented software system. Then, Listing 3 contains a
   artifact is a metamodel by looking at its extension (ecore).                                            sample query implemented in Java that makes use of EMF1
   The method checkConstraint executes the OCL predicate by                                                methods. In particular, the query returns all the models
   considering the parameters given as input, namely className,                                            conforming to a metamodel MM denoted by its nsUri (see
   attrName and refName. Each OCL query result is added to                                                 lines 5-11). In particular, the getNsURiFromModels method
   the result list, which will contain the list of metamodels                                              (line 6) extracts package information by using the EMF
   satisfying the query.                                                                                   .eClass() and .eResource() methods.
               Listing 1: Sample query supporting Scenario 1                                                         Listing 3: Sample query supporting Scenario 3
 1 public List < File > search ( String folderString , String className                                   1 public List < File > search ( String folderString , String nsUri ) {
        , String attrName , String refName ) {                                                            2 File folder = new File ( folderString ) ;
 2 File folder = new File ( folderString ) ;                                                              3 List < File > results = new ArrayList < Artifact >() ;
 3 List < File > results = new ArrayList < Artifact >() ;                                                 4
 4 String query = " EClass . allInstances () -> exists ( e ␣ | ␣ e . name = ' " +                         5 for ( final File fileName : folder . listFiles () ) {
          className + " ') ␣ and ␣ EAttribute . allInstances () -> exists ( e ␣ | ␣                       6     List < EPackage > epList = g e t N sU R i F r o m M o d e l s ( filename ) ;
          e . name = ' " + attrName + " ') ␣ and ␣ EReference . allInstances () ->                        7     if ( containsUri ( epList , nsUri ) )
          exists ( e ␣ | ␣ e . name = ' " + refName + " ') "                                              8      results . add ( fileName ) ;
 5 for ( final File fileName : folder . listFiles () ) {                                                  9 }
 6   if ( g e t F i l e E x t e n s i o n ( fileName ) . equals (" ecore " ) ) {                         10 return result ,
 7    List < EPackage > epList = getEPackages ( filename )                                               11 }
 8    for ( EPackages package : epList )                                                                 12 f . eClass () . eResource () . getURI ()
 9     if ( ch eck Con str ain t ( package , query ) )
10      results . add ( fileName ) ;                                                                       Again, despite the simplicity of the requirements the designer
11 }                                                                                                       must face a certain accidental complexity due to the tech-
12 }
13 return result ,                                                                                         nological setting that requires the familiarity with the EMF
14 }                                                                                                       framework and its corresponding APIs.
   It is worth noting that even such simple search requires
   writing a query (with three nested levels) that is a tedious and
                                                                                                           3      BACKGROUND
   error-prone activity, despite its simplicity (e.g., no relation                                         A number of existing approaches for model searching are
   among artifacts is involved).                                                                           summarized in Table 1. For each of them, the following
   Scenario 2. In this scenario, the designer is interested in find-                                       characteristics are considered:
   ing transformations able to generate specific elements out of                                                – Supported modeling artifact: it refers to the kinds of ar-
   source models of a given type. For instance, the developer                                                     tifacts that the considered approach is able to manage;
   is working on some model transformations able to generate                                                    – Query mechanism: it refers to how query are specified;
   Petri net models out of BMPN specifications. To this end,                                                      e.g., there are approaches that allow users to specify
   she would like to get inspired by existing transformations (if                                                 queries with query strings; others adopt more struc-
   any) in order to understand how to develop the mappings                                                        tured languages like OCL;
   between BPMN tasks and corresponding Petri net modules.                                                      – Megamodel-awareness: some approaches consider also
   Listing 2 contains an OCL query that at line 4 looks for trans-                                                relations among different kinds of artifacts, where rela-
   formations mapping the concepts expressed in the parameter                                                     tions give place to joins that traverse the repository; for
   inPatternName into one instance of outPatternName. The                                                         instance, like searching for all metamodels supported
   query is evaluated for each atl file, see checkContraint at line                                               by existing editors that are source metamodels of a
   8. The outcome is stored in the list result, see line 9.                                                       transformation that generates models conforming to a
                                                                                                                  given metammodel; considering the artifact relations
               Listing 2: Sample query supporting Scenario 2                                                      requires the approach to be aware of the repository
 1 List < File > search ( String folderString , String outPatternName ,                                           structure typically represented by means of megamod-
          String inPatternName ) {
 2 File folder = new File ( folderString ) ;                                                                      els [9];
 3 List < File > result = new ArrayList < File >() ;                                                            – Indexing supported: in order to make searches more
 4 String query = " S i m p l e O u t P a t t e r n E l e m e n t . allInstances () -> exists
                                                                                                                  efficient, approaches may rely on indexing mechanisms.
            ( e ␣ | ␣ e . type . name ␣ = ␣ '" + outPa tter nName + " ') ␣ and ␣
            S i m p l e I n P a t t e r n E l e m e n t . allInstances () -> exists ( e ␣ | ␣ e . type        The first entry in the table is an approach [10] for retrieving
            . name ␣ = ␣ '" + inPatternName + " ') "
 5 for ( final File fileName : folder . listFiles () )
                                                                                                           UML2 design models using the combination of WordNet and
 6   if ( g e t F i l e E x t e n s i o n ( fileName ) . equals (" atl " ) ) {                             Case-Based Reasoning. The approach makes use also of a sim-
 7    ATLModel atlModel = i n j e c t T r a s f o r m a t i o n ( fileName . getName () ) ;                ilarity distance to ’approximate’ the result. Moogle [14, 15] is
 8    if ( ch eck Con str ain t ( atlModel . getRoot () , query ) )
 9      result . add ( fileName ) ;
                                                                                                           a model search engine that relies on the use of metamodeling
10 }                                                                                                       information for creating indexes allowing the execution of
11 }                                                                                                       complex queries. It also delivers the results in a readable
   Scenario 3. In this scenario, the designer is interested in                                             way by removing irrelevant strings from the actual model
   models conforming to a specific metamodel. For instance,                                                1
                                                                                                               https://www.eclipse.org/modeling/emf/
                                                                                                           2
   the modeler would like to reuse and refine the architectural                                                https://www.omg.org/spec/UML/
                                         Supported modeling artifact   Query mechanism       Megamodel-awareness     Indexing supported
         Gomes P. et al. [10]                  UML models              Query-by-example             No                      Yes
         Lucrédio D. et al. [14, 15]               Any                   Text search                No                      Yes
         Konstantinos B. et al. [2, 3]             Any                 OCL-like language          Partial                   Yes
         Kessentini et al. [12]                 Metamodel              Query-by-example             No                       No
         Bislimovska B. et al. [6]            WebML Models             Query-by-example             No                      Yes
         Bozzon A. et al. [7]                      Any                   Text search                No                      Yes
         Kling W. et al. [13]                      Any                 OCL-like language            Yes                      No
         Ángel. et al. [1]                 Model and Metamodel           Text search                No                      Yes
Table 1: A sample of approaches for model search. The list is inevitably non–exhaustive and merely intended to reflect the kinds
of approaches that are available
file. Moogle can search for models conforming to different                 presentation) and presents an architecture for the automatic
languages, as long as there is a well-defined metamodel which              model-driven project segmentation, indexing and search, with-
can be provided to Moogle. The approach is limited to a                    out requiring any manual model annotation. The proposed
single architectural layer without considering for example                 approach is agnostic on the type of modelling artifact. In
relationships among artifacts. For the indexing mechanism                  [13], the authors propose MoScript, an OCL-based scripting
Moogle relies on the existing Apache SOLR3 search engine.                  language that permits to query a model repository by relying
    Hawk is a framework aiming at providing scalable tech-                 on the metadata available in a dedicated megamodel.
niques for large-scale model querying and transformations. In                 The authors in [1] propose EXTREMO, a tool developed as
[2, 3] the authors compare the conventional and commonly                   an Eclipse plugin, able to gather heterogeneous information
used persistence mechanisms in MDE with novel approaches                   from different technological spaces (like ontologies, RDF,
such as the use of graph-based NoSQL databases. Moreover,                  XML or EMF). EXTREMO represents them uniformly in a
they present an extensible model indexing framework at the                 common data model that enables an uniform querying, by
base of the developed tool. The proposed framework collects                means of an extensible mechanism, which can make use of
models stored in file-based version control systems and per-               services, e.g., for synonym search and word sense analysis.
sists them in indexes, while not altering the original files.                 By a brief analysis of Table 1, it clearly emerges that only
The node concept, represents metamodel types and contain                   few approaches leverage the relationships among the artifacts
their name, they are linked with relationships to their model.             as first-class entities to be used when exploring an existing
This mechanism directly supports conformance navigation                    model repository. Most of the analyzed approaches are based
but we are not sure if can be adapted to fully support a                   on indexing techniques for the sake of better performance.
megamodel-based relationship navigation. The default query                 Some approaches have recognized the importance of being
mechanism is based on native APIs e.g., related to Neo4J4                  generic and supporting the management of any kind of model-
and OrientDB5 .                                                            ing artifacts. Among the considered approaches, the preferred
    In [12], the authors propose a search-based metamodel                  query mechanism follows the ’query-by-example’ model.
matching mechanism combining structural and syntactic met-                    In the next section, the proposed approach is presented. It
rics to generate correspondences between metamodels. The                   aims at being generic in order to manage any kind of modeling
approach considers metamodel matching as an optimization                   artifact. Moreover, it reduces the accidental complexity by
problem. They adopted a global search to generate an initial               allowing modelers to efficiently search repositories by means
solution and, subsequently, a local search, namely simulated               of dedicated search operators that prevent the designer from
annealing, to refine the initial solution generated by the ge-             becoming familiar with specific languages, systems, and tech-
netic algorithm. The approach starts by generating a set of                nologies.
possible matching solutions between the source and target
metamodels randomly. Then, these solutions are evaluated                   4      PROPOSED APPROACH
using a fitness function based on structural and syntactic                 Conventional wisdom on managing large repositories suggests
measures.                                                                  that technical merit of the query model is key to success. In
    In [6], the authors investigate the adoption of different              fact, as shown in Sect. 2 very simple requirements may lead
techniques for indexing and searching model repositories, by               to complex queries embedded in components that involve sev-
focusing on WebML [8] models. Keyword-based and content-                   eral languages and tools. We aim to address this shortcoming
based techniques are employed to provide users with a query-               by proposing a technique that is based on search operators
by-example paradigm. In [7], authors proposes the adoption                 that abstract from the underlying machinery. For instance,
of information retrieval techniques. They identify relevant                Gmail provides users with a list of domain-specific opera-
design dimensions and several options (project segmentation,               tors6 that can be used when searching throughout mails, e.g.,
index structure, query language and processing, and result                 the operator has:attachment returns all messages with an
                                                                           attachment. Multiple operators can be combined in complex
3
    http://lucene.apache.org/solr/                                         search string, as for instance “to:david has:youtube” that
4
    https://neo4j.com
5                                                                          6
    https://orientdb.com/                                                      https://support.google.com/mail/answer/7190?hl=en
                                                                   the documents, providing a search interface, query building,
                                                                   query execution, and showing results.

                                                                   4.2      Model search infrastructure
                                                                   Figure 1 illustrates the architecture of the search infrastruc-
                                                                   ture with the components underpinning the integration of
                                                                   MDEForge and Lucene. In particular, Lucene is used to cre-
                                                                   ate indexes related to the artifacts stored in the MDEForge
                                                                   repository. The indexing operation can be customized for
                                                                   example, scheduling it every time a new artifact is uploaded.
                                                                   The content to be considered when creating the indexes is
                                                                   retrieved by the Content extractor component, which extracts
                                                                   specific information for each type of artifact. For instance, for
Figure 1: Architecture overview of the proposed model search       metamodels the component extracts relevant metamodel char-
infrastructure                                                     acteristics, including package names, nsUri, metaclass names,
                                                                   attribute and reference names, enumerations, literales, and
                                                                   datatypes. Concerning model transformations, the content
filters messages sent to a specific recipient with a YouTube
                                                                   extractor retrieves for each transformation specific attributes
video. The interesting idea about this is that it is based
                                                                   like helper and rule names, etc. Interestingly, the content
on a very simple syntax that does scale well, in the sense
                                                                   extractor implements some reflection as well, since it retrieves
that i) it is platform agnostic as it does not require specific
                                                                   also information about the megamodel representing all the
expertise to be used; and ii) the notation remains concise
                                                                   existing relationships of the ecosystem stored in the reposi-
despite searches can get complex. Thus, the same mechanism
                                                                   tory. For instance, the megamodel explicitly represents the
has been adopted for searching through model repositories.
                                                                   conformsTo relation between a model and the corresponding
For instance, let us assume we are interested in retrieving all
                                                                   metamodel. Such conformance relation is used to index mod-
transformations that consumes models conforming to a Fam-
                                                                   els with respect to the corresponding types. For instance, if
ily metamodel, the expression fromMM:Family returns them.
                                                                   we consider the Person metamodel7 , the content extractor
Now, if we restrict our attention to only those transformations
                                                                   will retrieve information so to enable the possibility to query
that return Person models, we can write fromMM:Family
                                                                   models conforming to the Person metamodel by searching
toMM:Person. An excerpt of the available operators is in
                                                                   for persons with a particular name and so on. There are
Table 2 together with descriptions and typing requirements.
                                                                   attributes that are common for all the types of artifacts and
    The approach has been implemented by integrating the
                                                                   that are retrieved by the extractor as, e.g., author’s name,
Lucene [16] search engine in the MDEForge [5] platform. A
                                                                   update time, etc.
detailed overview of the technologies, model search infras-
                                                                      Once all the artifacts stored in the repository have been
tructure, and operators is given in the next section.
                                                                   extracted, they are analyzed by the MDE Artifacts Analyzer,
                                                                   which enriches the indexes created by the Index writer com-
4.1   Overview of the technology baseline
                                                                   ponent. When users submit a search string (by means of
MDEForge is an extensible modeling framework that consists         the Web-based interface or via the available Rest API), a
of services for storing, managing, analysing any kind of mod-      corresponding query is built by the Query Builder component
eling artefacts. Extensions can be developed and integrated        in order to retrieve artifacts from the MDEForge repository
in the platform by starting from the services exposed by the       with respect to the available indexes.
platform. Restful APIs allow implementors to design complex           Figure 2 shows a screenshot of the Web-based search page,
modelling life-cycles that are in turn used as software-as-a-      consisting of three main parts: the search form, the list of
service. The persistency layer provided by MDEForge is at          available search operators that can be used for specifying
the base of this work and will be used to retrieve the artifacts   the query, and the query results. In the shown example, the
that can be searched by means of specific search operators.        search string makes use of three operators, namely eClass:,
Interested readers can refer to [5] for more details about the     eReference:, and eAttribute: in order to search for all the
technical details about MDEForge and its megamodel-based           metamodels containing metaclasses named Family, which in
architecture.                                                      turn contains a reference named members, and an attribute
   Lucene is a simple but powerful Java-based and open             named age. The execution of that query produced one result
source search library. It is scalable and its high-performance     consisting of the Family.ecore metamodel.
enables its adoption to index and search virtually any kind           Query results are ranked with respect to a matching score
of text. Lucene can be used in any application to add search       between the query and the found artifacts (in the example
capabilities to it. The library provides the core operations,      shown in Fig. 2 it is 755). The score is determined by the
which are typically required by any search application. The        Lucene engine and depends on many factors. In particular,
main operations the engine provides can be summarized
                                                                   7
as: collecting the content, analyzing the artifacts, indexing          http://www.eclipse.org/atl/atlTransformations/
                                    Figure 2: The proposed model search infrastructure at work
 Operator          Artifact         Description
 name
 name:             Any              It returns all the artifacts matching the name provided by the query tag value
 author:           Any              It returns all the artifacts provided by a specific author
 conformToMM:      Metamodel        It returns the models that conform to the metamodel named as the provided value
 eClass:           Metamodel        It returns the metamodels containing at least one metaclass named as the provided value
 eAttribute:       Metamodel        It returns the metamodels containing at least one metaclass having an attribute named as the
                                    provided value
 eReference:       Metamodel        It returns the metamodels containing at least one metaclass having a reference named as the provided
                                    value
 fromMM:       /   Transformation   It returns all the transformations having as source metamodel the provided value for the fromMM
 toMM:                              tag and as destination the provided value for the toMM tag
 fromMC:       /   Transformation   It returns all the transformations having a rule transforming the metaclass specified in the value for
 toMC:                              the first tag into the metaclass specified as value for the last tag
                                         Table 2: Excerpt of the available search operators

to determine that value, Lucene implements a variant of the             metamodels, 350 models, and 115 transformations8 . In par-
TF-IDF[17] scoring model.                                               ticular, the approach has been used to specify and execute
                                                                        the queries discussed in Sect.2. The main goal of the experi-
4.3    Model search operators                                           ments is to perform a preliminary experimental assessment
As mentioned above, Table 2 shows an excerpt of the opera-              of the proposed approach with respect to i) its suitability to
tors provided by the proposed approach and that can be used             support the specification of model queries involving different
to search throughout a repository as shown in Fig. 2. Typ-              kinds of interrelated artifacts; ii) its performance in terms of
ing information are in the second column, i.e., the operator            query execution time. The queries that have been employed
name: is evaluated on any type of artifacts, whilst eClass:             in the experiments are the following:
is executed only on metamodels.                                         Q1: Get all the metamodels having a metaclass named class-
   The operators can be used in the form key:value , where                    Name, which in turn contains an attribute named attr-
key is the operator specification and value is the matching                   Name, and a reference named refName;
term. More complex search expressions can be obtained by a              Q2: Get all the model transformations transforming meta-
conjunction of operators                                                      classes named metaclassName1 to generate target meta-
                                                                              className2 elements;
                         {key:value }+
                                                                        Q3: Get all the models conforming to the metamodel named
Furthermore, Lucene provides users with additional operators                  metamodelName.
that can be used to compose queries [16].                               The queries are written with the proposed approach and
                                                                        in OCL in order to provide a comparison. Table 3 shows
5     EXPERIMENTS
                                                                        8
                                                                          These are publicly available artifacts; as it often happens it is not
In this section, a discussion is provided about the application         easy to retrieve models from publicly accessible repositories, while
of the proposed approach to a dataset consisting of 2.422               metamodels are typically easier to find.
                                               Proposed approach                                            OCL based approach
             exec.time (ms)                               query string                                     exec.time (ms) #loc
       Q1           39         eClass:className AND eAttribute:attrName AND eReference:refName                 12641       ≈70
       Q2           59           fromMetaclass:metaclassName1 AND toMetaclass:metaclassName2                    7701       ≈40
       Q3           24                            conformToMM:metamodelName                                     8102       ≈60
                                                   Table 3: Performed experiments
relevant data related to the application of both versions of              heterogeneous information sources. Computer Languages, Sys-
the queries. The specification of the queries by means of                 tems & Structures 53 (2018), 90–120.
                                                                      [2] Konstantinos Barmpis and Dimitris Kolovos. 2013. Hawk: Towards
the proposed operators is given in the third column of the                a Scalable Model Indexing Architecture. In Proceedings of the
same table, whereas the number of lines of code (#loc) of                 Workshop on Scalability in Model Driven Engineering (BigMDE
                                                                         ’13). ACM, New York, NY, USA, Article 6, 9 pages.
the OCL-based solutions is shown in the last column. The              [3] Konstantinos Barmpis and Dimitrios S Kolovos. 2014. Towards
#loc values are obtained by considering the length of the                 scalable querying of large-scale models. In European Conference
                                                                          on Modelling Foundations and Applications. Springer, 35–50.
code shown in Sect. 2 and the lines of the auxiliary methods          [4] Francesco Basciani, Juri Di Rocco, Davide Di Ruscio, Amleto
that have been implemented for supporting the execution                   Di Salle, Ludovico Iovino, and Alfonso Pierantonio. 2014. MDE-
of those queries. For both approaches, the query execution                Forge: an Extensible Web-Based Modeling Platform. Cloud-
                                                                          MDE@MoDELS (2014), 66–75.
times are reported in milliseconds (see the second and fourth         [5] Francesco Basciani, Davide Di Ruscio, Ludovico Iovino, and Al-
columns of the table).                                                    fonso Pierantonio. 2014. Automated Chaining of Model Transfor-
   From the experimental data it emerges that the execution               mations with Incompatible Metamodels. MoDELS 8767, Chapter
                                                                          37 (2014), 602–618.
time of the OCL-based queries is always considerably higher           [6] Bojana Bislimovska, Alessandro Bozzon, Marco Brambilla, and
than that of the proposed approach. In particular, for Q2 the             Piero Fraternali. 2014. Textual and content-based search in repos-
                                                                          itories of web application models. ACM Transactions on the Web
execution time of the OCL query is eight times higher; this               (TWEB) 8, 2 (2014), 11.
can be explained with the cardinality of metamodels that is           [7] Alessandro Bozzon, Marco Brambilla, and Piero Fraternali. 2010.
much higher of that of the other artifacts. As to the verbosity,          Searching repositories of web application models. In International
                                                                          Conference on Web Engineering. Springer, 1–15.
the proposed approach outperforms the OCL-based query.                [8] Stefano Ceri, Piero Fraternali, and Aldo Bongio. 2000. Web
It is worth noting that the execution times of the queries                Modeling Language (WebML): a modeling language for designing
specified by means of the proposed approach do not take into              Web sites. Computer Networks 33, 1-6 (2000), 137–157.
                                                                      [9] Juri Di Rocco, Davide Di Ruscio, Johannes Härtel, Ludovico
account the time needed to create the indexes as discussed in             Iovino, Ralf Lämmel, and Alfonso Pierantonio. 2018. Sys-
the previous section. However, indexes are typically created              tematic Recovery of MDE Technology Usage. In Theory and
                                                                          Practice of Model Transformation, Arend Rensink and Jesús
off-line by means of batch processes, which do not interfere              Sánchez Cuadrado (Eds.). Springer International Publishing,
with the actual execution of queries. Moreover, each time the             Cham, 110–126.
indexes must be updated because new artifacts are added,             [10] Paulo Gomes, Francisco C. Pereira, Paulo Paiva, Nuno Seco, Paulo
                                                                          Carreiro, José L. Ferreira, and Carlos Bento. 2004. Using WordNet
this is done incrementally.                                               for Case-based Retrieval of UML Models. AI Commun. 17, 1 (Jan.
                                                                          2004), 13–23. http://dl.acm.org/citation.cfm?id=992846.992849
6   CONCLUSION AND FUTURE WORK                                       [11] Brahim Hamid. 2017. A model-driven approach for developing a
                                                                          model repository: Methodology and tool support. Future Gener-
Modern modelling tools are becoming more and more dis-                    ation Computer Systems 68 (2017), 473 – 490.
tributed platforms where artifacts can be persistently stored        [12] Marouane Kessentini, Ali Ouni, Philip Langer, Manuel Wimmer,
                                                                          and Slim Bechikh. 2014. Search-based metamodel matching with
and coherently dealt with. As a consequence, being able to                structural and syntactic measures. Journal of Systems and Soft-
conveniently search throughout the repository according to                ware 97 (2014), 1–14.
                                                                     [13] Wolfgang Kling, Frédéric Jouault, Dennis Wagelaar, Marco Bram-
specific criteria is key to any reuse practice. The approach pre-         billa, and Jordi Cabot. 2012. MoScript: A DSL for Querying
sented in this paper proposes simple yet powerful operators               and Manipulating Model Repositories. In Software Language En-
to perform complex repository searches. The correspond-                   gineering, Anthony Sloane and Uwe Aßmann (Eds.). Springer
                                                                          Berlin Heidelberg, Berlin, Heidelberg, 180–200.
ing search infrastructure permits modelers to explore model          [14] Daniel Lucrédio, Renata P. de M. Fortes, and Jon Whittle. 2008.
repositories in an efficient way by abstracting form the spe-             MOOGLE: A Model Search Engine. Springer Berlin Heidelberg,
cific platforms and tools used to formulate the queries. The              Berlin, Heidelberg, 296–310.
                                                                     [15] Daniel Lucrédio, Renata P de M Fortes, and Jon Whittle. 2012.
approach has been implemented by integrating the Lucene                   MOOGLE: a metamodel-based model search engine. Software &
search engine in the MDEForge platform. Preliminary exper-                Systems Modeling 11, 2 (2012), 183–208.
                                                                     [16] Michael McCandless, Erik Hatcher, and Otis Gospodnetic. 2010.
iments show that the approach is promising, especially when               Lucene in action: covers Apache Lucene 3.0. Manning Publica-
compared with traditional techniques, even though more ac-                tions Co.
curate comparison criteria and metrics have to be properly           [17] Juan Ramos et al. 2003. Using tf-idf to determine word relevance
                                                                          in document queries. In Proceedings of the first instructional
defined. Moreover, the adoption of alternative frameworks                 conference on machine learning, Vol. 242. 133–142.
for model query, like Hawk [2] will be also investigated.            [18] J. Di Rocco, D. Di Ruscio, L. Iovino, and A. Pierantonio. 2015.
                                                                          Collaborative Repositories in Model-Driven Engineering [Software
                                                                          Technology]. IEEE Software 32, 3 (May 2015), 28–34.
REFERENCES
 [1] Mora Segura Ángel, Juan de Lara, Patrick Neubauer, and Manuel
     Wimmer. 2018. Automated modelling assistance by integrating