<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Mining SQL Execution Traces for Data Manipulation Behavior Recovery</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Marco Mori?</string-name>
          <email>marco.mori@unamur.be</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nesrine Noughi</string-name>
          <email>nesrine.noughi@unamur.be</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anthony Cleve</string-name>
          <email>anthony.cleve@unamur.be</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>PReCISE Research Center, University of Namur</institution>
        </aff>
      </contrib-group>
      <fpage>41</fpage>
      <lpage>48</lpage>
      <abstract>
        <p>Modern data-intensive software systems manipulate an increasing amount of heterogeneous data in order to support users in various execution contexts. Maintaining and evolving activities of such systems rely on an accurate documentation of their behavior which is often missing or outdated. Unfortunately, standard program analysis techniques are not always suitable for extracting the behavior of dataintensive systems which rely on more and more dynamic data access mechanisms which mainly consist in run-time interactions with a database. This paper proposes a framework to extract behavioral models from dataintensive program executions. The framework makes use of dynamic analysis techniques to capture and analyze SQL execution traces. It applies clustering techniques to identify data manipulation functions from such traces. Process mining techniques are then used to synthesize behavioral models.</p>
      </abstract>
      <kwd-group>
        <kwd>data-manipulation behavior recovery</kwd>
        <kwd>data-oriented process mining</kwd>
        <kwd>data-manipulation functions</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Data-intensive systems typically consists of a set of applications performing
frequent and continuous interactions with a database. Maintaining and evolving
data-intensive systems can be performed only after the system has been su
ciently understood, in terms of structure and behavior. In particular, it is
necessary to recover missing documentation (models) about the data manipulation
behavior of the applications, by analyzing their interactions with the database.
In modern systems, such interactions usually rely on dynamic SQL, where
automatically generated SQL queries are sent to the database server.</p>
      <p>
        The literature includes various static and dynamic program analysis
techniques to extract behavioral models from traditional software systems.
Existing static analysis techniques [
        <xref ref-type="bibr" rid="ref18 ref19 ref20 ref22 ref7">19,18,22,7,20</xref>
        ], analyzing program source code,
typically fail in producing complete behavioral models in presence of dynamic
SQL. They cannot capture the dynamic aspects of the program-database
interactions, in uenced by context-dependent factors, user inputs and results of
? bene ciary of an FSR Incoming Post-doctoral Fellowship of the Academie
universitaire `Louvain', co-funded by the Marie Curie Actions of the European Commission
preceding data accesses. Existing dynamic analysis techniques [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], analyzing
program executions, have been designed for other purposes than data
manipulation behavior extraction. Several authors have considered the analysis of SQL
execution traces in support to data reverse engineering, service identi cation or
performance monitoring [
        <xref ref-type="bibr" rid="ref11 ref12 ref23 ref8 ref9">8,9,12,11,23</xref>
        ]. Such techniques look very promising for
recovering an approximation of data-intensive application behavior.
      </p>
      <p>In this paper, we propose a framework to recover the data manipulation
behavior of programs, starting from SQL execution traces. Our approach uses
clustering to group the SQL queries that implement the same high-level data
manipulation function, i.e., that are syntactically equal but with di erent input
or output values. We then adopt classical process mining techniques to recover
data manipulation processes. Our approach operates at the level of a feature, i.e.,
a software functionality as it can be perceived by the user. A feature corresponds
to a process enabling di erent instances, i.e., traces, each performing possibly
di erent interactions with a database.</p>
      <p>The reminder of this paper presents in Section 2 our approach along with a
tool-supported validation. Finally, Section 3 discusses related work and Section
4 ends the paper showing possible future directions.</p>
      <p>Motivating Example. We consider an e-commerce web store for selling
products in a world-wide area. The system provides a set of features requiring
frequent and continuous interactions with the database by means of executing
SQL statements. For instance, the feature for retrieving products (view products)
accesses information about categories, manufacturers and detailed product
information. Which data are accessed at runtime depends on dynamic aspects
of the system. For example, given that a certain feature instance retrieves the
categories of products before accessing product information we can derive that
it corresponds to a category-driven search. If a certain instance accesses
manufacturer information before product information we analogously derive that
it corresponds to a manufacturer-driven search. By capturing and mining the
database interactions of multiple feature instances, it is possible to recover the
actual data manipulation behavior of the feature, e.g., a process model with a
variability point among two search criteria.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Data Manipulation Behavior Recovery</title>
      <p>
        Our framework supports the extraction of the data manipulation behavior of
programs by exploiting several artifacts (see Fig. 1). We assume the existence
of a logical and possibly of a conceptual schema with a mapping between them.
The conceptual schema is a platform-independent speci cation of the application
domain concepts, their attributes and relationships. The logical schema contains
objects (tables, columns and foreign keys) implementing abstract concepts over
which queries are de ned. The conceptual schema and the mapping to the logical
schema can be either available, or they can be obtained via database reverse
engineering techniques [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Queries de ned over the logical schema materialize the
interactions occurring between multiple executions (traces) of a feature and the
underlying database. Once the source code related to a feature has been
identi ed [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], di erent techniques can capture SQL execution traces. Those
techniques, compared in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], range from using the DBMS log to sophisticated source
code transformation. Among others, the approaches presented in [
        <xref ref-type="bibr" rid="ref1 ref17">1,17</xref>
        ] recover
the link between SQL executions and source code locations through automated
program instrumentation, while [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] makes use of tracing aspects to capture SQL
execution traces without source code alteration. Once a sequence of queries is
captured, it is necessary to identify the di erent traces, each corresponding to a
feature instance. This problem has been tackled in the literature of speci cation
mining by analyzing value-based dependencies of methods calls [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
Our approach is independent from the adopted trace capturing techniques. For
each feature, it requires as minimal input a set of execution traces, each trace
consisting of a sequence of SQL queries.
      </p>
      <p>Query parsing (1). We characterize SQL queries according to (1) the
information they recover or modify and (2) the related selection criteria. To this end,
for each query we record a set of data-oriented properties according to the query
type. For a select query we record a property with the select clause while for
delete, update, replace or insert queries we record a property with the name of
the table. If the query is either update, replace or insert we also record a
property with the set clause and all its attributes. Finally for all query types but the
insert we add a property for the where clauses along with their attributes. By
means of these properties we ignore the actual values taken as input and
produced as output by each query. Figure 2 shows three SQL traces along with their
corresponding properties. For instance, query q1 is a select query over attribute
Password of Customer table (property p1) and it contains a where clause with
an equality condition over Id attribute (p2); query q3 is a select over attributes
Id and P rice of P roduct (property p4), it contains two where clauses, i.e., a
natural join between P roduct:Id and P Category:P roduct Id (p5) and an equality
condition over P Category:Category Id attribute (p6).</p>
      <p>Query ltering (2). We remove from the input traces the queries that do not
express end-user concepts, i.e., the ones referring to database system tables or
log tables appearing only in the logical schema. In our example we remove q10
Trace 1:
Trace 2:
Trace 3:
q1: SELECT Customer . Password FROM Customer WHERE Customer .Id = 'Mark27 '; [p1 ,p2]
q2: SELECT Category .Id , Category . Image FROM Category ; -&gt; [p3]
q3: SELECT Product .Id , Product . Price FROM Product , PCategory WHERE Product .Id= PCategory . Product_Id AND</p>
      <p>PCategory . Category_Id = '1 '; -&gt; [p4 ,p5 ,p6]
q4: SELECT PLang . Description FROM PLang , Language WHERE PLang . Language_Id = Language . Code AND PLang . Product_Id
='1A23 ' AND Language . Name =' Italian '; -&gt; [p7 ,p8 ,p9 , p10 ]
q5: SELECT SpecialProduct . NewPrice FROM SpecialProduct , Product WHERE SpecialProduct . Product_Id = Product .Id</p>
      <p>AND Product .Id ='1A23 '; -&gt; [p11 ,p12 , p13 ]
q6: SELECT Manufacturer . Name FROM Manufacturer , Product WHERE Manufacturer .Id= Product . Manufacturer_Id AND</p>
      <p>Product .Id ='1A23 '; -&gt; [p14 ,p15 , p13 ]
q7: SELECT PLang . Description FROM PLang , Language WHERE PLang . Language_Id = Language . Code AND PLang . Product_Id
='1F32 ' AND Language . Name =' Italian '; -&gt; [p7 ,p8 ,p9 , p10 ]
q8: SELECT SpecialProduct . NewPrice FROM SpecialProduct , Product WHERE SpecialProduct . Product_Id = Product .Id</p>
      <p>AND Product .Id ='1F32 '; -&gt; [p11 ,p12 , p13 ]
q9: SELECT Manufacturer . Name FROM Manufacturer , Product WHERE Manufacturer .Id= Product . Manufacturer_Id AND</p>
      <p>Product .Id ='1F32 '; -&gt; [p14 ,p15 , p13 ]
q10 : INSERT INTO Log ( IdEvent ,Event ,Date , Time ) VALUES ( '021 ' , ' PrAcc1A23 -1 F32 ' , '2013 -02 -22 ' , '12:21:00 ') ; -&gt; [
p16 ]
q11 : SELECT Customer . Password FROM Customer WHERE Customer .Id = 'JennyMa '; [p1 ,p2]
q12 : SELECT Category .Id , Category . Image FROM Category ; -&gt; [p3]
q13 : SELECT Product .Id , Product . Price FROM Product , PCategory WHERE Product .Id= PCategory . Product_Id AND</p>
      <p>PCategory . Category_Id = '2 '; -&gt; [p4 ,p5 ,p6]
q14 : SELECT Customer . Password FROM Customer WHERE Customer .Id = 'DanWer '; [p1 ,p2]
q15 : SELECT Manufacturer .Id , Manufacturer . Name FROM Manufacturer -&gt; [ p17 ]
q16 : SELECT Product .Id , Product . Price FROM Product WHERE Product . Manufacturer_Id =' AppleNamur01 ' -&gt; [p4 , p18 ]
q17 : SELECT PLang . Description FROM PLang , Language WHERE PLang . Language_Id = Language . Code AND PLang .</p>
      <p>Product_Id ='2D11 ' AND Language . Name =' Italian '; -&gt; [p7 ,p8 ,p9 , p10 ]
q18 : SELECT SpecialProduct . NewPrice FROM SpecialProduct , Product WHERE SpecialProduct . Product_Id = Product .Id</p>
      <p>AND Product .Id ='2D11 '; -&gt; [p11 ,p12 , p13 ]
q19 : SELECT Manufacturer . Name FROM Manufacturer , Product WHERE Manufacturer .Id= Product . Manufacturer_Id AND</p>
      <p>
        Product .Id ='2D11 '; -&gt; [p14 ,p15 , p13 ]
q20 : INSERT INTO Log ( IdEvent ,Event ,Date , Time ) VALUES ( '022 ' , ' PrAcc2D11 ' , '2013 -02 -28 ' , '14:00:03 ') ; -&gt; [ p16 ]
SQL-statements properties:
p1 =" SELECT Customer . Password ", p2 =" Customer .Id. EQ_VALUE ", p3 =" SELECT Category .Id Category . Image ",
p4 =" SELECT Product .Id Product . Price ", p5 =" Product .Id= PCategory . Product_Id ",
p6 =" PCategory . Category_Id . EQ_VALUE ", p7 =" SELECT PLang . Description ", p8 =" PLang . Language_Id = Language . Code ",
p9 =" PLang . Product_Id . EQ_VALUE ", p10 =" Language . Name . EQ_VALUE ", p11 =" SELECT SpecialProduct . NewPrice ",
p12 =" SpecialProduct . Product_Id = Product .Id", p13 =" Product .Id. EQ_VALUE ", p14 =" SELECT Manufacturer . Name ",
p15 =" Product . Manufacturer_Id = Manufacturer .Id", p16 =" INSERT INTO Log ",
p17 =" SELECT Manufacturer .Id Manufacturer . Name ", p18 =" Product . Manufacturer_Id . EQ_VALUE "
and q20 accessing table Log without a counterpart in the conceptual schema.
Query clustering (3). We cluster queries having the same data-oriented
properties thus producing disjoint partitions, related to di erent database accesses.
We report in Table 1 the clusters obtained from queries in Fig.2.
by each cluster by analyzing the conceptual schema fragment corresponding to
the logical subschema accessed by the cluster queries. For determining the labels
we adopt the same naming convection proposed in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] to associate conceptual
level operations to SQL query code. In addition, we associate the label with
a set of input/output (I/O) parameters (see Table 2). Input parameters are
the attributes involved in equality or inequality conditions that appear in the
data-oriented properties of the queries, while output parameters are the set of
attributes appearing within the select query property.
Process mining (5-6). We generate a process starting from a set of SQL
traces of a single feature. The traces abstraction phase replaces SQL traces with
the corresponding traces of data manipulation functions. The process extraction
phase exploits a process mining algorithm to extract the feature behavior as a
sequence of function executions with sequential, parallel and choice operators.
In the following we show how to recover the data manipulation behavior of the
view products web-store feature starting from the traces of data manipulation
functions in Table 3 (corresponding to queries in Fig.2).
Trace 1 gets customer information (C1), it performs a category-driven search of
products by means of getting all the product categories (C2) and all the products
of a certain selected category (C3). For each retrieved product, three functions
are iterated: C4 retrieves product description, C5 extracts special product
information and C6 extracts related manufacturer information. Trace 2 is di erent
from Trace 1 because after function C3 no products are retrieved and the process
ends. If we apply a mining algorithm to Trace 1 and 2 we obtain a process (Fig.
3(a)) which performs consecutively functions C1, C2 and C3 before entering in
the loop iterating C4, C5, and C6. The process ends after zero, one or more
iterations of the loop. Let us now assume to include into the process Trace 3
which is equal to Trace 1 except that it searches products based on their
manufacturer (functions C7 and C8) instead of searching by category (C2 and C3).
If we mine the process model by considering as input all the traces (Fig. 3(b)),
we end up with a new alternative branch: the customer can now perform either
a manufacturer-driven search or a category-driven search.
      </p>
      <p>
        (a)
(b)
Tool support. The presented approach is implemented into an integrated tool
which takes as input a set of SQL traces (each representing an instance of the
same feature), the logical schema and optionally the conceptual schema and the
conceptual-to-logical schema mapping. A SQL parser extracts the data-oriented
properties while a clustering component exploits the colibri-Java Formal
Concept Analysis tool1 to cluster queries according to those properties. A labeling
component generates data manipulation functions (i.e., cluster signatures) while
a trace abstraction component uses a Java library2 to create standardized event
logs. Finally we rely on the de-facto standard process mining tool (ProM tool3)
to create a Petri net from standardized event logs. ProM supports di erent
process mining algorithms providing di erent trade-o s between completeness and
noise [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] to be chosen according to speci c application needs.
      </p>
      <p>
        We applied our tool together with ProM and the ILP miner algorithm [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]
(complete models with low noise) to extract data-oriented processes of a
erestaurant web application and we conducted a set of preliminary experiments to
assess the sensitivity of our technique in producing correct processes depending
on the traces log coverage. The tool supported the identi cation of correct
features processes in a semi-automatic manner along with the help of the designer.
A complete list of SQL statements grouped by feature with di erent traces,
extracted data manipulation functions and corresponding processes are publicly
available at the companion website4. The conceptual and logical schemas,
accessible through the DB-MAIN5 CASE tool, are also provided.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Related Work</title>
      <p>
        In the literature di erent approaches use dynamic analysis of SQL queries with a
di erent goal than data manipulation behavior understanding. The approaches
1 http://code.google.com/p/colibri-java/
2 http://www.xes-standard.org/openxes/start
3 http://www.promtools.org/
4 http://info.fundp.ac.be/~mmo/MiningSQLTraces
5 DB-MAIN o cial website, http://www.db-main.be
presented in [
        <xref ref-type="bibr" rid="ref8 ref9">8,9</xref>
        ] analyze SQL statements in support to database reverse
engineerinf, e.g., detecting implicit schema constructs [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] and implicit foreign keys
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. The approach presented by Di Penta et al. [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] identi es services from SQL
traces. The authors apply FCA techniques to name services I/O parameters thus
supporting the migration towards Service Oriented Architecture. Debusmann et
al. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] present a dynamic analysis method for system performance monitoring,
i.e., measuring the response time of queries sent to a remote database server.
Yang et al. [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] support the recovery of a feature model by means of
analyzing SQL traces. Although the former approaches analyze (particular aspects of)
the data access behavior of running programs, none of the former approaches
[
        <xref ref-type="bibr" rid="ref11 ref12 ref23 ref8 ref9">8,9,12,11,23</xref>
        ] is able to produce process models expressing such a behavior at a
high abstraction level, as we do in this paper.
      </p>
      <p>
        Other approaches (e.g., [
        <xref ref-type="bibr" rid="ref15 ref16">16,15</xref>
        ]) extract business processes by
exploiting/combining static and dynamic analysis techniques, but they are not designed to deal
with dynamically generated SQL queries. The most related approach, by Alal
et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], extracts scenario diagrams and UML security models by considering
runtime database interactions and the state of the PHP program. These models
are used for verifying security properties but they do not describe the generic
data manipulation behavior of the program, they only analyze web-interface
interactions. In addition they have not considered di erent possible instances of a
given scenario as we claim it is necessary to extract a complete and meaningful
model. Understanding processes starting from a set of execution traces is at the
core of process mining. This paper does not make any additional contributions
as far as process mining is concerned, but it is the rst to apply such techniques
to analyze program-database interactions.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Conclusion and future work</title>
      <p>Our paper presented a tool-supported approach to recover the data
manipulation behavior of data-intensive systems. The approach makes use of clustering,
conceptualization and process mining techniques starting from SQL execution
traces captured at runtime. The approach is independent from the type of
systems considered, provided that a query interception phase is possible. It could,
for instance, be applied to legacy cobol systems, Java systems with or without
Object-Relational-Mapping technologies, or web applications written in PHP.
As for future work we plan to enrich the input traces with multiple sources of
information like user input, source code and queries results with the aim of
identifying the conditions that characterize decision points within process models.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>M.</given-names>
            <surname>Alal</surname>
          </string-name>
          , J. Cordy, , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Dean</surname>
          </string-name>
          . WAFA:
          <article-title>Fine-grained dynamic analysis of web applications</article-title>
          .
          <source>In WSE 2009</source>
          , pages
          <fpage>41</fpage>
          {
          <fpage>50</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>M. H.</given-names>
            <surname>Alal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Cordy</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T. R.</given-names>
            <surname>Dean</surname>
          </string-name>
          .
          <article-title>Recovering role-based access control security models from dynamic web applications</article-title>
          .
          <source>In ICWE</source>
          , pages
          <volume>121</volume>
          {
          <fpage>136</fpage>
          .
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>G.</given-names>
            <surname>Ammons</surname>
          </string-name>
          ,
          <string-name>
            <surname>R.</surname>
          </string-name>
          <article-title>Bod k, and</article-title>
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Larus</surname>
          </string-name>
          .
          <article-title>Mining speci cations</article-title>
          .
          <source>In ACM Sigplan Notices</source>
          , volume
          <volume>37</volume>
          , pages
          <fpage>4</fpage>
          {
          <fpage>16</fpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>J.</given-names>
            <surname>Buijs</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Dongen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>W.</given-names>
            <surname>Aalst</surname>
          </string-name>
          .
          <article-title>On the role of tness, precision, generalization and simplicity in process discovery</article-title>
          .
          <source>In OTM</source>
          , volume
          <volume>7565</volume>
          <source>of LNCS</source>
          , pages
          <volume>305</volume>
          {
          <fpage>322</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>A.</given-names>
            <surname>Cleve</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.-F.</given-names>
            <surname>Brogneaux</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.-L.</given-names>
            <surname>Hainaut</surname>
          </string-name>
          .
          <article-title>A conceptual approach to database applications evolution</article-title>
          .
          <source>In ER</source>
          , pages
          <volume>132</volume>
          {
          <fpage>145</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>A.</given-names>
            <surname>Cleve</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.-L.</given-names>
            <surname>Hainaut</surname>
          </string-name>
          .
          <article-title>Dynamic analysis of SQL statements for data-intensive applications reverse engineering</article-title>
          .
          <source>In WCRE 2008</source>
          , pages
          <fpage>192</fpage>
          {
          <fpage>196</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>A.</given-names>
            <surname>Cleve</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Henrard</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.-L.</given-names>
            <surname>Hainaut</surname>
          </string-name>
          .
          <article-title>Data reverse engineering using system dependency graphs</article-title>
          .
          <source>In WCRE 2006</source>
          , pages
          <fpage>157</fpage>
          {
          <fpage>166</fpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>A.</given-names>
            <surname>Cleve</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-R.</given-names>
            <surname>Meurisse</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.-L.</given-names>
            <surname>Hainaut</surname>
          </string-name>
          .
          <article-title>Database semantics recovery through analysis of dynamic sql statements</article-title>
          .
          <source>J. Data Semantics</source>
          ,
          <volume>15</volume>
          :
          <fpage>130</fpage>
          {
          <fpage>157</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>A.</given-names>
            <surname>Cleve</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Noughi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.-L.</given-names>
            <surname>Hainaut</surname>
          </string-name>
          .
          <article-title>Dynamic program analysis for database reverse engineering</article-title>
          .
          <source>In GTTSE</source>
          , pages
          <volume>297</volume>
          {
          <fpage>321</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>B.</given-names>
            <surname>Cornelissen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Zaidman</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. van Deursen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Moonen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Koschke</surname>
          </string-name>
          .
          <article-title>A systematic survey of program comprehension through dynamic analysis</article-title>
          .
          <source>IEEE Trans. Software Eng.</source>
          ,
          <volume>35</volume>
          (
          <issue>5</issue>
          ):
          <volume>684</volume>
          {
          <fpage>702</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>M.</given-names>
            <surname>Debusmann</surname>
          </string-name>
          and
          <string-name>
            <given-names>K.</given-names>
            <surname>Geihs</surname>
          </string-name>
          .
          <article-title>E cient and transparent instrumentation of application components using an aspect-oriented approach</article-title>
          .
          <source>In DSOM</source>
          <year>2003</year>
          , volume
          <volume>2867</volume>
          <source>of LNCS</source>
          , pages
          <volume>227</volume>
          {
          <fpage>240</fpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>C. D. Grosso</surname>
            ,
            <given-names>M. D.</given-names>
          </string-name>
          <string-name>
            <surname>Penta</surname>
            ,
            <given-names>and I. G</given-names>
          </string-name>
          . R. de Guzman.
          <article-title>An approach for mining services in database oriented applications</article-title>
          .
          <source>In CSMR</source>
          , pages
          <volume>287</volume>
          {
          <fpage>296</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>J.-L. Hainaut</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Henrard</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Englebert</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Roland</surname>
          </string-name>
          , and J.
          <string-name>
            <surname>-M. Hick</surname>
          </string-name>
          .
          <article-title>Database reverse engineering</article-title>
          .
          <source>In Encyclopedia of Database Systems</source>
          , pages
          <fpage>723</fpage>
          {
          <fpage>728</fpage>
          .
          <string-name>
            <surname>Springer</surname>
            <given-names>US</given-names>
          </string-name>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14. H.
          <string-name>
            <surname>Kazato</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Hayashi</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Kobayashi</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Oshima</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Okada</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Miyata</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Hoshino</surname>
            , and
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Saeki</surname>
          </string-name>
          .
          <article-title>Incremental feature location and identi cation in source code</article-title>
          .
          <source>In CSMR</source>
          , pages
          <volume>371</volume>
          {
          <fpage>374</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <given-names>Y.</given-names>
            <surname>Labiche</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Kolbah</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H.</given-names>
            <surname>Mehrfard</surname>
          </string-name>
          .
          <article-title>Combining static and dynamic analyses to reverse-engineer scenario diagrams</article-title>
          .
          <source>In ICSM</source>
          , pages
          <volume>130</volume>
          {
          <fpage>139</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>H. R. M. Nezhad</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Saint-Paul</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Casati</surname>
            , and
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Benatallah</surname>
          </string-name>
          .
          <article-title>Event correlation for process discovery from web service interaction logs</article-title>
          .
          <source>VLDB</source>
          ,
          <volume>20</volume>
          (
          <issue>3</issue>
          ):
          <volume>417</volume>
          {
          <fpage>444</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <given-names>M. N.</given-names>
            <surname>Ngo</surname>
          </string-name>
          and
          <string-name>
            <given-names>H. B. K.</given-names>
            <surname>Tan</surname>
          </string-name>
          .
          <article-title>Applying static analysis for automated extraction of database interactions in web applications</article-title>
          .
          <source>Information and software technology</source>
          ,
          <volume>50</volume>
          (
          <issue>3</issue>
          ):
          <volume>160</volume>
          {
          <fpage>175</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>J.-M. Petit</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Kouloumdjian</surname>
            ,
            <given-names>J.-F.</given-names>
          </string-name>
          <string-name>
            <surname>Boulicaut</surname>
            , and
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Toumani</surname>
          </string-name>
          .
          <article-title>Using queries to improve database reverse engineering</article-title>
          .
          <source>In ER</source>
          , pages
          <volume>369</volume>
          {
          <fpage>386</fpage>
          ,
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>J. C. Silva</surname>
            ,
            <given-names>J. C.</given-names>
          </string-name>
          <string-name>
            <surname>Campos</surname>
            , and
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Saraiva</surname>
          </string-name>
          .
          <article-title>Gui inspection from source code analysis</article-title>
          .
          <source>ECEASST</source>
          ,
          <volume>33</volume>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20. H. van den Brink, R. van der Leek, and
          <string-name>
            <given-names>J.</given-names>
            <surname>Visser</surname>
          </string-name>
          .
          <article-title>Quality assessment for embedded sql</article-title>
          .
          <source>In SCAM</source>
          , pages
          <volume>163</volume>
          {
          <fpage>170</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>J. M. E. van derWerf</surname>
          </string-name>
          ,
          <string-name>
            <surname>B. F. van Dongen</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>A. Hurkens, and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Serebrenik</surname>
          </string-name>
          .
          <article-title>Process discovery using integer linear programming</article-title>
          .
          <source>Fundamenta Informaticae</source>
          ,
          <volume>94</volume>
          (
          <issue>3</issue>
          ):
          <volume>387</volume>
          {
          <fpage>412</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <given-names>D.</given-names>
            <surname>Willmor</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Embury</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Shao</surname>
          </string-name>
          .
          <article-title>Program slicing in the presence of a database state</article-title>
          .
          <source>In ICSM 2004</source>
          , pages
          <fpage>448</fpage>
          {
          <fpage>452</fpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Peng</surname>
          </string-name>
          , and
          <string-name>
            <given-names>W.</given-names>
            <surname>Zhao</surname>
          </string-name>
          .
          <article-title>Domain feature model recovery from multiple applications using data access semantics and formal concept analysis</article-title>
          .
          <source>In WCRE</source>
          , pages
          <volume>215</volume>
          {
          <fpage>224</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>