<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Entity Resolution and Data Fusion: an integrated approach (DISCUSSION PAPER)</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Domenico Beneventano</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sonia Bergamaschi</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Luca Gagliardelli</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giovanni Simonini</string-name>
          <email>giovanni@csail.mit.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>MIT CSAIL</institution>
          ,
          <addr-line>Cambridge, MA</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Modena and Reggio Emilia</institution>
          ,
          <addr-line>Modena</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <fpage>1</fpage>
      <lpage>1</lpage>
      <abstract>
        <p>Entity Resolution and Data Fusion are fundamental tasks in a Data Integration process. Unfortunately, these tasks cannot be completely addressed by purely automated methods and, then, a "humanin-the-loop" approach, i.e., the interaction with the Integration Designer has to be considered. In fact, the application goal can be relevant to reduce the complexity and the cost of the whole integration process. Moreover, the Entity Resolution and Data Fusion tasks are often considered consecutive and independent of each other: the output of the rst step is used as input of the second one. In this paper, we will show how these tasks have not to be considered independent. In fact, the evaluation of data fusion results is fundamental for the Integration Designer to analyze, and eventually modify, the choices made during the Entity Resolution process. To show this, our highly scalable Entity Resolution tool, SparkER, will be extended with post-processing high-quality methods for matching. These methods will be integrated in the MOMIS Data Fusion system, extended as well with metrics for the evaluation of data fusion results.</p>
      </abstract>
      <kwd-group>
        <kwd>Data Integration</kwd>
        <kwd>Entity Resolution</kwd>
        <kwd>Data Fusion</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Data Integration is the problem of combining data residing at di erent
autonomous sources, and providing the user with a uni ed view of these data.
MOMIS (Mediator EnvirOnment for Multiple Information Sources) is an open
source Data Integration System [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], characterized by a classical wrapper/mediator
architecture, where the local data sources contain the real data, while a Global
Copyright c 2019 for the individual papers by the papers authors. Copying
permitted for private and academic purposes. This volume is published and copyrighted by
its editors. SEBD 2019, June 16-19, 2019, Castiglione della Pescaia, Italy.
Virtual Schema (GVS) provides a reconciled, integrated, and virtual view of the
underlying data sources. In particular, MOMIS performs Data Fusion, i.e., the
process of fusing multiple records representing the same real-world object into
a single, consistent, and clean representation; to perform data fusion, several
con ict handling strategies introduced in [7] are available in the MOMIS system
(see section 2.2). As described in several papers MOMIS adopts a semi-automatic
approach that retains the "human in the loop" where algorithms and tools are
used to assist the Integration Designer performing the Data Fusion task [
        <xref ref-type="bibr" rid="ref4">4, 17</xref>
        ].
      </p>
      <p>
        In the current MOMIS version, the Data Fusion process assumes that Entity
Resolution (ER) has been already performed and thus a shared object
identier (ID) exists among di erent sources; in other words, the current version of
MOMIS only implements an exact match. Multiple records with the same ID are
fused by means of the Full Join Merge operator [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>
        On 2016, we faced and solved the Entity Resolution problem by developing
a set of novel techniques [
        <xref ref-type="bibr" rid="ref1 ref5">1, 5, 12, 13, 15, 16</xref>
        ]. In particular, we proposed Blast
[13] (Blocking with Loosely-Aware Schema Techniques), an approach to reduce
the ER complexity with indexing techniques aiming to group similar records in
blocks and limit the comparison to only those records appearing in the same
block. This approach was implemented in SparkER [9, 14], a highly scalable
Entity Resolution tool designed to be parallelizable on Apache Spark. As
highlighted in a very recent demo of SparkER [10], the Entity Resolution process can
be improved by including the "human-in-the-loop".
      </p>
      <p>On the other hand, the Entity Resolution and Data Fusion steps are often
considered consecutive and independent of each other: the output of the rst step
is used as input of the second one in Data Integration systems. In this paper we
will show how these tasks have not to be considered independent. In fact, the
evaluation of data fusion results is fundamental for the Integration Designer to
analyze, and eventually modify, the choices made during the Entity Resolution
process.</p>
      <p>In detail, the main contributions of this paper are the following:
{ SparkER will be extended with post-processing methods to obtain
one-toone matching; such extended SparkER will be integrated in the MOMIS
framework: the output will be used as input in the MOMIS Data Fusion
system;
{ MomisDF, the module of the MOMIS system which performs Data Fusion, will
be extended as well with methods for the evaluation of data fusion results;
{ We will show, by an example, how the evaluation of data fusion results can
be used by the Integration Designer.</p>
      <p>The complete SparkER-MomisDF work ow is shown in Figure 1. SparkER
is composed by two main modules: (i) blocker: takes the input records and
performs the blocking phase, providing as output the candidate pairs; (ii)
Entity matcher takes the candidate pairs generated by the blocker and label
them as match or no match by comparing pair's similarity with a threshold,
so producing a Match Table of similar records with their similarity score. The
1-1 Matching module take as input such Match Table, which may contain</p>
    </sec>
    <sec id="sec-2">
      <title>Input</title>
    </sec>
    <sec id="sec-3">
      <title>Records</title>
      <p>Records
loading</p>
    </sec>
    <sec id="sec-4">
      <title>SparkER</title>
      <p>Blocker</p>
    </sec>
    <sec id="sec-5">
      <title>1-1 Matching</title>
      <p>Entity
matcher</p>
    </sec>
    <sec id="sec-6">
      <title>MOMIS</title>
    </sec>
    <sec id="sec-7">
      <title>Data Fusion</title>
      <p>Data
Fusion
results</p>
    </sec>
    <sec id="sec-8">
      <title>Data Fusion</title>
    </sec>
    <sec id="sec-9">
      <title>Evaluation</title>
      <p>User
interaction
many-to-many matching, and returns a 1-1 Match Table with only one-to-one
matching. MomisDF performs data fusion of the input records, on the basis of
the 1-1 Match Table.</p>
      <p>The structure of the paper is the following. Section 2 contains some
preliminaries from literature which we use in our paper. Section 3 shows, by means of
an example covering the whole SparkER-MomisDF work ow, how the evaluation
of fusion results can be useful for the Integration Designer.
2</p>
      <sec id="sec-9-1">
        <title>Preliminaries</title>
        <p>This section contains some preliminaries from literature which we use in our
paper. Section 2.1 introduces the post-processing methods for one-to-one matching
proposed in [8]; section 2.2 introduces the methods for the evaluation of data
fusion results proposed in [7].
2.1</p>
        <p>Post-processing Methods for one-to-one Matching
As stated in [8] assuming de-duplicated sources, each record can at most match
to one record of another source; hence, the matching result should exclusively
contain one-to-one links as otherwise precision is deteriorated; in other words,
in the most common two-source case, it is often desirable for the nal matching
to be one-to-one [18]. On the other hand, Most ER methods, and in particular,
the ones using threshold-based techniques, as SparkER, often produce
multilinks, i.e., one record is matched to many records of another source. For this
reason, the authors of [8] proposed methods that can be executed after any
entity resolution process to clean multi-links, i.e., to transform the result such
that only one-to-one matching occurs in the nal result. The following three
post-processing strategies are analyzed and implemented in [8]:
{ Symmetric Best Match (Max1-both): the basic idea is that for every record
only the best matching record of the other source is accepted.
{ Maximum Weight matching (MWM): a MWM is a matching that has
maximum weight, i.e., that maximizes the sum of the overall similarities between
records in the nal linkage result.
{ Stable Marriage (SM): a matching is de ned as stable, if there are no two
records of the di erent local classes who both have a higher similarity to
each other than to their current matching record.</p>
        <p>In [8] a complete evaluation of the di erent post-processing methods is performed
by using both synthetic and real datasets. The linkage quality is assessed by
recall and precision: recall measures the proportion of true-matches that have
been correctly classi ed as matches after the linkage process; precision is de ned
as the fraction of classi ed matches that are true-matches. The aim of
postprocessing is to optimize precision while recall is ideally preserved. The result
of the evaluation performed in [8] was that both Max1-both and SM are able to
signi cantly improve the linkage quality; in general, Max1-both can achieve the
best linkage quality in terms of precision; for applications favoring recall over
precision, a SM should be applied.
2.2</p>
        <p>Data Fusion Evaluation
To perform data fusion, several con ict handling strategies de ned in [7] are
available in the MOMIS system; in particular, the following strategies:
S1 Take the information: prefers values over null values;
S2 Consider all possibilities : creates all possible value combinations.
These strategies are implemented as default, i.e., before the intervention of the
Designer, that can also apply some Resolution Strategies choosing by the Con ict
Resolution Functions implemented in MomisDF, such as, takes an average value
and takes the most recent value. The authors of [7] also propose methods for
the evaluation of data fusion based on measures of quality of source data, such
as completeness and consistency ; for example, the (extensional) completeness is
j unique objects in dataset j = j unique objects in universe j.</p>
        <p>Instead of quality of data sources, we want to evaluate quality of fused data,
then we reformulate such measures in terms of the Global Class (GC) resulting
from the data fusion process (see next section for an example), by considering
the following Data Centric Evaluation Measures3:
{ Density: measures the fraction of non-NULL values.</p>
        <p>The density of a Global Attribute GA in GC is de ned as</p>
        <p>DensityGA = j non-NULL values in GA j</p>
        <p>j records in GC j
The density of the whole global class GC is de ned as</p>
        <p>DensityCG =</p>
        <p>j non-NULL values in GC j
j attributes in GC j j records in GC j
3 Such measures are also introduced in [6]. In our evaluation, the number of objects in
the universe is identi ed with the number of unique entities in the fused data source,
and then with the records of the Global Class GC.
{ Consistency: a data set is consistent if it is free of con icting information.</p>
        <p>The consistency of a Global Attribute GA in GC is de ned as</p>
        <p>ConsistencyGA = j non-con icting values in GA j</p>
        <p>j records in GC j
The consistency of the whole global class GC is de ned as</p>
        <p>ConsistencyGC =</p>
        <p>j non-con icting values in GC j
j attributes in GC j j records in GC j
In a similar way, in [11] the concept of F-quality is introduced as a measure
of quality of fused data rather than source data and two novel algorithms for
Linked Data fusion with provenance tracking and quality assessment of fused
data are proposed.</p>
        <p>Our goal is di erent, we want to discuss how the evaluation of data fusion
results can be used by the Integration Designer to analyze, and eventually
modify, the choices made during the Entity Resolution process, to improve the nal
result. For these reasons, we only consider simple and common con ict handling
strategies (i.e. S1 and S2) and the straightforward evaluation de ned above.
3</p>
      </sec>
      <sec id="sec-9-2">
        <title>Example and Discussion</title>
        <p>This section shows, by means of an example covering the whole
SparkERMOMIS work ow, how the evaluation of fusion results can be useful for the
Integration Designer; in particular, this example will show how the evaluation of
data fusion results changes by varying the post-processing 1-1 matching method.</p>
        <p>We consider two local classes L and R and a global class GC with the same
attributes: Surname (S), Name (N), City, Age and Salary (i.e, we assume that
the schema matching and global schema generation phases have already been
carried out with the MOMIS framework); moreover L and R have a local ID.
An instance of L and R is shown in Figure 2(a).</p>
        <p>Let us rst consider the simplest case: all the attributes are used to perform
Entity Resolution. The ER process performed with SparkER (see [10] for a
detailed description of the tool) produces the Match Table shown in Figure 2(b).
Note that the match table provides the pair of local identi ers of each record
and their similarity, and that the obtained matching is many-to-many (e.g. L1
matches with R1, R2). Now it is possible to apply the three post-processing
methods discussed in section 2.1 to obtain a one-to-one matching: each of these
methods produces a 1-1 Match Table as shown in Figure 2(b2) (a denotes
that the pair is in the 1-1 Match Table).</p>
        <p>Each 1-1 Match Table is then used to perform Data Fusion; intuitively, two
sequential operations are performed:
1. the natural outer join is performed:</p>
        <p>Local Class L ./ 1-1 Match Table ./ Local Class R</p>
        <p>Local Class L
L_ID LS LN L_CITY L_AGE L_SALARY
L1 William Charlie NULL 29 NULL
L2 William Ciarlye Canton 27 155</p>
        <p>Local Class R
R_ID RS RN R_CITY R_AGE R_SALARY
R1 Wiliam Charli Canyon 28 NULL
R2 Wiliam Charli NULL 44 138
R3 Vylian Carl Canton NULL 158
(a) Local Sources</p>
        <p>SM</p>
        <p>Max1-both
PROV CITY AGE
L1-R1 Canyon {29,28}
L2 Canton 27
R2 NULL 44
R3 Canton NULL</p>
        <p>Match Table
SparkER L_ID R_ID sim</p>
        <p>L1 R1 0.91
L1 R2 0.86
L2 R1 0.81
L2 R3 0.69
(b1) Match</p>
        <p>Table
✔
✔
✔
(b2) 1-1 Match</p>
        <p>Tables
MWM
SALARY PROV CITY AGE
NULL L1-R1 Canyon {29,28}
155 L2-R3 Canton 27
138 R2 NULL 44
158
(c) Data Fusion result (Global Class Instance)</p>
        <p>SALARY PROV CITY AGE</p>
        <p>NULL L1-R2 NULL {29,44}
{155,158} L2-R1 {Canton, Canyon} {27,28}
138 R3 Canton NULL</p>
        <p>SALARY
138
155
158</p>
        <p>Max1-both SM MWM</p>
        <p>CITY AGE SALARY CITY AGE SALARY CITY AGE SALARY</p>
        <p>Column density 3/4 3/4 3/4 2/3 3/3 2/3 2/3 2/3 3/3
Column consistency 4/4 3/4 4/4 3/3 2/3 2/3 2/3 1/3 3/3</p>
        <p>Table density 9/12 7/9 7/9
Table consistency 11/12 7/9 6/9</p>
        <p>(d) Data Fusion metrics
2. the S1 and S2 strategies (see Section 2.2) are applied to all the con icting
attributes involved in the Data Fusion process.</p>
        <p>
          For example, the Max1-Both Match Table contains only the pair (L1; R1)
and only such two local records are fused together so obtaining the rst record
in the Max1-Both Global Class shown in Figure 2(c), where the PROV attribute
represents the data provenance [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], i.e., intuitively, the input local records that
contributed to the output global record. Con icting values obtained by the S2
strategy are highlighted in yellow. With the other 1-1 Match Tables, di erent
global class are obtained, as shown in Figure 2(c). Finally, the Integration
Designer chooses which attributes to use in the evaluation phase; in our example,
suppose they are City, Age and Salary. The Data Fusion evaluation, with the
measures density and consistency is shown in Figure 2(d).
        </p>
        <p>In this preliminary work, we have not yet achieved results on signi cant
and real cases. However, some considerations can also be made about the (toy)
example. First of all, Max1-both achieves the best match quality in terms of
column and table consistency. This is consistent with the conclusion reached in
[8] that, in general, Max1-both can achieve the best linkage quality in terms of
precision. On the other hand, it is easy to verify that to increase column and
table density, SM or MWM should be applied, with the obvious consequence
that these methods deteriorate column and table consistency. For these reasons,
we believe it is important to give the designer the opportunity to analyze and
choose the best method. For example, in a context where the Salary attribute
has greater importance, the best choice is the MWM method that maximizes
both density and consistency for such attribute.</p>
        <p>In this example we only discussed evaluation of data fusion results to
varying of the post-processing 1-1 matching method. On the other hand, it will
also be interesting to evaluate the results of the data fusion correspond to the
changes in the con gurations sparker (and therefore in the di erent Match
Tables produced). In fact, as discussed in [10], the SparkER tool can work both in
a completely unsupervised mode and in a supervised one. In the rst case, the
Integration Designer can use a default con guration and perform the process
on its data without taking care of the parameters tuning. In the second case,
she/he can supervise the entire process, in order to determine which are the best
parameters for her/his data, thus producing a custom con guration.
4</p>
      </sec>
      <sec id="sec-9-3">
        <title>Conclusions and future work</title>
        <p>We discussed some preliminary ideas about an integrated approach for entity
resolution and data fusion. We showed, by an example, how the evaluation of
data fusion results can be used by the Integration Designer to analyze, and
eventually modify, the choices made during the Entity Resolution process.</p>
        <p>As future work, we will perform a complete evaluation of data fusion results
with respect to the di erent post-processing methods, both using real datasets
and with other evaluation measures. Another future work is to extend the data
fusion evaluation to Con ict Resolution Functions, by considering Ground Truth
Based Evaluation measures, such as the Accuracy, in order to evaluate the
fraction of correct values selected by con ict resolution functions chosen by the
Integration Designer.
6. Bizer, C.: Data quality assessment and data fusion. University Lecture (2018)
7. Bleiholder, J., Naumann, F.: Data fusion. ACM Comput. Surv. 41(1), 1:1{1:41
(Jan 2009). https://doi.org/10.1145/1456650.1456651
8. Franke, M., Sehili, Z., Gladbach, M., Rahm, E.: Post-processing methods for high
quality privacy-preserving record linkage. In: Data Privacy Management,
Cryptocurrencies and Blockchain Technology - ESORICS 2018 International
Workshops, Barcelona, Spain, September 6-7, 2018, Proceedings. pp. 263{278 (2018)
9. Gagliardelli, L., Zhu, S., Simonini, G., Bergamaschi, S.: Bigdedup: a big data
integration toolkit for duplicate detection in industrial scenarios. In: Transdisciplinary
Engineering. vol. 7, pp. 1015{1023 (2018)
10. Gagliardelli, L., Simonini, G., Beneventano, D., Bergamaschi, S.: Sparker: Scaling
entity resolution in spark. In: Proceedings of the Workshops of the EDBT/ICDT
2019 Joint Conference (EDBT/ICDT ), Lisbon, Portugal (2019)
11. Michelfeit, J., Knap, T., Necasky, M.: Linked data integration with con icts. CoRR
abs/1410.7990 (2014), http://arxiv.org/abs/1410.7990
12. Simonini, G., Bergamaschi, S.: Enhancing entity resolution e ciency with loosely
schema-aware techniques. In: 24th Italian Symposium on Advanced Database
Systems, SEBD 2016, Ugento, Lecce, Italy, June 19-22, 2016, Ugento, Lecce, Italia,
June 19-22, 2016. pp. 270{277 (2016)
13. Simonini, G., Bergamaschi, S., Jagadish, H.V.: BLAST: a loosely schema-aware
meta-blocking approach for entity resolution. PVLDB 9(12), 1173{1184 (2016),
http://www.vldb.org/pvldb/vol9/p1173-simonini.pdf
14. Simonini, G., Gagliardelli, L., Bergamaschi, S., Jagadish, H.V.: Scaling entity
resolution: A loosely schema-aware approach. Inf. Syst. 83, 145{165 (2019).
https://doi.org/10.1016/j.is.2019.03.006
15. Simonini, G., Papadakis, G., Palpanas, T., Bergamaschi, S.: Schema-agnostic
progressive entity resolution. In: 34th IEEE International Conference on Data
Engineering, ICDE 2018, Paris, France, April 16-19, 2018. pp. 53{64. IEEE Computer
Society (2018). https://doi.org/10.1109/ICDE.2018.00015
16. Simonini, G., Papadakis, G., Palpanas, T., Bergamaschi, S.: Schema-agnostic
progressive entity resolution. IEEE Trans. Knowl. Data Eng. 31(6), 1208{1221 (2019).
https://doi.org/10.1109/TKDE.2018.2852763
17. Vincini, M., Beneventano, D., Bergamaschi, S.: Semantic integration of
heterogeneous data sources in the MOMIS data transformation system. J. UCS 19(13),
1986{2012 (2013). https://doi.org/10.3217/jucs-019-13-1986
18. Zhang, D., Rubinstein, B.I.P., Gemmell, J.: Principled graph matching algorithms
for integrating multiple data sources. IEEE Trans. on Knowl. and Data Eng.
27(10), 2784{2796 (Oct 2015). https://doi.org/10.1109/TKDE.2015.2426714</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Benedetti</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Beneventano</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bergamaschi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Simonini</surname>
          </string-name>
          , G.:
          <article-title>Computing interdocument similarity with context semantic analysis</article-title>
          .
          <source>Information Systems</source>
          <volume>80</volume>
          ,
          <fpage>136</fpage>
          {
          <fpage>147</fpage>
          (
          <year>2019</year>
          ). https://doi.org/10.1016/j.is.
          <year>2018</year>
          .
          <volume>02</volume>
          .009
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Beneventano</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bergamaschi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Provenance-aware semantic search engines based on data integration systems</article-title>
          .
          <source>Inter. J. of Organizational and Collective Intelligence (IJOCI) 4</source>
          (
          <issue>2</issue>
          ),
          <volume>1</volume>
          {
          <fpage>30</fpage>
          (Apr
          <year>2014</year>
          ). https://doi.org/10.4018/ijoci.2014040101
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bergamaschi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Beneventano</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Corni</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kazazi</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Orsini</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Po</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sorrentino</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>The open source release of the MOMIS data integration system</article-title>
          .
          <source>In: Nineteenth Italian Symposium on Advanced Database Systems (SEBD)</source>
          . pp.
          <volume>175</volume>
          {
          <issue>186</issue>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Bergamaschi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Beneventano</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guerra</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Orsini</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Data integration</article-title>
          .
          <source>In: Handbook of Conceptual Modeling - Theory, Practice, and Research Challenges</source>
          , pp.
          <volume>441</volume>
          {
          <fpage>476</fpage>
          . Springer (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Bergamaschi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ferrari</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guerra</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Simonini</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Velegrakis</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Providing insight into data source topics</article-title>
          .
          <source>J. Data Semantics</source>
          <volume>5</volume>
          (
          <issue>4</issue>
          ),
          <volume>211</volume>
          {
          <fpage>228</fpage>
          (
          <year>2016</year>
          ). https://doi.org/10.1007/s13740-016-0063-6
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>