=Paper= {{Paper |id=Vol-2652/paper06 |storemode=property |title=(Artificial) Mind over Matter: Humans In and Humans Out in Matching |pdfUrl=https://ceur-ws.org/Vol-2652/paper06.pdf |volume=Vol-2652 |authors=Roee Shraga |dblpUrl=https://dblp.org/rec/conf/vldb/Shraga20 }} ==(Artificial) Mind over Matter: Humans In and Humans Out in Matching== https://ceur-ws.org/Vol-2652/paper06.pdf
                                   (Artificial) Mind over Matter
                                Humans In and Humans Out in Matching

                                                                  Roee Shraga
                                   Supervised by Prof. Avigdor Gal and Prof. Rakefet Ackerman
                                      Technion – Israel Institute of Technology, Haifa, Israel
                                                  shraga89@campus.technion.ac.il


ABSTRACT                                                                     were followed by theoretical grounding (e.g., see [2, 6]), algo-
The matching task is at the heart of data integration, in                    rithmic solutions for efficient and effective integration, and
charge of aligning elements of data sources. Historically,                   a body of systems, benchmarks and competitions that allow
matching problems were considered semi automated tasks                       comparative empirical analysis of integration solutions.
in which correspondences are generated by matching al-                          Matching problems have been historically defined as a
gorithms and subsequently validated by human expert(s).                      semi-automated task in which correspondences are gener-
This research is devoted to the changing role of humans                      ated by matching algorithms and outcomes are subsequently
in matching, which is divided into two main approaches,                      validated by one or more human experts. The reason for
namely Humans Out and Humans In. With the increase in                        that is the inherent assumption that humans “do it better.”
amount and size of matching tasks, the role of humans as val-                The traditional roles of humans and machines in matching
idators seems to diminish; thus Humans In questions the in-                  are subject to change due to the availability of data and
herent need for humans in the matching loop. On the other                    advances in machine learning. Therefore, in the proposed
hand, Humans Out focuses on overcoming human cognitive                       research we question this assumption and aim at developing
biases via algorithmic assistance. Above all, we observe that                a machine learning framework for matching.
matching requires unconventional thinking demonstrated by                       Given the availability of data and the improvement of ma-
advance machine learning methods to complement (and pos-                     chine learning techniques, this line of research is devoted to
sibly take over) the role of humans in matching.                             the investigation of respective roles of humans and machines
                                                                             in achieving cognitive tasks in matching, aiming to deter-
                                                                             mine whether traditional roles of humans and machines are
1.    INTRODUCTION                                                           subject to change [15, 16]. Such investigation, we believe,
                                                                             will pave a way to better utilize both human and machine
   Modern industrial and business processes require intensive
                                                                             resources in new and innovative manners. We consider two
use of large-scale data alignment and integration techniques
                                                                             possible modes of change, namely humans out and humans
to combine data from multiple heterogeneous data sources
                                                                             in. Humans Out aim at exploring out-of-the-box latent
into meaningful and valuable information. Data alignment
                                                                             matching reasoning using machine learning algorithms when
and integration has been recently challenged by the need
                                                                             attempting to overpower human matcher performance. Pur-
to handle large volumes of data, arriving at high velocity
                                                                             suing out-of-the-box thinking, we investigate the best way
from a variety of sources, which demonstrate varying levels
                                                                             to include machine and deep learning in matching. Humans
of veracity. This challenging setting, often referred to as
                                                                             in explores how to better involve humans in the matching
big data, renders many of the existing techniques, especially
                                                                             loop by assigning human matchers with a symmetric role to
those that are human-intensive, obsolete.
                                                                             algorithmic matcher in the matching process.
   At the heart of the data integration realm lies the match-
                                                                                In following sections we describe each of the two modes
ing task [2], in charge of aligning elements of data sources.
                                                                             of change. Section 2 describes how and where we envision
In particular, whenever data sources are represented as
                                                                             replacing humans in the matching loop. In Section 3, we
schemata, the task of schema matching aligns attributes
                                                                             detail our approch to better involve humans in matching by
that convey similar semantic content. At the data level,
                                                                             understanding their strengths and weaknesses. Finally, we
entity resolution (also known as record deduplication) aims
                                                                             summarize and discuss future directions in Section 4.
at “cleaning” a database by identifying tuples representing
the same entity. Initial heuristic attempts (e.g., COMA [4])




Proceedings of the VLDB 2020 PhD Workshop, August 31st, 2020. Tokyo,
Japan. Copyright (C) 2020 for this paper by its authors. Copying permitted
for private and academic purposes.
2.    HUMANS OUT                                                   top-K algorithms [11] and, as psychological literature sug-
   The Humans Out approach seeks matching subtasks, tra-           gests, also applicable when introducing a list of options (as
ditionally considered to require cognitive effort, in which        in the traditional top-K setting) to humans [14].
humans can be excluded. An initial good place to start is            Finally, using large scale experiments with real-world
with the basic task of identifying correspondences. We note        benchmark ontology and schema sets, as well as synthetic
that many contemporary matching algorithms use heuris-             data, we show the effectiveness of the proposed algorith-
tics, where each heurisitc associates some semantic cue to         mic solution. Specifically, we show that the size of a top-K
justify an alignment between elements. For example, string-        match list is geometrically distributed with a parameter that
based matchers use string similarity as a cue for item align-      can be estimated as the amount of times the original best
ment. We observe that such heuristics, in essence, encode          match was the one with the highest F 1 value. Additionally,
human intuition about matching. Our earlier work [17]              we show empirical evidence for the theoretical choice of K,
showed that human matching choices can be reasonably               demonstrate that the newly suggested predictors correlate
predicted by classifying them into types, where a type cor-        well with evaluation measures, validate the use of NDCG as
respond to an existing heuristic. Moreover, in our experi-         an optimization function, and above all show that LRSM
ments, decision making of most human matchers can be pre-          performs better than state-of-the-art methods providing im-
dicted well using a combination of two algorithmic matchers.       proved (and robust) matching results.
Therefore, we can argue that the cognitive effort of many hu-
man matchers can be easily replaced with such heuristics.          2.2           Cross-Domain Schema Matching using
   Next, we describe two works aiming to enhance the au-                         Deep Similarity Matrix Adjustment and
tomation of matching, focusing on the task of schema match-                      Evaluation
ing. The main component of these works is a similarity ma-            In a recent paper [18], we show that deep learning can also
trix, a conceptual model representing a matching result.           be applied to “small” matching problems such as schema
                                                                   matching, making extensive use of similarity matrices. We
2.1    Learning to Rerank Schema Matches                           offer a novel post processing step to schema matching that
                                                                   improves the final matching outcome without human inter-
   In [7, 9] we suggested a learning algorithm for re-ranking      vention. We present a new mechanism, similarity matrix
top-K matches so that the best match is ranked at the top          adjustment, to calibrate a matching result and propose an
termed LRSM (illustrated in Figure 1). The proposed algo-          algorithm (dubbed ADnEV) that manipulates, using deep
rithm has shown good results when tested on real-world as          neural networks, similarity matrices, created by state-of-the-
well as synthetic datasets, offering an alternative to humans      art algorithmic matchers.
in selecting the best match, a task traditionally reserved for        ADnEV uses deep neural networks, providing a data-driven
human verifiers.                                                   approach for extracting hidden representative features for an
                                                                   automatic schema matching process, removing the require-
                                                                   ment for manual feature engineering. ADnEV learns two
                                                                   conjoint neural network models for adjusting and evaluating
                                                                   a similarity matrix. ADnEV algorithm applies these models
                                                                   to iteratively adjust and evaluate new similarity matrices
                                                                   (illustrated in Figure 2), created by state-of-the-art match-
                                                                   ers. With such a tool at hand, we enhance the ability to
                                                                   introduce new data sources to existing systems without the
                                                                   need to rely on either domain experts (knowledgeable of the
                                                                   domain but less so on the best matchers to use) or data inte-
                                                                   gration specialists (who lack sufficient domain knowledge).
                                                                   Having a trained ADnEV model also supports systems where
                                                                   human final judgement is needed by regulation, e.g., health-
                                                                   care, by offering an improved matching recommendation.
Figure 1: Learn-to-Rerank Schema Matches (LRSM ) al-
gorithm illustrated                                                  𝑀0 = 𝑀                  𝑀1 = 𝐴𝐷ሺ𝑀0 ሻ          𝑀𝑡 = 𝐴𝐷ሺ𝑀𝑡−1 ሻ 𝑀
                                                                                                                                       𝑜𝑢𝑡
                                                                                                                                             = 𝐴𝐷ሺ𝑀𝑡−1 ሻ
                                                                    𝑀11   ⋯      𝑀1𝑚            1          1                              𝑇        𝑇
                                                                                              𝑀11   ⋯     𝑀1𝑚                           𝑀11   ⋯   𝑀1𝑚
                                                                   ( ⋮    ⋱       ⋮ )   AD   ( ⋮
                                                                                               1
                                                                                                    ⋱      ⋮ )
                                                                                                           1
                                                                                                                  AD     ⋯    AD       ( ⋮    ⋱    ⋮ )
                                                                    𝑀𝑛1   ⋯      𝑀𝑛𝑚          𝑀𝑛1   ⋯     𝑀𝑛𝑚                             𝑇
                                                                                                                                        𝑀𝑛1   ⋯    𝑇
                                                                                                                                                  𝑀𝑛𝑚

   The novelty of LRSM is in the use of similarity matrices                                                         𝑅𝑒𝑝𝑒𝑎𝑡 𝑢𝑛𝑡𝑖𝑙
                                                                          EV                        EV
as a basis for learning features, creating feature-rich datasets                                                 𝐸𝑉ሺ𝑀𝑡 ሻ < 𝐸𝑉൫𝑀𝑡−1 ൯
that fit learning and provide us with a feature aggregation               𝐸෠ 0                      𝐸෠1
that is needed to enrich algorithmic matching beyond that
of human matching. To create a reranking framework, we
adopt a learning-to-rank approach [3], utilizing matching                           Figure 2: ADnEV algorithm illustrated
predictors [8, 13] as features. In addition to the state-of-
the-art predictors, which mostly emphasize positive char-             We empirically demonstrate the effectiveness of ADnEV
acteristics of a match, we propose a novel set of matching         for improving matching results, using real-world benchmark
predictors that capture complementary negative aspects.            ontology and schema sets. We show that ADnEV can gener-
   We show a bound on the size of K, given a desired level         alize into new domains without the need to learn the domain
of confidence in finding the best match, justified theoret-        terminology, thus allowing cross-domain learning. We also
ically and validated empirically. This bound is useful for         show ADnEV to be a powerful tool in handling schemata
which matching is particularly challenging. Finally, we show                                 Trend Correct      Mean of Correct
the benefit of using ADnEV in a related integration task of                                  Trend Confidence   Mean of Confidence
ontology alignment.
                                                                                1.0
                                                                                0.8




                                                                       Proportion
3.          HUMANS IN
   The Humans In approach aims at investigating whether
                                                                                0.6
the current role humans take in the matching process is ef-
fective and whether alternative role can improve overall per-                   0.4
formance of the matching process.
                                                                                0.2
                                                                                        MC = 0.69
                       1.0                                                      0.0




                                                                                         0-5
                                                                                        6-10
                                                                                       11-15
                                                                                       16-20
                                                                                       21-25
                                                                                       26-30
                                                                                       31-35
                                                                                       36-40
                                                                                       41-45
                                                                                       46-50
                                                                                       51-55
                                                                                       56-60
                       0.8
     (Average Precision)




                       0.6                                                                             Time Spent
        Correctness




                       0.4
                                                                  Figure 4: Temporal Dimension: Confidence (Blue) and
                       0.2                                        correctness (Red) by elapsed time

                       0.0
                                                                  As time passes, less decisions made by humans are correct
                                                                  and there is a decline of human confidence.
                             0.2   0.4    0.6     0.8   1.0
                                     Confidence                   3.2               InCognitoMatch: Cognitive-aware Match-
                                                                                    ing via Crowdsourcing
Figure 3: Correctness by confidence, partitioned into buck-
ets of 0.1                                                           Acknowledging cognitive awareness in human match-
                                                                  ing, we recently proposed InCognitoMatch [19], the
                                                                  first cognitive-aware crowdsourcing application for match-
   By way of motivation, we provide an illustration (Fig-
                                                                  ing tasks. InCognitoMatch provides a handy tool to vali-
ure 3) of the relationship between human confidence in match-
                                                                  date, annotate, and correct correspondences using the crowd
ing and correctness (in terms of precision) based on our ex-
                                                                  whilst accounting for human matching biases. In addition,
periments [1, 17]. It is clear that human subjective con-
                                                                  InCognitoMatch enables system administrators to con-
fidence cannot serve as a good predictor to matching cor-
                                                                  trol context information visible for workers and analyze their
rectness. Next, we describe a work that shows how human
                                                                  performance accordingly. For crowd workers, InCognito-
biases affect confidence levels via consistency dimensions.
                                                                  Match is an easy-to-use application that may be accessed
                                                                  from multiple crowdsourcing platforms. In addition, workers
3.1              A Cognitive Model of Human Matching              completing a task are offered suggestions for followup ses-
                 Bias                                             sions according to their performance in the current session.
   A recent study [1], aided by metacognitive models, ana-        We foresee that such a tool will become handy in match-
lyzes the consistency of human matchers. We explore three         ing schemata in big data setting, where schema description
main consistency dimensions as potential cognitive biases,        may be poorly documented and human expertise becomes a
taking into account the time it takes to reach a matching         scarce resource.
decision, the extent of agreement among human matchers
and the assistance of algorithmic matchers. In particular,
we showed that when an algorithmic suggestion is available,
                                                                  4.         ONGOING AND FUTURE RESEARCH
humans tend to accept it to be true, in sharp contradiction          In this paper we presented our approach for human in-
to the conventional validation role of human matchers.            volvement in the matching loop, introducing tasks where hu-
   Interestingly enough, all dimensions were found predictive     mans can be replaced and emphasizing our vision for under-
of both confidence and accuracy of human matchers. This           standing human behavior to allow better engagement. An
indicates that 1) humans have cognitive biases affecting their    additional overarching goal is to propose a common match-
ability to provide consistent matching decisions, and 2) that     ing framework that would allow treating matching as a uni-
such biases has predictive value in determining to what ex-       fied problem whether we match schemata attributes, ontol-
tent a human matcher’s alignment decision is accurate. Our        ogy elements, process activities, entity’s tuples, etc. Next,
empirical evaluation serves as a proof-of-concept that vali-      we describe some concrete ongoing and future research di-
dates the important roles of humans as participants in the        rections.
matching process, and less so as validators. As an example,       Cognition-aware Matching Collaboration: Match con-
Figure 4 compares confidence with correctness, by showing         sistency was introduced in [1] as a measure of human match-
the proportion of correctly identified correspondences, out       ing variability along potential bias dimensions. As a direct
of all correspondences (i.e., precision), partitioned according   future direction, we design a collaboration matcher that
to elapsed time (red) and mean of confidence across all hu-       combines human and algorithmic opinions to improve the
man matchers, again partitioned according to elapsed time         matching outcome by compensating for human biases along
(blue). For each measure we also include a linear trend-line      consistency dimensions as defined [1], namely temporal, cons-
and error bars (standard deviation) for each time bucket.         esnsuality, and control. We validated the proposed matcher
using an empirical study with human and algorithmic match-        [5] X. L. Dong and T. Rekatsinas. Data integration and
ers over a well-known benchmark, showing it provides better           machine learning: A natural synergy. In SIGMOD,
matching performance than human or algorithmic matching,              pages 1645–1650. ACM, 2018.
performed separately.                                             [6] A. Gal. Uncertain Schema Matching. Synthesis
Expert Identification: In [1] we show that humans have                Lectures on Data Management. Morgan & Claypool
cognitive biases decreasing their ability to perform match-           Publishers, 2011.
ing tasks effectively (see Section 3.1). Expert identification    [7] A. Gal, H. Roitman, and S. Roee. Heterogeneous data
aims to predict humans qualification to serve as “experts”            integration by learning to rerank schema matches. In
for a matching task. We intend to explore predictive behav-           IEEE International Conference on Data Mining,
iors that capture the process of human matching by trans-             ICDM. IEEE Computer Society, 2018.
forming physical aspects (such as time, screen scrolls, mouse     [8] A. Gal, H. Roitman, and T. Sagi. From
tracking, and eye movement) into features that can be used            diversity-based prediction to better ontology &
for examining the role of humans in the matching process.             schema matching. In Proceedings of the 25th
This, in turn, would enable matching systems to carefully             International Conference on World Wide Web, pages
select a matching expert that fits the task.                          1145–1155. International World Wide Web
Learning from Matchers: Using machine learning for                    Conferences Steering Committee, 2016.
data integration raises the issue of shortage of labeled data     [9] A. Gal, H. Roitman, and R. Shraga. Learning to
to offer supervised learning [5, 9, 10, 12, 18]. Hence, pur-          rerank schema matches. IEEE Transactions on
suing less-than-supervised (e.g., unsupervised, weakly su-            Knowledge and Data Engineering (TKDE), 2019.
pervised) methods would be a natural next step to follow.        [10] F. Jabeen, H. Leopold, and H. A. Reijers. How to
In a nutshell, we will propose a framework that uses pre-             make process model matching work better? an
trained embeddings to represent data elements, processes              analysis of current similarity measures. In
a candidate pair to be matched with bidirectional LSTM,               International Conference on Business Information
and trained using state-of-the-art heuristic matchers. Once           Systems, pages 181–193. Springer, 2017.
trained, the framework will be independent of both human         [11] C. Macdonald, R. L. T. Santos, and I. Ounis. The
input and human designed heuristics. Initial empirical eval-          whens and hows of learning to rank for web search.
uation shows the proposed framework to performs better                Information Retrieval, 16(5):584–628, Oct 2013.
than multiple baselines and provide insights on future tech-     [12] S. Mudgal, H. Li, T. Rekatsinas, A. Doan, Y. Park,
nique choices.                                                        G. Krishnan, R. Deep, E. Arcaute, and
Matching Relevance: The vision we put forward is for                  V. Raghavendra. Deep learning for entity matching: A
the creation of a probabilistic matching relevance framework          design space exploration. In Proceedings of the 2018
that will allow matching tasks to consider matching intent            International Conference on Management of Data,
when creating a match. An intent reflects user preferences            pages 19–34. ACM, 2018.
that may relate to granularity level, system requirement,        [13] T. Sagi and A. Gal. Schema matching prediction with
match context, or simply individual inclination. We will              applications to data source discovery and dynamic
present a probabilistic model of a match, showing that in-            ensembling. The VLDB Journal, 22(5):689–710, 2013.
tent, either implicitly or explicitly specified, enables more    [14] B. Schwartz. The paradox of choice: Why more is less.
accurate matching by better separating the relevant from              Ecco New York, 2004.
the irrelevant. The proposed probabilistic notation will de-     [15] R. Shraga. (artificial) mind over matter: Integrating
scribe uncertainty in general existing matching problem, and          humans and algorithms in solving matching problems.
accompanied with an intent, will enable assessment of the             In Proceedings of the 2018 International Conference
relevance of a match to a system rather than its correctness.         on Management of Data (SIGMOD). ACM, 2018.
                                                                 [16] R. Shraga and A. Gal. The changing roles of humans
Acknowledgments                                                       and algorithms in (process) matching. In International
I would like thank Dr. Haggai Roitman, Dr. Tomer Sagi,                Conference on Business Process Management, pages
Dr. Ofra Amir, and Coral Scharf, for their involvement in             106–109. Springer, 2019.
this research.                                                   [17] R. Shraga, A. Gal, and H. Roitman. What type of a
                                                                      matcher are you?: Coordination of human and
5.   REFERENCES                                                       algorithmic matchers. In Proceedings of the Workshop
 [1] R. Ackerman, A. Gal, T. Sagi, and R. Shraga. A                   on Human-In-the-Loop Data Analytics,
     cognitive model of human bias in matching. In                    HILDA@SIGMOD 2018, Houston, TX, USA, June
     PRICAI: Trends in Artificial Intelligence, 2019.                 10, 2018, pages 12:1–12:7, 2018.
 [2] Z. Bellahsene, A. Bonifati, and E. Rahm, editors.           [18] R. Shraga, A. Gal, and H. Roitman. Adnev:
     Schema Matching and Mapping. Data-Centric Systems                Cross-domain schema matching using deep similarity
     and Applications. Springer, 2011.                                matrix adjustment and evaluation. PVLDB,
                                                                      13(9):1401–1415, 2020.
 [3] C. J. Burges. From ranknet to lambdarank to
     lambdamart: An overview. Learning, 11:23–581, 2010.         [19] R. Shraga, C. Scharf, R. Ackerman, and A. Gal.
                                                                      Incognitomatch: Cognitive-aware matching via
 [4] H. H. Do and E. Rahm. Coma: a system for flexible
                                                                      crowdsourcing. In Proceedings of the 2020
     combination of schema matching approaches. In
                                                                      International Conference on Management of Data,
     Proceedings of VLDB, pages 610–621. VLDB
                                                                      SIGMOD. ACM, 2020.
     Endowment, 2002.