(Artificial) Mind over Matter Humans In and Humans Out in Matching Roee Shraga Supervised by Prof. Avigdor Gal and Prof. Rakefet Ackerman Technion – Israel Institute of Technology, Haifa, Israel shraga89@campus.technion.ac.il ABSTRACT were followed by theoretical grounding (e.g., see [2, 6]), algo- The matching task is at the heart of data integration, in rithmic solutions for efficient and effective integration, and charge of aligning elements of data sources. Historically, a body of systems, benchmarks and competitions that allow matching problems were considered semi automated tasks comparative empirical analysis of integration solutions. in which correspondences are generated by matching al- Matching problems have been historically defined as a gorithms and subsequently validated by human expert(s). semi-automated task in which correspondences are gener- This research is devoted to the changing role of humans ated by matching algorithms and outcomes are subsequently in matching, which is divided into two main approaches, validated by one or more human experts. The reason for namely Humans Out and Humans In. With the increase in that is the inherent assumption that humans “do it better.” amount and size of matching tasks, the role of humans as val- The traditional roles of humans and machines in matching idators seems to diminish; thus Humans In questions the in- are subject to change due to the availability of data and herent need for humans in the matching loop. On the other advances in machine learning. Therefore, in the proposed hand, Humans Out focuses on overcoming human cognitive research we question this assumption and aim at developing biases via algorithmic assistance. Above all, we observe that a machine learning framework for matching. matching requires unconventional thinking demonstrated by Given the availability of data and the improvement of ma- advance machine learning methods to complement (and pos- chine learning techniques, this line of research is devoted to sibly take over) the role of humans in matching. the investigation of respective roles of humans and machines in achieving cognitive tasks in matching, aiming to deter- mine whether traditional roles of humans and machines are 1. INTRODUCTION subject to change [15, 16]. Such investigation, we believe, will pave a way to better utilize both human and machine Modern industrial and business processes require intensive resources in new and innovative manners. We consider two use of large-scale data alignment and integration techniques possible modes of change, namely humans out and humans to combine data from multiple heterogeneous data sources in. Humans Out aim at exploring out-of-the-box latent into meaningful and valuable information. Data alignment matching reasoning using machine learning algorithms when and integration has been recently challenged by the need attempting to overpower human matcher performance. Pur- to handle large volumes of data, arriving at high velocity suing out-of-the-box thinking, we investigate the best way from a variety of sources, which demonstrate varying levels to include machine and deep learning in matching. Humans of veracity. This challenging setting, often referred to as in explores how to better involve humans in the matching big data, renders many of the existing techniques, especially loop by assigning human matchers with a symmetric role to those that are human-intensive, obsolete. algorithmic matcher in the matching process. At the heart of the data integration realm lies the match- In following sections we describe each of the two modes ing task [2], in charge of aligning elements of data sources. of change. Section 2 describes how and where we envision In particular, whenever data sources are represented as replacing humans in the matching loop. In Section 3, we schemata, the task of schema matching aligns attributes detail our approch to better involve humans in matching by that convey similar semantic content. At the data level, understanding their strengths and weaknesses. Finally, we entity resolution (also known as record deduplication) aims summarize and discuss future directions in Section 4. at “cleaning” a database by identifying tuples representing the same entity. Initial heuristic attempts (e.g., COMA [4]) Proceedings of the VLDB 2020 PhD Workshop, August 31st, 2020. Tokyo, Japan. Copyright (C) 2020 for this paper by its authors. Copying permitted for private and academic purposes. 2. HUMANS OUT top-K algorithms [11] and, as psychological literature sug- The Humans Out approach seeks matching subtasks, tra- gests, also applicable when introducing a list of options (as ditionally considered to require cognitive effort, in which in the traditional top-K setting) to humans [14]. humans can be excluded. An initial good place to start is Finally, using large scale experiments with real-world with the basic task of identifying correspondences. We note benchmark ontology and schema sets, as well as synthetic that many contemporary matching algorithms use heuris- data, we show the effectiveness of the proposed algorith- tics, where each heurisitc associates some semantic cue to mic solution. Specifically, we show that the size of a top-K justify an alignment between elements. For example, string- match list is geometrically distributed with a parameter that based matchers use string similarity as a cue for item align- can be estimated as the amount of times the original best ment. We observe that such heuristics, in essence, encode match was the one with the highest F 1 value. Additionally, human intuition about matching. Our earlier work [17] we show empirical evidence for the theoretical choice of K, showed that human matching choices can be reasonably demonstrate that the newly suggested predictors correlate predicted by classifying them into types, where a type cor- well with evaluation measures, validate the use of NDCG as respond to an existing heuristic. Moreover, in our experi- an optimization function, and above all show that LRSM ments, decision making of most human matchers can be pre- performs better than state-of-the-art methods providing im- dicted well using a combination of two algorithmic matchers. proved (and robust) matching results. Therefore, we can argue that the cognitive effort of many hu- man matchers can be easily replaced with such heuristics. 2.2 Cross-Domain Schema Matching using Next, we describe two works aiming to enhance the au- Deep Similarity Matrix Adjustment and tomation of matching, focusing on the task of schema match- Evaluation ing. The main component of these works is a similarity ma- In a recent paper [18], we show that deep learning can also trix, a conceptual model representing a matching result. be applied to “small” matching problems such as schema matching, making extensive use of similarity matrices. We 2.1 Learning to Rerank Schema Matches offer a novel post processing step to schema matching that improves the final matching outcome without human inter- In [7, 9] we suggested a learning algorithm for re-ranking vention. We present a new mechanism, similarity matrix top-K matches so that the best match is ranked at the top adjustment, to calibrate a matching result and propose an termed LRSM (illustrated in Figure 1). The proposed algo- algorithm (dubbed ADnEV) that manipulates, using deep rithm has shown good results when tested on real-world as neural networks, similarity matrices, created by state-of-the- well as synthetic datasets, offering an alternative to humans art algorithmic matchers. in selecting the best match, a task traditionally reserved for ADnEV uses deep neural networks, providing a data-driven human verifiers. approach for extracting hidden representative features for an automatic schema matching process, removing the require- ment for manual feature engineering. ADnEV learns two conjoint neural network models for adjusting and evaluating a similarity matrix. ADnEV algorithm applies these models to iteratively adjust and evaluate new similarity matrices (illustrated in Figure 2), created by state-of-the-art match- ers. With such a tool at hand, we enhance the ability to introduce new data sources to existing systems without the need to rely on either domain experts (knowledgeable of the domain but less so on the best matchers to use) or data inte- gration specialists (who lack sufficient domain knowledge). Having a trained ADnEV model also supports systems where human final judgement is needed by regulation, e.g., health- care, by offering an improved matching recommendation. Figure 1: Learn-to-Rerank Schema Matches (LRSM ) al- gorithm illustrated 𝑀0 = 𝑀 𝑀1 = 𝐴𝐷ሺ𝑀0 ሻ 𝑀𝑡 = 𝐴𝐷ሺ𝑀𝑡−1 ሻ 𝑀 𝑜𝑢𝑡 = 𝐴𝐷ሺ𝑀𝑡−1 ሻ 𝑀11 ⋯ 𝑀1𝑚 1 1 𝑇 𝑇 𝑀11 ⋯ 𝑀1𝑚 𝑀11 ⋯ 𝑀1𝑚 ( ⋮ ⋱ ⋮ ) AD ( ⋮ 1 ⋱ ⋮ ) 1 AD ⋯ AD ( ⋮ ⋱ ⋮ ) 𝑀𝑛1 ⋯ 𝑀𝑛𝑚 𝑀𝑛1 ⋯ 𝑀𝑛𝑚 𝑇 𝑀𝑛1 ⋯ 𝑇 𝑀𝑛𝑚 The novelty of LRSM is in the use of similarity matrices 𝑅𝑒𝑝𝑒𝑎𝑡 𝑢𝑛𝑡𝑖𝑙 EV EV as a basis for learning features, creating feature-rich datasets 𝐸𝑉ሺ𝑀𝑡 ሻ < 𝐸𝑉൫𝑀𝑡−1 ൯ that fit learning and provide us with a feature aggregation 𝐸෠ 0 𝐸෠1 that is needed to enrich algorithmic matching beyond that of human matching. To create a reranking framework, we adopt a learning-to-rank approach [3], utilizing matching Figure 2: ADnEV algorithm illustrated predictors [8, 13] as features. In addition to the state-of- the-art predictors, which mostly emphasize positive char- We empirically demonstrate the effectiveness of ADnEV acteristics of a match, we propose a novel set of matching for improving matching results, using real-world benchmark predictors that capture complementary negative aspects. ontology and schema sets. We show that ADnEV can gener- We show a bound on the size of K, given a desired level alize into new domains without the need to learn the domain of confidence in finding the best match, justified theoret- terminology, thus allowing cross-domain learning. We also ically and validated empirically. This bound is useful for show ADnEV to be a powerful tool in handling schemata which matching is particularly challenging. Finally, we show Trend Correct Mean of Correct the benefit of using ADnEV in a related integration task of Trend Confidence Mean of Confidence ontology alignment. 1.0 0.8 Proportion 3. HUMANS IN The Humans In approach aims at investigating whether 0.6 the current role humans take in the matching process is ef- fective and whether alternative role can improve overall per- 0.4 formance of the matching process. 0.2 MC = 0.69 1.0 0.0 0-5 6-10 11-15 16-20 21-25 26-30 31-35 36-40 41-45 46-50 51-55 56-60 0.8 (Average Precision) 0.6 Time Spent Correctness 0.4 Figure 4: Temporal Dimension: Confidence (Blue) and 0.2 correctness (Red) by elapsed time 0.0 As time passes, less decisions made by humans are correct and there is a decline of human confidence. 0.2 0.4 0.6 0.8 1.0 Confidence 3.2 InCognitoMatch: Cognitive-aware Match- ing via Crowdsourcing Figure 3: Correctness by confidence, partitioned into buck- ets of 0.1 Acknowledging cognitive awareness in human match- ing, we recently proposed InCognitoMatch [19], the first cognitive-aware crowdsourcing application for match- By way of motivation, we provide an illustration (Fig- ing tasks. InCognitoMatch provides a handy tool to vali- ure 3) of the relationship between human confidence in match- date, annotate, and correct correspondences using the crowd ing and correctness (in terms of precision) based on our ex- whilst accounting for human matching biases. In addition, periments [1, 17]. It is clear that human subjective con- InCognitoMatch enables system administrators to con- fidence cannot serve as a good predictor to matching cor- trol context information visible for workers and analyze their rectness. Next, we describe a work that shows how human performance accordingly. For crowd workers, InCognito- biases affect confidence levels via consistency dimensions. Match is an easy-to-use application that may be accessed from multiple crowdsourcing platforms. In addition, workers 3.1 A Cognitive Model of Human Matching completing a task are offered suggestions for followup ses- Bias sions according to their performance in the current session. A recent study [1], aided by metacognitive models, ana- We foresee that such a tool will become handy in match- lyzes the consistency of human matchers. We explore three ing schemata in big data setting, where schema description main consistency dimensions as potential cognitive biases, may be poorly documented and human expertise becomes a taking into account the time it takes to reach a matching scarce resource. decision, the extent of agreement among human matchers and the assistance of algorithmic matchers. In particular, we showed that when an algorithmic suggestion is available, 4. ONGOING AND FUTURE RESEARCH humans tend to accept it to be true, in sharp contradiction In this paper we presented our approach for human in- to the conventional validation role of human matchers. volvement in the matching loop, introducing tasks where hu- Interestingly enough, all dimensions were found predictive mans can be replaced and emphasizing our vision for under- of both confidence and accuracy of human matchers. This standing human behavior to allow better engagement. An indicates that 1) humans have cognitive biases affecting their additional overarching goal is to propose a common match- ability to provide consistent matching decisions, and 2) that ing framework that would allow treating matching as a uni- such biases has predictive value in determining to what ex- fied problem whether we match schemata attributes, ontol- tent a human matcher’s alignment decision is accurate. Our ogy elements, process activities, entity’s tuples, etc. Next, empirical evaluation serves as a proof-of-concept that vali- we describe some concrete ongoing and future research di- dates the important roles of humans as participants in the rections. matching process, and less so as validators. As an example, Cognition-aware Matching Collaboration: Match con- Figure 4 compares confidence with correctness, by showing sistency was introduced in [1] as a measure of human match- the proportion of correctly identified correspondences, out ing variability along potential bias dimensions. As a direct of all correspondences (i.e., precision), partitioned according future direction, we design a collaboration matcher that to elapsed time (red) and mean of confidence across all hu- combines human and algorithmic opinions to improve the man matchers, again partitioned according to elapsed time matching outcome by compensating for human biases along (blue). For each measure we also include a linear trend-line consistency dimensions as defined [1], namely temporal, cons- and error bars (standard deviation) for each time bucket. esnsuality, and control. We validated the proposed matcher using an empirical study with human and algorithmic match- [5] X. L. Dong and T. Rekatsinas. Data integration and ers over a well-known benchmark, showing it provides better machine learning: A natural synergy. In SIGMOD, matching performance than human or algorithmic matching, pages 1645–1650. ACM, 2018. performed separately. [6] A. Gal. Uncertain Schema Matching. Synthesis Expert Identification: In [1] we show that humans have Lectures on Data Management. Morgan & Claypool cognitive biases decreasing their ability to perform match- Publishers, 2011. ing tasks effectively (see Section 3.1). Expert identification [7] A. Gal, H. Roitman, and S. Roee. Heterogeneous data aims to predict humans qualification to serve as “experts” integration by learning to rerank schema matches. In for a matching task. We intend to explore predictive behav- IEEE International Conference on Data Mining, iors that capture the process of human matching by trans- ICDM. IEEE Computer Society, 2018. forming physical aspects (such as time, screen scrolls, mouse [8] A. Gal, H. Roitman, and T. Sagi. From tracking, and eye movement) into features that can be used diversity-based prediction to better ontology & for examining the role of humans in the matching process. schema matching. In Proceedings of the 25th This, in turn, would enable matching systems to carefully International Conference on World Wide Web, pages select a matching expert that fits the task. 1145–1155. International World Wide Web Learning from Matchers: Using machine learning for Conferences Steering Committee, 2016. data integration raises the issue of shortage of labeled data [9] A. Gal, H. Roitman, and R. Shraga. Learning to to offer supervised learning [5, 9, 10, 12, 18]. Hence, pur- rerank schema matches. IEEE Transactions on suing less-than-supervised (e.g., unsupervised, weakly su- Knowledge and Data Engineering (TKDE), 2019. pervised) methods would be a natural next step to follow. [10] F. Jabeen, H. Leopold, and H. A. Reijers. How to In a nutshell, we will propose a framework that uses pre- make process model matching work better? an trained embeddings to represent data elements, processes analysis of current similarity measures. In a candidate pair to be matched with bidirectional LSTM, International Conference on Business Information and trained using state-of-the-art heuristic matchers. Once Systems, pages 181–193. Springer, 2017. trained, the framework will be independent of both human [11] C. Macdonald, R. L. T. Santos, and I. Ounis. The input and human designed heuristics. Initial empirical eval- whens and hows of learning to rank for web search. uation shows the proposed framework to performs better Information Retrieval, 16(5):584–628, Oct 2013. than multiple baselines and provide insights on future tech- [12] S. Mudgal, H. Li, T. Rekatsinas, A. Doan, Y. Park, nique choices. G. Krishnan, R. Deep, E. Arcaute, and Matching Relevance: The vision we put forward is for V. Raghavendra. Deep learning for entity matching: A the creation of a probabilistic matching relevance framework design space exploration. In Proceedings of the 2018 that will allow matching tasks to consider matching intent International Conference on Management of Data, when creating a match. An intent reflects user preferences pages 19–34. ACM, 2018. that may relate to granularity level, system requirement, [13] T. Sagi and A. Gal. Schema matching prediction with match context, or simply individual inclination. We will applications to data source discovery and dynamic present a probabilistic model of a match, showing that in- ensembling. The VLDB Journal, 22(5):689–710, 2013. tent, either implicitly or explicitly specified, enables more [14] B. Schwartz. The paradox of choice: Why more is less. accurate matching by better separating the relevant from Ecco New York, 2004. the irrelevant. The proposed probabilistic notation will de- [15] R. Shraga. (artificial) mind over matter: Integrating scribe uncertainty in general existing matching problem, and humans and algorithms in solving matching problems. accompanied with an intent, will enable assessment of the In Proceedings of the 2018 International Conference relevance of a match to a system rather than its correctness. on Management of Data (SIGMOD). ACM, 2018. [16] R. Shraga and A. Gal. The changing roles of humans Acknowledgments and algorithms in (process) matching. In International I would like thank Dr. Haggai Roitman, Dr. Tomer Sagi, Conference on Business Process Management, pages Dr. Ofra Amir, and Coral Scharf, for their involvement in 106–109. Springer, 2019. this research. [17] R. Shraga, A. Gal, and H. Roitman. What type of a matcher are you?: Coordination of human and 5. REFERENCES algorithmic matchers. In Proceedings of the Workshop [1] R. Ackerman, A. Gal, T. Sagi, and R. Shraga. A on Human-In-the-Loop Data Analytics, cognitive model of human bias in matching. In HILDA@SIGMOD 2018, Houston, TX, USA, June PRICAI: Trends in Artificial Intelligence, 2019. 10, 2018, pages 12:1–12:7, 2018. [2] Z. Bellahsene, A. Bonifati, and E. Rahm, editors. [18] R. Shraga, A. Gal, and H. Roitman. Adnev: Schema Matching and Mapping. Data-Centric Systems Cross-domain schema matching using deep similarity and Applications. Springer, 2011. matrix adjustment and evaluation. PVLDB, 13(9):1401–1415, 2020. [3] C. J. Burges. From ranknet to lambdarank to lambdamart: An overview. Learning, 11:23–581, 2010. [19] R. Shraga, C. Scharf, R. Ackerman, and A. Gal. Incognitomatch: Cognitive-aware matching via [4] H. H. Do and E. Rahm. Coma: a system for flexible crowdsourcing. In Proceedings of the 2020 combination of schema matching approaches. In International Conference on Management of Data, Proceedings of VLDB, pages 610–621. VLDB SIGMOD. ACM, 2020. Endowment, 2002.