Efficient Machine Learning Methods over Pairwise Space (keynote) Hung Son Nguyen University of Warsaw Keywords Rough sets, Support Vector Machine, Factorization Machine, Distance Metric Learning, Context-Aware Recommendation Extended Abstract In recent years many machine learning concepts and methods were developed on the set of pairs of objects. In this paper, the set of all pairs of objects is called the pairwise space. Let us notice that if the set of objects 𝒳 = {x1 , x2 , . . . , x𝑛 } consists of 𝑛 instances, then the pairwise space contains 𝑂(𝑛2 ) pairs. Thus why the straightforward implementations of those methods are not applicable for big data sets with millions of objects. The main concepts in rough set theory (RS) such as reducts, lower and upper approximations, decision rules or discretizations have been defined in term of the discernibility matrix, which is a form of the pairwise space [1]. For example, in minimal decision reduct problem, we are looking for the minimal subset of features that preserves the discernibility between objects from different decision classes [2]. Support Vector Machine (SVM) is also a classification method described as an optimization problem over the pairwise space [3]. The initial idea of looking for the linear classifier with the maximal margin were transformed into the problem of looking for a set of coefficients 𝛼 = (𝛼1 , 𝛼2 , Β· Β· Β· , 𝛼𝑛 ) related to objects that maximizes an objective function βˆ‘οΈ 1 βˆ‘οΈ π‘Š (𝛼) = 𝛼𝑖 βˆ’ 𝑦𝑖 𝑦𝑗 𝛼𝑖 𝛼𝑗 𝐾(π‘₯𝑖 , π‘₯𝑗 ). 2 𝑖 𝑖,𝑗 defined on the set of dot products of all pairs of objects. In the above formula 𝑦𝑖 denotes the decision class of the object x𝑖 and 𝐾 is a kernel function chosen by the user. Distance Metric Learning (DML) [4] is a machine learning discipline that looks for the best distance function (also divergence or similarity ) from certain available information about similarity measures between different pairs or triplets of data. These similarities are determined 29th International Workshop on Concurrency, Specification and Programming (CS&P’21) $ son@mimuw.edu.pl (H. S. Nguyen)  0000-0002-3236-5456 (H. S. Nguyen) Β© 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) by the sets 𝑆 = {(x𝑖 , x𝑗 ) ∈ 𝒳 Γ— 𝒳 : x𝑖 and x𝑗 are similar.} 𝐷 = {(x𝑖 , x𝑗 ) ∈ 𝒳 Γ— 𝒳 : x𝑖 and x𝑗 are not similar.} 𝑅 = {(x𝑖 , x𝑗 , x𝑙 ) ∈ 𝒳 Γ— 𝒳 Γ— 𝒳 : x𝑖 is more similar to x𝑗 than to x𝑙 .} With these data and similarity constraints, the problem to is to look for those distance functions (belonging to a predefined family of distances π’Ÿ) that minimize a certain loss function β„“ determined on the base of the sets 𝑆, 𝐷 and 𝑅. In other words, the objective of DML is to solve the optimization problem min β„“(𝑑, 𝑆, 𝐷, 𝑅) π‘‘βˆˆπ’Ÿ In recent years, many efficient implementations for the mentioned above disciplines have been proposed and developed. Most of them are based either on the approximation idea [5], [6] or deep learning [7]. Context aware recommendation systems is another machine learning concept that were defined on the pairwise space which was in fact defined as a regression problem over the set of transaction pairs [8]. In this talk we compare different techniques for the mentioned above machine learning concepts and we will pay an attention on application of factorization machine (FM). This method has been successfully applied for context aware recommendation systems [9]. The main idea is to transform the optimization problem established on the pairwise space into an equivalent problem where the time complexity for each iteration has been reduced from quadratic time into linear time [10]. We will show that factorization machine can be also applied for some problems in rough set theory, SVM or distance metric learning. References [1] Z. Pawlak, Rough sets, International Journal of Information and Computer Sciences 11 (1982) 341–356. [2] Z. Pawlak, A. Skowron, Rudiments of rough sets, Information Sciences 177 (2007) 3–27. [3] C. Cortes, V. Vapnik, Support vector networks, Machine Learning 20 (1995) 273–297. [4] J.-L. SuΓ‘rez, S. GarcΓ­a, F. Herrera, A tutorial on distance metric learning: Mathematical foundations, algorithms, experimental analysis, prospects and challenges, Neurocomputing 425 (2021) 300–322. [5] H. S. Nguyen, Approximate Boolean Reasoning: Foundations and Applications in Data Mining, Springer-Verlag, Berlin, Heidelberg, 2006, p. 334–506. [6] T. Joachims, Learning to Classify Text Using Support Vector Machines – Methods, Theory, and Algorithms, Kluwer/Springer, 2002. [7] P. H. Barros, F. Queiroz, F. Figueredo, J. A. dos Santos, H. S. Ramos, A new similarity space tailored for supervised deep metric learning, CoRR abs/2011.08325 (2020). URL: https://arxiv.org/abs/2011.08325. arXiv:2011.08325. [8] S. Rendle, Z. Gantner, C. Freudenthaler, L. Schmidt-Thieme, Fast context-aware rec- ommendations with factorization machines, in: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’11, Association for Computing Machinery, New York, NY, USA, 2011, p. 635–644. URL: https://doi.org/10.1145/2009916.2010002. doi:10.1145/2009916.2010002. [9] X. Xin, B. Chen, X. He, D. Wang, Y. Ding, J. Jose, Cfm: Convolutional factorization machines for context-aware recommendation, in: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19, International Joint Conferences on Artificial Intelligence Organization, 2019, pp. 3926–3932. URL: https: //doi.org/10.24963/ijcai.2019/545. doi:10.24963/ijcai.2019/545. [10] S. Rendle, Factorization machines with libfm, ACM Trans. Intell. Syst. Technol. 3 (2012) 57:1–57:22. URL: http://doi.acm.org/10.1145/2168752.2168771. doi:10.1145/2168752. 2168771.