=Paper=
{{Paper
|id=Vol-2951/keynote2
|storemode=property
|title=Efficient Machine Learning Methods over Pairwise Space (keynote)
|pdfUrl=https://ceur-ws.org/Vol-2951/keynote2.pdf
|volume=Vol-2951
|authors=Hung Son Nguyen
|dblpUrl=https://dblp.org/rec/conf/csp/Nguyen21
}}
==Efficient Machine Learning Methods over Pairwise Space (keynote)==
<pdf width="1500px">https://ceur-ws.org/Vol-2951/keynote2.pdf</pdf>
<pre>
Efficient Machine Learning Methods
over Pairwise Space (keynote)
Hung Son Nguyen
University of Warsaw


                                      Keywords
                                      Rough sets, Support Vector Machine, Factorization Machine, Distance Metric Learning, Context-Aware
                                      Recommendation


Extended Abstract
In recent years many machine learning concepts and methods were developed on the set of
pairs of objects. In this paper, the set of all pairs of objects is called the pairwise space. Let us
notice that if the set of objects 𝒳 = {x1 , x2 , . . . , x𝑛 } consists of 𝑛 instances, then the pairwise
space contains 𝑂(𝑛2 ) pairs. Thus why the straightforward implementations of those methods
are not applicable for big data sets with millions of objects.
   The main concepts in rough set theory (RS) such as reducts, lower and upper approximations,
decision rules or discretizations have been defined in term of the discernibility matrix, which
is a form of the pairwise space [1]. For example, in minimal decision reduct problem, we are
looking for the minimal subset of features that preserves the discernibility between objects
from different decision classes [2].
   Support Vector Machine (SVM) is also a classification method described as an optimization
problem over the pairwise space [3]. The initial idea of looking for the linear classifier with
the maximal margin were transformed into the problem of looking for a set of coefficients
𝛼 = (𝛼1 , 𝛼2 , · · · , 𝛼𝑛 ) related to objects that maximizes an objective function
                                                                                  ∑︁               1 ∑︁
                                                                 𝑊 (𝛼) =                 𝛼𝑖 −           𝑦𝑖 𝑦𝑗 𝛼𝑖 𝛼𝑗 𝐾(𝑥𝑖 , 𝑥𝑗 ).
                                                                                                   2
                                                                                     𝑖                  𝑖,𝑗

defined on the set of dot products of all pairs of objects. In the above formula 𝑦𝑖 denotes the
decision class of the object x𝑖 and 𝐾 is a kernel function chosen by the user.
   Distance Metric Learning (DML) [4] is a machine learning discipline that looks for the best
distance function (also divergence or similarity ) from certain available information about
similarity measures between different pairs or triplets of data. These similarities are determined

29th International Workshop on Concurrency, Specification and Programming (CS&P’21)
$ son@mimuw.edu.pl (H. S. Nguyen)
 0000-0002-3236-5456 (H. S. Nguyen)
                                    © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
 CEUR
 Workshop
 Proceedings
               http://ceur-ws.org
               ISSN 1613-0073       CEUR Workshop Proceedings (CEUR-WS.org)
by the sets

              𝑆 = {(x𝑖 , x𝑗 ) ∈ 𝒳 × 𝒳 : x𝑖 and x𝑗 are similar.}
              𝐷 = {(x𝑖 , x𝑗 ) ∈ 𝒳 × 𝒳 : x𝑖 and x𝑗 are not similar.}
              𝑅 = {(x𝑖 , x𝑗 , x𝑙 ) ∈ 𝒳 × 𝒳 × 𝒳 : x𝑖 is more similar to x𝑗 than to x𝑙 .}

With these data and similarity constraints, the problem to is to look for those distance functions
(belonging to a predefined family of distances 𝒟) that minimize a certain loss function ℓ
determined on the base of the sets 𝑆, 𝐷 and 𝑅. In other words, the objective of DML is to solve
the optimization problem

                                         min ℓ(𝑑, 𝑆, 𝐷, 𝑅)
                                         𝑑∈𝒟

   In recent years, many efficient implementations for the mentioned above disciplines have
been proposed and developed. Most of them are based either on the approximation idea [5], [6]
or deep learning [7].
   Context aware recommendation systems is another machine learning concept that were
defined on the pairwise space which was in fact defined as a regression problem over the set of
transaction pairs [8].
   In this talk we compare different techniques for the mentioned above machine learning
concepts and we will pay an attention on application of factorization machine (FM). This
method has been successfully applied for context aware recommendation systems [9]. The
main idea is to transform the optimization problem established on the pairwise space into
an equivalent problem where the time complexity for each iteration has been reduced from
quadratic time into linear time [10]. We will show that factorization machine can be also applied
for some problems in rough set theory, SVM or distance metric learning.


References
 [1] Z. Pawlak, Rough sets, International Journal of Information and Computer Sciences 11
     (1982) 341–356.
 [2] Z. Pawlak, A. Skowron, Rudiments of rough sets, Information Sciences 177 (2007) 3–27.
 [3] C. Cortes, V. Vapnik, Support vector networks, Machine Learning 20 (1995) 273–297.
 [4] J.-L. Suárez, S. García, F. Herrera, A tutorial on distance metric learning: Mathematical
     foundations, algorithms, experimental analysis, prospects and challenges, Neurocomputing
     425 (2021) 300–322.
 [5] H. S. Nguyen, Approximate Boolean Reasoning: Foundations and Applications in Data
     Mining, Springer-Verlag, Berlin, Heidelberg, 2006, p. 334–506.
 [6] T. Joachims, Learning to Classify Text Using Support Vector Machines – Methods, Theory,
     and Algorithms, Kluwer/Springer, 2002.
 [7] P. H. Barros, F. Queiroz, F. Figueredo, J. A. dos Santos, H. S. Ramos, A new similarity
     space tailored for supervised deep metric learning, CoRR abs/2011.08325 (2020). URL:
     https://arxiv.org/abs/2011.08325. arXiv:2011.08325.
 [8] S. Rendle, Z. Gantner, C. Freudenthaler, L. Schmidt-Thieme, Fast context-aware rec-
     ommendations with factorization machines, in: Proceedings of the 34th International
     ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR
     ’11, Association for Computing Machinery, New York, NY, USA, 2011, p. 635–644. URL:
     https://doi.org/10.1145/2009916.2010002. doi:10.1145/2009916.2010002.
 [9] X. Xin, B. Chen, X. He, D. Wang, Y. Ding, J. Jose, Cfm: Convolutional factorization
     machines for context-aware recommendation, in: Proceedings of the Twenty-Eighth
     International Joint Conference on Artificial Intelligence, IJCAI-19, International Joint
     Conferences on Artificial Intelligence Organization, 2019, pp. 3926–3932. URL: https:
     //doi.org/10.24963/ijcai.2019/545. doi:10.24963/ijcai.2019/545.
[10] S. Rendle, Factorization machines with libfm, ACM Trans. Intell. Syst. Technol. 3 (2012)
     57:1–57:22. URL: http://doi.acm.org/10.1145/2168752.2168771. doi:10.1145/2168752.
     2168771.

</pre>