Eficient Machine Learning Methods
over Pairwise Space (keynote)
Hung Son Nguyen
University of Warsaw
Keywords
Rough sets, Support Vector Machine, Factorization Machine, Distance Metric Learning, Context-Aware
Recommendation
Extended Abstract
In recent years many machine learning concepts and methods were developed on the set of
pairs of objects. In this paper, the set of all pairs of objects is called the pairwise space. Let us
notice that if the set of objects X = {x
1
, x
2
,..., x
n
} consists of n instances, then the pairwise
space contains O(n
2
) pairs. Thus why the straightforward implementations of those methods
are not applicable for big data sets with millions of objects.
The main concepts in rough set theory (RS) such as reducts, lower and upper approximations,
decision rules or discretizations have been defned in term of the discernibility matrix, which
is a form of the pairwise space [1]. For example, in minimal decision reduct problem, we are
looking for the minimal subset of features that preserves the discernibility between objects
from diferent decision classes [2].
Support Vector Machine (SVM) is also a classifcation method described as an optimization
problem over the pairwise space [3]. The initial idea of looking for the linear classifer with
the maximal margin were transformed into the problem of looking for a set of coefcients
α =(α
1
,α
2
, ··· ,α
n
) related to objects that maximizes an objective function
W (α)=
∑
i
α
i
-
1
2
∑
i,j
y
i
y
j
α
i
α
j
K(x
i
,x
j
).
defned on the set of dot products of all pairs of objects. In the above formula y
i
denotes the
decision class of the object x
i
and K is a kernel function chosen by the user.
Distance Metric Learning (DML) [4] is a machine learning discipline that looks for the best
distance function (also divergence or similarity ) from certain available information about
similarity measures between diferent pairs or triplets of data. These similarities are determined
29th International Workshop on Concurrency, Specifcation and Programming (CS&P’21)
son@mimuw.edu.pl (H. S. Nguyen)
0000-0002-3236-5456 (H. S. Nguyen)
© 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
CEUR
Workshop
Proceedings
http://ceur-ws.org
ISSN1613-0073 CEUR Workshop Proceedings (CEUR-WS.org)