<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Finding Concepts with Unexpected Multi-Label Ob jects Based on Shared Subspace Method</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>104, Department of Computer Science, Palacky University Olomouc</institution>
          ,
          <addr-line>2018. Copying permitted only for private and academic purposes</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Graduate School of Information Science and Technology Hokkaido University N-14 W-9</institution>
          ,
          <addr-line>Sapporo 060-0814</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>c paper author(s), 2018. Proceedings volume published and copyrighted by its editors. Paper published in Dmitry I. Ignatov</institution>
          ,
          <addr-line>Lhouari Nourine (Eds.): CLA 2018, pp. 93</addr-line>
        </aff>
      </contrib-group>
      <fpage>93</fpage>
      <lpage>104</lpage>
      <abstract>
        <p>We discuss a method of retrieving unexpected objects for a given query, where each data object is represented as a feature vector and assigned a multi-label as well. Given an object-feature matrix X1 and an object-label matrix X2, we try to simultaneously factorize X1 and X2 as X1 ≈ BV and X2 ≈ SW by means of Nonnegative Shared Subspace Method, where the basis S is a part (subspace) of the basis B. With the help of the shared subspace, thus, we can predict a multi-label for a query feature-vector with unknown labels. Our unexpected object for the query is defined as an object which is similar to the query in the feature space, but is dissimilar in the label space. In order to obtain unexpected objects from several viewpoints of similarity, we formalize our retrieval task as a problem of finding formal concepts satisfying a constraint w.r.t. the unexpectedness. We present an efficient depth-first branch-and-bound algorithm for extracting our target concepts.</p>
      </abstract>
      <kwd-group>
        <kwd>formal concept</kwd>
        <kwd>shared subspace method</kwd>
        <kwd>nonnegative matrix factorization</kwd>
        <kwd>unexpectedness of objects</kwd>
        <kwd>multi-labels</kwd>
        <kwd>recommendation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Information Retrieval (IR) is a fundamental task in our daily life. In popular
keyword-based IR, since it is not easy to get desirable data objects by
providing query keywords just once, we iteratively input queries until we can meet
satisfiable results. Particularly, in Associative Search [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], at each step we
repeatedly input a query, our query is shifted to its sibling concept [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. As the
results, we often find an interesting search result which is surprising or
unexpected for us but still keeps a certain degree of relevance to our initial query.
The authors consider that such an aspect of associative search is strongly
desirable especially for recommendation-oriented IR systems. This paper discusses
a recommendation-oriented method for finding interesting objects for a given
query, especially taking an unexpectedness of objects for the query into account.
      </p>
      <p>
        A notion of unexpectedness in recommender systems has been discussed
in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. In the framework, the unexpectedness of an item (object) is evaluated
based on a distance between the item and those a user already knows in some
sense. Another related notions, novelty and serendipity, have also been
investigated in [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. An object is said to be novel for a user if he/she does not know it.
For example, it can be evaluated based on user’s history of recommendations.
On the other hand, since the notion of serendipity is emotional and difficult to be
defined, it has been discussed in terms of diversity which is based on dissimilarity
among objects [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. It is noted here that those notions previously proposed are
subjectively defined because we need some kind of user-dependent information.
      </p>
      <p>In contrast with them, we propose an objective unexpectedness of objects. A
data object is usually represented as a vector in a primary feature space. Then,
the notions of novelty and serendipity have been formalized with the help of
additional information specific to particular users. Nowadays, however, several
kinds of additional information are also commonly available. In case of movie
objects, for example, each movie would be primarily represented as a vector
of feature terms extracted from their plots. In addition, each movie is often
assigned some genre labels by commercial companies or many SNS users. Since
those secondary features provide us valuable information about movies, they
would make our IR systems more flexible and useful for a wide range of users.</p>
      <p>In our framework, as such commonly-available additional features, we assume
each object is assigned some labels (as a multi-label) beforehand. That is, our
data objects are given by two data matrices, X1 and X2, each of which represents
an object-feature relation and object-label relation, respectively. Then, we propose
our notion of unexpectedness with respect to label-information of objects.</p>
      <p>
        More concretely speaking, in our recommendation-oriented IR, a query q
is given as a feature vector and supposed to have no label. As a reasonable
guess, we often consider that if an object x is similar to q in the feature space,
q would also have a multi-label similar to that of x. Conversely, if we observe
(by any means) their multi-labels are far from each other, we would find some
unexpectedness of x for q because they have distant multi-labels even though
their features are similar. Based on the idea of unexpectedness, we formalize our
IR task as a problem of detecting formal concepts [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] each of which contains
some unexpected objects in the extent. By finding those formal concepts, we can
obtain our search results from various viewpoints of similarity among the query
and objects.
      </p>
      <p>
        The point in our IR is to predict a multi-label of the query represented as
a feature-vector. For the task, our object-feature matrix X1 and object-label
matrix X2 are simultaneously factorized as X1 ≈ BV and X2 ≈ SW by
Nonnegative Shared Subspace Method [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], where the basis S is a part (subspace) of
the basis B. In a word, such a shared subspace associates the label-information
with the feature-information of the original matrices. With the shared subspace,
we can predict a multi-label for the query feature-vector with unknown labels.
      </p>
      <p>
        To predict a multi-label of a given object, a method of multi-label
classification has already been proposed in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. In the method, we need to obtain a
subspace and its orthogonal basis for the original feature space which
approximately reflects similarity among multi-labels assigned to the objects by solving
an eigenvalue problem. However, such an orthogonal basis yields negative
components in the subspace which complicate interpretation of our search results.
Moreover, orthogonality is not necessarily required in prediction purpose. As
its simplified and efficient version, a prediction method has also been discussed
in [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], where a subspace is just defined as a real line. However, the prediction is
not so reliable as we will see later. This is the reason why we prefer nonnegative
factorization in our framework.
      </p>
      <p>
        In our recommendation-oriented IR, for a given query, we try to find formal
concepts whose extents contains some unexpected objects for the query. We
first create a formal context consisting of only objects similar (relevant) to the
query in the feature space with a standard Nonnegative Matrix Factorization [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
Then, we try to extract concepts with unexpected ones in the context. Since we
often have a huge number of concepts, we evaluate a concept with its extent E
by the average distance between each object in E and the query in the
labelsubspace, and try to extract concepts with the top-N largest evaluation values.
We present a depth-first algorithm for finding those top-N concepts, where a
simple branch-and-bound pruning based on the evaluation function is available.
Our experimental result for a movie dataset shows our system can actually detect
an interesting concept of movies whose plots are similar to a given query but
some of them have genre-labels which are far from predicted genres of the query.
      </p>
      <p>
        From the viewpoint of Formal Concept Analysis (FCA) [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], our study in this
paper is closely related to several interesting topics. In order to reduce the size
of formal context preserving important information, methods with non-negative
matrix factorizations have been investigated in [
        <xref ref-type="bibr" rid="ref3 ref5">3, 5</xref>
        ]. Although our method is
also based on such a factorization technique, the main purpose is not only to
reduce the context but also to associate label information with the reduced
context.
      </p>
      <p>
        A smaller lattice with a reduced number of concepts is desirable as a practical
requirement in FCA. Computing a subset of possible concepts, e.g., in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], is
a useful approach for that purpose. Interestingness measures of concepts can
meaningfully restrict our targets to be extracted and several representatives are
surveyed in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Although our method also proposes a kind of interestingness
based on unexpectedness of objects, we emphasize that it is a query-specific one.
2
      </p>
      <p>
        Approximating Data Matrices Based on Nonnegative
Shared Subspace Method
In this section, we discuss how to simultaneously approximate a pair of data
matrices representing different informations of the same (common) objects. The
approximation is based on Nonnegative Shared Subspace Method [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
2.1
      </p>
      <p>Approximating Object-Feature Matrix Reflecting Multi-Label
Information
Let X1 = f1 · · · fN be an M ×N object-feature matrix and X2 = `1 · · · `L
an M × L object-label matrix, where each object is represented as a row-vector.
As will be discussed later, in our experimentation, movies are regarded as objects
and terms (word) in their plots as features. In addition, each movie is assigned
a set of genre-labels as its multi-label.</p>
      <p>
        Since each object is often represented as a high-dimensional feature vector, it
would be required to compress the matrix X1. Moreover, although the number
of possible labels would be less than that of features, the matrix X2 also tends to
be sparse because each object has only a few labels in general. We, therefore, try
to compress both X1 and X2 by means of Nonnegative Matrix Factorization [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>More formally speaking, X1 and X2 are approximated as follows:
X1 ≈
X2 ≈
˜ ˜
f1 · · · fK</p>
      <p>V,</p>
      <p>where V = (vij )ij , fj ≈
˜ ˜
`1 · · · `KL</p>
      <p>WL,
where WL = (wiLj )ij , `j ≈</p>
      <p>K
X vij f˜i
i=1</p>
      <p>KL
X wiLj `˜i.
i=1
It is noted here that f˜1 · · · f˜K is a compressed representation of X1 and
`˜1 · · · `˜KL that of X2. As has been stated previously, we especially try to use
the latter matrix `˜1 · · · `˜KL as a part of the former f˜1 · · · f˜K in order to
associate the label-information with the feature-information in our
approximation process. That is, assuming a (KF = K − KL) × N coefficient matrix VF and
a KL × N coefficient matrix VL, we try to perform approximations such that
X1 ≈
X2 ≈
˜ ˜ ˜ ˜
f1 · · · fKF `1 · · · `KL
˜ ˜
`1 · · · `KL</p>
      <p>WL.</p>
      <p>VF
VL
= f˜1 · · · f˜KF</p>
      <p>˜ ˜
VF + `1 · · · `KL</p>
      <p>VL,
Note that the original column-vector fi of X1 is approximated by a linear
combination of basis vectors f˜j in F = (f˜1 · · · fKF ) and `˜j in L = (`˜1 · · · `˜KL )
˜
which are respectively unaffected and affected by label-compression.</p>
      <p>
        In order to obtain a certain degree of quality in the approximation process,
we have to care a balance between feature and label-compressions. Following [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ],
we take into account Frobenius Norm of the original matrices X1 and X2 and
try to solve the following optimization (minimization) problem:
min
      </p>
      <p>X1 −</p>
      <p>F |L
kX1k2F</p>
      <p>VF
VL
2
F + kX2 − LWLk2F ,
kX2k2F
where kX1kF and kX2kF can be treated as constants.</p>
      <p>
        Based on a similar discussion to the standard formulation of NMF [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], we
can obtain a set of multiplicative update rules for the optimization problem [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
With element-wise expressions and λ = kX1k2F /kX2k2F , we have
(L)ij ← (L)ij × (S)ij ,
(1)
where (S)ij is given by
      </p>
      <p>1
(S)ij
=
(LVLVLT + F VF VLT )ij + λ
(X1VLT + λX2WLT )ij</p>
      <p>(LWLWLT )ij
(X1VLT + λX2WLT )ij
for L defining the shared subspace. And, for V =
, WL and F , we have
VF</p>
      <p>VL
(V )ij ← (V )ij
( F |L</p>
      <p>T
( F |L</p>
      <p>T X1)ij
F |L V )ij</p>
      <p>,
(LT X2)ij
(WL)ij ← (WL)ij (LT LWL)ij</p>
      <p>and
(X1VFT )ij
(F )ij ← (F )ij (LVLVFT + F VF VFT )ij</p>
      <p>.
2.2</p>
    </sec>
    <sec id="sec-2">
      <title>Predicting Unknown Labels of Query</title>
      <p>Based on the matrix factorization discussed above, we can predict a multi-label
of a given query. We assume our query q with unknown-label is given as just an
N -dimensional (feature) vector, that is, q = (qi)1≤i≤N . A prediction about labels
of q can be performed by computing a coefficient vector for the basis vectors `˜j
reflecting label-information of objects in the factorization process. More precisely
speaking, the query can be represented as
Then we have
q =</p>
      <p>N
X qifi,
i=1
where fi ≈</p>
      <p>KF
X vjFif˜j +
j=1</p>
      <p>KL
X vjLi`˜j .</p>
      <p>j=1</p>
      <p>KF
q ≈ X
j=1</p>
      <p>N
X qivjFi
i=1
!</p>
      <p>KL
f˜j + X
j=1</p>
      <p>N !
X qivjLi `˜j .
i=1
Thus, the (KL-dimensional) coefficient vector for `˜j can be given by VLq.</p>
      <p>After the approximation, each object x is represented by its corresponding
row-vector vxT in the compressed matrix L = `˜1 · · · `˜KL reflecting their
original label-information. Therefore, a distance between vx and VLq provides us a
hint about which labels the query would have. If the vectors are close enough,
q seems to have labels similar to those of x. In other words, we can evaluate
a farness/closeness between labels of x and q by defining an adequate distance
function for those vectors. In the following discussion, we assume a distance
function, distL, based on cosine similarity between vectors. That is, for an object x
vxT VLq
and a query q, it is defined as distL(x, q) = 1 − ||vx|| ||VLq|| .
(2)
(3)
(4)
(5)</p>
      <p>Evaluating Similarity among Features of Objects
As is similar to the case of labels, we can evaluate similarity among features
of objects based on the compressed matrix (F |L). However, since the matrix is
affected by not only the feature compression but also the label compression, it
would not be adequate for evaluating only similarity of features. For our
evaluation of feature similarity, therefore, we try to approximate the original matrix X1
into an M × KT matrix HX1 with the standard NMF such that X1 ≈ HX1 WF ,
where each object x after the compression is given as its corresponding
rowvector hxT in HX1 . Therefore, we can evaluate similarity between features of x
and q by computing a distance between hx and WF q, denoted by distF (x, q).
3</p>
      <p>Extracting Formal Concepts with Unexpected</p>
      <p>Multi-Label Objects for Query
Towards a recommendation-oriented information retrieval, we present in this
section our method for finding formal concepts whose extents include some
unexpected objects for a given query. The reason why we try to detect formal
concepts is that the extent of a concept can be explicitly interpreted by its
intent. That is, the intent provides us a clear explanation why those objects
are grouped together. By extracting various concepts, we can therefore obtain
interesting object clusters from multiple viewpoints.
3.1</p>
      <p>Unexpected Objects Based on Predicted Multi-Label of Query
We first present our notion of unexpectedness of objects for a given query.
Especially, we propose here an objective definition for the notion.</p>
      <p>As has been discussed, we can implicitly predict a multi-label of a given query
q with unknown-label. More precisely, we can measure a farness/closeness
between labels of an object x and the query. In addition, we can evaluate similarity
of features between x and q. Suppose here that we find both x and q have similar
or relevant features. In such a case, it would be plausible that we expectedly
consider they also have similar/relevant multi-labels. However, if we observe their
labels are far from each other, we seem to find some unexpectedness of x for q
because they have distant multi-labels even though their features are similar.</p>
      <p>With the distance functions, distF in the feature-subspace and distL in the
label-subspace, we can formalize this kind of unexpectedness of objects for a
given query q with unknown-label.</p>
    </sec>
    <sec id="sec-3">
      <title>Definition 1. (Unexpected Object for Query)</title>
      <p>For an object x, if x satisfies the following two constraints, then x is said to be
unexpected for the query q:
Relevance/Similarity of Features : x is relevant/similar to q in the
featuresubspace, that is, distF (x, q) ≤ δF , and
Farness of Multi-Labels : x and q are distant from each other in the
labelsubspace, that is, distL(x, q) ≥ δL,
where δF (&gt; 0) and δL (&gt; 0) are user-defined parameters.</p>
      <p>Extracting Formal Concepts with Unexpected Objects
Constructing Formal Context for Query : Suppose the original
objectfeature matrix X1 is approximated as X1 ≈ HX1 WF with the standard NMF,
where HX1 is regarded as a compressed representation of X1. It is recalled that
the i-th object xi in X1 is given as the i-th row-vector viT = (vi1 · · · viKT ) in
HX1 , where the j-th element vij is the value of the feature fj for xi.</p>
      <p>In order to extract formal concepts with unexpected objects for a given query
q, we first define a formal context Cq consisting of the objects relevant to q.
Formally speaking, with the parameter δF , the set of relevant objects is defined
as Oq = {xi | xi ∈ O, distF (xi, q) ≤ δF }. The set of features (attributes), Fq, is
simply defined as Fq = {f1, . . . , fKT }. Moreover, introducing a parameter θ as
a threshold for feature values, we define a binary relation Rq ⊆ Oq × Fq as</p>
      <p>Rq = {(xi, fj ) | xi ∈ Oq, viT = (vi1 . . . viKT ) in HX1 and vij ≥ θ}.
Thus, our formal context is given by Cq = hOq, Fq, Rqi.</p>
      <p>Evaluating Formal Concepts : It is obvious that for each formal concept
in the context Cq, its extent consists of only objects relevant to q. The extent,
however, does not always have unexpected ones. For a purpose of
recommendation, since it would be desirable to involve some unexpected objects, we need to
evaluate formal concepts in Cq taking the farness of multi-labels into account.</p>
      <p>Let us consider a concept in Cq with its extent E. We evaluate the
concept by the average distance between each object in E and the query in the
label-subspace. That is, our evaluation function, eval, is defined as eval(E) =
Px∈E distL(x,q) .</p>
      <p>|E|</p>
      <p>Although our formal context consists of only objects relevant to the query, we
could have many concepts in some cases. We, therefore, try to obtain concepts
with the top-N largest evaluation values.</p>
      <p>Interpreting Intents of Formal Concepts : Our formal context is
constructed from the matrix HX1 which is a compressed representation of the
original object-feature matrix X1 approximated as X1 ≈ HX1 WF with the
standard NMF. That is, the intent of a concept in the context is given as a set of
compressed features interpretable in terms of original features. The relationship
among compressed and original features is represented as the matrix WF . Each
compressed feature is given as a row-vector in WF in which larger components
can be regarded as relevant original features. When we, therefore, interpret the
intent of a concept, it is preferable to show relevant original features as well as
each compressed one in the intent. As a simple way of dealing with that,
assuming a (small) integer k, we can present original features corresponding to the k
largest components for each compressed feature.
3.3</p>
      <p>Algorithm for Extracting Top-N Formal Concepts
We present our algorithm for extracting target formal concepts.
[Input] Cq = (Oq, Fq, Rq) : a formal context obtained for a query q</p>
      <p>N : a positive integer for top-N
[Output] FC : the set of formal concepts with the top-N largest evaluation values
procedure Main(Cq, N) :</p>
      <p>FC = ∅ ;
α = 0.0 ; // the N-th (tentative) evaluation value of concepts in FC
Fix a total order ≺ on Oq such that for any xi, xj ∈ Oq, xi ≺ xj if distL(xi, q) ≤ distL(xj, q) ;
while Oq 6= ∅ do
begin
x = head(Oq) ;
Oq = (Oq \ {x}) ; // removing x from Oq</p>
      <p>FCFind({x}, ∅, Oq) ; // Oq as candidate objects
end
return FC ;
procedure FCFind(P , P revExt, Cand) :</p>
      <p>F C = (Ext = P 00, P 0) ; // computing FC
if ∃x ∈ (Ext \ P revExt) such that x ≺ tail(P ) then</p>
      <p>return; // found to be duplicate formal concept
endif
Update FC adequately so that it keeps concepts with top-N largest evaluation values found so far ;
α = the N-th (tentative) evaluation value of concepts in FC;
while Cand 6= ∅ do
begin
x = head(Cand) ;
Cand = (Cand \ {x}) ; // removing x from Cand
NewCand = (Cand \ Ext) ; // new candidate objects
if NewCand = ∅ then continue ;
if eval(P ∪ {x} ∪ NewCand) &lt; α then continue ; // branch-and-bound pruning
FCFind(P ∪ {x}, Ext, NewCand) ;
end</p>
      <p>Let Cq = hOq, Fq, Rqi be a formal context constructed for a given query
q. As the basic strategy, generating extents of concepts, we explore object sets
along the set enumeration tree, rooted by ∅, based on a total ordering ≺ on Oq.
More concretely speaking, we expand a set of objects P into P x = P ∪ {x}
by adding an object x succeeding to tail(P ) and then compute ((P x)00, (P x)0)
to obtain a formal concept, where we refer to the last (tail) element of P as
tail(P ). Such an object x to be added is called a candidate and is selected from
cand(P ) = {x | x ∈ (Oq \ P 00), tail(P ) ≺ x}. Initializing P with ∅, we recursively
iterate this process in depth-first manner until no P can be expanded.</p>
      <p>Our target concepts must have the top-N largest evaluation values. Let us
assume the objects xi in Oq are sorted in ascending order of distL(xi, q). Based
on the ordering, along our depth-first expansion process of concepts, the
evaluation values of obtained concepts increase monotonically. It should be noted
here that for a set of objects P ⊆ Oq, the extent E of each concept obtained by
expanding P is a subset of P 00 ∪ cand(P ). Due to the monotonicity of evaluation
values, therefore, eval(P 00 ∪ cand(P )) gives an upper bound we can observe in
our expansion process from P . This means that if we find eval(P 00 ∪ cand(P ))
is less than the tentative N -th largest evaluation value of concepts found so far,
there is no need to expand P because we never meet any target concept by the
expansion. Thus, a branch-and-bound pruning is available in our search process.</p>
      <p>A pseudo-code of the algorithm is presented in Figure 1. The head element of
a set S is referred to as head(S). Although we skip details due to space limitation,
the code incorporates a mechanism for avoiding duplicate generations of the same
concept, as if statement at the beginning of procedure FCFind.
4</p>
      <sec id="sec-3-1">
        <title>Experimental Result</title>
        <p>In this section, we present our experimental result with our system.</p>
        <p>
          We have prepared a movie dataset consisting of 17, 000 movies with their
plots and genres. Our dataset has been created from CMU Movie Summary
Corpus [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] 1. After preprocessing, the plot of each movie is represented as
a boolean vector of 6, 251 feature terms with medium frequency. That is, our
movie-plot matrix XP has the dimension of 17, 000 × 6, 251. Moreover, each
movie is assigned some of 364 genre-labels as its multi-label. Then, our
movielabel matrix XL is given as a boolean matrix with the dimension of 17, 000×364.
Applying Nonnegative Shared Subspace Method, we have compressed XP into
a 17, 000 × 500 matrix (F |L), where dimensions of F and L are 17, 000 × 450
and 17, 000 × 50, respectively, and L is also a compressed matrix of XL.
        </p>
        <p>In addition, as candidates of our queries, we have also prepared 783 movies
with only their plots, hiding their genres. Thus, our system is given a 6,
251dimensional boolean vector as a query obtained from those candidates.
Example of Extracted Formal Concept for “Shark Bait (2006)”
For a query vector obtained from the plot of a candidate movie “Shark Bait”,
we present here a formal concept actually detected by our system.</p>
        <p>“Shark Bait” is a computer animated family movie released in 2006. The
story is about Pi, the main fish character, his relatives and friends while fighting
against a mean tiger shark terrorizing Pi’s reef community.</p>
        <p>For the query vector (plot), an example of formal concept our system detected
is shown in Figure 2.</p>
        <p>Similarity of Movie Plots : The extent of the concept consists of 5 movies
all of which are concerned with marine animals. For example, “Jaws” is a very
famous movie about a great white shark wrecking havoc in a beach resort. “Open
Water” is a suspense movie based on a real story about a couple of scuba divers
left behind in the shark-infested sea due to an accident. Moreover, “Free Willy”,
a family-oriented adventure movie, is about the friendship between a street child
and a killer whale in a local amusement park. As the intent shows, all of them
are commonly associated with 5 features compressed with N M F each of which
is related to several relevant feature terms used in the original plots. Actually,
we can observe a certain degree of similarity among the plots of the movies in
the extent. It is, furthermore, easy to see that they are also similar to the query
plot.
1 http://www.cs.cmu.edu/˜ark/personas/
Farness of Multi-Labels : As is shown in the figure, the movies in the extent
have various multi-labels. More precisely speaking, the movies are listed in
descending order of distance between the (implicitly) predicted multi-label of the
query and that of each movie. That is, upper movies are expected to have
multilabels further from that of the query as unexpected ones. On the other hand,
multi-labels of lower movies would be similar to that of the query. According
to our problem setting, although the correct multi-label of the query was
intensionally hidden, its actual genre labels are “Family Film” and “Animation”.
It is easy to see that the lowest movie, “Help! I’m a Fish”, is categorized into
genres very similar to the actual genres of the query. In other words, the
multilabel of the query can reasonably be predicted with the help of Nonnegative
Shared Subspace Method. As the result, based on our evaluation function for
formal concepts, we can find some movies in the extent with unexpected
(further) genre labels, like “Jaws” and “Open Water”. Inspired by such a concept,
users could newly try and enjoy some movies with those unexpected genres.
Thus, our method has an ability to stimulate us to try unexperienced movies
based on unexpectedness of multi-labels.</p>
        <p>Quality of Multi-Label Prediction : Quality of multi-label prediction would
be important in our method because our unexpectedness is based on the
prediction. We here observe quality of prediction by comparing two object rankings,
one is based on predicted multi-labels of the query and the other based on its
correct multi-label (hidden in our retrieval process). More concretely speaking,
for the former, each movie in the dataset is ranked in ascending order of
similarity between its multi-label and the predicted label of the query. The obtained
ranking is referred to as Rpred. Similarly, we also rank each movie in ascending
order of similarity between its multi-label and the correct label of the query.
We refer to the ranking as Rcorrect. Then we can compute the Spearman’s rank
correlation coefficient between Rpred and Rcorrect. If we can observe a certain
degree of correlation, we can expect that our prediction would be reasonable.</p>
        <p>
          For the above example, we actually have the coefficient of 0.35 showing a weak
positive correlation. As has been mentioned previously, a prediction method has
been discussed in [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ], where prediction is performed based on a subspace defined
as a real line. For comparison, the correlation coefficient between the ranking
according to the previous method and Rcorrect is 0.03 showing little correlation.
Although the value of 0.35 seems to be a little bit small, it can be increased
to 0.84 when we focus on Top-10% ranked (1, 700) objects in the ranking. It is
noted here that our main purpose of prediction is just to identify objects whose
multi-labels are far from that of the query. In this sense, precise lower ranks in
Rpred are not matter. Therefore, we consider that the prediction of multi-labels
in our method can work well for our purpose.
5
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Concluding Remark</title>
        <p>We discussed our method of finding interesting formal concepts with unexpected
objects for a given query with no label. We defined our unexpected object for
the query as object which is similar to the query in the feature space, but is
dissimilar in the label space. In order to predict a multi-label of the query, the
original object-feature and object-label matrices are simultaneously factorized by
means of Nonnegative Shared Subspace Method, where the obtained subspace
associates the label-information with the feature-information of the original
matrices. Our retrieval task was formalized as a problem of enumerating every
formal concept with Top-N evaluation values w.r.t. unexpectedness.</p>
        <p>
          At the moment, we still leave quantitative evaluation of our method. As
the unexpectedness in [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] has been quantitatively evaluated for another movie
dataset according to [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], we can attempt a similar comparison. In addition, it
would be worth verifying actual usefulness of our system through user trials.
        </p>
        <p>
          We also need to investigate a prediction method of multi-labels. In our current
framework, although the basis of label space is assumed to be a part of the basis
of feature space. In order to further improve quality of multi-label prediction, it
would be required to carefully distinguish two classes of labels, ones which can
be well explained in terms of features and the others. By a correlation analysis
for labels and features, we can define an adequate regularization term in the
objective function for our matrix factorization. This kind of analysis is also very
important in the framework of feature association which identifies reasonable
similarities among features in different data matrices [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
        </p>
        <p>
          Moreover, a recommendation system based on query-specific FCA-based
biclustering has been proposed in [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. We need to clarify relationship between the
method and ours.
        </p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Gupta</surname>
            ,
            <given-names>S.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Phung</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Adams</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tran</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Venkatesh</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Nonnegative Shared Subspace Learning and Its Application to Social Media Retrieval</article-title>
          ,
          <source>Proc. of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD'10</source>
          , pp.
          <fpage>1169</fpage>
          -
          <lpage>1178</lpage>
          (
          <year>2010</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>D. D.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Seung</surname>
            ,
            <given-names>H. S.</given-names>
          </string-name>
          :
          <article-title>Algorithms for Non-negative Matrix Factorization</article-title>
          ,
          <source>Proc. of NIPS</source>
          <year>2000</year>
          , pp.
          <fpage>556</fpage>
          -
          <lpage>562</lpage>
          (
          <year>2000</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>C. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dias</surname>
            ,
            <given-names>S. M.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Vieira</surname>
            ,
            <given-names>N. J.</given-names>
          </string-name>
          :
          <article-title>Knowledge Reduction in Formal Contexts Using Non-negative Matrix Factorization</article-title>
          , Mathematics and Computers in Simulation, 109(C), pp.
          <fpage>46</fpage>
          -
          <lpage>63</lpage>
          ,
          <string-name>
            <surname>Elsevier</surname>
          </string-name>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Kuznetsov</surname>
            ,
            <given-names>S. O.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Makhalova</surname>
          </string-name>
          ,
          <source>T: On Interestingness Measures of Formal Concepts</source>
          ,
          <source>Information Sciences, 442(C)</source>
          , pp.
          <fpage>202</fpage>
          -
          <lpage>219</lpage>
          ,
          <string-name>
            <surname>Elsevier</surname>
          </string-name>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Akhmatnurov</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Ignatov</surname>
            ,
            <given-names>D. I.</given-names>
          </string-name>
          :
          <article-title>Context-Aware Recommender System Based on Boolean Matrix Fatorisation</article-title>
          ,
          <source>Proc. of the 12th International Conference on Concept Lattices and Their Applications - CLA'15</source>
          , pp.
          <fpage>99</fpage>
          -
          <lpage>110</lpage>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Stumme</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taouil</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bastide</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pasquier</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Lakhal</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Computing Iceberg Concept Lattices with TITANIC</article-title>
          ,
          <source>Data &amp; Knowledge Engineering</source>
          ,
          <volume>42</volume>
          (
          <issue>2</issue>
          ), pp.
          <fpage>189</fpage>
          -
          <lpage>222</lpage>
          ,
          <string-name>
            <surname>Elsevier</surname>
          </string-name>
          (
          <year>2002</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Alqadah</surname>
            ,
            <given-names>F</given-names>
          </string-name>
          , Reddy,
          <string-name>
            <given-names>C. K.</given-names>
            ,
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            and
            <surname>Alqadah</surname>
          </string-name>
          ,
          <string-name>
            <surname>H. F.</surname>
          </string-name>
          :
          <article-title>Biclustering NeighborhoodBased Collaborative Filtering Method for Top-N Recommender Systems</article-title>
          ,
          <source>Knowledge Information Systems</source>
          ,
          <volume>44</volume>
          (
          <issue>2</issue>
          ), pp.
          <fpage>475</fpage>
          -
          <lpage>491</lpage>
          , Springer (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Sun</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ji</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Ye</surname>
          </string-name>
          , J.:
          <article-title>Hypergraph Spectral Learning for Multi-Label Classification</article-title>
          ,
          <source>Proc. of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD'08</source>
          , pp.
          <fpage>668</fpage>
          -
          <lpage>676</lpage>
          (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>A.</given-names>
            <surname>Takano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Niwa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Nishioka</surname>
          </string-name>
          ,
          <string-name>
            <surname>M</surname>
          </string-name>
          , Iwayama,
          <string-name>
            <given-names>T.</given-names>
            <surname>Hisamitsu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Imaichi</surname>
          </string-name>
          , and
          <string-name>
            <surname>H</surname>
          </string-name>
          ,
          <source>Sakurai: Information Access Based on Associative Calculation, Proc. of SOFSEM 2000: Theory and Practice of Informatics, LNCS</source>
          <year>1963</year>
          , pp.
          <fpage>187</fpage>
          -
          <lpage>201</lpage>
          (
          <year>2000</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10. H.
          <string-name>
            <surname>Zhai</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Haraguchi</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Okubo</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Hashimoto</surname>
            and
            <given-names>S.</given-names>
          </string-name>
          <article-title>Hirokawa: Shifting Concepts to Their Associative Concepts via Bridges</article-title>
          ,
          <source>Proc. of the 9th International Conference on Machine Learning and Data Mining in Pattern Recognition, LNAI</source>
          <volume>7988</volume>
          ,
          <fpage>586</fpage>
          -
          <lpage>600</lpage>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhai</surname>
          </string-name>
          and
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Haraguchi: A Linear Algebraic Inference for Feature Association</article-title>
          ,
          <source>Proc. of the 12th Int'l Conf. on Knowledge, Information and Creativity Support Systems, IEEE Conference Publishing Services</source>
          , pp.
          <fpage>102</fpage>
          -
          <lpage>107</lpage>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Ganter</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Wille</surname>
          </string-name>
          , R.:
          <source>Formal Concept Analysis - Mathematical Foundations</source>
          ,
          <volume>284</volume>
          pages, Springer (
          <year>1999</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Bamman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>O</given-names>
            <surname>'Connor</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            and
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <surname>N. A.</surname>
          </string-name>
          :
          <article-title>Learning Latent Personas of Film Characters</article-title>
          ,
          <source>Proc. of the 51st Annual Meeting of the Association for Computational Linguistics</source>
          , pp.
          <fpage>352</fpage>
          -
          <lpage>361</lpage>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Adamopoulos</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Tuzhilin</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>On Unexpectedness in Recommender Systems: Or How to Better Expect the Unexpected</article-title>
          ,
          <source>ACM Transactions on Intelligent Systems and Technology</source>
          ,
          <volume>5</volume>
          (
          <issue>4</issue>
          ), Article No.
          <volume>54</volume>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Murakami</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mori</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Orihara</surname>
            ,
            <given-names>R</given-names>
          </string-name>
          :
          <article-title>Metrics for Evaluating the Serendipity of Recommendation Lists, New Frontiers in Artificial Intelligence</article-title>
          ,
          <source>JSAI 2007 Conference and Workshops, Revised Selected Papers, LNAI-4914</source>
          , pp.
          <fpage>40</fpage>
          -
          <lpage>46</lpage>
          (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Herlocker</surname>
            ,
            <given-names>J. L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Konstan</surname>
            ,
            <given-names>J. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Terveen</surname>
            ,
            <given-names>L. G.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Riedl</surname>
          </string-name>
          , J. T.:
          <article-title>Evaluating Collaborative Filtering Recommender Systems</article-title>
          ,
          <source>ACM Transactions on Information Systems</source>
          ,
          <volume>22</volume>
          (
          <issue>1</issue>
          ), pp.
          <fpage>5</fpage>
          -
          <lpage>53</lpage>
          (
          <year>2004</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Okubo</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Haraguchi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Liu</surname>
          </string-name>
          , H.:
          <article-title>Finding Interesting Formal Concepts with Unexpected Objects with respect to Multi-Labels</article-title>
          ,
          <source>Proc. of the 6th Asian Conference on Information Systems - ACIS</source>
          <year>2017</year>
          , pp.
          <fpage>98</fpage>
          -
          <lpage>105</lpage>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>