<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>E. Freuder. In pursuit of the holy grail. Con-
straints</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Recommender Systems for Configuration Knowledge Engineering</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>A. Felfernig</string-name>
          <email>felfernig@ist.tugraz.at</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>S. Reiterer</string-name>
          <email>reiterer@ist.tugraz.at</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>M. Stettinger</string-name>
          <email>stettinger@ist.tugraz.at</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>F. Reinfrank</string-name>
          <email>reinfrank@ist.tugraz.at</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>M. Jeran</string-name>
          <email>jeran@ist.tugraz.at</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>G. Ninaus</string-name>
          <email>ninaus@ist.tugraz.at</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Graz University of Technology Inffeldgasse 16b</institution>
          ,
          <addr-line>A-8010 Graz</addr-line>
          ,
          <country country="AT">Austria</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2013</year>
      </pub-date>
      <volume>2</volume>
      <issue>1</issue>
      <fpage>29</fpage>
      <lpage>30</lpage>
      <abstract>
        <p>The knowledge engineering bottleneck is still a major challenge in configurator projects. In this paper we show how recommender systems can support knowledge base development and maintenance processes. We discuss a couple of scenarios for the application of recommender systems in knowledge engineering and report the results of empirical studies which show the importance of user-centered configuration knowledge organization.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        ∗
Product knowledge changes frequently [Soloway, 1987].
Therefore, it must be possible to conduct knowledge base
development and maintenance operations efficiently. Since
the early developments of configurator applications in the
late 1970’s and early 1980’s [McDermott, 1982], knowledge
representations have been improved in terms of (1)
modelbased approaches which allow a clear separation of
domain knowledge and problem solving algorithms, (2)
higherlevel knowledge representations which allow a
componentoriented representation of configuration knowledge (see, e.g.,
[Stumptner et al., 1998]), and (3) graphical knowledge
representations
        <xref ref-type="bibr" rid="ref3">(e.g., [Felfernig et al., 2000; 2001])</xref>
        which
allow a compact representation. In addition to new
knowledge representations, intelligent diagnosis approaches have
been developed which help a knowledge engineer to identify
and repair erroneous configuration knowledge [Junker, 2004;
Felfernig et al., 2004; 2009; 2013].
      </p>
      <p>Due to diversification strategies of companies, product
and service assortments are becoming increasingly large and
complex [Huffman and Kahn, 1998]. The complexity of
the underlying knowledge bases increases to the same extent
which requires additional concepts that help a knowledge
engineer to conduct knowledge base development and
maintenance operations in an efficient fashion. Furthermore,
knowledge bases are often developed by a group of persons with
different knowledge, goals, and focuses with regard to
development and maintenance operations. This situation requires
adaptive user interfaces to be integrated into configuration
knowledge engineering environments. Adaptive user
interfaces for knowledge engineering have the potential to
effectively support engineers and domain experts in activities such
as learning (knowledge base understanding), finding (the
relevant items in the knowledge base), and testing &amp; debugging
(removing the source of faulty behavior).</p>
      <p>In order to offer more adaptivity in configurator
development environments, we propose the application of
different types of recommendation technologies [Jannach et al.,
2010] which proactively support domain experts and
engineers when creating and adapting configuration knowledge.
Such technologies should dispose of a basic understanding of
cognitive processes when persons develop and maintain
configuration knowledge bases. They should support
functionalities such as recommending relevant items (variables,
component types, constraints, diagnoses, etc.) and simultaneously
omitting specific items that are not relevant. Recommender
systems have the potential to provide such a support (see, e.g.,
[Robillard et al., 2010]).</p>
      <p>There are three basic recommendation approaches. First,
collaborative filtering [Konstan et al., 1997] determines
recommendations based on the preferences of nearest
neighbors (users with similar preferences compared to the current
user). In this context, items are recommended to the
current user which have received a positive rating by the
nearest neighbors but are not known to the current user. Second,
content-based filtering [Pazzani and Billsus, 1997]
recommends items that are not known to the current user and are
similar to items that have already been purchased by her/him.
Similarity between items can be determined, for example,
on the basis of the similarity of keywords used to describe
the item. Third, knowledge-based recommenders recommend
items by using constraints or similarity metrics [Burke, 2000;
Felfernig and Burke, 2008].</p>
      <p>This paper is organized as follows. In Section 2 we
introduce example scenarios for the application of recommender
technologies in knowledge engineering. Thereafter, we report
results of related empirical studies (see Section 3). In Section
4 we provide a discussion of related work. Conclusions and a
discussion of future research issues are given in Section 5.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Recommenders for Knowledge Engineering</title>
      <sec id="sec-2-1">
        <title>Collaborative Recommendation of Constraints. Collabo</title>
        <p>rative filtering (CF) recommender systems have shown to be
one of the best choices to achieve serendipity effects, i.e., to
be surprised (in a positive sense) by item recommendations
one did not expect when starting the recommendation
process. In situations were knowledge engineers do not know the
configuration knowledge base very well, collaborative
recommendations can be exploited to support a more focused
analysis of the knowledge base. The availability of navigation
data from other knowledge engineers is the major
precondition for determining recommendations with collaborative
filtering. Table 1 shows an example of navigation data that
describes in which order knowledge engineers (users) accessed
the constraints of a knowledge base. For simplicity we
assume that each of the users accessed each constraint (but in
different order). Similar applications of collaborative
filtering can be imagined for the recommendation of variables (or
component types) and instances of a component catalog.</p>
        <p>Table 1 stores the information in which order the
constraints have been visited by knowledge engineers (users),
for example, user 1 analyzed the constraints in the order
[c5, c2, c3, c1, c4, c6]. Let us assume that the current user has
already visited the constraints c5 and (then) c2. The nearest
neighbors of the current user (users with a similar navigation
behavior) are the users 1, 2, and 4. The majority of these
users analyzed constraint c1 in the third step – this one will
be recommended to the current user. Note that this
recommendation approach is currently under evaluation, therefore
no related empirical results will be reported in Section 3.
user
1
2
3
4
current</p>
        <p>
          Content-based Clustering of Constraints. Another
possibility to support knowledge engineers is to cluster
constraints with the goal to improve the overall clarity of the
knowledge base. We will exemplify this on the basis of
kmeans clustering [Witten and Frank, 2005]. Following this
approach, we have to generate k initial centroids which act
as (first) representatives of future clusters. In the following,
each object (in our case: constraint) is assigned to the group
(cluster) with the closest (most similar) centroid. Thereafter,
centroids are recalculated. In our case, a centroid is defined as
the object with the highest overall similarity to the other
objects in the cluster. The algorithm terminates if the centroids
are stable (do not change). k-means clustering is guaranteed
to terminate but is not necessarily optimal since the outcome
depends on the initial centroids
          <xref ref-type="bibr" rid="ref6">([Witten and Frank, 2005])</xref>
          .
        </p>
        <p>For demonstration purposes we introduce the following
simple configuration problem which is represented as a
basic constraint satisfaction problem (CSP = (V, D, C)) where
V represents a set of variables {v1, v2, ..., v5}, D represents
the set of corresponding domains (dom(vi) = {1..5}), and C
represents the following set of constraints.</p>
        <p>{c1 : v1 = 3 → v2 &gt; 1, c2 : v1 = 3 ∧ v3 = 1, c3 : v2 =
2 → v3 = 1, c4 : v3 = 1 → v1 6= 1, c5 : v3 = 1 → (v4 =
2 ∧ v1 &gt; v5), c6 : v4 ≥ 1 → v5 ≤ 4, c7 : v5 = 1 → v3 =
2 ∨ v3 = 3}.</p>
        <p>On the basis of this simple knowledge base, we can
calculate the similarities between the individual constraints
(ca, cb) by using Formula 1. In this formula, V =
variables(ca) ∪ variables(cb), co−occurrence(v, ca, cb)
= 1 if v is contained in both constraints on the same
position, co−occurrence(v, ca, cb) = 0.5 if v is
contained in both constraints but on a different position, and
co−occurrence(v, ca, cb) = 0 of no co-occurrence exists.
Note that this is one possible approach to similarity
determination. We also compared this approach with operator-based
similarity and a random assignment of constraints to clusters.
sim(ca, cb) =</p>
        <p>Pv∈V co–occurrence (v, ca, cb)
|V |
(1)</p>
        <p>The similarities between the pairs of individual constraints
are depicted in Table 2.</p>
        <p>ci ∈ C
c1
c2
c3
c4
c5
c6
c7
c1
1.0
0.33
0.16
0.16
0.1
0.0
0.0</p>
        <p>On the basis of these individual similarities we are able to
determine a set of corresponding clusters (k = 2). The
determination of such clusters is exemplified in Table 3. First,
we (randomly) select two constraints as initial cluster
centers (centroids): c1 and c5 (denoted by cs). In iteration 2 the
center of cluster 1 changes to c2 and we have to re-calculate
the cluster assignment. After this iteration, the assignment is
stable, i.e., the cluster centers (c2 and c5) remain the same.
iteration
1
2</p>
        <p>c1
1(cs)
1
c2
1
1(cs)</p>
        <p>For the visualization of the constraints {c1, c2, ..., c7} this
means that the knowledge base would be presented in terms
of two constraint groups: {c1, c2, c3, c4, c7} and {c5, c6}.</p>
        <p>Knowledge-based Refactoring Recommendations. The
way in which semantics is expressed has an impact on the
understandability of the knowledge base. For example, users
need less time to understand the semantics of a knowledge
base if implications are expressed in terms of A → B
compared to the alternative representation of ¬A ∨ B. Explicit
knowledge about the cognitive complexity of constraint
representations can be exploited to recommend structural and
semantics-preserving adaptations of knowledge structures.
Such recommendations are knowledge-based, since they are
explicitly encoded in refactoring rules.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Empirical Evaluation</title>
      <p>For the content-based clustering of constraints and
knowledge-based refactoring recommendations we now
present the results of two empirical studies. In the first study,
we compared the applicability of three different clustering
strategies with regard to knowledge engineering tasks (find a
solution, find a minimal conflict ) (see, e.g., [Junker, 2004]).</p>
      <p>Study A: Clustering of Constraints. For two different
configuration knowledge bases ( kba1, kba2) we conducted a
study based on an within-subjects design (N=40). Each study
participant (students of computer science who visited a
related course on knowledge engineering) had the task of (1)
finding a solution (in kba1) and (2) finding a minimal conflict
(in kba2).1 There were no time limits regarding task
completion. Each student was assigned to one type of
clustering (one out of variable-based similarity, operator-based
similarity, and random clustering), i.e., we did not vary the type
of clustering per student. The knowledge bases (kba1, kba2)
were defined as CSPs in a domain-independent fashion in
order to avoid an additional cognitive complexity related to the
understanding of a product domain. The basic properties of
the used knowledge bases are summarized in Table 4.</p>
      <p>Knowledge base
kba1
kba2
#(vi ∈ V )
5
10
vi domain size
5
3
#(ci ∈ C)
15
10</p>
      <p>The outcome of this experiment is shown in Table 5.</p>
      <p>From the three compared approaches to the clustering of
constraints in a configuration knowledge base, variable
similarity based clustering clearly outperforms operator-based
clustering and random clustering of constraints.</p>
      <sec id="sec-3-1">
        <title>Study B: Cognitive Complexities. There are different</title>
        <p>possibilities to represent equivalent semantics on the basis of
a constraint, for example, the requires relationship X → Y
can be represented in terms of ¬X ∨ Y . The
incompatibility relationship ¬(X ∧ Y ) can be represented as X → ¬Y .
Table 6 depicts vfie different possibilities to express requires
and incompatibility relationships.</p>
        <p>1We used these tasks to measure knowledge understanding.
Further more differentiated tasks are within the scope of future work.</p>
        <sec id="sec-3-1-1">
          <title>Requires</title>
          <p>X → Y
¬X ∨ Y
¬Y → ¬X
¬(X ∧ ¬Y )
Y ← X</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>Incompatibility</title>
          <p>X → ¬Y
¬X ∨ ¬Y
Y → ¬X
¬(X ∧ Y )
¬Y ← X</p>
          <p>Study B is based on an within-subjects design (N=66) with
two configuration knowledge bases. Knowledge base kbb1
consisted of a set of requires constraints and kbb2 consisted
of a set of incompatibility constraints. Each study
participant (again, computer science students who visited a related
knowledge engineering course) had the task of finding a
solution for the given CSP. Each participant was confronted with
one version of kbb1 and one version of kbb2 conform the
schema depicted in Table 6. For example, if a student
received the X → Y version of kbb1 then she/he also received
the X → ¬Y version of kbb2. The knowledge bases kbb1
and kbb2 were (again) defined in a domain-independent
fashion (see Study A). The basic properties of the used knowledge
bases are summarized in Table 7.</p>
          <p>Knowledge base
kbb1
kbb2
#(vi ∈ V )
5
3
vi domain size
5
3
#(ci ∈ C)
7
5</p>
          <p>The outcome of this experiment is shown in Table 8.
kbb1: SOL</p>
          <p>X → Y
¬X ∨ Y
¬Y → ¬X
¬(X ∧ ¬Y )</p>
          <p>Y ← X
errors
21.43%
50.0%
96.43%
73.08%
25.0%
kbb2: SOL
X → ¬Y
¬X ∨ ¬Y
Y → ¬X
¬(X ∧ Y )
¬Y ← X
errors
14.29%
34.62%
50.0%
42.31%
16.67%</p>
          <p>A result of the study is that basic implications (→) should
be preferred to other representations in order to maximize
understandability. The only type of knowledge representation
with a similar performance is the reverse implication,
however, when comparing both alternatives, the standard
implication seems to be the better choice.
4</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Related Work</title>
      <p>There is a long history of research on the improvement of
knowledge engineering processes. Early research focused on
model-based knowledge representations that allowed a
separation of domain and problem solving knowledge. An
example of such a representation are constraint technologies which
became extremely popular as a technological basis for
industrial applications [Freuder, 1997]. In a next step, graphical
knowledge representations [Felfernig et al., 2000] and
intelligent techniques for knowledge base testing and debugging
have been developed [Felfernig et al., 2004]. The need of an
intuitive access to a corpus of software artifacts is also one of
the major requirements for software comprehension [Storey,
2006]. In this context, recommender systems [Jannach et
al., 2010] have already been identified as a valuable means
to provide intelligent support for the navigation in large and
complex software spaces (see, e.g., [Robillard et al., 2010]).
The application of recommendation technologies for
supporting knowledge engineering processes is a new research area.
Research contributions in this field have the potential to
significantly improve the overall quality of knowledge
engineering processes. In [Felfernig et al., 2010] basic knowledge
representations are compared, for example, the use of → to
represent an implication vs. the use of ¬ and ∨. This work
is an important step towards a discipline of empirical
knowledge engineering with a clear focus on usability aspects and
cognitive efforts needed to complete knowledge engineering
tasks. The work presented in this paper is a continuation of
the work of [Felfernig et al., 2010]. It takes a more detailed
look at different alternative representations of requires and
incompatibility relationships and introduces a new concepts
related to the content-based clustering of constraints.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>In this paper we showed how recommenders can be exploited
to support knowledge engineering tasks. Examples are
collaborative filtering of constraint sets, clustering of constraints,
and knowledge-based recommendation of refactoring
operations. Future work will include the development of
further recommendation algorithms, for example, the inclusion
of content-based filtering and further clustering algorithms
as well as further empirical studies with more differentiated
maintenance tasks. Finally, we will focus on an in-depth
analysis of existing research in the area of cognition psychology
which can further advance the state of the art in
(configuration) knowledge engineering.
[Storey, 2006] M. Storey. Theories, tools and research
methods in program comprehension: past, present and future.</p>
      <p>Software Quality Journal, 14:187–208, 2006.
[Stumptner et al., 1998] M. Stumptner, G. Friedrich, and
A. Haselbo¨ck. Generative Constraint-based Configuration
of Large Technical Systems. AI EDAM, 12(4):307–320,
1998.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <source>[Burke</source>
          ,
          <year>2000</year>
          ]
          <string-name>
            <given-names>R.</given-names>
            <surname>Burke</surname>
          </string-name>
          .
          <article-title>Knowledge-based recommender systems</article-title>
          .
          <source>Library and Inf. Systems</source>
          ,
          <volume>69</volume>
          (
          <issue>32</issue>
          ):
          <fpage>180</fpage>
          -
          <lpage>200</lpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <source>[Felfernig and Burke</source>
          , 2008]
          <string-name>
            <given-names>A.</given-names>
            <surname>Felfernig</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Burke</surname>
          </string-name>
          .
          <article-title>Constraint-based recommender systems: Technologies and research issues</article-title>
          .
          <source>In ACM International Conference on Electronic Commerce (ICEC08)</source>
          , pages
          <fpage>17</fpage>
          -
          <lpage>26</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [Felfernig et al.,
          <year>2000</year>
          ]
          <string-name>
            <given-names>A.</given-names>
            <surname>Felfernig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. E.</given-names>
            <surname>Friedrich</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Jannach</surname>
          </string-name>
          .
          <article-title>UML as Domain Specific Language for the Construction of Knowledge-based Configuration Systems</article-title>
          . IJSEKE,
          <volume>10</volume>
          (
          <issue>4</issue>
          ):
          <fpage>449</fpage>
          -
          <lpage>469</lpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [Felfernig et al.,
          <year>2001</year>
          ]
          <string-name>
            <given-names>A.</given-names>
            <surname>Felfernig</surname>
          </string-name>
          , G. Friedrich, and
          <string-name>
            <given-names>D.</given-names>
            <surname>Jannach</surname>
          </string-name>
          .
          <article-title>Conceptual modeling for configuration of mass-customizable products</article-title>
          .
          <source>Artificial Intelligence in Engineering</source>
          ,
          <volume>15</volume>
          (
          <issue>2</issue>
          ):
          <fpage>165</fpage>
          -
          <lpage>176</lpage>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [Felfernig et al.,
          <year>2004</year>
          ]
          <string-name>
            <given-names>A.</given-names>
            <surname>Felfernig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Friedrich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Jannach</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Stumptner</surname>
          </string-name>
          .
          <article-title>Consistency-based diagnosis of configuration knowledge bases</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <source>[Witten and Frank</source>
          , 2005]
          <string-name>
            <given-names>I.</given-names>
            <surname>Witten</surname>
          </string-name>
          and
          <string-name>
            <given-names>E. Frank. Data</given-names>
            <surname>Mining</surname>
          </string-name>
          . Morgan Kaufman,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>