On Developing a Distributed CBR Framework
         through Semantic Web Services ?

                 Belén Dı́az-Agudo, Pedro A. González-Calero,
                Pedro P. Gómez-Martı́n, Marco A. Gómez-Martı́n

                     Dep. Sistemas Informáticos y Programación
                     Universidad Complutense de Madrid, Spain
                  email: {belend, pedro, pedrop, marcoa}@sip.ucm.es


        Abstract. jCOLIBRI is an object-oriented framework in Java that pro-
        motes software reuse for building CBR systems, integrating the appli-
        cation of well proven Software Engineering techniques with a knowledge
        level description that separates the problem solving methods, that de-
        fine the reasoning process, from the domain model. In this paper we
        envision the evolution of this framework into an open distributed frame-
        work where contributions to the framework are published, searched and
        integrated through Semantic Web Services.


1     Introduction

Case-Based Reasoning (CBR) is one of most successful applied AI technologies
of recent years. Commercial and industrial applications can be developed rapidly
and existing corporate databases can be used as knowledge sources. CBR is based
on the intuition that new problems are often similar to previously encountered
problems, and therefore, that past solutions may be reused (directly or through
adaptation) in the current situation. CBR systems typically apply retrieval and
matching algorithms to a case base of past problem-solution pairs.
    Developing a CBR system is a complex task where many decisions have to be
taken. The system designer has to choose how the cases will be represented, the
case organization structure, which methods will solve the CBR tasks and which
knowledge, besides cases, will be used by these methods. This process would
greatly benefit from the reuse of previously developed CBR systems.
    Software reuse is a goal that the Software Engineering community has pur-
sued from its very beginning. A number of technologies have appeared that
directly or indirectly promotes software reuse. Unfortunately AI systems have
remained for too long in the prototype arena and, in general, AI researchers do
not worry too much about software engineering concerns. The most significant
and long term effort within the AI community to attain effective software reuse is
the KADS methodology [12] and its descendants. The KADS approach for build-
ing knowledge based systems proposes the reuse of abstract models consisting
?
    Supported by the Spanish Committee of Science & Technology (TIC2002-01961)
of reusable components, containing artificial Problem Solving Methods(PSMs),
and ontologies of domain models.
    During the last few years we have developed jCOLIBRI1 , a framework for
developing CBR systems [6–8, 4]. jCOLIBRI promotes software reuse for build-
ing CBR systems, and tries to integrate the best of both worlds: the application
of well proven Software Engineering techniques with the KADS key idea of sep-
arating the reasoning process (using PSMs) from the domain model.
    In this paper we envision the evolution of this framework into an open dis-
tributed framework where contributions to the framework are published, searched
and integrated through semantic web services; using component technologies the
PSMs that were thought as internal methods of the framework become exter-
nal components. Section 2 describes the main ideas lying behind jCOLIBRI and
its current architecture pointing out some limitations we have encountered. We
propose a solution to these problems based on Semantic Web Services in Section
3. Finally, Section 4 concludes.


2     jCOLIBRI

At the knowledge level jCOLIBRI is built around a task/method ontology that
guides the framework design, determines possible extensions and supports the
framework instantiation process. Task and methods are described in terms of
domain-independent CBR terminology.
    Every CBR system makes use of CBR terminology, the type of entities that
the CBR processes manage. A CBR ontology elaborates and organizes the ter-
minology found in, ideally, any CBR system to provide a domain independent
basis for new CBR systems. On this way, CBROnto [8] elaborates an extensive
ontology over CBR terminology, the idea beyond this ontology is to have a com-
mon language to define the elements that compose a CBR system and to be able
to build generic CBR methods independent of the knowledge domain.
    Within a knowledge level description, PSMs capture and describe problem-
solving behavior in an implementation and domain-independent manner. PSMs
are used to accomplish tasks by applying domain knowledge. Although various
authors have applied knowledge level analysis to CBR systems, the most relevant
work is the CBR task structure developed in [2]. At the highest level of generality,
they describe the general CBR cycle in terms of four tasks (4 Rs): Retrieve the
most similar case/s, Reuse its/their knowledge to solve the problem, Revise the
proposed solution and Retain the experience. Each one of the four CBR tasks
involves a number of more specific sub-tasks. There are methods to solve tasks
either by decomposing a task in subtasks or by solving it directly.
    Figure 1 depicts the task decomposition structure we use in our framework.
The task structure indexes a number of alternative methods for solving each
task, and each one of the methods sets up different subtasks in its turn. This
kind of task-method-subtask analysis is carried on to a level of detail where
1
    jcolibri-cbr.sourceforge.net
                                                                                                                                                             CBR_TASK
                                                                        Problem


                                                                                                                                   Retrieve
                                                                                                                            ObtainCases
                                                                                                                            AssessSim
             CBR_TASK                                                                                                       Assess
                                                                                                                            LocalSim
                                                                         New                 RET
                                                                         case                   RIE
                                                                                                   VE
                                                                                                                             Agregate
                                                                                                                             Sim
                    Retain (Remember)
                                                    Learnt                                                                          Select
                        Retain_Case                 case

                                                                                                        Retrieved
                        Retain_                                                                         case        New
                                                                                                                    Case                Reuse
                                                                                Previous
                        Knowledge                                               Cases


                                            EMBER
                           Retain_
                           retrieval_                                                                                                           Copy_Solution


                                        REM


                                                                                                                      USE
                           knowlege                                             Background
                                                                                                                                                Adapt_Solution


                                                                                                                    RE
                                                                                Knowledge
                          Retain_
                          reuse_                             Repaired                                                                      Select_Strategy
                                                             case
                          knowledge                                                               Solved
                                                                                                  case                                     Select_Discrepancy
                                                                                REVISE                                                     Modify_Solution
                                        Confirmed                                                       Suggested
                                        solution                                                        solution                     Apply_Transformation
                                                                                                                                        Local_Revision
                                                                 Revise
                                                                        Evaluate
                                                                         Repair


           Fig. 1. CBR execution cycle [2] and CBROnto Task Structure


the tasks are primitive with respect to the available knowledge (i.e. there are
resolution methods). Besides this task structure, jCOLIBRI includes a library
of PSMs to solve these tasks. It describes CBR PSMs by relating them within
CBROnto concepts representing the tasks and domain characteristics. PSMs in
our library are organized around the tasks they resolve. We also need repre-
senting the method knowledge requirements (preconditions), and the input and
output “types”. These characteristics are described by using vocabulary (i.e.
concepts) from the CBROnto ontology.

2.1   Framework Architecture
The jCOLIBRI framework is organized around the following elements and inte-
grated through the architecture of Figure 2:
Tasks and Methods XML files describe the tasks supported by the framework
   along with the methods for solving those tasks.
Problem solving methods The actual code that supports the methods in-
   cluded in the framework.
Case Base Different connectors (XML, JDBC, RACER, . . . ) are defined to
   support several types of case persistency, from the file system to a data base
   [4].
Cases The framework includes a number of interfaces and classes to provide an
   abstract representation of cases.
    Tasks are a key element of the system since they drive the CBR process
execution and represent the method goals. Tasks can be added to the framework
at any time, although including a new task is useless unless an associated method
exists.
                  Lighweight               Browser              Web Service
                     Java                   Client                Client
                                             Clients


                                           Interfaces

                         API             Web Service            EJB


                                        JCOLIBRI                      PSM1
                      CBROnto            CORE                          .
                                                          Task
                      General                           Structure
                                                                       .
                      Domain                                           .

                      Case Base                                       PSMn


                                    Data / Knowledge Sources
                                            File
                 RACER         DB                         XML         JESS
                                           System


                          Fig. 2. jCOLIBRI architecture


    Regarding methods, most approaches consider that a PSM consists of three
related parts. The competence is a declarative description of what can be achieved.
The operational specification describes the reasoning process. The requirements
describe the knowledge needed by the PSM to achieve its competence [9].
    Some approaches like CommonKADS [10] specify much of how the PSM
achieves its goals, i.e. the reasoning steps, the data flows between them, and the
control that guides their execution. As we focus on PSM applicability assessment
we consider what the method does, i.e. the task it solves, and its knowledge re-
quirements, and leave control-flow issues to informal documentation and method
implementation code. This allow us to use a black box type of method reuse.
    Our approach to the specification of PSMs competence and requirements
makes use of ontologies and provides two main advantages. First, it allows formal
specifications that add a precise meaning and enables reasoning support. Second,
it provides us with important benefits regarding reuse because task and method
ontologies can be shared by different systems.
    Method descriptions follow an XML schema. This elaborated description has
a counter-part description in a Description Logic(DL) syntax (we use OWL with
RACER as the inference engine). It includes the following elements:

Name The fully qualified name of the class that implements the method. This
   class must implements the CBRMethod interface.
Description A textual description of the method.
ContextInputPrecondition A formal description of the applicability require-
   ments for the method, including input requirements.
Type jCOLIBRI manages two types of methods: execution (or resolution) and
   decomposition. Execution methods solve the task, for which has been as-
   signed to, while decomposition ones divide the task into other tasks.
Parameters Method configuration parameters (Inputs and outputs). These pa-
   rameters are the variable hooks of the method implementation. For example,
   a retrieval method may be parameterized with the similarity function to ap-
   ply. They are described by concepts that belong to the CBROnto ontology.
Competencies The list of tasks this method is able to solve.
Subtasks In decomposition methods this element provides the list of tasks that
   result from dividing the original task.
ContextOutputPostcondition Output data information obtained from this
   method execution. The information will be used to check which method can
   take as input the output of this one.
   Building a CBR system consists on the instantiation of the jCOLIBRI frame-
work. It is a configuration process where the system developer selects the tasks
the system must fulfill and for every task assigns the method that will do the
job. The execution of the resulting CBR system can be seen as a sequence of
method applications where a method takes as input the output of the previous
one. jCOLIBRI provides an user interface that allows the developer to choose
the methods to be applied to perform every task.
   Ideally, the system designer would find every task and method needed for
the system at hand, so that she would program just the representation for cases.
However, in a more realistic situation a number of new methods may be needed
and, less probably, some new task. Since jCOLIBRI is designed as an extensible
framework, new elements will smoothly integrate with the available infrastruc-
ture as long as they follow the framework design.
   Obviously, not every method designed to solve a certain task can be applied
once the method that solve the previous task has been fixed. For example, it
makes no sense to apply a voting mechanism to obtain the result in the reuse
process if the retrieval one returns just one case.
   Apart from input/output constraints, method applicability can be also de-
termined by more general constraints such as the requirement of a particular
organization for the Case Base or the availability of a given type of similarity
function defined on cases. These requirements are expressed as descriptions in
a DL and correspond to the conditions to be satisfied by the context of the
CBR system. The element ContextInputPrecondition in a method description
describes the requirements that the application of the method impose on the
input context, while the element ContextOutputPostcondition describes how the
context is affected by the execution of this method. The method applicability
checking is made using description logics and CBROnto.

2.2   Limitations that guide jCOLIBRI towards a distributed
      architecture
Nowadays, jCOLIBRI is managed using sourceforge, a software development
website that provides a version control system (CVS). Users check out the source
code of the framework, and use its library of PSMs, or extend or create new meth-
ods. If the programmer wants to share her/his methods with all the community,
(s)he has to commit the files to the central distribution.
    Our goal with jCOLIBRI has been to provide with a reference framework
for CBR development that would grow with contributions from the community.
Even though it, we have found several difficulties within contributions due to
the current monolithic architecture, namely:
1. Previously to the addition to the framework, all the contributions have to
   be processed, sometime too time consuming to the developer team.
2. It is not always easy to decide against incorporating some new method,
   because many of them, though useful for some other users, are too much
   specific for some kind of systems.
3. The contributions added to the framework are not incorporated to the local
   copy of the other users while they stay with the same version. As in the
   process of framework instantiation the system designer search for every task
   and method needed for the system using the local copy of the system, it is
   possible he is missing the opportunity of reuse other new methods.
4. CBR system designers usually find tedious (and a waste of time) to con-
   tribute to the method library of jCOLIBRI.
    We have found that these difficulties can be tackled using a distributed ar-
chitecture. This new approach is used both in the developing of a new CBR
system and in its execution, using remote method calls and OWL-S [11] for the
description of those methods.


3     Distributed Architecture
The distributed model is not meant to substitute the main core of the framework
but help the publication of the new methods without having to contact with
jCOLIBRI development team.
     Users still checkout the last release of jCOLIBRI with the library of core
PSMs, and they continue using the GUI in order to create new CBR systems. The
difference arises when a jCOLIBRI user (CBR designer) has created (or modified)
a method and (s)he finds it interesting enough for the rest of the community.
Instead of sending it to be incorporated in the next release (increasing more and
more the core of the framework) (s)he publicizes the method and allows that
other (external) systems use it remotely.
     With this approach, jCOLIBRI GUI should be able to find remote methods,
i.e., it does not search only in the local copy of the framework but it uses the
same techniques to search in the complete set of available remote PSMs.

3.1   Our proposal
Our proposal is using the jCOLIBRI GUI as a service discovery tool. Using
the same techniques that we are using now (locally over the library of CBR
methods) [8], we aim to widen the scope of the search space to the semantic
web, in particular to the set of CBR services (previously called methods, the
PSMs) publicly available from the CBR community.
    Our CBR ontology (CBROnto) defines CBR related terminology to describe
CBR methods [6, 8]. So, the first step we are doing is exporting the methods in
our library. OWL-S has some “hooks” where different kind of information can
be added, even information outside OWL-S or outside OWL. These extensions
could require some kind of specialized reasoner for them. We are integrating
CBROnto in OWL-S using the hooks, using OWL itself as language to relate
them.
    The OWL-S Service Profile class does not dictate any representation of ser-
vices: using OWL subclassing anyone can create specialized representations for
them to be used as service profiles. We could design a new Service Profile subclass
containing the information we consider important for our CBR methods.
    Nevertheless, OWL-S provides the class Profile as a possible representation.
An OWL-S Profile contains the functional description of the service specifying
the inputs and preconditions required, the outputs generated, and the expected
effects (postconditions). These four attributes are stored using OWL properties
in the Profile class. It also contains more general information as the service
name, a general description, a service category, etcetera. We have planned a
mapping between the information currently contained in the XML Schema of
our jCOLIBRI methods and the properties of OWL-S Profile.

 1. Name: it is mapped to a string by means of the profile:ServiceName prop-
    erty.
 2. Description: it also mapped to a string using the profile:textDescription
    property.
 3. Parameters: they contain both input and outputs. OWL-S has a property
    called “hasParameter” which is subclassed in “hasInput” and “hasOutput”.
    We will organize our previous parameter information to be correctly catego-
    rized as inputs or outputs using these properties. We will discuss about the
    range of hasParameter shortly.
 4. Competencies: they are the list of tasks the method is able to solve. We are
    mapping them into the Profile serviceCategory property, using the Service-
    Category class (the range). In this way, ServiceCategory is used to describe
    categories of services on the bases of some classification that is outside OWL-
    S, but understood by our CBROnto specialized reasoner.
 5. ContextInputPrecondition: we will use hasPrecondition property to
    store this information. We discuss its range (expr:Condition) below.
 6. ContextOutputPostcondition: hasResult property will be used.

   There are two elements of the method descriptions that are not mapped
in OWL-S: the method type (execution or decomposition methods) and the
subtasks. Both elements are relative to the task/method ontology. Currently we
are limiting ourselves to use semantic web with the execution methods, so it is
not important to incorporate into OWL-S information about decomposition.
    Both preconditions and effects need information about the inputs and outputs
of the methods respectively. They are expressed using OWL-S Input and Output
classes, that are subclasses of Parameter. They must specify the parameter
types, and, in some cases, their values. Type is store as an URI, and value as
a plain text. Specialized reasoner using OWL-S as a “container” should be give
sense to them when the discovery is taking place.
    In our case, types are specified using the name of the classes in CBROnto
that modelize them. Our reasoner will test if each class is a descendant class of
CBRTerm.
    OWL-S uses logical formulas to represent preconditions and effects. Instead
of integrating them into RDF, OWL-S treats the formulas (expressions and con-
ditions) as string literals or XML literals, which reference the inputs and outputs
defined somewhere else. External reasoners are supposed to be able to analyse
and understand these strings.
    OWL-S has two basic classes concerning expressions. Expression class con-
tains the string with the logical formula. It has the property expressionLanguage
related to the LogicLanguage class. OWL-S includes three instances of this con-
cept, referring to some concrete languages: SWRL, KIF and DRS. We have added
the OWL language to describe description logic formulas that can be used to
express the kind of conditions and effects that we need to model. Our reasoner
used to service discovery employ the OWL expressions and RACER inference
engine.
    As said before, service profiles intention is to store information referring to
“what the service does”. OWL-S services also keep “how the service works” in
the so-called service models. Concretely, OWL-S includes a Service Model class
that, as Service Profile, is mainly empty, but concreted in the Process subclass.
Its information is specially useful for composite services which store some kind
of state between interactions.
    The name “composite services” can suggest some kind of relationship with
our decomposition methods. However, our decomposition methods cannot be
modeled using the composite services supported by OWL-S because they have
different semantic. Our decomposition methods follow a kind of “divide&conquer”
philosophy being the method in charge of invoking the submethods. The OWL-S
idea of composite services refers to the user calling to the different subservices.
In other words, a composite process is not a behaviour a service will do, but a
behaviour (or set of behaviours) the client can perform by sending and receiv-
ing a series of messages [11]. Consequently, our methods will be always atomic
process from the OWL-S point of view, and we are not currently interested in
making an advanced use of the Process subclass.


CBR services discovery
 Converting jCOLIBRI framework to a distributed component-based system
using OWL-S implies a change in the way the jCOLIBRI development GUI
works. We need some kind of central registry where third-part components (also
called methods or services) are published using OWL-S + CBROnto, and the
development GUI searches concrete methods using it, depending on the user
requirements.
    This central registry extracts the information referring to the query from the
“OWL-S container”, and obtains the specification on top of CBROnto in order to
look for some existing method using our techniques based on description logics.
Concretely, the main issue of verifying whether or not a service satisfies the set
of restrictions is accomplish using Racer as inference engine.
    As said before, currently we are not interested in method composition at
this level. Tasks are decomposed using the task/method ontology in the local
development tool, and all the queries are concerned to the “leaf” methods once
all the decompositions have been decided.


4   Conclusions

We have presented jCOLIBRI, an object-oriented framework in Java to build
CBR systems. This framework is built around a task/method ontology that
facilitates the understanding of an intrinsically sophisticated software artifact.
The current implementation of jCOLIBRI has been recently released as an open
source effort to serve as development tool and profit from the input of the CBR
community.
     In the current framework-centered architecture of jCOLIBRI new CBR sys-
tems are developed through framework instantiation. In this process, users may
extend available classes, developing new methods as needed. In our role as devel-
opers of the main core of the system, we expect those programmers to contribute
to our method collection with the most relevant ones. We intend to process and
filter all these third-party methods and add to the next release those potentially
interesting to jCOLIBRI users.
     In this paper we have proposed a new distributed architecture for jCOL-
IBRI profiting from the similarities between PSMs and component-based reuse.
jCOLIBRI provides a battery of PSMs, and the programmer searches for the
most useful for his purpose. Separating PSMs from the main core by using Web
Services get us closer to the concept of software components technology, bring-
ing its possibilities to jCOLIBRI. In that sense, both services and main core can
evolve independently, so the version problem decrease. Each method developer
is responsible for controlling his own versions of each method, and guarantees
the backward compatibility, maybe using techniques used in other component
technologies such as DCOM or Enterprise JavaBeans.
     Web Services and remote invocation let the implementers choose a program-
ming language different to that used in the jCOLIBRI implementation (Java).
Any developer who respects the rules of the Semantic Web and creates correct
descriptions in OWL-S of his services using CBROnto will be creating methods
that will be available for the rest of the community.
     The main drawback of distributed architecture is the performance due to
the speed of remote calls, specially when the method granularity is high. The
methods’ implementer should create them with this problem in mind, trying to
provide suitable interfaces for them but minimizing the number of invocations.
Another solution is to let the user of the method to get a copy of the source code
to allow him to install the PSM as a local component reducing the call overload.
    Our (ambitious) goal is to provide a reference framework for CBR develop-
ment that would grow with contributions from the community. This reference
would serve for pedagogical purposes and as bottom line implementation for pro-
totyping CBR systems and comparing different CBR approaches to a given prob-
lem. This idea is so mature in the community that several efforts are pursuing
it at time of writing: CAT-CBR [3], a component-based platform for developing
CBR systems; JavaCREEK the Java implementation of the CREEK architecture
for knowledge-intensive CBR systems [1]; IUCBRF [5] a Java framework devel-
oped at Indiana University, to mention just a few. The main contribution of the
work presented in this paper is along the line of proposing a distributed architec-
ture where different approaches to CBR system development would collaborate
instead of compete.

References
 1. A. Aamodt. Knowledge-intensive case-based reasoning in CREEK. In Procs. of
    the (ECCBR 2004), pages 1–15. Springer-Verlag, 2004.
 2. A. Aamodt and E. Plaza. Case-based reasoning: Foundational issues, methodolog-
    ical variations, and system approaches. AI Communications, 7(i), 1994.
 3. C. Abásolo, E. Plaza, and J.-L. Arcos. Components for case-based reasoning sys-
    tems. Lecture Notes in Computer Science, 2504, 2002.
 4. J. J. Bello-Tomás, P. A. González-Calero, and B. Dı́az-Agudo. JColibri: An object-
    oriented framework for building cbr systems. In Procs. of the (ECCBR 2004), pages
    32–46. Springer-Verlag, 2004.
 5. S. Bogaerts and D. Leake. IUCBRF: A Framework For Rapid And Modular
    Case-Based Reasoning System Development. http://www.cs.indiana.edu/ sbo-
    gaert/CBR/IUCBRF.pdf.
 6. B. Dı́az-Agudo and P. A. González-Calero. An architecture for knowledge intensive
    CBR systems. In E. Blanzieri and L. Portinale, editors, Advances in Case-Based
    Reasoning – (EWCBR’00). Springer-Verlag, Berlin Heidelberg New York, 2000.
 7. B. Dı́az-Agudo and P. A. González-Calero. Classification based retrieval using
    formal concept analysis. In Procs. of the (ICCBR 2001). Springer-Verlag, 2001.
 8. B. Dı́az-Agudo and P. A. González-Calero. CBROnto: a task/method ontology
    for CBR. In S. Haller and G. Simmons, editors, Procs. of the 15th International
    FLAIRS’02 Conference (Special Track on CBR, pages 101–106. AAAI Press, 2002.
 9. A. Gómez and R. Benjamins. Overview of knowledge sharing and reuse compo-
    nents: Ontologies and problem-solving methods. In IJCAI99 workshop on Ontolo-
    gies and Problem-Solving Methods. Sweden, 1999.
10. T. Schreiber, B. J. Wielinga, J. M. Akkermans, W. V. de Velde, and R. de Hoog.
    CommonKADS: A comprehensive methodology for KBS development. IEEE Ex-
    pert, 9(6), 1994.
11. The OWL Services Coalition. OWL-S: Semantic Markup for Web Services.
    http://www.daml.org/services/owl-s/1.1/overview/.
12. B. Wielinga, A. Schreiber, and J. Breuker. Kads: A modelling approach to knowl-
    edge engineering. Knowledge Acquisition, 4(1), 1992.