=Paper= {{Paper |id=Vol-34/paper-4 |storemode=property |title=Case Based Reasoning for Knowledge Management in KDD Projects |pdfUrl=https://ceur-ws.org/Vol-34/bartlmae_riemenschneider.pdf |volume=Vol-34 |dblpUrl=https://dblp.org/rec/conf/pakm/BartlmaeR00 }} ==Case Based Reasoning for Knowledge Management in KDD Projects== https://ceur-ws.org/Vol-34/bartlmae_riemenschneider.pdf
 Case Based Reasoning for Knowledge Management in KDD-Projects
    Concepts, Organizational Setting, Categorization into KM and Application in the case of
                             Knowledge Discovery in Databases


                   Kai Bartlmae                                                                    Michael Riemenschneider
               DaimlerChrysler AG,                                                                   DaimlerChrysler AG,
            Research and Technology 3                                                              Research and Technology 3
             D-89013 Ulm, Germany                                                                   D-89013 Ulm, Germany
        kai.bartlmae@daimlerchrysler.com



                                                                                       This is not a new problem, so i.e. Heisig [HEI98b]
                                                                                       proposes that before a new project is being started, a plan
                                 Abstract                                              for collecting know-how and experiences should be
                                                                                       prepared, considering the topics of:
     In this paper we introduce our departments
     organizational and technical infrastructure for                                   •   Who is responsible for experience collection?
     knowledge-intensive      and     weak-structured                                  •   Where can know-how be gained?
     processes: A framework for Knowledge                                              •   Who gained certain experiences?
     Management in the case of projects in                                             •   In what form should the experience be documented?
     Knowledge Discovery in Databases (KDD). It is
                                                                                       •   How are the experiences collected and saved?
     based on the experience factory approach and the
     method of case based reasoning. We introduce                                      •   How are the experiences be disseminated?
     both approaches in the context of knowledge
     management, derive application-areas and                                          But experience documentation has many barriers, so it is a
     introduce our realization for projects in                                         time intensive task and the person documenting it will in
     knowledge discovery in databases.                                                 many cases not be the user of it and therefore reluctant to
                                                                                       share [KPMG98]. Further, the project teams are under
                                                                                       time pressure and therefore the motivation for
1 Introduction                                                                         documenting experiences is initially low. A goal of an
                                                                                       approach must be to give the team members help and time
Many knowledge intensive activities take place in project                              when documenting their own project experiences as well
organizations, where project teams form a temporal                                     as giving them the information they need as easy and
organization, which are disbanded after the projects are                               quick as possible, releasing them from administrative
completed. This shows especially true for the work, our                                work. Further, project-management has to make the team
department FT3/AD is involved in, Knowledge Discovery                                  aware of the need for knowledge management, to define
in Databases. Here we analyze customer databases of                                    processes for it, to train the teams, and last but not least to
different DaimlerChrysler branches i.e. for marketing                                  introduce a technical infrastructure to collect, disseminate
reasons or for assessing credit risk. Because we work in                               and reuse them.
these temporal teams, it is our interest that the experience
gained in these projects should not only be kept as the                                Here we try to approach these problems with the concept
team members personal knowledge, but be kept within our                                of case based reasoning together with the experience
business organization in order to be reused.                                           factory concept by Basili et al. [BCR94], building the
                                                                                       base of the Experience Factory in Knowledge Discovery
The copyright of this paper belongs to the paper’s authors. Permission to copy         in Databases at FT3/AD (see also [BAR99]). It covers
without fee all or part of this material is granted provided that the copies are not   necessary aspects mentioned above for knowledge
made or distributed for direct commercial advantage.
                                                                                       management in project work. The approach has its basis in
Proc. of the Third Int. Conf. on Practical Aspects of
                                                                                       the domain of software engineering and successfully be
Knowledge Management (PAKM2000)
                                                                                       applied by Althoff et al. [ABT97].
Basel, Switzerland, 30-31 Oct. 2000, (U. Reimer, ed.)
http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-34/




K. Bartlmae, M. Riemenschneider                                                                                                                   2-1
In this paper we introduce both concepts, show where they                               meaning additionally effort for the project members. This
can complete each other and how they cover the different                                is why an organizational separation of collection and the
building blocks of knowledge management. The paper                                      creation of experiences might prove useful. The
concludes with the description of the KDD-experience                                    organization for collecting, structuring, saving and
factory, describing selected experience package types                                   disseminating of experiences is called an Experience
used.                                                                                   Factory by Basili. Experience packages (EP) are its form
                                                                                        for representing experiences of different structure and
                                                                                        types, from data to process definitions. These are saved in
2 Organizational view: The Experience                                                   an experience base, which can be compared to a
                                                                                        safeguarded organizational memory.
Factory approach
The approach of the experience factory has been                                         The experience factory approach is in its basic form very
introduced by Basili et al. as an evolutionary, experience                              abstract and conceptional [HOU99]. In order to apply it, it
based approach for the improvement of software-products                                 is necessary to define its specific goals, the tasks and
and    software-development-processes.         They     were                            processes of the involved agents and to install a
motivated by the realization, that collected experiences                                (technological) platform.
can improve development processes[HOU99]. Based on                                      The experience factory approach has been applied in
the Quality Improvement Paradigm (QIP), the experience                                  different applications, here we modified the model in
factory has been introduced as an organization that                                     order to apply it in the domain of Knowledge Discovery in
supports the projects teams in the different steps of QIP.                              Databases (Figure 1).

One basic determinant of it is its organizational separation                            The approach proposed by Basili has been tailored
from the project teams in order to compensate the                                       according to [ANT97]. We also distinguish the project
different goals project-teams and experience-management                                 teams, conducting different KDD-projects, and the
have[ABT97]:                                                                            experience factory organization, according to [HOU99],
While project teams try to reach their project goals fast                               with the roles of the Experience Engineer, the Experience
and within a cost-frame, experience management wants                                    Factory Manager and EF-supporting-agents. See also
the avoidance of mistakes or the installation of good                                   [BT98] for a similar differentiation.
practices using collected experiences. But this process of
experience collection is time consuming and costly,

                                    Company/
                                   Department                   Projectorganisation

                                                                                       DM-Project n
                                                                                     DM-Project 2
                                                                                   DM-Project 1

                                                                                                                  Planung/                                Business                            Data                             Data
                                                                                                                                                                                                                                                         Modeli ng                    Evaluat ion                 Deployment
                                                                                                                  Mapping                             Understanding                    Underst andi ng                    P reparation




                                                                                                      Pl anung/                       Busi ness                            Data                               Data
                                                                                                                                                                                                                                         Modeling                    Evaluation                     Depl oyment
                                                                                                      Mapping                      Understanding                     Understanding                       Preparat ion




                                                                                          P lanung/                    Busi ness                        Data                             Dat a
                                                                                                                                                                                                                        Modeling                    Evaluation                    Deployment
                                                                                          Mapping                  Understanding                  Understanding                      Preparation




                                   Management

                                                              Lessons Learned,                                                                                                                                                                                                                              Lessons Learned, Reports,
                                                             KDD-Knowledge:                                                        Queries                                                                                                                                                                  Deliverables, Feedback,
                                                    Process, Products, Persons,                                                                                                                                                                                                                             Workshops, Meetings

                                                                Experience Factory FT3/AD
                                                                FT3/AD
                                                                                                                                                                                                                                                    Experience Base
                                        Reporting
                   Experience
                   existing in
                  the world at                                    Supporter             Experience

                      large                                                   Manager    Engineer                                                                                                                                  EP1                                                         EP2                             EPn




                                                                                                      Information Flow
                                                                                                      Influence
                                                                                                      Optional Information Flow


 Figure 1: Experience Factory in Knowledge Discovery in Databases.( See also [ANT97])




K. Bartlmae, M. Riemenschneider                                                                                                                                                                                                                                                                                                         2-2
3 The building blocks for Knowledge                                                   evaluation of products, through the cooperation with
Management within the Experience Factory                                              universities and hiring of new personnel. At FT3/AD we
                                                                                      installed the KDD-Shop, where we evaluate new tools and
The building blocks according to Probst [PRR99] build a                               train our own teams and that of the project members in
general framework for Knowledge Management and is                                     order keep track with the state-of-the-art in our domain.
based on a 2-layered learning cycle. The outside cycle
consists of the elements goals, realization and valuation                             Our experience engineers are responsible for the
and describes a traditional management control cycle. The                             documentation of experience packages and artifacts of
inner cycle is represented by the blocks of knowledge                                 projects. This is done in cooperation with the project team
identification,   knowledge      acquisition,   knowledge                             and according to the EF-management's formulated
development, knowledge distribution, knowledge use and                                operative knowledge goals of what types should be
knowledge preservation. The blocks represent an overall                               collected and how the infrastructure and processes should
approach for management of knowledge in a business                                    look. They are therefore all the persons responsible for
organization. Since the conceptual approach by Probst et                              knowledge preservation. Further, the EF-supportive
al. incorporates the whole organization, the different                                agents take part in the collection of information and
concrete knowledge managing initiatives have to be fitted                             experiences within the project teams.
in this general approach. Here the Experience Factory
should represent a concretization of these blocks. We                                 In the outer cycle, the responsibility for setting knowledge
therefore mapped the approach in order to investigate,                                goals can be found on different levels. Probst et al.
how the EF fulfilled these. Further, an approach should be                            differentiates between levels for formulating knowledge
integratable, problem-oriented, understandable, action-                               goals, of interest are more the operative ones. Here
oriented and give instruments. Here we want to introduce,                             realistic goals have to be formulated and further, measures
how the experience factory and the used CBR-approach                                  to value these have to be defined and evaluated, closing
realize these requirements.                                                           the loop with the formulation of optimized knowledge
                                                                                      goals.
In the inner cycle, the blocks of knowledge use and
knowledge development is in the scope of the project                                  It can be seen, that the basic roles and responsibilities of
organizations and teams. On the other hand, knowledge                                 the Experience Factory can be assigned to the KM
identification is one major task of the experience                                    building block approach in the context of our department
engineers, but depends on the help of the project teams.                              FT3/AD and Knowledge Discovery in Databases (see also
The EF-supporting agents are responsible for assisting the                            figure 2). Although the experience factory approach has
project-teams and the knowledge distribution. This can                                its focus on collection and reusing experiences in project
happen through joining the project teams, helping through                             work, it also covers with its roles the basic blocks of an
seminars in our so called KDD-Shop or last but not least                              general knowledge management approach. While the
through our experience base called Core-DM (Case                                      Experience Factory is more of an organizational approach,
Oriented Reuse of Experiences in Data Mining). Here our                               giving roles to the different persons, we now want to
department FT3/AD plays an experience factory-like role                               introduce a more technical approach for completion.
for the different departments conducting knowledge
discovery in databases in corporation with us. We                                     4 Cognitive Sciences View: Case Based
represent a competence center in KDD, helping project
partners to conduct KDD. As a research department
                                                                                      Reasoning
within DaimlerChrysler, we are further interested in the                              The approach of case based reasoning (CBR) and
development and application of new KDD technologies.                                  knowledge management share the same goal: the use and
We participate in KDD research and present the results to                             development of knowledge.
leading academic conferences. But on the other hand, we                               While one can understand under knowledge management
conduct knowledge acquisition through the buy in and                                  a general and large area, incorporating different methods

                                               Outer Cycle
                                               Inner Cycle
                                               Knowledge        Knowledge    Knowledge   Knowledge      Knowledge   Knowledge      Knowledge   Knowledge
                                               Identification   Aquisition   Development Distribution   Use         Preservation   Goals       Valuation
  EF                     EF Support Agent
  Organization           Experience Engineer
                         EF Manager
                         Exp-base /OM
  Management
  Project Organization
 Figure 2: Assigning the responsibilities of the EF organization to the building blocks of Knowledge Management
 of Probst et al.. (grey = important role, black = less important role).




K. Bartlmae, M. Riemenschneider                                                                                                                            2-3
and techniques, i.e. from organizational and technical,                           Probst framework and how the building blocks are
case based reasoning represents a very concrete approach                          covered by CBR.
for these mentioned goals.
As we did this in the last section with the experience                             The basic idea of case based reasoning is, that for solving
factory, we will now introduce the case based reasoning                           a new problem, a concrete similar but solved solution is
approach and show, how it can be used in the general                              tailored to the new context and reused [WES96]. It is



                                                                                                                         Project
                                   Experience Factory Organization                                                        Project
                                     Experience Factory Organization                                                   Organization
                                                                                                                        Organization
                           EF Support Experience        EF          CBR
                                                                      CBR
                             Agent      Engineer     Manager       System
                                                                     System

                             Support during
                Retrieve     documentation                                                                                 Problem
                                                                                                                          description,
                              and selection
                                                                                                                          query to the
                                                                                            Presentation                    system
                                                                                              of cases,
                                                                                             experience




                                                                                                                           Adaptation
                                                                                                                          and reuse of
                                                                                                                             cases

                Reuse
                                                                                                                          New solution




                                                                                                                         Verify solution
                                                                                                                          and further
                Revise         Confirmed
                                solution                                                                                  adaptation




                                                      Evaluation and                                                      Feedback to
                                                       adaptation of                                                      the EF about
                                                          cases,                                                           used cases
                                                       experiences




                                                      Evaluation and
                                                       adaptation of
                                                        the system
                                                                                           Save changed
                                                                                           domain model

                Retain                                                     Allow for
                                                                          changes of
                                                                            system
                               New case                    Final                           Save changed
                                                       evaluation of                         similarity
                                                          cases                              measures




                                                       Delete case


                                                                                            Save case in
                                                        Save case
                                                                                              system




  Figure 3: General tasks of the KDD experience factory along the CBR-process.




                                              Outer Cycle
                                              Inner Cycle

                                              Knowledge Knowledge         Knowledge Knowledge              Knowledge    Knowledge Knowledge   Knowledge
    CBR-Cycle                                 Identification Aquisition   Development Distribution         Use          Preservation Goals    Valuation
    Run-time      Retrieve
                  Reuse
                  Revise
                  Retain
                  CBR-System Design
   Figure 4: Mapping the CBR cycle onto the KM building blocks (grey = many similarities, black = some
   similarities).



K. Bartlmae, M. Riemenschneider                                                                                                                           2-4
based on a learning-cycle, including its phases retrieval,                 the necessary knowledge goals of what is to be reached
reuse, revise and retain of cases and experiences [AP94].                  with this approach and how its success can be measured.
It is based on cognition-psychology, stating that experts
tend to reuse concrete experiences rather than to solve                    The most important part of a CBR-system is the
new problems from the ground up. Case based reasoning                      experience base, where the cases are saved in the form of
tries to realize this idea by describing a problem and its                 experience packages. The experience packages are
solution by a set of attributes and saves them as a case into              accessed during the retrieve phase using a similarity based
a case- or experience base. Besides this knowledge in the                  measure. In most cases a technological platform exists, in
experience-base in the form of cases, it is necessary to                   order to do this in an easy and fast way. In figure 3, the
formulate general knowledge on how to select, interpret                    lifecycle of a case can be seen along the CBR-phases and
and transform cases, i.e. to formulate similarity measures                 the responsibilities according to the experience factory
or how to transform the old solution into the new one.                     organization.

5 The Experience Factory as organizational                                 In figure 4 we mapped the CBR cycle onto the KM
framework for realizing a case based                                       building blocks. It can be seen that the CBR cycle by
                                                                           [AP94] corresponds to the realization of the inner
reasoning system                                                           knowledge management cycle according to Probst et al.
Through the Experience Factory, a case based reasoning                     But also the design, evaluation and maintenance of a
system can be given a organizational framework                             CBR-system are important topics that need to be covered
[ABT97]. With this framework, it is possible to                            by an overall approach. Here, we see the EF-manager and
compensate the organizational deficits of the CBR                          the experience engineer responsible for the development
approach and assign responsibilities within the CBR-                       of the system, i.e. of the domain model, the structure, the
learning cycle:                                                            similarity measures and its technical implementation.

First of all, the EF-supporting agents, together with the                  In figure 5 we assigned for each of the EF roles the
project-teams, are responsible for collecting cases that are               different CBR-phases and added the missing building
candidates for being saved into the experience-base. They                  blocks. This combined framework of experience factory
give these to the experience engineer for further                          and case based reasoning now covers all necessary steps
documentation. On the other side, they are responsible for                 of a major knowledge management framework, making it
supporting the project teams by formulating queries to the                 to a concretization of the introduced KM-building blocks.
experience base and for retrieving old cases. They build
the interface between the project teams and the EF
organization.

                                                       Outer KM Cycle
                                                       Inner KM Cycle
                                                       CBR-System
                                                       Run-time
                                                                                              CBR System Knowledge   Knowledge
                                                       Retrieve    Reuse   Revise   Retain    Desgin     Goals       Valuation
          EF                     EF Support Agent
          Organization           Experience Engineer
                                 EF Manager
                                 Exp-base /OM
          Management
          Project Organization
   Figure 5: The experience factory roles and case based reasoning for the realization of knowledge management
   (grey = important role, black = less important role).



The experience engineer is responsible for the final                       6 Representation of KDD experience in a
structure and form of the cases. He is a safeguard that the                case based reasoning system
quality of the cases are adequate. Further he has to
evaluate and perform maintenance operations on the                         In a case based reasoning system, knowledge is saved in
experience base and its cases. If necessary, he alters the                 so called Knowledge Containers, which are case-base,
similarity measures for improved retrieval performance or                  structure/vocabulary, similarity measures and transaction
changes the case-structure. On the other hand, the EF-                     knowledge [RIC98].
management, together with the rest of the EF-team, sets




K. Bartlmae, M. Riemenschneider                                                                                                   2-5
The development of a CBR system starts with the                     which is in the sense of CBR.
structural description of the application domain. This
includes the kind of cases one wants to describe, their             We further used keywords to describe the packages
structure and the definition of their attributes. Further, a        textual components. The keyword concept allows the
similarity measure has to be defined for retrieval from the         introduction of additional context description and assists
experience base. As a last step, knowledge on how to                the      user     to     identify     useful      packages.
transform an old solution to the current situation can be            Rather than relying on the experience engineer to find
included through transaction rules, but in our case the             good keywords, we combine our structural CBR approach
transaction has to be performed by the user of the system           with a textual CBR technique (tCBR) for the
without technological help. The whole structural                    representation of the knowledge of the textual parts (See
description of a domain is called domain model and is               also [BL00]). Here we rely on the structured form of the
based on the following primitives [WES96]:                          cases and use the textual components to extract
                                                                    Information Entities (IEs) about the packages. The
    §    Attribute and types, describe features of a
                                                                    knowledge for identifying the IEs of the packages is given
         domain (i.e. Text, Reals, Integer)
                                                                    by a set of term indices, thesauri, a product/name-index
    §    Concepts, objects, describe concrete entities of           and a term-generalization-index. The content of the
         the domain                                                 dictionaries is collected by our domain experts or
    §    Relations describe the relationship between                automatically by parsing KDD related documents. For
         objects                                                    retrieving cases, we distinguish the attribute part, where
    §    Rules, describe rule-based relations between               we can make use of the structured domain model's
         objects                                                    predefined attributes and their possible values, and the
                                                                    textual part, which makes use of domain-dependent and
                                                                    common knowledge stored in the index-vocabulary,
Based on the structural description of the domain, a                thesauri and term-generalizations. For the textual parts, a
similarity measure is defined in order to retrieve similar          query to the experience base should give results similar to
cases from the case base. For each attribute of a given             a package, that contains similar expressions in the form of
case and a given query, a similarity can then be calculated,        the IEs. The resulting overall similarity is then calculated
which are aggregated to an overall similarity score                 as a weighted sum of the similarities of all attributes.
between a case and a query. The most similar cases can              Before the experience base can be queried, the packages'
then be presented to the user of the system. Using as               IEs have to be pre-calculated. This is done in an off-line
similarity measure makes it possible to find not only               process.
completely fitting packages, but also "near-matches",




                                                                CBR-Works                          CBR-Works
                                                               Case Navigator
                                                 Intranet
                                                                CBR-Works
                                                                CQL-Server
                                                                                      Similarity         Case
                                                   http        CBR-                   Measures           Base
                                                               Online



                                                            CQL-Query         CQL-Results          IE-Generation
                                                                                                   IE-Generation
                                                                                                      (PERL)
                                                                                                      (PERL)


                               WWW-Client
                               WWW-Client          http             Servlet           Fileaccess       HTML-
                                                                                                       HTML-
                               (z.B.
                               (z.B. Netscape)
                                     Netscape)                 Java Virtual Machine
                                                                                                      Template
                                                                                                      Template




                                                   http                               Fileaccess
                                                                  Webserver
                                                                  Webserver                           Artefacts




 Figure 6: Architecture of Core-DM..




K. Bartlmae, M. Riemenschneider                                                                                             2-6
7 Technical View: Realization of a CBR                         the factory (Table 1). We use an object-oriented package-
based Experience Factory in the case of                        model including generalizations so that common attributes
                                                               are shared by different package types. In the next section
Knowledge Discovery in Databases                               we will introduce three of the nine package types in more
                                                               detail. These package types represent the different classes
                                                               of packages used in the experience base, from very
                                                               structured information packages (i.e. artifacts), to semi-
At DaimlerChrysler, KDD is applied from different KDD-         structured packages using large textual component (i.e.
teams in projects from Credit-Scoring to Customer              lessons learned) in order to represent the knowledge. So
Relationship Management. We see KDD as a knowledge-            far we collected over 350 packages by evaluating different
intensive and weak-structured process, where the agents        KDD-projects and our KDD-documents like guidelines
have to choose on each step from a variety of options          and handbooks.
based on their background-knowledge with KDD. Further
we observed, that because of the repetitive application of     7.1 Lessons Learned-Packages
a standard-process model in KDD, CRISP-DM1,
experience can be used in successive projects. This makes      Experience Packages of this type describe solutions
organizational team support an important topic in the case     experienced in a concrete setting of a project (See figure
of KDD. Systematic knowledge creation, capture,                7). The packages are structured in a part for classification,
organization and use provides a way to support the KDD-        a main part of a solution-description, and a part giving
process model CRISP-DM. We therefore identified types          reasons for this solution (See also [HOU99]). For a first
of knowledge that can improve KDD-processes and ways           classification of the package, attributes describing a
on how experience can be integrated using a CBR-based          project3 and the KDD-step, where it occurred, are used.
experience factory for KDD, Core-DM.                           Especially of interest is the step in CRISP-DM, where the
                                                               package has been used or has been created. This is being
On the organizational side, we implemented the proposed        modelled by a taxonomy of all possible process phases
CBR based experience factory approach with its                 and steps and indicates in new projects, where they can be
processes. The technical architecture of the Core-DM           reused.
system can be seen in Figure 6 and is based on the             A further context specification is saved additionally to
commercial tool CBR-Works from TecInno. We                     each package. The KDD- and application-context
implemented an intranet-interface using java-servlet           description attributes help to characterize the context of
technology, which communicates with the CBR-Works-             the packages. These features include information about
Server using the CQL-case query language. Further, the         the overall goal of the KDD-Project (i.e. Prediction or
EF-teams use the CBR-Works Case-Navigator to author            Description of data), the KDD-problemtype (i.e.
the experience base. Since the user can access different       Regression, classification or segmentation of data), and
artifacts like KDD-reports, presentations or streams of our    information about the application context. In this case, we
Clementine Stream Library2, we further installed a simple      applied KDD in the area of marketing and credit-risk-
web-server.                                                    management and specify the concrete application within a
                                                               taxonomy of these areas. This context also includes
We derived nine types of experience packages to be             features about the objects being described by the data (i.e.
stored in the experience-base and disseminated through         private customer information or small commercial

      Experience Package Type                       Contains Experience about
      Documents                                     Documentation, code, reports required by CRISP-DM
      Process                                       Process-model steps definitions used in a project
      Data                                          Attributes and data-transformation used in former KDD-projects
      Product                                       Product-description of KDD-tools
      Solutions, Lessons Learned: KDD, Management   Problem/solution pairs, success-factors, mistakes, best practices
      Experts                                       Persons involved in KDD projects and skill-database
      Methods, Techniques                           KDD-methods and technique description i.e. neural nets
      Project                                       Project-characterization, KDD-problem type, goals, persons involved
      Formula                                       Error measures, quality measures
    Table 1: The experience package types used in the KDD-experience factory
                                                               2
                                                                 Clementine is a KDD Tool by SPSS Inc. used by
                                                               FT3/AD. Clementine programs are called streams.
1
 Cross Industry Process for Data Mining, see www.crisp-        3
                                                                 Projects are described by its own package type not
dm.org                                                         further described here.




K. Bartlmae, M. Riemenschneider                                                                                           2-7
             Domain model part   Description                                                Attributes

             Context             This part of the package describes the                     Application(Taxonomy of Domains)
             Applied Methods     context, in which a problem occurred. This                 Data Mining Problem Type(Set)
             Involved Object     includes the selection of predefined                       Keywords (String)
                                 dimensions, i.e. KDD-processes task, the                   Objects involved in Experience (Set)
                                 KDD-problem type, the used tools, the level                Problem class(Set)
                                 of specialization, the methods applied, the                Project characerization( Subconcept
                                 specific domain etc. Further, keywords from                with context-attributes,: Team size,
                                 the keyword-list help to identify the                      Duration, Region, Tools used, Data sets
                                 concrete problem. Here tools, methods and                  used)
                                 objects are described which were applied                   KDD-Goals(Set)
                                 or involved during the step. They are                      Data Mining step in CRISP-DM
                                 mostly predefined through the domain                       (Taxonomy)
                                 model.                                                     Lessons learned type(Set)
             Abstract             Here tools, methods and objects are described which       String
                                  were applied or involved during the step. They are
                                  mostly predefined through the domain model.
             Problem/Topic        In this section of an experience package describes the    String
                                  problem/topic that had to be solved during the
                                  execution of a KDD-step.

             Solution            Here a case/solution or experience description is          String
                                 presented, that can give help in the given context.
                                 Further, a justification or rationale, why it has been
                                 chosen, can be described.

             Rationale           If it is possible, it describes reasons that made it       String
                                 necessary to perform this step.

             Outcome             In this section, the outcome and result of applying the    String
                                 solution to the problem is described. Further, it is
                                 assessed, if the solution is a success for this problem.
                                 Note, that also negative outcomes add to the
                                 knowledge about a problem.
             References          Since experience packages are only compact                 String
                                 documents, links to other information sources or
                                 persons can be given.

             Admin               Here administrative information for experience             Author (Reference to Person Experience
                                 controlling is being given, i.e. number of accesses and    Package),
                                 ratings.                                                   Comment (String)
                                                                                            Controlling Concept (3 Attributes)
                                                                                            Knowledge view concept (Review
                                                                                            form(Set)
                                                                                            Specialization of Experience(Set)
                                                                                            Lifecycle of experience)



Figure 7: Structure of the Lessons Learned packages.

customers) and the regional setting of the application.
The content of the packages is further specified by two
attributes. A set of involved objects (i.e. Person, Time,
Data or Product) and the class of problem (taken from the
areas of management, technical problems, KDD-related
problems) help to differentiate the cases. On a knowledge-
perspective, three features represent the origin of the
package in respect to KDD and its processes (General
about KDD, from a KDD project, review of a project), the
specialization of the experience (General, special and
cookbook), the lifecycle (theory, observation and practice)
and the view onto the experience( i.e. Application
Developer, Business Analyst, KDD Engineer or End
User).
An important characterization is the type of the case, here
we distinguish experience between Best Practice, How                               Figure 8: Query for the Lessons Learned Packages of
To, Mistake/Critique and Success Factors.                                          Core-DM. The structural CBR approach allows for the
                                                                                   specification of attribute values, the textual approach
The experience is described in the main part of the                                allows for keyword-search of the packages textual
packages. This is being done in the two text fields, named                         components.
topic/problem and solution. So the case information has to
be processed in order to fit into these fields. Further, if it
is possible, the rationale for applying the solution and the
outcome after application can be collected in two further
text-fields. The introduced information entities (IEs) are
calculated over these four fields in order to use textual                          7.2 Artifact-Packages
CBR techniques. Figure 8 and 9 show, how the experience
base can be queried within our departments intranet.                               In these packages artifacts of different KDD-processes
                                                                                   and projects are collected for reuse. These artifacts can be




K. Bartlmae, M. Riemenschneider                                                                                                            2-8
of different types, i.e. presentations, reports of projects or   complement each other on the technical and
code-fragments. The user of the system can, therefore,           organizational level for our needs. We then introduced our
specify the type of artifact he wants to retrieve. This          realization of the approach in the domain of knowledge
information is represented by an artifact-type in the            discovery in databases. We described our solution for the
experience base. Further, the KDD- and application-              experience base, called Core-DM, which is based on a
context specification is saved as before in addition to each     combination of structural and textual CBR techniques.
artifact-package, and last but not least the CRISP-DM
process step.                                                    In the next steps we plan to evaluate the system Core-DM.
                                                                 We will derive quantitative and qualitative measures in
The artifacts are further described by a short abstract, a       order to value aspects like quality of the experience-base,
detailed description and the project it has been created in.     economic utility, usability and technical performance.
The content of the artifact is characterized by an attribute     These can then be aggregated to measure the overall
using a taxonomy of content-types. Here one distinguishes        success of our knowledge management initiative. A
broadly between the result of the KDD-projects like              further topic of interest is tightly coupled to this
deliverables, reports, process supporting documentation          evaluation step. So the maintenance-step of the
(user-guides or reference-models) for a certain application      experience-base and its packages has to be investigated.
area. Last but not least, a reference to the concrete artifact
is used, so that it can be downloaded from our web-server.       References
                                                                 [ABT97] Althoff, K.-D./ Birk, A./ Tautz, C.: The
7.3 Person-Packages                                                Experience Factory Approach: Realizing Learning
                                                                   from Experience        in    Software     Development
With this concept the information and especially skills of         Organizations. Proceedings of the Tenth German
persons involved in our KDD-projects can be described.             Workshop on Machine Learning, University of
The description can be separated into two parts. First,            Karlsruhe, 6-8- August, 1997. IESE-Report No.
information about the person is being saved as in any              013.97/E. 1997.
person-register, from names to addresses and phone                 http://www.iese.fhg.de/pdf_files/iese-013_97.pdf
numbers.
                                                                 [ANT98] Althoff, K.-D./ Nick, M./ Tautz, C.: CBR-PEB:
In the second part, the skills and roles of a person are           Implementing Reuse Concepts of the Experience
described, making it possible to find persons according to         Factory for the Transfer of CBR System Know-How.
their knowledge and expertise and who are willing to               Preceeding of the 7th Workshop on Case-Based
                                                                   Reasoning. IESE-Report No. 058.98/E. 1998.
share these with others. Rather than using free-text fields
                                                                   http://www.iese.fhg.de/pdf_files/iese-058_98.pdf
to describe these, the packages domain model gives
predefined attributes in order to characterize the person.       [AP94] Aamodt, A/ Plaza, E.: Case-based reasoning:
Here the package distinguishes between the KDD-                    Foundational issues, methodological variations, and
application (i.e. credit scoring for new customers) a              system approaches. AI-Communications, 7(1), 39-59,
person is involved in, its regional setting, the KDD-              1994
methods and techniques (i.e. regression techniques) he is
expert in. Of further interest in the context of KDD are         [BT98] Birk, A./ Tautz, C.:Knowledge Management of
programming language or product skills, given by a fixed           Software Engineering Lessons Learned. Technical
taxonomy.                                                          Report. IESE-Report No. 002.98/E. 1998.
                                                                   http://www.iese.fhg.de/pdf_files/iese-002_98.pdf
In order to substitute our departments personal register we
also collect traditional individual and person information       [BCR94] Basili, V.R./ Caldiera, G./ Rombach, H.D.:
in free text fields.                                               Experience Factory. In: J. Marciniak, editor,
                                                                   Encyclopedia of Software Engineering, vol. 1, John
                                                                   Wiley and Sons, 1994.
8 Conclusion
In this paper we introduced our approach for managing            [BAR99] Bartlmae, K.: A CBR based Experience Factory
experiences in KDD-projects. It is based on the                    for Data Mining, in: Proceedings of the International
experience factory organization and the approach of case           Computer Science Conference: Internet Applications
based reasoning. We therefore investigated how the CBR             (ICSC'99), Lecture Notes in Computer Science,
                                                                   Springer-Verlag, New York, 1999
based experience factory approach covers the different
aspects of knowledge management, represented by the
                                                                 [BL00] Bartlmae, K./ Lanquillon, C : A KDD
approach of Probst et. al. It showed that case based               Experience Factory: Using Textual CBR for Reusing
reasoning and the experience factory approach                      Lessons Learned, in: Proceedings of the DEXA2000



K. Bartlmae, M. Riemenschneider                                                                                         2-9
   (Database and Expertsystems Application Conference),
   London, Lecture Notes in Computer Science, Springer-
   Verlag, New York, 2000

[HEI98b]   Heisig,   P.:   Projektmanagement         und
  Wissensmanagement. Wissenstransfer noch            kein
  Thema. In: IT Management 7/1998. In german.

[HOU99]      Houdek,      F.:     Empirisch     basierte
  Qualitätsverbesserung. Systematischer Einsatz externer
  Experimente im Software Engineering. Dissertation.
  Logos-Verlag. Berlin. 1999. In german.

[KPMG98] KMPG Management Consulting, Parlby, D.:
  Knowledge Management. Research Report 1998.
  http://www.kpmg.co.uk/kpmg/uk/services/manage/rese
  arch/knowmgmt/knowmgmt.pdf

[PRR99] Probst, G./ Raub, S./ Romhardt, K. Wissen
  Managen. Wie Unternehmen ihre wertvollste
  Ressource optimal nutzen. Gabler. Wiesbaden. 1999.
  In german.

[REI98] Reinartz, T. et. al.: The Current CRISP-DM
  Process Model for Data Mining. In: Maschinelles
  Lernen. S. 1-9. 1998. In german.

[RIC98] Richter, M.: Introduction (to CBR) In: Lenz, M./
   Bartsch-Spörl, B./ Burkhard, H-D./ Wess, S.: Case
   Based Reasoning Technology. From Foundations to
   Applications. Lecture Notes in Artificial Intelligence.
   Springer Verlag, New York. 1998.

[WES96] Wess, S: Fallbasiertes Problemlösen in
  wissensbasierten             Systemen             zur
  Entscheidungsunterstützung       und       Diagnostik:
  Grundlagen, Systeme und Anwendungen. Dissertation.
  Dissertationen zur künstlichen Intelligenz. Bd. 126.
  Infix Verlag. Sankt Augustin. 1995. In german.




K. Bartlmae, M. Riemenschneider                              2-10