Mapping MOF-based requirements
 representations to ontologies for software reuse

 Katharina Wolter1 , Michal Śmialek2 , Lothar Hotz1 , Stephanie Knab1 , Jacek
                     Bojarski2 , and Wiktor Nowakowski2
                  1
                    HITeC e.V., University of Hamburg, Germany
                (kwolter,hotz,knab)@informatik.uni-hamburg.de
                    2
                      Warsaw University of Technology, Poland
                  (smialek,bojarsj1,nowakoww)@iem.pw.edu.pl


      Abstract. In this paper, we combine MOF-based software representa-
      tions and description logic-based mechanisms for facilitating software
      reuse. All software representations (requirements specifications, design
      models, code) arising from one project are combined in a software case
      and stored in a repository for later retrieval. For reuse purposes, we use
      requirements as search indexes. We map metamodel-based requirements
      specifications to an ontology and use a Description Logic reasoner for
      classification. This makes implicit taxonomical relations explicit. The in-
      ferred taxonomical hierarchy is then used to compute the taxonomical
      similarity between the current (initial) requirements and those in the
      repository. Doing so, we retrieve software cases with a high reuse poten-
      tial. This approach has already been validated in an industrial context.


1   Introduction
The underlying goal of our work is to enable reuse of model-based software devel-
opment artifacts. For this task, we combine all artifacts that arise in the course
of a software development project in a “software case”. A software case is a set
of interlinked models and code written in textual and graphical languages that
have their syntax defined formally (normally with a metamodel or a context-free
grammar). The contents of a software case are manifested through the contained
requirements model, which is specified using a precise and semantic-oriented re-
quirements language. In this language, meaning of words is defined by linking
them to WordNet [1]. The requirements model is mapped to other models (us-
ing model-based transformations) so that all syntactic elements of these models
have their source in at least one element of the requirements model. All software
cases that have been developed are stored in a repository for later reuse.
    The retrieval of similar software cases from the repository is a key prerequi-
site for this reuse approach. In order to retrieve such similar software cases we
compare requirements models. This is based on the assumption that cases with
similar requirements models can have similar design models, too. The retrieval
uses a combination of different similarity measures [2], [3]. In this paper, we
focus on the ontology-based similarity measure for software cases.
2       Wolter K., et. al

    Such an ontology-based similarity measure requires the representation of all
software cases (more precisely their requirements models) in an ontology. One
resulting advantage of representing them in an ontology is the availability of
reasoning techniques provided by Description Logic (DL) reasoners like Pellet or
Racer3 . One of these reasoning techniques (namely classification) supports our
similarity measure. The classification makes implicit relations between elements
in the ontology explicit by changing the taxonomical hierarchy. These newly
inferred relations can then be used by our ontology-based similarity measure.
    The rest of this paper is structured as follows: Section 2 sketches the facilities
Description Logics supply and elaborates on main questions that have to be
answered when Description Logics are applied in a certain domain. In Section 3,
we introduce the general reuse approach in which our ontology-based similarity
measure is embedded and we describe the MOF-based requirements specification
language (RSL). In order to realize the similarity measure we had to construct
an ontology containing all requirements models, the related metamodel, and the
linked elements from WordNet (see Section 4 for details). The use of classification
as a reasoning technique for supporting software retrieval is explained in Section
5. Our ontology-based similarity measure uses this inferred taxonomic hierarchy
(Section 6). Section 7 describes the validation experiments as well as first results.
Finally, the approach is discussed (Section 8) and a summary concludes the
paper.

2     DL Modeling and Reasoning Facilities
For our current task (ontology-based similarity measure), we need to select from
a diverse range of modeling and reasoning facilities of Description Logic. It pro-
vides facilities to define concepts and roles (or sometimes called properties) be-
tween two concepts. A concept (given with a unique name) indicates the set
of individuals that belongs to it, and a role indicates a relationship between
concepts. Concepts and roles are combined via constructors to so-called descrip-
tions. Typical constructors are those of ALCHIF, i.e. Attributive Language
with universal restrictions, existential qualification, concept intersection (ALC),
role hierarchy (H), inverse roles (I), and functional properties (F). With these
constructors, one can define a specialization relation between concepts, which
provides superconcepts and subconcepts in a taxonomy, i.e. general or specific
concepts. Furthermore, a Description Logic allows to distinguish between primi-
tive (i.e. necessary) and defined (i.e. necessary and sufficient) concepts, indicated
with implies and equivalent, respectively. If several somehow combined roles are
sufficient for a concept, an individual who fulfills such roles would be part of the
set the concept describes. If an instance belongs to a concept, the instance fulfills
the necessary roles of that concepts.
    Given such modeling facilities, DL reasoners provide reasoning techniques.
Examples are (see www.w3.org/Submission/owl11-tractable/ or [4]):
  – Concept Subsumption (classification): A concept A subsumes B, when the
     description of A is more general than the description of B.
3
    clarkparsia.com/pellet, www.racer-systems.com
                           Mapping MOF-based requirements to ontologies         3

 – Instance Checking: Given an individual a and a concept A, verify whether a
   is an instance of A.
 – Ontology Consistency: Check whether a given ontology is not contradictory.
 – Concept Satisfiability: Given a concept A, verify whether the description of
   A yields a non-empty set of individuals.
 – Conjunctive Query Answering: Given an ontology O and a conjunctive query
   q, return the answers of the query with respect to O.
    Generally, when ontologies should effectively be applied in a task, like here
in ontology-based similarity computation, four design decisions are concerned:
1. What reasoning techniques should be used for supporting the solution of the
   task? In our case, classification is used (see Section 5 for details).
2. What entities should be mapped to the ontology? In our case, what elements
   of a software case are relevant for realizing the reasoning technique? Section
   3 introduces the main elements of a software case and Section 4 explains the
   mapping to the ontology.
3. What concepts should be modeled as defined concepts and which should be
   necessary? Especially defined concepts enable reasoning tasks (Section 4).
4. Which logical language should be used for solving the task (see Section 4 for
   details)? This comes from previous decisions which impact the expressiveness
   of the language and computational efficiency.


3   Model-driven Software Cases
Figure 1 shows an example of a software case. Elements of the requirements
model within the software case (R) are interlinked with appropriate elements of
the design model (D), which, in turn, are reflected in code (C). Design and code
form a solution to the software problem manifested by the requirements. The
initial requirements for a new problem (Q) can be compared with the require-
ments models of already existing software cases. Thus, the requirement models
serve as indexes to the design models which facilitates searching and retrieval
of past solutions that can be applied to the current problem. For the indexing
mechanism to work properly we have designed a special-purpose Requirements
Specification Language (RSL). Its syntax contains detailed constructs which al-
low for automatic comparison and transformation to the design models.
    RSL enables two scenarios of software development: constructing and storing
a software case, and retrieving a software case. When starting a new software
project, initially we create its requirements model (see R21 in Figure 1). The
model contains the description of the application logic with a separate descrip-
tion of the problem domain. Such a complete requirements model written in
controlled language, is automatically transformed into the design model (see
D21 ). This model can be manually updated to conform to certain non-functional
constraints. Then, the code (see C21 ) is generated, which includes also some of
the contents of the methods (the application logic). Finally, all these artifacts,
together with the mappings generated between them are stored in a software
case repository.
4                        Wolter K., et. al


                                                                                                                   Q1
                      WordNet
                    terminology                                                                                                                                                +
                                                                                                                                             R21
                                                                                                                                            R21
                                                                                                                                   R21

                                                                      +                                                                                                    +

                                                                                                                                   D21
               « i nte rfa ce »                                                                              «i n te rface »
           CourseManagement::                                        UserInterface::                    CourseManagement::
         IShowExtendedCourseList                                     UIComponent                          IAddNewCourse
         +     w a n tsTo S e e Co u rseL i st()                                                    +     e nte rsCo u rse ()
                                                                                                    +     w a n tsToA dd Co u rse ()


                                   IS h owE xte nd e d Co ur se L i st         UI      IA d d Ne w Co urse                                                     UICom pon ent        UI :UI                             «i nterfa ce»                       «i nterface»
                                                                                                                                                              :UICom pone nt                                 ISh owExten ded Course Li st                   ICourseLi st
                                                                                                                           co urse m ana ger :co urse                                                        :ISh owExten dedCo urseL ist                  :ICourseLi st
                   « i n terfa ce »
                                                                                                                                   m an ager
                 UserInterface::UI                            CourseManagement ICh a n ge Co u rse Da t a
                                                                                                                                         wan ts to se e c ou rse li st()
     +       sho w sCo u rse E n tryFo rm()
                                                                                                                                                                               wan tsT oSe eCou rseLi st()
     +       sho w sCo u rse E rrorMe ssa g e()                                            IS h o wO w n ed Cou rse s                                                                                                             bui ld s(Cou rseLi stDT O )
     +       sho w sCo u rse Li st()                                                                                                                                                           sho wsCo urseL i st()


                                                   ICou rse L i st                  ICo u rse
                                                                                                           « i nte rfa ce»
                  « i nte rfa ce»                                                                       CoursesServices::
              CoursesServices::                                                                              ICourse
                 ICourseList                                    BusinessLogic::
                                                                CoursesServ ices                    +    reg i ste rs(Co urse DTO )
         +     b ui l d s(Co u rse L istDTO )                                                       +    ve ri fi es(Cou rse DTO )


                     package coursesServices;
                                                                                                                                   C21                                     package courseManagement;
                     interface ICourse {                                                                                                                                   interface IAddNewCourse {
                      void registers(CourseDTO c);                                                                                                                          void entersCourse();
                      void verifies(CourseDTO c);                                                                                                                           void wantsToAddCourse();
                     }                                                                                                                                                     }


    Fig. 1. Example of an elementary software case (R21 ,D21 ,C21 ) with a query (Q1 )

    After some time, when we start a next project, we write an initial require-
ments model which is used to build a query (see Q1 ) to search for past solutions
that can be reused. The query engine compares the query with the requirements
models of past software cases. The most similar software cases or their parts are
picked by the team and copied to the current workspace for merging with the
current project. After merging, the newly created software case can be stored in
the repository for further reuse.
    Figure 1 explains the details of RSL. It shows three use cases (see boxes with
tabs within Q1 and R21 ) which are the basic units of functionality in RSL. Use
cases have their detailed representations in the form of textual scenarios (graph-
ical versions also can be used). Every such scenario is a sequence of numbered
                                                                             Mapping MOF-based requirements to ontologies                                                                          5


                    Requirements                                                            Project specific vocabulary                                               Terminology

                                   Req ui reme nt                                                                                            Te rmHyp erl in k +li n ke dT erm             Te rm
                                      Use Ca s e                                                       Phras e        1              0 ..1   Determine r                         Determine r
                                                                                                                                                               0..*         1
                    UseCas e                                                                                          1

                                                                                                                                             Te rmHyp erl in k +li n ke dT erm             Te rm
                                                                                               1                                        1
                                                                  Ph rase Hype rli nk                                                          Noun                                Noun
            +scen ari o s 0 ..*                                                         *                                                                      0..*          1
                                                                     Subj ect
             Re qu ire me ntRep rese nta ti o n                                                    VerbPhrase

      ConstrainedLanguageScenari o                                                                                                           Te rmHyp erl in k +li n ke dT erm             Te rm
                                                    +su bj e ct      1                                                               0 ..1
                                                                                                                                              Modifier                            Modifier
                                                                                                   1                                                           0..*          1
                            1 .. *
                                                                  Ph rase Hype rli nk *
                                                                    Predicate                                                                Te rmHyp erl in k +li n ke dT erm             Te rm
    +sce na rio St ep s   1 .. *                                                                        ComplexV erbPhrase
                                                                                                                                             Preposition                         Preposition
                                                          1                                                                  0 ..1                             0..*         1
                 SV OSentence                            +ve rb Wi thOb je cts
                                                                                                                                             0 ..1

                                                                                                                 1

                                                                                               SimpleV erbPhrase
                                                                                                                                             Te rmHyp erl in k +li n ke dT erm             Te rm
                                                                                                                                        1       Verb                                Verb
                                                                                                                                                               0..*          1


                                                          Fig. 2. Fragment of the RSL metamodel

sentences in the SVO grammar – consisting of a subject (S), a verb (V) and one
or two objects (O), e.g. System shows course list. Actually, scenario sentences are
not pure text but they are built of hyperlinks pointing to notions in a domain
vocabulary (e.g. course list). These notions can have their definitions specific
to the system’s problem domain. Notions can also contain domain statements
which are verb phrases associated with the given notion (e.g. show course list).
An example domain vocabulary is shown in Figure 1 to the left of the scenarios.
In order to make requirements models comparable for reuse purposes, we have
to additionally link all elements of the domain vocabulary to one global termi-
nology, like WordNet [1] that contains meanings of words in the general context.
This makes possible to calculate semantic-oriented similarity of different require-
ments models as it will be shown in the following sections. In this context it can
be noted that RSL models can both form parts of software cases (R21 ) and serve
as queries (Q1 ) for software case retrieval.
    To facilitate automatic transformations, the RSL syntax is defined through
a meta-model in MOF. This was a natural choice which facilitates constructing
transformations to design models written in the MOF-based UML. It also al-
lows for mapping RSL to an ontology. For transformations we use the MOLA
transformation language (see [5]).
    A relevant fragment of the RSL metamodel expressed in MOF is presented in
Figure 2. The metamodel is composed of three parts, divided with metaclasses
constituting hyperlink relations. First part is the requirements part (left frame).
In RSL, requirements can be modeled as general requirements written in natural
language (not included in Figure 2) or as use cases with scenarios defined in
constrained language. Scenarios (ConstrainedLanguageScenario4 metaclass) are
composed of SVOSentences. Every such sentence is composed of a Subject and
a Predicate which are in fact hyperlinks to definitions of the proper phrases in a
4
    With this font we will denote elements of the RSL metamodel.
6       Wolter K., et. al

vocabulary (center frame). The Subject points to a Phrase containing one Noun
and optionally one Modifier and one Determiner. The Predicate points to one of
the two VerbPhrase subclasses. It contains the VO(O) part of the sentence. If the
sentence contains only the direct object (SVO sentence), the predicate points to a
SimpleVerbPhrase. This subtype of VerbPhrase contains a Noun (inherited from
the Phrase) and an additional Verb added before the Noun (eg. ,,add user”).
Phrases containing both the direct and indirect object (VOO) are stored in
a ComplexVerbPhrase. This subtype of the VerbPhrase links an instance of a
SimpleVerbPhrase (the VO part of the phrase) followed by a Preposition and an
indirect object (the Noun inherited from the Phrase, eg. ,,add user to user list”).
In RSL, the vocabulary is project specific and contains notions’ and phrases’
definitions characteristic to the current problem domain. All the words used
in the project-specific phrases are summarized in a terminology (see the right
frame). The TermHyperlinks connect the Phrases and the Terms. Each Term links a
WordNet element (by referring to its ID) in order to define the terms’s meaning
in a general context. WordNet is a sematic lexicon which currently consists
of more than 200.000 synonyms grouped in more than 100.000 synonym sets
(called synsets, see [4]). Semantic relations (like is-a and has-parts) connect
these synsets.

   This three-level language infrastructure enables clear separation between the
behavior specification (use case scenarios’ steps) and the description of the do-
main (vocabulary). The third level – the terminology – defines the meaning of
words in a global context and thus gives the possibility to perform meaningful
comparison of different requirements models.

   For expressing design models within software cases, we use a subset of UML
elements defined through a profile. A design model is generated automatically
as a result of a transformation that takes a requirements model in RSL as its
source. The transformation procedure adds to the information contained in re-
quirements all the details of the technical solution: the architectural style, design
patterns, etc. In the example we present the results of a transformation from re-
quirements R21 to design D21 . The example uses a 3-tier architectural model
as the transformation target. Use cases are transformed into interfaces of the
application logic layer and domain element are transformed into interfaces of
the business logic layer and also into data access objects. Sequences of scenario
sentences can be transformed into sequence diagrams on the architectural level.
For more details on the transformation rules see [6]. During the transformation,
the mapping links between elements of the requirements model and the design
model are created. By following these links, the design elements and the resulting
code (see C21 ) can be retrieved for reuse.

    Validation efforts conducted in the industrial context proved that despite of
its very detailed meta-model, the language is readable and understandable for
its users (see [7]). The full specification of the RSL can be found in [8].
                            Mapping MOF-based requirements to ontologies          7

4     Mapping the MOF-based Models to an Ontology
In order to use the classification facilities provided by DL reasoners we had to
represent the RSL metamodel, the requirements models and the linked Word-
Net elements in an ontology using ontology languages like OWL [9] or KRSS
[10]. The goal was to represent the requirements models in a way such that
similar requirements have short taxonomical distances after classification. This
representation is explained in the following, starting with the RSL metamodel.

4.1   Mapping the RSL Metamodel
The RSL metamodel consists of 127 classes, sub- and superclasses, and many
associations between them. These classes and associations are represented in
the ontology as follows: classes of the metamodel are represented with concepts,
generalization relations are represented with specialization relations, and asso-
ciations are represented with roles. Furthermore, the associations of a class in
the metamodel define the class, i.e. if an object with such relations exists, then
it belongs to that class and if an object belongs to a class it has the relations
of that class (see Figure 3). The classes of the RSL metamodel are furthermore
disjoint, because every element of a requirements model is an instance of exactly
one class of the metamodel.
    Although navigation across associations is bidirectional in the RSL meta-
model, we do not use inverse roles in the ontology since the similarity com-
putation only requires roles directing down from the concept representing the
requirements model to the synsets of WordNet. Likewise, we avoided a role hier-
archy because only a relatively flat hierarchy of associations is given in the RSL
metamodel. Finally, functional properties are not needed for defining concepts
because they relate a class to a primitive type not to another concept. Thus, we
can use the DL language ALC instead of the more complex language ALCHIF.

4.2   Mapping Requirements Models
The requirements models of specific software cases are represented through in-
stances of classes of the RSL metamodel. This is illustrated in Figure 4. The
RSL metamodel can be found on M2 while the specific requirements models
are located on M1 (in the context of the MDA metamodeling layers). Thus, the
SVO sentences defined in a particular requirements model are instances of the
class SVOSentence in the RSL metamodel. Please note that we represent these
instances with further concepts in the ontology. They are mapped to subcon-
cepts of the concepts representing the classes of the RSL metamodel. This is


Fig. 3. Definition of an upper model concept representing the RSL class SVOSentence.
8         Wolter K., et. al


                     M3             EMOF


                                    SDSL metamodel
                     M2
                                    RSL metamodel


                     M1
                                                               Represented
                                                               in ontology
                              Requirements Model C1
                               Requirements Model Cn

                                 WordNet


    Fig. 4. Requirements models and their metamodel are represented in the ontology.


illustrated in Figure 5 which shows a part of an ontology, which we created
manually for illustration purposes. Concepts for SVO sentences are coded by a
software case id starting with C and a readable string reflecting the text of the
sentence (e.g. SVOSentenceC2GuestChoosesEquipment5 ). The concept SVOSentence
is the superconcept of four SVO sentences contained in distinct requirements
models.
    As a result, the concepts representing the classes of the RSL metamodel form
the upper part of our ontology. Thus, the part of the ontology representing the
model from the M2 level is called upper model of the ontology. The concepts
representing the elements of specific requirements models from the M1 level are
modeled as specializations of these concepts and form the lower part of our
ontology.
    We do not use individuals for representing the elements of particular specifi-
cations because taxonomical relationships cannot be computed between individ-
uals. In our approach this is the key to infer taxonomical relationships between
the elements of requirements models, because the similarity is based on taxo-
nomical relations.
    Thus, each class of a specific requirements model is mapped to a concept
with a unique id as its name and the corresponding upper model concept as its
superconcept. The concepts representing the elements of requirements models
5
     With this font we will denote elements defined in the ontology.


                                  element from M2 level


                                                              elements from M1
                                                              level


Fig. 5. Concepts in the ontology representing classes from the M2 and the M1 level.
                                 Mapping MOF-based requirements to ontologies       9

      (equivalent SVOSentenceC2GuestChoosesEquipment
         (and SVOSentence
              (some subject SubjectC2Guest)
              (some predicate PredicateC2ChoosesEquipment)))
      (equivalent PredicateC2ChoosesEquipment
         (and Predicate
             (some verbPhrase SimpleVerbPhraseC2Choose)))
      (equivalent SimpleVerbPhraseC2Choose
         (and SimpleVerbPhrase
              (some verb PhraseVerbLinkC2Choose)
              (some object NounPhraseC2Equipment)))
      (equivalent PhraseVerbLinkC2Choose
         (and PhraseVerbLink
              (some linkedVerb VerbC2Choose)))
      (equivalent VerbC2Choose
         (and Verb
              (some termLinksToWordnetEntry ChooseSelectSynset)))
      (equivalent NounPhraseC2Equipment
         (and NounPhrase
              (some noun NounLinkC2Equipment)))
      (equivalent NounLinkC2Equipment
         (and NounLink
              (some linkedNoun NounC2Equipment)))
      (equivalent NounC2Equipment
         (and Noun
              (some termLinksToWordnetEntry EquipmentSynset)))

  Fig. 6. Parts of the definition of the SVO sentence “Guest chooses equipment.”.


are related through roles defined in the upper model. Furthermore, these roles
define the requirements model concepts (see Fig. 6). Thus, not only necessary
relations (i.e. subclass) but necessary and sufficient relations are used for relating
software cases. This modeling enables the reasoning techniques described in the
Section 5.
    This general mapping can be used to convert all elements of a requirements
model to our ontology. However, this is not necessary since RSL requirements
models contain some elements not relevant for our ontology-based similarity
measure like all TermHyperlinks in Figure 2 (see [11] for details). Mapping these
elements would cost significant amount of computing time and would not im-
prove the similarity measure. Thus, we only map RSL elements that are relevant
for our similarity calculation.

4.3     Mapping WordNet Elements
The WordNet metamodel consists of several classes like Synset, Wordform, Gloss
etc. NounSynsets are related via the relation hyponym and hypernym in a taxon-
omy. Classification only requires this taxonomical relation between synsets and
the fact that synsets are distinct. Thus, we only map synsets from WordNet.
They are represented with concepts and the synset taxonomy is represented
with the subconcept relation.
    The role termLinksToWordNetEntry relates each term of a requirements model
to a synset (see Fig. 6). However, it is not necessary to map all synsets of Word-
Net (i.e. almost 117.000) into the ontology. For our purposes, it is only necessary
to map the synsets that are linked by a term and all the predecessors of these
synsets in the synset taxonomy. Synsets have only taxonomical relations between
10            Wolter K., et. al

        (implies ChooseSelectSynset VerbSynset) (implies FixSynset VerbSynset)
        (implies PersonSynset NounSynset)
        (implies ComputerUserSynset PersonSynset)
        (implies System_AdministratorSynset ComputerUserSynset)
        (implies UserSynset PersonSynset) (implies ConsumerSynset UserSynset)
        (implies CustomerClientSynset ConsumerSynset)
        (implies GuestSynset CustomerClientSynset)
        (disjoint ComputerUserSynset UserSynset)
        (disjoint ChooseSelectSynset FixSynset)

                                              Fig. 7. Mapping synsets into concepts.


themselves (i.e. no further roles). Thus, the taxonomical relations are necessary
conditions, i.e. implies is used for describing them. Furthermore, synsets that
are siblings are disjoint to each other (see Fig. 7).

5        Classifying the Ontology
In this section, we show what can be gained by classifying the constructed on-
tology. Using the mapping described in the previous section we can represent
requirements models with concepts and roles in an ontology. However, the re-
quirements models do not contain information that directly relates the diverse
concepts of distinct requirements models. As a result, all concepts are direct sub-
concepts of upper model concepts. In Figure 5 e.g., all SVO sentences are direct
subconcepts of the upper model concept SVOSentence. In this initial ontology, only
the synset taxonomy provides a hierarchical structure (see Fig. 8, left side).
    Figure 8 shows the SVO sentences from Figure 5 plus some additional concepts but
after classification. The synset taxonomy provided by WordNet and the fact that the


    Pre-classification:    Post-classification:

                                c1

   c1   c2
                           c2
Distance (c1, c2) = 2                    Distance (c1, c2) = 1

Tax-Sim (c1, c2) = 0.644             Tax-Sim (c1, c2) = 0.761


                                                                          Distance (c1, c2) = 2


Fig. 8. Left side: synset taxonomy mapped from WordNet. Right side: cutout of in-
ferred hierarchy: classification results are emphasized in gray and blue font.
                              Mapping MOF-based requirements to ontologies             11

concepts representing a requirements model are defined concepts and thus, strongly
related between each other, allow for the classification described in the following.
     Our mapping of RSL metamodel and requirements models enables a DL reasoner
to use the synset taxonomy in order to classify the concepts of the requirements
models, i.e. to compute, which concepts are equivalent (Fig. 8, upper part in gray
marked by ≡) or, which concepts can be taxonomically structured (Fig. 8, lower part
in blue). Relations of the synset taxonomy are propagated one by one to the sub-
jects of requirements models of C1, C2, C3, and C4. Namely, SubjectC3Client and
SubjectC4System Administrator are subconcepts of the concept SubjectC1Person
due to the fact that CustomerClientSynset and System AdministratorSynset are
subconcepts of PersonSynset, respectively. Thus, after classifying the ontology, the
taxonomy defined in WordNet is also reflected in the concepts representing the specific
requirements models. Even more interesting is the impact on structural concepts like
SVOSentence. In Figure 8 for example, the SVO sentences of case C2 and C3 are clas-
sified as subconcepts of the SVO sentence of case C1. This is due to the fact that in
both sentences a specific type of person (being either a guest or a client) chooses an
type of equipment. Since the SVO sentence of C4 is about fixing a printer instead of
selecting one, the sentence is not a subconcept of the SVO sentence of case C1. Thus,
classifying the ontology makes these taxonomical relations explicit. Please note that
Figure 8 shows only parts of this information.
     This kind of classification takes the structure, i.e. the roles between the concepts,
into account. Through this classification, a different taxonomical distance between the
concepts of requirements models is introduced and will lead to different similarities.
Before classifying the ontology, the taxonomical distance between all SVO sentences
was equal (see Fig. 5). Classification reduces the distance between SVO sentences of
C2 and C1 while it increases the distance between sentences of C2 and C4 (comp.
Fig. 5 and Fig. 8). This effect of classification on distances is sketched in the box in
Figure 8. By using the WordNet link and the provided mapping, our similarity measure
considers the meaning defined in the requirements models.


6    Similarity Computation
Once the ontology has been constructed and classified we use the inferred taxonomical
hierarchy for computing similarity values. For this purpose, a query is compared with
each requirements model contained in the repository where the query is a (potentially
partial) requirements model. The goal of similarity computation is to get a value be-
tween 0 and 1 that indicates the similarity between a query and a software cases; 0
denoting no similarity and 1 denoting that the query is completely contained in the
software case.
    Please note that our measure differs from most similarity measures. Typically,
similarity measures positively correlate with the amount of commonality between query
and case and at the same time negatively correlate with the amount of difference (see
[12]). The latter however, is not wanted for our similarity measure. A query element
that is not found in a software case should of course result in a lower similarity value
but if a software case contains elements that are not part of the query, this should
not have a negative impact on the similarity value. This is due to the assumption
that additional elements in the software case are potential for reuse and thus wanted.
Supported by a tool, the user can browse these additional elements and select those
that are valuable in the current software development project while others are omitted.
In order to allow for this, our measure is asymmetric.
12      Wolter K., et. al

     The assumption made here is that the taxonomical distance between two concepts
can be the basis of such a similarity measure. Our similarity measure compares pairs of
concepts in the classified taxonomy. This comparison includes both the concept place-
ment in the taxonomical hierarchy (distance-based similarity) and the roles and role
fillers that the concepts define (role-based similarity) (see [13] for a similar approach).
     For computing the distance-based similarity between two concepts c1 and c2, the
distance of both concepts to the least common subsumer (LCS) is computed, both
values are added and a value describing the distance-based similarity is returned:

 1. lcs = leastCommonSubsumer(c1, c2)
 2. distance1 = pathLength(c1, lcs)
 3. distance2 = pathLength(c2, lcs)
 4. distance = distance1 + distance2
                  1
 5. return ln(distance+e)

The algorithm has the following properties: The division ensures values between 0
and 1. The further away the concepts are from each other, the smaller the similarity
value gets. The logarithm ensures higher similarity values for low distances, than, for
               1
example, distance   would provide. Finally, the addition of e ensures a similarity value
of 1 for distance 0, i.e. if c1 and c2 are in fact the same concept. The box in Figure
8 illustrates the effect of classification on the distance-based similarity measure. By
classification the distance between c1 and c2 is reduced from 2 to 1. The associated
similarity values of the distance-based measure are also given in the figure.
    The basic idea of role-based similarity is that the similarity of two concepts depends
on the similarity of their subgraphs. When comparing roles, their most important as-
pects are their fillers: concepts or concrete domains that specify the value of a role.
When comparing two concepts, the function of role-based similarity is recursively ap-
plied. The recursion terminates when two concepts without roles are compared; their
similarity is given by the distance-based similarity function. The roles of both concepts
are recursively compared and similarities are summed up for every concept. A value
describing role-based similarity is returned: 1 means both concepts have the same roles,
and the smaller the value, the less their roles have in common.
    Both aspects can be computed independently from each other. Therefore, we de-
fined two algorithms, one computing the distance-based similarity between two con-
cepts and another one comparing the common roles of concepts. The similarity between
two concept definitions is the sum of the distance-based similarity and the role-based
similarity divided by two.


7    Experiments
During the development of our approach we used small examples for testing and illus-
tration purposes (see previous sections). In order to evaluate the presented similarity
measures we conducted two types of experiments. On the one hand, we produced
several small software cases by varying specific elements e.g. of a sentence, in order
analyze the impact of these modifications. On the other hand, we applied the approach
to software cases produced by software development teams in the industrial context
(see Acknowledgements).

Technical Setting. The RSL metamodel was implemented as part of a comprehensive
tool suite [14]. Internally, the tool uses a graph representation of the metamodel (see
                              Mapping MOF-based requirements to ontologies            13

[15]). This representation provides the input for the presented ontology construction.
It also serves as the basis for transforming into design and storing of software cases.
    A JAVA-based converter maps the RSL metamodel and the requirements models
to an OWL ontology by using the mapping described in this paper. For debugging the
OWL ontology we use Protégé. Protégé also provides interfaces to the DL reasoners
Pellet and Racer. For debugging inference behavior we apply Pellint.6 The similarity
computation is a further JAVA-component developed by us. It traverses the classified
ontology for computing similarity values as presented in Section 6. This similarity mea-
sure is combined with other measures and integrated in the above mentioned software
development tool.

Experiments. Firstly, we created almost 50 small, artificial requirements models. These
models vary only in particular elements e.g. one word is added, one word is replaced by a
more specific term, one sentence is added etc.. Comparing the similarity values for these
requirements models we could analyze the impact of particular modifications in detail.
Furthermore, we could show that the ontology-based similarity measure computes the
same similarity ranking for these test cases as considered plausible by us beforehand.
    For applying the approach in a broader scale, 16 industrial software cases have
been created by four software development organizations using the above mentioned
tool. The application domains are in the area of internet banking, investment funds
management, emergency systems, forestry systems, financial contract systems, and
funding systems. The requirements models contained 261 requirements written in RSL
in total, which mapped to more than 30.000 defined concepts. We were able to reduce
the number of defined concepts to about 8.000 by converting only concepts that are
relevant for the ontology-based similarity computation (see Section 4).
    In addition, 8 more partial requirements models were created. These models were
used to query the repository of the above mentioned cases. The development teams
were to determine the compliance of their understanding of similarity and the com-
puted measures. This was subject to UTAUT analysis [16] which included appropriate
questionnaires asking about the similarity measures. The results of the analysis showed
general positive attitude and acceptance of the developers. It resulted also with sev-
eral remarks which allow for additional improvements in the classification approach.
Namely, further work will extend the number of tractable concepts (which is limited
through the capabilities of the used DL-reasoners). The currently gathered repository
of software cases gives good means to further experiment in this direction.


8     Discussion and Related Work
As can be seen in Figure 4 the constructed ontology crosses the border between the
levels M1 and M2. This is due to the fact that the instance-of relationship between an
element on Level M1 (e.g. a specific SVO sentence) and its class (like the RSL class
SVOSentence) on Level M2 is mapped to a specialization relation of appropriate con-
cepts in the ontology. Or, as we introduced it in Section 4, the metamodel of RSL is rep-
resented as an upper-model in the ontology. By doing so, definitions of the metamodel
can be used within the ontology, i.e. by reasoning techniques. This mapping is slightly
different from those described in [17] where a level-wise mapping is proposed. With
our mapping the detailed metamodel of RSL is used in the ontology for preclassifying
the elements of M1, i.e. the elements of a requirements model. Thus, the M1-elements
6
    protege.stanford.edu, pellet.owldl.com/pellint
14      Wolter K., et. al

are not subconcepts of the root Thing but of specific concepts like SVOSentence or
VerbPhrase. The preclassification provokes that the taxonomical distance between dif-
ferent elements is higher than it would have been without preclassification where all
elements are subconcepts of Thing.
    Other reasoning techniques like Instance Checking could be used for verifying if a
given requirements model complies to the RSL metamodel (i.e. model confirmation).
However, this is not needed in our case, since the software development tool men-
tioned above already ensures that each requirements model is compliant with the RSL
metamodel.
    Some approaches allow the user to use natural language (e.g. [18]). This results
in the need to parse the text which can lead to more than one possible model due to
language ambiguities. On the contrary, formal languages which avoid such ambiguities
like Z are not easy to understand for all stakeholders. RSL combines the advantages
of both alternatives. It is understandable for its users [7] and provides unambiguous
requirements models that can directly be mapped onto ontology elements.
    ORE [19] uses a similar approach to define the meaning of words in a specification;
they are related to concepts of an ontology. However, OREs support is based on a
domain-specific ontology containing domain-dependent inference rules that need to
be developed and maintained. The same holds for the ontology-driven requirements
analysis described in [20]. In contrast, our approach is independent of domain-specific
inference rules. We only require the mature semantic lexicon WordNet as input for our
approach.

9    Summary
In this paper, we presented our new combination of MOF-based requirements models
and reasoning techniques from Description Logic for supporting the reuse of similar
software cases. We showed how we represent the MOF-based models in an ontology.
Special in our approach is that we combine elements from different levels (i.e. M2 and
M1) within one taxonomy in our ontology. Doing so, we are able to infer taxonomical
relations implicitly contained in requirements models. These inferred taxonomical re-
lations are used by our ontology-based similarity measure. Current experiments using
industrial software cases show promising validation results.
Acknowledgments. This work is partially funded by the EU: Requirements-Driven
Software Development System (ReDSeeDS) (contract no. IST-2006-33596 under 6FP).
The project is coordinated by Infovide, Poland with technical lead of Warsaw University
of Technology and with University of Koblenz-Landau, Vienna University of Technol-
ogy, Fraunhofer IESE, University of Latvia, HITeC e.V. c/o University of Hamburg,
Heriot-Watt University, PRO DV, Cybersoft and Algoritmu Sistemos.

References
 1. Fellbaum, C., ed.: WordNet: An Electronic Lexical Database. MIT Press (1998)
 2. Wolter, K., Krebs, T., Hotz, L.: A combined similarity measure for determining
    similarity of model-based and descriptive requirements. In: Proceeding of the Ar-
    tificial Intelligence Techniques in Software Engineering Workshop (AISEW) at the
    ECAI 2008. (2008) 11–15
 3. Bildhauer, D., Horn, T., Ebert, J.: Similarity-driven software reuse. In: Interna-
    tional Workshop on Comparison and Versioning of Software Models (CVSM 2009).
    (2009)
                             Mapping MOF-based requirements to ontologies            15

 4. Baader, F., Calvanese, D., McGuinness, D., Nardi, D., Patel-Schneider, P.: The
    Description Logic Handbook. Cambridge University Press (2003)
 5. Kalnins, A., Barzdins, J., Celms, E.: Model transformation language MOLA. Lec-
    ture Notes in Computer Science 3599 (2004) 14–28
 6. Kalnins, A., Kalnina, E., Celms, E., Sostaks, A., Schwarz, H., Ambroziewicz, A.,
    Bojarski, J., Nowakowski, W., Straszak, T., Kavaldjian, S., Falb, J.: Reusable case
    transformation rule specification. Project Deliverable D3.3, ReDSeeDS Project
    (2007) www.redseeds.eu.
 7. Mukasa, K.S., Jedlitschka, A., Graf, C., Klöckner, K., Eisenbarth, M., Steinbach-
    Nordmann, S.: Requirements specification language validation report. Project
    Deliverable D2.5.1, ReDSeeDS Project (2007)
 8. Kaindl, H., Śmialek, M., Svetinovic, D., Ambroziewicz, A., Bojarski, J.,
    Nowakowski, W., Straszak, T., Schwarz, H., Bildhauer, D., Brogan, J.P., Mukasa,
    K.S., Wolter, K., Krebs, T.: Requirements specification language definition. Project
    Deliverable D2.4.1, ReDSeeDS Project (2007) www.redseeds.eu.
 9. W3C:              OWL      Web        Ontology    Language      Overview.     (2004)
    http://www.w3.org/TR/owl-features.
10. Patel-Schneider, P.F.: Description-logic knowledge representation system specifi-
    cation from the KRSS group of the ARPA knowledge sharing effort. Technical
    report, DARPA Knowledge Representation System Specification (KRSS) Group
    of the Knowledge Sharing Initiative (1993)
11. Hotz, L., Wolter, K., S., K., Solth, A.: Ontology-based Similarity of Software
    Cases. In to appear, ed.: International Conference on Knowledge Engineering and
    Ontology Development. (2009)
12. Borgida, A., Walsh, Thomas, J., Hirsh, H.: Towards measuring similarity in de-
    scription logics. In: International Workshop on Description Logics, Edinburgh,
    Scotland, DL2005 (2005)
13. González-Calero, P.A., Gómez-Albarran, M., Dı́az-Agudo, B.: Applying DLs for
    retrieval in case-based reasoning. In: Proc. of the 1999 International Workshop on
    Description Logics (DL’99). (1999)
14. Śmialek, M., Ambroziewicz, A., Bojarski, J., Nowakowski, W., Straszak, T.: ReD-
    SeeDS: Requirements Driven Software Development System. to be presented at
    ICSR 09 (2009) tool demo.
15. Dahm, P., Widmann, F.: GraLab - Das Graphenlabor. Projektbericht 4.3.0, Uni-
    versity of Koblenz-Landau, Institute for Software Technology (2003)
16. Venkatesh, V., Smith, R.H., Morris, M.G., Davis, G.B., Davis, F.D.: User accep-
    tance of information technology: Toward a unified view. MIS Quarterly 27 (2003)
    425–478
17. Zhao, Y., Pan, J.Z., Pareiras, F.S., Walter, T., Gröner, G., Ren, Y., Jekjantuk,
    N.: Successful Stories and Potential Patterns on Applying Ontologies in Software
    Engineering. Project Deliverable WP3-D6, MOST Project (2008)
18. Kleiner, M., Albert, P., Bezivin, J.: Configuring Models for (Controlled) Languages.
    In: Proceedings of the IJCAI-09 Workshop on Configuration. (2009)
19. Kaiya, H., Saeki, M.: Using domain ontology as domain knowledge for require-
    ments elicitation. In: Proc. of 14th IEEE International Requirements Engineering
    Conference (RE’06), IEEE CS (2006) 189–198
20. Liu, W., He, K.Q., Wang, J., Peng, R.: Heavyweight semantic inducement for
    requirement elicitation and analysis. In: SKG ’07: Proceedings of the Third Inter-
    national Conference on Semantics, Knowledge and Grid, Washington, DC, USA,
    IEEE Computer Society (2007) 206–211