=Paper=
{{Paper
|id=Vol-516/paper-11
|storemode=property
|title=Pattern for Re-engineering a Classification Scheme, which Follows the Path Enumeration Data Model, to a Taxonomy
|pdfUrl=https://ceur-ws.org/Vol-516/pat04.pdf
|volume=Vol-516
|dblpUrl=https://dblp.org/rec/conf/semweb/Villazon-TerrazasSG09a
}}
==Pattern for Re-engineering a Classification Scheme, which Follows the Path Enumeration Data Model, to a Taxonomy==
Pattern for Re-engineering a Classification
Scheme, Which Follows the Path Enumeration
Data Model, to a Taxonomy
http://ontologydesignpatterns.org/wiki/Submissions:Classification scheme -
path enumeration model - to Taxonomy
Boris Villazón-Terrazas1 , Mari Carmen Suárez-Figueroa1 , and Asunción
Gómez-Pérez1
Ontology Engineering Group, Departamento de Inteligencia Artificial, Facultad de
Informática, Universidad Politécnica de Madrid, Spain
{bvillazon,mcsuarez,asun}@fi.upm.es,
WWW home page: http://www.oeg-upm.net/
1 Introduction
This pattern for re-engineering non-ontological resources (PR-NOR) fits in the
Schema Re-engineering Category proposed by [3]. The pattern defines a proce-
dure that transforms the classification scheme components into ontology repre-
sentational primitives. This pattern comes from the experience of ontology en-
gineers in developing ontologies using classification schemes in several projects
(SEEMP1 , NeOn2 , and Knowledge Web3 ). The pattern is included in a pool of
patterns, which is a key element of our method for re-engineering non-ontological
resources into ontologies [2]. The patterns generate the ontologies at a concep-
tualization level, independent of the ontology implementation language.
2 Pattern
Problem
Re-engineering a classification scheme, which follows the path enumeration model, to design a
taxonomy.
Non-Ontological Resource
A non-ontological resource holds a classification
scheme which follows the path enumeration model.
A classification scheme is a rooted tree of concepts,
in which each concept groups entities by some par-
ticular degree of similarity.
The semantics of the hierarchical relation between
parents and children concepts may vary depend-
ing of the context. The path enumeration data
model [1] for classification schemes take advantage
of that there is one and only one path from the
root to every item in the classification. The path
enumeration model stores that path as string by
concatenating either the edges or the keys of the
classification scheme items in the path.
1
http://www.seemp.org
2
http://www.neon-project.org
3
http://knowledgeweb.semanticweb.org
112
Applicability
The semantics of the relation between parent and children items are subClassOf.
There is not multi-inheritance nor cyclic relations.
Ontology Generated
The ontology generated will be based on the tax-
onomy architectural pattern (AP-TX-01) [4].
Each category in the classification scheme is
mapped to a class, and the semantics of the re-
lationship between children and parent categories
are mapped to subClassOf relations.
Process - Solution
1. Identify the classification scheme items whose
their path enumeration values have the short-
est length, i.e. classification scheme items
without parents.
2. For each one of the above identified classifi-
cation scheme items cei :
2.1. Create the corresponding ontology class,
Ci class.
2.2. Identify the classification scheme items,
cej , which are children of cei , by using
the path enumeration values.
2.3. For each one of the above identified clas-
sification scheme items cej :
2.3.1. Create the corresponding ontology
class, Cj class.
2.3.2. Set up the subClassOf relation be-
tween Cj and Ci .
2.3.3. Repeat from step 2.2 for cej as a
new cei .
3. If there are more than one classification
scheme items without parent cei
3.1. Create an ad-hoc class as the root class
of the ontology.
3.2. Set up the subClassOf relation between
Ci class and the root class.
Example
Suppose that someone wants to build an ontology based on the International Standard
Classification of Occupations (for European Union purposes) ISCO-88 (COM). This classification
scheme follows the path enumeration data model.
Non-Ontological Resource
The International Standard Classification of Oc-
cupations (for European Union purposes), 1988
version: ISCO-88 (COM) published by Eurostat
is modelled with the path enumeration data
model. This classification scheme is available at
http://ec.europa.eu/eurostat/ramon/
113
Ontology Generated
The ontology generated will be based on the tax-
onomy architectural pattern (AP-TX-01) [4].
Each category in the classification scheme is
mapped to a class, and the semantics of the re-
lationship between children and parent categories
are mapped to subClassOf relations.
Process - Solution
1. Create the LEGISLATORS, SENIOR OFFICIALS
AND MANAGERS class.
1.1. Create the Legislators and senior
officials class, and set up the subClas-
sOf relation between the Legislators
and senior officials class and the
LEGISLATORS, SENIOR OFFICIALS AND
MANAGERS class.
1.2. Create the Corporate managers class,
and set up the subClassOf relation be-
tween the Corporate managers class and
the LEGISLATORS, SENIOR OFFICIALS AND
MANAGERS class.
2. Create the PROFESSIONALS class.
3. Create the Occupation class.
4. Set up the subClassOf relation between the
LEGISLATORS, SENIOR OFFICIALS AND MANAGERS
class and the Occupation class.
5. Set up the subClassOf relation between the
PROFESSIONALS class and the Occupation class.
Related Resources
This pattern is related to the architectural pattern TX-AP-01 [4] for modelling a taxonomy.
3 Pattern Usage
This pattern was applied to re-engineer the ISCO-88(COM)4 , International Stan-
dard Classification of Occupations (for European Union purposes), into a Oc-
cupation Ontology5 , within the context of the SEEMP project. This standard
is a classification scheme which consists of 520 occupations. ISCO-88(COM) is
modelled following the path enumeration data model. Because of the number of
occupations of the ISCO-88(COM) standard, it was not practical to create the
4
Available at http://ec.europa.eu/eurostat/ramon/
5
The ontology is available at http://droz.dia.fi.upm.es/hrmontology/
114
ontology manually. Therefore, we created an ad-hoc wrapper, implemented in
Java, that reads the data from the resource implementation and automatically
creates the corresponding elements of the new ontology following the suggestion
given by the pattern.
4 Summary and Future Work
We have presented a pattern for transforming a classification scheme, which
is modelled following the path enumeration data model, into a taxonomy. The
pattern is included in a pool of patterns, which is a key element of our method
for re-engineering non-ontological resources into ontologies [2].
We plan to develop software libraries within a framework that implement the
transformation process suggested by the pattern. Moreover, we will include exter-
nal resources to improve the quality of the resultant ontologies. Finally, we need
to calculate how much effort do we save re-engineering classification schemes us-
ing patterns compared with re-engineering classification schemes without them.
Acknowledgments. This work has been partially supported by the European
Comission projects NeOn(FP6-027595) and SEEMP(FP6-027347), as well as by
an R+D grant from the UPM.
References
1. D. Brandon. Recursive database structures. Journal of Computing Sciences in
Colleges, 2005.
2. A. Garcı́a, A. Gómez-Pérez, M. C. Suárez-Figueroa, and B. Villazón-Terrazas. A
Pattern Based Approach for Re-engineering Non-Ontological Resources into On-
tologies. In Proceedings of the 3rd Asian Semantic Web Conference (ASWC2008).
Springer-Verlag, 2008.
3. V. Presutti, A. Gangemi, S. David, G. Aguado de Cea, M. C. Surez-Figueroa,
E. Montiel-Ponsoda, and M. Poveda. NeOn Deliverable D2.5.1. A Library of On-
tology Design Patterns: reusable solutions for collaborative design of networked
ontologies. In NeOn Project. http://www.neon-project.org, 2008.
4. M. C. Suárez-Figueroa, S. Brockmans, A. Gangemi, A. Gómez-Pérez, J. Lehmann,
H. Lewen, V. Presutti, and M. Sabou. Neon modelling components. Technical
report, NeOn project deliverable D5.1.1, 2007.
115