<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>CROSI Mapping System (CMS) Results of the 2005 Ontology Alignment Contest</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Yannis Kalfoglou</string-name>
          <email>y.kalfoglou@ecs.soton.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bo Hu</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Advanced Knowledge Technologies (AKT), School of Electronics and Computer Science, University of Southampton</institution>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>School of Electronics and Computer Science, University of Southampton</institution>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <fpage>77</fpage>
      <lpage>84</lpage>
      <abstract>
        <p>In this results report we summarize our experiences from running the CROSI Mapping System (CMS) over three test cases for this year's OAEI contest: bibliography, Web directories and medical ontologies alignment case studies. CMS successfully parsed and aligned all input ontologies in all three case studies. We also elaborate on the insights gained and potential research directions towards building more robust alignment systems to cope with the increasing diversity of alignment requirements.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Although WordNet-based approaches equip themselves
with the lexical synonymy of the names of classes, they
do not have the right measure to capture the
structural information that is conveyed in most taxonomies.
Structural information is exploited in di®erent ways.
Heuristic rules is the most common way to take
structures into account, e.g. identifying similarity of two
entities based on the status of their parents and siblings.
The modular architecture depicted in ¯gure 1 employs
a multi-strategy system comprising of four modules,
namely, Feature Generation, Feature Selection and
Pro$ %
$ % $%
cessing, Aggregator and Evaluator. In this system,
different features of the input data are generated and
selected to ¯re o® di®erent sorts of feature matchers. The
resultant similarity values are compiled by multiple
similarity aggregators running in parallel or consecutive
order. The overall similarity is then evaluated to initiate
iterations that backtrack to di®erent stages.</p>
      <p>CMS, is an instantiation of such a system. We include
a screenshot of the Web-based interface of CMS in
¯gure 2. The system is still under development and we
only used the ¯rst two components, Feature generation
and Feature Selection and Processing, for aligning the
ontologies in the three case studies of the OAEI
contest. The alignment algorithms and techniques used
are described in later sections but ¯rst we elaborate, in
the next section, on the purpose of CMS and highlight
some of its key characteristics, like the robust features
extraction module.</p>
    </sec>
    <sec id="sec-2">
      <title>1.1 State, purpose, general statement</title>
      <p>The process of ontology mapping (or alignment), can be
summarised as: given two ontologies, a system measures
the similarity of the source ontological entities against
the target ones and produces a list of correspondences,
i.e. mapping : Os; Ot ! Cs £ Ct [ Ps £ Pt [ Is £ It where
Oi is the input ontologies with i 2 fs; tg, subscript
s indicating the source and t indicating the target, Ci
the set of classes, Pi the set of properties and Ii the
set of instances. Hence, the ¯rst step when deploying
CMS was to extract characteristics that can be used to
identify similar entities from di®erent ontologies. We
summarize the characteristics we extracted in table 1.
There are several points that need further explanation.
First, in many cases, identifying corresponding instances
is considered to be an easier task than identifying
corresponding classes. This is because instances are
expected to have more grounded variables.
Corresponding instances provide a ground on which the number of
candidate mapping classes can be narrowed down to a
few (as we discovered in our past work with the IF-Map
instance-based system [?]). Second, in case of
complement classes, let cs be a class from the source ontology
and ct from the target ontology, if sim(cs; ct) = a and
d = :c, we can safely conclude that sim(d; cs) = 1 ¡ a,
where sim=2 is the similarity function and a, a real
number, gives the con¯dent value.</p>
    </sec>
    <sec id="sec-3">
      <title>1.2 Specific techniques used</title>
      <p>To ¯t the requirements of di®erent applications, we
developed and implemented a series of mapping
techniques, which are regarded as independent components
that made up the CMS.</p>
      <sec id="sec-3-1">
        <title>Name matchers</title>
        <p>Ranging from pure syntactical approaches to more
semantic enriched ones, name matchers are categorised
as: String (tokenised) distance, Thesaurus, and
WordNet hierarchical distance.</p>
        <p>Levenstain distance is the simplest implementation of
string distance. More sophisticated ones are:
MongeElkan distance optimises edit-distance functions with
well-tuned editing cost and Jaro Metric and its
variants computes an accumulated similarity of s and t from
the order and number of common characters between s
and t, just to name a few. In our system thesaurus
comes into play in two forms: WordNet3 and a
prede¯ned corpora that are implemented as WNNameMatcher
and CorpusNameMatcher respectively. To facilitate the
use of WordNet, we assume that the local names of
classes are either nouns or noun phrases while the
local names of properties are phrases starting with verbs
followed by either nouns or adjectives. Elements in
the retrieved synsets are then compared against each
other using either exact string matching or one of the
string-distance based algorithms discussed in the
previous section. WordNet arranges it entries in hierarchical
structures. Hence, the similarity between names can
be computed as followings: let wi and wj be the
corresponding WordNet entries of namei and namej , w
3http://wordnet.princeton.edu/
Local features
class labels and URIs
equivalent classes
related property names
complement classes
property labels and URIs
property domain and range
inverse (transitive) property
functional property
instance labels and URIs
instantiated classes
comments
Global features
super and sub classes
sibling classes
super and sub properties
disjoint classes
comments
version information
subsumption relationship help to identify the location of a class in the taxonomy
and thus capture the structural semantics.
sibling classes provide the hint of how the parent class is de¯ned.
properties' hierarchy is useful in matching both properties and classes
disjoint cover should be treated as a special case.
comments sometimes are also given at the global level.</p>
        <p>the record of modi¯cations and authentication provides alternatives.
be the least common hypernym of wi and wj , r be the
root of the underlying WordNet hierarchy, and hi, hj ,
h be the distances between wi and r, wj and r, w and
r, respectively, the similarity between wi and wj is
approximated as 2 £ h=hi + hj .</p>
      </sec>
      <sec id="sec-3-2">
        <title>Semantic matchers</title>
        <p>In CMS, the °avour of semantic is added in two di®erent
ways: namely structure-aware matchers and
intensionaware matchers.</p>
        <p>Structure-awareness refers to the capability of
traversing class hierarchies and accumulating similarities along
the sub-class (sub-property) relationships. Let c and
d be two classes from source and target ontologies, ci
and di are their direct parents in respective ontologies,
the similarity between c and d is recursively de¯ned as
sim(c; d) = ®simlocal(c; d) + ¯sim(ci; di), where ® and ¯
are arbitrary weights and simlocal=2 gives the local
similarity with regard to c and d which can be computed
using one or a combination of techniques discussed above.
Intension-awareness takes into account the de¯nitions of
classes. A class c are regarded as a tuple hS; P i where
S is a set of classes of which c is a subclass and P is
a set of properties having c as the domain and other
classes or concrete data types as the range. Hence,
¯nding the semantic similarity between c = hSc; Pci
and d = hSd; Pdi amounts to ¯nding the similarity
between Sc and Sd as well as Pc and Pd, i.e. sim(c; d) =
®sim(Sc; Sd) + ¯simproperty(Pc; Pd), where ® and ¯ are
arbitrary weights and simproperty=2 computes the
property similarity. More speci¯cally, we di®erentiate the
following situations:
² classes with matching property names, property
domains and property ranges: Lpc = Lpd and
simset(¢pc ; ¢pd ) ¸ v and simset(©pc ; ©pd ) ¸ v
where simset=2 computes the similarity of two sets
of entities and v is a prede¯ned threshold.
² classes with matching property names and
property domains but di®erent property ranges: Lpc =
Lpd and simset(¢pd ; ¢pd ) ¸ v, simset(©pc ; ©pd ) &lt;
v, and
² classes with matching property names but
di®erent property domains as well as ranges: Lpc = Lpd
and simset(¢pc ; ¢pd ) &lt; v and simset(©pc ; ©pd ) &lt;
v.
The ¯rst situation contributes the most to the
similarity of c and d. We regard classes with matching
names and exact matching properties, i.e., properties
with same name, domain and range, as semantically
equivalent classes.</p>
        <p>In many cases, matching between ¢Pc and ¢Pd (©Pc
and ©Pc , respectively) can only be concluded after
traversing several levels upwards or downwards the class
hierarchy. Although not as strong as exact matching of
property domains and ranges, matching classes of ¢Pc
(©Pc ) to remote ancestors or descendants of classes of
¢Pd (©Pd ) provides a hint on how close the di®erent
properties are, and thus how similar the two concepts c
and d are. Such an idea is implemented in our system
as a ClassDefPlusMatcher method.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>1.3 Adaptations made for the contest</title>
      <p>We didn't do any major adaptations to CMS in order to
align the OAEI contest ontologies. We only did minor,
routine programmatic adjustments, as for example
running the CMS system from the command line prompt
in a batch mode to parse and align the hundreds of
ontologies in the Web directories case or include speci¯c
Java heap size adjustment °ags in order to run the
system over the vast FMA ontology. Other than that, the
system ran as normal.</p>
    </sec>
    <sec id="sec-5">
      <title>2. RESULTS</title>
      <p>CMS bene¯ts from the plug and play of modular
matchers. In this contest, four di®erent matchers were used,
namely ClassDef for examining the domain and range
of properties associated with classes, CanoName for
accumulating similarities among class hierarchies, WNDisSim
for computing the distance between two class names
based on WordNet structures and HierarchyDisSim
for distributing similarity among class hierarchies. The
four major matchers were invoked both in parallel and
sequentially. When invoked in parallel their results were
then aggregated as weight average. On the other hand,
when invoked in sequence, CanoName and WNDisSim give
a list of corresponding classes whose similarities were
then re¯ned by ClassDef and HierarchyDisSim. CMS
ran each test case with di®erent con¯gurations
(combination and sequencing) of the aforementioned four
mapping modules and precision and recall values were
calculated for each run. In this report, we include the
the con¯gurations with the highest precision and recall
values.</p>
    </sec>
    <sec id="sec-6">
      <title>2.1 Case 1: benchmark/BibTex ontologies</title>
      <p>For all the ontologies in this case we used a threshold
of 0.8.
ontology 202: CMS fails to produce any mapping
candidates with high similarity score in test case 202 due to
the naming convention. We consider class names as the
foundation on which other techniques can be applied
(although not the sole and dominant clue for ¯nding
mapping candidates). Similarly, cases 248 to 266 also
fall into this category: no candidates with high
similarity value were found.
ontology 205: CMS does not achieve a high recall
rate for benchmark test case 205 due to the restriction
of WordNet. In case 205, class names are replaced by
randomly selected synonyms. CMS relies heavily on
external resources, e.g. WordNet, to provide lexical
alternatives for class and property names and thus fails
to respond well for synonyms that are not recognised
by WordNet. A customised corpus might alleviate the
problem and improve the performance with signi¯cant
e®orts and domain expertise.
ontology 301: In test case 301, smaller similarity scores
were assigned to mapping candidates. This is due to
the fact that although classes have similar names, they
are de¯ned with di®erent properties which have
di®erent names, domains and/or ranges. It is our contention
that for classes restricted with di®erent properties, they
should either not be considered as equivalent classes or
their similarity value should be reduced to re°ect such
di®erence.</p>
    </sec>
    <sec id="sec-7">
      <title>2.2 Case 2: Web directories ontologies</title>
      <p>We do not have any speci¯c comments for Case 2. All
2265 were parsed successfully by CMS and fetched for
alignment. However, 29 ontologies did not produced
any alignment results due to circular de¯nitions in the
original source.owl and target.owl ¯les. So, a total
of 2236 pairs of source.owl/target.owl were aligned.
The system parsed them from the command line in a
batch mode, and the results produced after 2 hours and
53 minutes. Each cycle involved reading and parsing the
source and target ontologies, ¯nd alignments (if any)
and save and write the results in the common alignment
format in a ¯le. This was repeated 2265 times.</p>
    </sec>
    <sec id="sec-8">
      <title>2.3 Case 3: Medical ontologies</title>
      <p>This case was the most interesting. The sheer size of
the input ontologies (especially that of FMA), the
modelling style of OWL, the conventions used, and the
complexity of the paradigm made it an interesting
adventure from the research point of view. We report in more
detail about our experiences in section 3.3.</p>
    </sec>
    <sec id="sec-9">
      <title>3. GENERAL COMMENTS</title>
      <p>Performance tuning and hardware settings: As
we were facing some really large ontologies (i.e., the 72k
classes FMA ontology), we had to do certain
optimizations to the code and to the computer settings in order
to obtain alignment results in acceptable time. We ran
the tests on a stand-alone PC running Microsoft
Windows XP operating system, service pack II, 2003
version. The PC had 1GB of memory installed
(DDR400SDRAM), an 80GB Serial ATA hard disk, and a
Pentium 4, 3.0GHz processor. We used Java VM (version
1.5.0 04) and we had to do certain con¯gurations to
adjust the heap size in Java. For example, the standard
Java heap size is 64MB. This was not enough though for
the Web directory and medical ontologies case. In fact,
for the medical ontologies case, the sheer size of the
input ontologies (especially that of FMA) forced us to use
a 768MB heap size. Settings lower than this threshold
caused the system to run out of memory.</p>
      <p>Parsing and extracting experiences: FMA owl is a
31MB .owl ¯le comprising of 72545 declarations of owl
classes and 100 relations (object and data type
properties). These numbers were obtained when using our
Jena 2.2 API and probably deviate slightly from other
parsers. Parsing and extracting features from the FMA
ontology took 9 minutes and 17 seconds with Java Heap
Size adjusted to 512MB. However, in order to run the
CMS and ¯nd alignments with the OpenGALEN we had
to use a 768MB heap size setting. While parsing, Jena
API was complaining about the syntax idioms used.
For example we had a lot of warnings from Jena's RDF
syntax handler, or the form "bad URI in qname XXX:
no scheme found". We elaborate on the reasons behind
this parsing warnings in section 3.3.</p>
      <p>OpenGALEN.owl is a 4MB .owl ¯le comprising of 24
declarations of owl classes and 30 relations (as
previously, object and data type properties, and these
numbers were obtained from Jena 2.2 API). Parsing and
extracting features from OpenGALEN took just a few
seconds. There was no need to adjust the Java heap
size.</p>
    </sec>
    <sec id="sec-10">
      <title>3.1 Comments on the results</title>
      <p>Di®erent combinations of CMS plug-in matchers
perform signi¯cantly di®erently due to the nature of
benchmark test cases. Table 3.1 lists the choice of matchers
with regard to each test cases while Table 3.2 shows
performance values of di®erent matchers4 with regard
to alignment of ontology 303 in case 1, in terms of
precision and recall.</p>
    </sec>
    <sec id="sec-11">
      <title>3.2 Discussions on the way to improve the proposed system</title>
      <p>CMS is expected to be improved on the following
aspects: a more sophisticated aggregation mechanism, a
uni¯ed alignment representation formalism, and
parameterised algorithms for class hierarchy distance.
Firstly, as discussed in previous sections, results from
multi-matchers are aggregated as weighted average with
arbitrary weights to start with. Thus far, the weights
are ¯ne-tuned manually relying on the knowledge of the
4Results are obtained with equal weights for matchers.</p>
      <p>
        Test Case #
A
A, B
A, C, D
domain of discourse and the underlying algorithms of
CMS. A more sophisticated approach would hire
machine learning techniques to work out the most
appropriate weights with regard to di®erent matchers aiming
to solve di®erent sort of mappings. Furthermore,
results from di®erent matchers can be sorted locally ¯rst
which could make accumulating results from di®erent
matchers to be reduced to ranking aggregation [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
Secondly, the heterogeneous nature of di®erent
matchers { some external matchers produce pairwise
equivalence with numeric values stating the similarity score
while others output high level relationships, e.g. same
entity as, more speci¯c than, more general than and
disjoint with expressed in high level languages such as
OWL and RDF { suggests that output from di®erent
matchers has to be lifted to the same syntactical and
semantic level. A uni¯ed representation formalism equipped
with both numeric and abstract expressivity can
facilitate the aggregation of heterogeneous matchers.
Thirdly, CMS takes into account the exact position of
classes in the class hierarchy. We would like to develop
algorithms that penalise mapping candidates that are
found to be quite apart from each other, and then
propagate their similarity values upwards and downwards
in the hierarchy to their descendants and/or ancestors.
There could also be pre-de¯ned parameters that as we
go up or down the hierarchy we change the
similarity values of their descendants and/or ancestors
accordingly. We expect that this could reduce the number of
false positive results.
      </p>
    </sec>
    <sec id="sec-12">
      <title>3.3 Comments on the test cases</title>
      <p>We do not have any speci¯c comments for test cases on
BibTex and Web directories alignments. However, we
found interesting the last test case, that of medical
ontologies alignment, and we summarize our experiences
below.</p>
      <p>
        FMA.owl was a di®erent case altogether. The ontology
describes the domain of human anatomy and it aims to
provide "a reference ontology in biomedical
informatics for correlating di®erent views of anatomy, aligning
existing and emerging ontologies in bioinformatics" [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
However, there are two notable facts regarding the
syntactic and modelling idioms of FMA and existing
results from previous e®orts in trying to align FMA and
GALEN. As far as the former is concerned, the OWL
version we had to work with was a result of translation
from Protege. Previous work has shown that this result
is not always a faithful representation of the original
FMA Protege model. For instance, it has been reported
that FMA DL constructs are often ill-de¯ned and they
lead to inconsistencies when a reasoner parses the
ontology [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Consistency checking for FMA is an
acknowledged problem though, even by its authors: "[. . . ]
feedback from these investigators revealed an aggregate of
a few hundred errors, many of which related to spelling
and only a few to cycles in the class subsumption and
partonomy hierarchies." [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>Leaving aside this fact of life (as it is natural for an
ontology that big and so close to human practice to be
inconsistent), we point to a couple of syntactic idioms that
we found interesting when parsing the ontology with our
Jena-based CMS system. Firstly, the rather unusual
use of unique frame IDs for class names (&lt;owl:Class
rdf:ID&gt; constructs) and the textual description of a
class in an rdfs:label construct. We also noticed some
unusual uses of references to frame IDs. For instance,
the declaration of "arterial supply" as an object
property: &lt;owl:ObjectProperty rdf:ID="arterial supply"
rdfs:label="arterial supply"&gt; is used in other parts
of the ontology where it refers to a rdf:resource which
points to a di®erent resource:
&lt;arterial supply rdf:resource="#frame 14586"/&gt;.
Tracing that frame ID leads us to a de¯nition of a
"Tissue" class, and not the "arterial supply": &lt;owl:Class
rdf:ID="frame 14586" rdfs:label="Tissue"&gt;. The
de¯nition of an instance (with frame ID 14586) of an
object property ("arterial supply") that is a class
("Tissue") could lead to modelling misunderstandings and
confusion (although, syntactically speaking, it is allowed
in some versions of OWL).</p>
      <p>
        Going back to our argument for the notable facts, we
found that previous e®orts for aligning FMA to GALEN
reported rather controversial results. For example, in
[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], the authors employed two di®erent alignment
methods to map FMA to GALEN. Despite of the subtle
differences of OpenGALEN with GALEN, the similarity of
their work with that of the OAEI contest 3rd case study
is high but some of their ¯ndings are questionable from
the semantics point of view: for example, it was
reported that "Pancreas" in FMA matches "Pancreas" in
OpenGALEN with 1.0 similarity value which "indicates
a perfect match" [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. When we looked carefully at the
de¯nitions of "Pancreas" in both ontologies we saw that
"Pancreas" is de¯ned as a class in FMA ( &lt;owl:Class
rdf:ID="frame 12280" rdfs:label="Pancreas"&gt;)
whereas in GALEN (OpenGALEN) as an instance of
class "Body Cavity Anatomy"
&lt;owl:Class rdf:ID="Body Cavity Anatomy"&gt;
&lt;rdfs:subClassOf
rdf:resource="#OpenGALEN Anatomy Metaclass"/&gt;
&lt;Body Cavity Anatomy rdf:ID="Pancreas"&gt;
Even if OWL semantics allow to map an individual to
a class (when dealing with OWL Full), such an
alignment is misleading especially when we consider the high
level of abstraction for the "Pancreas" class in
OpenGALEN. It seems that the "lexical phase" parsing used
in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] was the main contributor to this high similarity
value when relatively little structure information was
taken into account. As a ¯nal comment on the case, we
also point the reader to observations made by the FMA
authors when trying to validate mapping results and
di®erences in terminologies with these two ontologies:
"[. . . ]the reasons for the di®erences have not yet been
explored, but at least some of them may be the di®erent
contexts of modelling. GALEN represents anatomy in
the context of surgical procedures, whereas FMA has a
strictly structural orientation." [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
    </sec>
    <sec id="sec-13">
      <title>3.4 Comments on the measures</title>
      <p>The proposed measures of precision and recall have been
studied and practiced in the NLP community for years
and they are a de facto standard metric for commercial
applications, like search engines. However, we believe
that their adaptation for measuring the performance of
an ontology mapping system is somewhat questionable.
We cannot elaborate fully on our reservations
regarding the use of such a metric in this short paper, but we
highlight the main points of our objections: (a)
precision is regarded as hard to implement and reveals the
usefulness of a retrieved document (or hit in a hitlist)
for a search engine. We can't judge the usefulness of
a found alignment by comparing it with the reference
alignment; (b) neither precision nor recall take into
account the possible applications of the alignments found.
In all the past EON (and this year OAEI) contests, a
set of pre-de¯ned alignments were used as a standard
against which all found alignment were compared. This
does not say anything about the usefulness of the found
alignments, or even of they are complete as the
prede¯ned ones can be erroneous. Further to these
comments, we would also like to add that the assignment
of numerical values in the range 0.0 to 1.0 does not
reveal their semantic relevance, but purely a brute-force
algorithmic way of comparing performance. We also
observed a variety of interpretations of precision and
recall metrics by the ontology alignment community.</p>
    </sec>
    <sec id="sec-14">
      <title>3.5 Proposed new measures</title>
      <p>Devising new measures for assessing the found
alignments between two ontologies in a universally agreed
manner is a di±cult task. We do not see a quick
solution to this problem, but as ontology engineers we can
apply knowledge engineering technologies that
encompass as much semantic information as possible; for
example, we were surprised that the semantically rich
definitions of OWL for declaring class or property equality
(and inequality) and the universal construct for
declaring similarity, are hardly used by the community.
We would also like to see ways of introducing
"applicationdriven" alignment metrics where an example
application (i.e., a Semantic Web service information lookup
engine) will need to access two di®erent ontologies and
the alignments found will need to be used in the
application in a speci¯c way. Having an application-driven
alignment metric, we can experiment with the notion of
usefulness of alignment in a real world scenario, rather
than doing meaningless number crunching with regard
to found and pre-de¯ned alignments. After all,
alignment needs to be done in the ¯rst place because there
is a real world need for it.</p>
    </sec>
    <sec id="sec-15">
      <title>4. CONCLUSION</title>
      <p>The 2005 OAEI ontology alignment contest was the
¯rst one that introduced sizeable ontologies and posed
some interesting and challenging problems with respect
to performance, scaling and domain exploration. We
found it a rewarding experience and we look forward to
continue the fruitful exploration of this key ¯eld in the
emergent Semantic Web.</p>
    </sec>
    <sec id="sec-16">
      <title>6. RAW RESULTS</title>
      <p>All of our results are included in a tabular format in
table 6.3. These results have been the best of the CMS
combinations with di®erent matcher. We report on
those in section 3.1. So, for example, alignments for case
#103 were produced using CMS Matcher A, whereas
alignments for case 225 were produced using CMS
Matchers A+B+C. A list of all this combibnation can be found
in table 3.2.</p>
    </sec>
    <sec id="sec-17">
      <title>6.1 Link to the system and parameters file</title>
      <p>Access to the Web-based interface of the CMS system
is provided via www.aktors.org/crosi/cms. We note
that the system is not available in the community for
free distribution yet, due to the legalities of the IPR for
the CROSI project.</p>
      <sec id="sec-17-1">
        <title>Name</title>
        <p>Reference alignment
Irrelevant ontology
Language generalization
Language restriction
No names
No names, no comments
No comments
Naming conventions
Synonyms
Translation</p>
      </sec>
      <sec id="sec-17-2">
        <title>No specialisation Flatenned hierarchy Expanded hierarchy No instance</title>
        <p>No restrictions
No properties
Flattened classes
Expanded classes</p>
      </sec>
      <sec id="sec-17-3">
        <title>Real: BibTeX/MIT Real: BibTeX/UMBC Real: Karlsruhe Real: INRIA</title>
      </sec>
    </sec>
    <sec id="sec-18">
      <title>Acknowledgements</title>
      <p>This work is supported under the Capturing,
Representing, and Operationalising Semantic Integration (CROSI)
project which is sponsored by Hewlett Packard
Laboratories at Bristol, UK. The ¯rst author is also supported
by the Advanced Knowledge Technologies (AKT)
Interdisciplinary Research Collaboration (IRC) project which
is sponsored by the UK EPSRC under Grant number
GR/N15764/01.</p>
    </sec>
    <sec id="sec-19">
      <title>6.2 Link to the set of provided alignments (in align format)</title>
      <p>The results of all three cases (BibTex, Web directories,
Medical) are available for download from the CROSI
web site at www.aktors.org/crosi/eon05contest/results.
6.3</p>
    </sec>
    <sec id="sec-20">
      <title>Matrix of results</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>W.W.</given-names>
            <surname>Cohen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Ravikumar</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.E.</given-names>
            <surname>Fienberg</surname>
          </string-name>
          .
          <article-title>A comparison of string distance metrics for name-matching tasks</article-title>
          .
          <source>In IJCAI 2003 IIWeb Workshop</source>
          , pages
          <volume>73</volume>
          {
          <fpage>78</fpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R.</given-names>
            <surname>Fagin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kumar</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Sivakumar</surname>
          </string-name>
          . E±
          <article-title>cient similarity search and classi¯cation via rank aggregation</article-title>
          .
          <source>In Proceedings of the ACM SIGMOD International Conference on Management of Data</source>
          , pages
          <volume>301</volume>
          {
          <fpage>312</fpage>
          . ACM Press,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>C.</given-names>
            <surname>Fellbaum. WordNet: An Electronic Lexical Database</surname>
          </string-name>
          . The MIT Press,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ehrig</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Staab. QOM - Quick Ontology</surname>
          </string-name>
          <article-title>Mapping</article-title>
          .
          <source>In Proceedings of the 3rd International Semantic Web Confernece (ISWC'04)</source>
          , LNCS 3298,
          <string-name>
            <surname>Hiroshima</surname>
          </string-name>
          , Japan, page
          <volume>683</volume>
          {
          <fpage>697</fpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>C.</given-names>
            <surname>Golbreich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>O.</given-names>
            <surname>Bodenreider</surname>
          </string-name>
          .
          <article-title>Migrating the FMA from Protege to OWL</article-title>
          .
          <source>Technical report, jul 2005. In notes of the 8th International Protege Conference</source>
          , Madrid, Spain.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>C.</given-names>
            <surname>Rosse</surname>
          </string-name>
          and
          <string-name>
            <surname>JL. Mejino</surname>
          </string-name>
          .
          <article-title>A Reference Ontology for Bioinformatics: The Foundational Model of Anatomy</article-title>
          .
          <source>Journal of Biomedical Informatics</source>
          ,
          <volume>36</volume>
          :
          <fpage>478</fpage>
          {
          <fpage>500</fpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , P. Mork, and
          <string-name>
            <given-names>O.</given-names>
            <surname>Bodenreider</surname>
          </string-name>
          .
          <article-title>Lessons learned from aligning two representations of anatomy</article-title>
          .
          <source>In in Proceedings of the KR 2004 Workshop on Formal Biomedical Knowledge Representation</source>
          , Whistler,
          <string-name>
            <surname>BC</surname>
          </string-name>
          , Canada, pages
          <volume>102</volume>
          {
          <fpage>108</fpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>