<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Lily Results for OAEI 2015</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Wenyu Wang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Peng Wang</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Chien-Shiung Wu College, Southeast University</institution>
          ,
          <country country="CN">China</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>School of Computer Science and Engineering, Southeast University</institution>
          ,
          <country country="CN">China</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper presents the results of Lily in the ontology alignment contest OAEI 2015. As a comprehensive ontology matching system, Lily is intended to participate in four tracks of the contest: benchmark, conference, anatomy, and instance matching. The speci c techniques used by Lily will be introduced brie y. The strengths and weaknesses of Lily will also be discussed.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>1.1</p>
    </sec>
    <sec id="sec-2">
      <title>State, purpose, general statement</title>
      <p>The core principle of matching strategies of Lily is utilizing the useful information
correctly and e ectively. Lily combines several e ective and e cient matching
techniques to facilitate alignments. There are ve main matching strategies: (1)
Generic Ontology Matching (GOM) is used for common matching tasks with
normal size ontologies. (2) Large scale Ontology Matching (LOM) is used for
the matching tasks with large size ontologies. (3) Instance Ontology Matching
(IOM) is used for instance matching tasks. (4) Ontology mapping debugging is
used to verify and improve the alignment results. (5) Ontology matching tuning
is used to enhance overall performance.</p>
      <p>The matching process mainly contains three steps: (1) Pre-processing, when
Lily parses ontologies and prepares the necessary information for subsequent
steps. Meanwhile, the ontologies will be generally analyzed, whose
characteristics, along with studied datasets, will be utilized to determine parameters and
strategies. (2) Similarity computing, when Lily uses special methods to calculate
the similarities between elements from di erent ontologies. (3) Post-processing,
when alignments are extracted and re ned by mapping debugging.</p>
      <p>In this year, some algorithms and matching strategies of Lily have been
modi ed for higher e ciency, and adjusted for brand-new matching tasks like
Author Recognition and Author Disambiguation in the Instance Matching track.
1.2</p>
    </sec>
    <sec id="sec-3">
      <title>Speci c techniques used</title>
      <p>Lily aims to provide high quality 1:1 concept pair or property pair alignments.
The main speci c techniques used by Lily are as follows.</p>
      <p>
        Semantic subgraph An element may have heterogeneous semantic
interpretations in di erent ontologies. Therefore, understanding the real local meanings
of elements is very useful for similarity computation, which are the foundations
for many applications including ontology matching. Therefore, before similarity
computation, Lily rst describes the meaning for each entity accurately. However,
since di erent ontologies have di erent preferences to describe their elements,
obtaining the semantic context of an element is an open problem. The semantic
subgraph was proposed to capture the real meanings of ontology elements [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
To extract the semantic subgraphs, a hybrid ontology graph is used to
represent the semantic relations between elements. An extracting algorithm based on
an electrical circuit model is then used with new conductivity calculation rules
to improve the quality of the semantic subgraphs. It has been shown that the
semantic subgraphs can properly capture the local meanings of elements [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>Based on the extracted semantic subgraphs, more credible matching clues can
be discovered, which help reduce the negative e ects of the matching uncertainty.
Generic ontology matching method The similarity computation is based
on the semantic subgraphs, which means all the information used in the
similarity computation comes from the semantic subgraphs. Lily combines the text
matching and structure matching techniques.</p>
      <p>Semantic Description Document (SDD) matcher measures the literal
similarity between ontologies. A semantic description document of a concept contains
the information about class hierarchies, related properties and instances. A
semantic description document of a property contains the information about
hierarchies, domains, ranges, restrictions and related instances. For the descriptions
from di erent entities, the similarities of the corresponding parts will be
calculated. Finally, all separated similarities will be combined with the experiential
weights.</p>
      <p>
        Matching weak informative ontologies Most existing ontology matching
methods are based on the linguistic information. However, some ontologies may
lack in regular linguistic information such as natural words and comments.
Consequently the linguistic-based methods will not work. Structure-based methods
are more practical for such situations. Similarity propagation is a feasible idea
to realize the structure-based matching. But traditional propagation strategies
do not take into consideration the ontology features and will be faced with
effectiveness and performance problems. Having analyzed the classical similarity
propagation algorithm, Similarity Flood, we proposed a new structure-based
ontology matching method [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. This method has two features: (1) It has more strict
but reasonable propagation conditions which lead to more e cient matching
processes and better alignments. (2) A series of propagation strategies are used to
improve the matching quality. We have demonstrated that this method performs
well on the OAEI benchmark dataset [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>However, the similarity propagation is not always perfect. When more
alignments are discovered, more incorrect alignments would also be introduced by
the similarity propagation. So Lily also uses a strategy to determine when to use
the similarity propagation.</p>
      <p>
        Large scale ontology matching Matching large ontologies is a challenge due
to its signi cant time complexity. We proposed a new matching method for large
ontologies based on reduction anchors [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. This method has a distinct advantage
over the divide-and-conquer methods because it does not need to partition large
ontologies. In particular, two kinds of reduction anchors, positive and negative
reduction anchors, are proposed to reduce the time complexity in matching.
Positive reduction anchors use the concept hierarchy to predict the ignorable
similarity calculations. Negative reduction anchors use the locality of matching
to predict the ignorable similarity calculations. Our experimental results on the
real world datasets show that the proposed methods are e cient in matching
large ontologies [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        Ontology mapping debugging Lily utilizes a technique named ontology
mapping debugging to improve the alignment results [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Di erent from existing
methods that focus on nding e cient and e ective solutions for the ontology mapping
problems, mapping debugging emphasizes on analyzing the mapping results to
detect or diagnose the mapping defects. During debugging, some types of
mapping errors, such as redundant and inconsistent mappings, can be detected. Some
warnings, including imprecise mappings or abnormal mappings, are also locked
by analyzing the features of mapping result. More importantly, some errors and
warnings can be repaired automatically or can be presented to users with revising
suggestions.
      </p>
      <p>
        Ontology matching tuning Lily adopted ontology matching tuning this year.
By performing parameter optimization on training datasets [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], Lily is able to
determine the best parameters for similar tasks. Those data will be stored. When
it comes to real matching tasks, Lily will perform statistical calculations on the
new ontologies to acquire their features that help it nd the most suitable
congurations, based on previous training data. In this way, the overall performance
can be improved.
      </p>
      <p>Currently, ontology matching tuning is not totally automatic. It is di cult
to nd out typical statistical parameters that distinguish each task from
others. Meanwhile, learning from test datasets can be really time-consuming. Our
experiment is just a beginning.
1.3</p>
    </sec>
    <sec id="sec-4">
      <title>Adaptations made for the evaluation</title>
      <p>For benchmark, anatomy and conference tasks, Lily is totally automatic, which
means Lily can be invoked directly from the SEALS client. It will also determine
which strategy to use and the corresponding parameters. For a speci c instance
matching task, Lily needs to be con gured and started up manually, so only
matching results were submitted.
1.4</p>
    </sec>
    <sec id="sec-5">
      <title>Link to the system and parameters le</title>
      <p>SEALS wrapped version of Lily for OAEI 2015 is available at https://drive.
google.com/file/d/0B4fqkE38d3QrS1Zta0pPSFpqXzA/view?usp=sharing.
1.5</p>
    </sec>
    <sec id="sec-6">
      <title>Link to the set of provided alignments</title>
      <p>The set of provided alignments, as well as overall performance, is available at
each track of the OAEI 2015 o cial website, http://oaei.ontologymatching.
org/2015/.
2</p>
      <sec id="sec-6-1">
        <title>Results</title>
        <p>2.1</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Benchmark track</title>
      <p>There are two datasets in di erent sizes: Biblio and energy. The former one,
which will be matched using Generic Ontology Matching, is generally small,
while the latter one is so much that it has to be matched by Large scale Ontology
Matching.</p>
      <p>There are ve groups of test suites in each dataset. Each test suite has 94
matching tasks. The overall results of one test suite will be represented by the
mean value of Precision, Recall and F-Measure. Test suites were generated from
the same seed ontologies, which means they are all equal. Thus, the harmonic
mean values of all test suites will be used to evaluate how well Lily worked.</p>
      <p>The detailed results are shown in Table 1.</p>
      <p>As Table 1 has shown, Lily handles Benchmark datasets well in both small
and large scales, although the results of the energy dataset are slightly worse
as the expense of better performance. According to the Benchmark results of
OAEI20151, Lily has the highest overall F-Measure among 11 matching systems
that generated alignments for the Biblio dataset. However, the public results
show that Lily failed to produce alignments for energy dataset. That is because
the energy dataset is a replacement for its former dataset IFC. The substitution
also brought about format changes of ontology description les. Consequently,
Lily and some other systems were not able to parse ontologies correctly. After
the issue was xed, we evaluated Lily on only energy dataset with SEALS client
and obtained the results.
2.2</p>
    </sec>
    <sec id="sec-8">
      <title>Anatomy track</title>
      <p>The anatomy matching task consists of two real large-scale biological ontologies.
Table 2 shows the performance of Lily in the Anatomy track on a server with
one 3.46 GHz, 6-core CPU and 8GB RAM allocated. The time unit is second
(s).</p>
      <p>
        Compared with the result in OAEI 2011 [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], there is a small improvement of
Precision, Recall and F-Measure, from 0.80, 0.72 and 0.76 to 0.87, 0.79 and 0.83,
1 http://oaei.ontologymatching.org/2015/results/benchmarks/index.html
respectively. One main reason for the improvement is that we found the names
of classes not semantically useful, which would confuse Lily when the similarity
matrix was calculated. After the names were excluded, better alignments were
generated. Besides, there is a signi cant reduction of the time consumption, from
563s to 266s. This is not only the result of stronger CPU, but also because more
optimizations, like parallelization, were applied to the algorithms in Lily.
      </p>
      <p>However, as can be seen in the overall result, Lily lies in the middle position
of the rank, which indicates it is still possible to make further progress.
Additionally, some key algorithms have not been successfully parallelized. After that
is done, the time consumption is expected to be further reduced.
2.3</p>
    </sec>
    <sec id="sec-9">
      <title>Conference track</title>
      <p>In this track, there are 7 independent ontologies that can be matched with one
another. The 21 subtasks are based on given reference alignments. As a result of
heterogeneous characters, it is a challenge to generate high-quality alignments
for all ontology pairs in this track.</p>
      <p>Lily adopted ontology matching tuning for the Conference track this year.
Table 3 shows its latest performance.</p>
      <p>
        Compared with the result in OAEI 2011 [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], there is a signi cant improvement
of mean Precision, Recall and F-Measure, from 0.36, 0.47 and 0.41 to 0.59, 0.53
and 0.56, respectively. Besides, all the tasks share the same con gurations, so it is
possible to generate better alignments by assigning the most suitable parameters
for each task. We will continue to enhance this feature.
We submitted alignments for two tasks in the IM track of OAEI 2015: Author
Disambiguation Task and Author Recognition Task. For the other three tasks,
there is currently no speci c strategy available, so Lily will not produce
alignments for them.
      </p>
      <p>For each task, there are two matching subtasks with di erent scales. The
sandbox scale is around 1,000 instances, which was provided as the test dataset.
The mainbox scale is around 10,000 instances. The results will be analyzed for
each task.</p>
      <p>Author Disambiguation Task Lily utilized a di erent strategy for this task,
as we found several features of the dataset: one author's name in ontology A
usually contains the corresponding name in ontology B, and a slight di erence
of one property may distinguish publications in two ontologies. The result is
shown in Table 4.</p>
      <p>As can be seen in Table 4, the strategy is practical. Most correct matches
can be found with high precision in both sandbox and mainbox subtasks.
According to overall results, Lily scores highest in this task. However, there are still
some missing matches. After analyzing the reference alignments and matching
ontologies, we found that some matched authors had actually no publication in
common, and that accounts for many matches missed by Lily.</p>
      <p>Author Recognition Task Quite di erent from the previous task, this task
requires computations over the source ontology, whose results will be matched
with the target ontology. Lily will rst follow the requirement to generate an
intermediate, statistical ontology from the source ontology. Then, string
properties and numeric properties of that ontology and the target ontology will be
compared in di erent methods. Finally, all the similarities will be combined. The
result is shown in Table 5.
In this year, a lot of modi cations were done to Lily for both e ectiveness and
e ciency. The performance has been improved as we have expected. The
strategies for new tasks have been proved to be useful.</p>
      <p>On the whole, Lily is a comprehensive ontology matching system with the
ability to handle multiple types of ontology matching tasks, of which the results
are generally competitive. However, Lily still lacks in strategies for some newly
developed matching tasks. The relatively high time and memory consumption
also prevent Lily from nishing some challenging tasks.
4</p>
      <sec id="sec-9-1">
        <title>Conclusion</title>
        <p>In this paper, we brie y introduced our ontology matching system Lily. The
matching process and the special techniques used by Lily were presented, and
the alignment results were carefully analyzed.</p>
        <p>There is still so much to do to make further progress. Lily needs more
optimization to handle large ontologies with limited time and memory. Thus,
techniques like parallelization will be applied more. Also, we have just tried out
ontology matching tuning. With further research on that, Lily will not only
produce better alignments for tracks it was intended for, but also be able to
participate in the interactive track.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Peng</given-names>
            <surname>Wang</surname>
          </string-name>
          , Baowen Xu:
          <article-title>Lily: ontology alignment results for OAEI 2009</article-title>
          . In The 4th International Workshop on Ontology Matching, Washington Dc.,
          <source>USA</source>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Peng</given-names>
            <surname>Wang</surname>
          </string-name>
          , Baowen Xu:
          <article-title>Lily: Ontology Alignment Results for OAEI 2008</article-title>
          . In The Third International Workshop on Ontology Matching, Karlsruhe, Germany (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Peng</given-names>
            <surname>Wang</surname>
          </string-name>
          , Baowen Xu:
          <article-title>LILY: the results for the ontology alignment contest OAEI 2007</article-title>
          . In The Second International Workshop on Ontology Matching (
          <issue>OM2007</issue>
          ), Busan, Korea (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Peng</given-names>
            <surname>Wang</surname>
          </string-name>
          , Baowen Xu, Yuming Zhou:
          <article-title>Extracting Semantic Subgraphs to Capture the Real Meanings of Ontology Elements</article-title>
          .
          <source>Journal of Tsinghua Science and Technology</source>
          , vol.
          <volume>15</volume>
          (
          <issue>6</issue>
          ), pp.
          <fpage>724</fpage>
          -
          <lpage>733</lpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Peng</given-names>
            <surname>Wang</surname>
          </string-name>
          , Baowen Xu:
          <article-title>An E ective Similarity Propagation Model for Matching Ontologies without Su cient or Regular Linguistic Information</article-title>
          ,
          <source>In The 4th Asian Semantic Web Conference (ASWC2009)</source>
          , Shanghai, China (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Peng</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <surname>Yuming Zhou</surname>
          </string-name>
          ,
          <source>Baowen Xu: Matching Large Ontologies Based on Reduction Anchors. In The Twenty-Second International Joint Conference on Arti cial Intelligence (IJCAI</source>
          <year>2011</year>
          ), Barcelona, Catalonia, Spain (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Peng</given-names>
            <surname>Wang</surname>
          </string-name>
          , Baowen Xu:
          <article-title>Debugging Ontology Mapping: A Static Approach</article-title>
          .
          <source>Computing and Informatics</source>
          , vol.
          <volume>27</volume>
          (
          <issue>1</issue>
          ), pp.
          <volume>2136</volume>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Peng</given-names>
            <surname>Wang</surname>
          </string-name>
          :
          <article-title>Lily results on SEALS platform for OAEI 2011</article-title>
          .
          <source>Proc. of 6th OM Workshop</source>
          , pp.
          <fpage>156</fpage>
          -
          <lpage>162</lpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Yang</surname>
            , Pan,
            <given-names>Peng</given-names>
          </string-name>
          <string-name>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            <given-names>Ji</given-names>
          </string-name>
          , Xingyu Chen, Kai Huang,
          <source>Bin Yu: Ontology Matching Tuning Based on Particle Swarm Optimization: Preliminary Results. In The Semantic Web and Web Science</source>
          , pp.
          <fpage>146</fpage>
          -
          <lpage>155</lpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>