<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Building up Ontologies with Property Axioms from Wikipedia</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Tokio Kawakami</string-name>
          <email>kawakami0412@keio.jp</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Takeshi Morita</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Takahira Yamaguchi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Keio University</institution>
          ,
          <addr-line>3-14-1 Hiyoshi, Kohoku-ku, Yokohama, Kanagawa 223-8522</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Wikipedia has been recently drawing attention as a semistructured information resource for the automatic construction of ontology. We proposed a method of constructing a general-purpose ontology by automatically extracting the is-a relations (rdfs:subClassOf), class-instance relations (rdf:type), property relations and types. Methods to automatically construct ontologies with numerous property relations and types from Wikipedia are discussed here. The ontologies include is-a relations, class-instance relations and property relations and types. The property relations are triples, property domain (rdfs:domain), property range (rdfs:range), and property hypernymy-hyponymy relations (rdfs:su bPropertyOf). The property types are object (owl:ObjectProperty), data (owl:DatatypeProperty), symmetric (owl:SymmetricProperty), transitive (owl:TransitiveProperty), functional (owl:FunctionalProperty), and inverse functional (owl:InverseFunctionalProperty).</p>
      </abstract>
      <kwd-group>
        <kwd>Ontology Learning</kwd>
        <kwd>Property Axioms</kwd>
        <kwd>Wikipedia</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        It is effective to construct large-scale ontologies for information searching and
data integration. Among popular ontologies are WordNet [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and Cyc [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
However, it is expensive to construct these ontologies manually. Moreover, the manual
ontology engineering process results in numerous bugs, and its maintenance and
update are challenging. Therefore, more attention comes to build the automatic
or semi-automatic creation of ontologies on research, ontology learning.
      </p>
      <p>Wikipedia, the web-based open encyclopedia, is increasing in popularity as
a new information resource. Because Wikipedia has rich vocabulary, reasonable
updatability, and semistructuredness, there is less differences between Wikipedia
and ontologies when compared with free text. Thus, ontology learning from
Wikipedia is becoming popular.</p>
      <p>
        We proposed a large-scale and general-purpose ontology learning method
for extracting the is-a relations (rdfs:subClassOf), class-instance (rdf:type), and
property relations and types by using Wikipedia as the resources [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. However,
there are certain challenges. For example, we did not de ne the property domain,
propety range and types. Therefore, we extract the property names (that are
including following types: object, data, symmetric, transitive, functional, and
inverse functional) and relations (that are triples, property domain and property
range) by adding certain techniques to current technique and construct a
largescale and general-purpose ontology including numerous properties.
      </p>
      <p>This paper is structured as follows: We introduce related works on deriving
ontology from Wikipedia in Section 2. In Section 3, we explain the de nition of
property as described in our previous research and details on extraction
techniques applied to Wikipedia. In Section 4, we present the result of the experiment
wherein we applied the extraction techniques to Wikipedia. Finally, we present
the conclusion of this paper and our future work.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        Auer et al.'s DBpedia [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] constructed a large information database to extract
RDF from semi-structured information resources of Wikipedia. They used
information resources such as Infobox, external link, categories the article belongs
to. However, properties and classes are constructed manually, and there are 170
classes and 720 properties. We use not only Infobox and Wikipedia categories
but also text information such as list structures, list articles and de nition texts
and extract relations automatically. It is also different in that it is working on
automatic extraction of property axioms.
      </p>
      <p>
        Fabian et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] proposed YAGO, which enhanced WordNet by using the
Conceptual Category. Conceptual Category is a category of Wikipedia English,
whose head part has the form of a plural noun suchas American singers of
German origin . They de ned a Conceptual Category as a class and de ned the
articles that belong to the Conceptual Category as instances of the class. This
method permits all articles using categories to be set as an instance. However,
as the information in the main text is not used, in the cases where the article
is absent or where information not re ected in the categories is present in the
main text, it is unfeasible to extract such information as the instance.
      </p>
      <p>
        YAGO2 [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] aimed for further expansion of ontology with the knowledge
base expansion of YAGO both by the previous linkage between WordNet and
Wikipedia category and by the extraction of spatiotemporal information from
Wikipedia and GeoNames3. Moreover, in the expansion version, YAGO3 [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ],
Wikipedia (in other languages as well as English) is used to permit
multilingual expansion. YAGO2 focused on nonhierarchical relations and constructed
an advanced ontology, incorporating a spatiotemporal relation that is not based
only on the hierarchical relation. However, it has not utilized Wikipedia's unique
structural aspects such as information in the main text, de ne statements, and
Wikipedia lists.
      </p>
      <p>
        Sanchez [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] presented the approach which exploits the Web to extract
evidences that sustain a particular property of a given relationship. Speci cally,
the axioms studied in this work are symmetrical, re exive, functional, transitive,
inverse and inverse functional properties. They don't extract property domains
and ranges and don't extract using each property types mutually.
      </p>
      <p>
        Fleischhacker et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] proposed a method of creating an RDF dataset with
property axioms. They extracted property subsumption, disjointness,
transitivity, domain, range, symmetry, asymmetry, inverse, functionality, inverse
functionality, re exivity and irre exivity. They used DBpedia for evaluating their
approach. They don't extract using each property types mutually.
3
3.1
      </p>
    </sec>
    <sec id="sec-3">
      <title>Property De nition for Our Wikipedia Ontology</title>
      <sec id="sec-3-1">
        <title>Extraction of Property Names</title>
        <sec id="sec-3-1-1">
          <title>We extract properties by using the following two methods.</title>
          <p>
            Extraction by Scraping Infobox This method has been proposed in the
past [
            <xref ref-type="bibr" rid="ref3">3</xref>
            ]. We extract a three-piece set of article-item-value in the Infobox as
\instance-property-property value". In case the triple set is extracted directly
from the dump data, the meaning of these properties is likely to appear
challenging to understand at a glance, or the properties may not be integrated.
Therefore, we convert properties in Infobox into HTML by Java Wikipedia API
(Bliki engine)1 here. Hereby, we can understand the properties' meaning at a
glance and integrate properties.
          </p>
          <p>Extraction by Scraping List Structure A large number of Wikipedia
articles have list structures. This method extracts triples from list structures as
\article name - heading name - each value". At that time, we examine the
categories that each article belongs to and collect a number of heading names that
appear in each category. This enables the extraction of the category that the
article belongs to as a property domain. The procedure of this method is illustrated
by steps 1 - 4 below.
1. Extract categories and heading name from each article of Wikipedia dump
data.
2. Check occurrence rate of heading name from each category using (1).
3. Remove heading names that have a low occurrence rate from (2).
4. Extract heading name as property and each value of list structures as value
of the property from each article.
3.2</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>Extraction of Property Domain</title>
        <p>
          A subject of the triple discussed in 3.1 is an article name as an instance.
Therefore, examining a category to which an article that is a subject belongs enables
the de nition a domain of a property. Therefore, the Infobox template name is
extracted as a domain of each property in Infobox. Furthermore, we extract the
Wikipedia categories to which the article using the Infobox template belongs, as
1 http://code.google.com/p/gwtwiki
a domain. Moreover, for articles for which Infobox is not de ned, a hyper concept
that is extracted in \Extracting hypernymy-hyponymy from textual de nition"
in [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] is extracted as a domain. Collating the domain that is extracted so far
with the class hierarchies extracted in [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], we extract the domain (considered as
classes) included in the class hierarchy. As a result, we can eliminate domains
that are not appropriate as classes. Finally, we remove a few occurrence rate
domain ( ve or less), and we extract the remaining as a domain.
3.3
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>Extraction of Property Range</title>
        <p>It is comparatively more convenient to de ne a property domain because an
instance name as a subject of a property corresponds to an article name in a
property triple, and the Infobox template name of an article can be considered
to be a property domain. However, with regard to property ranges, an instance
name as a object of a property cannot conclude to an article name; therefore, it
is challenging to de ne the property ranges of all properties similar to property
domain. Thus, for property ranges, we use following two methods:</p>
        <sec id="sec-3-3-1">
          <title>1. Extraction using class-instance relations</title>
          <p>2. Extraction using is-a relations</p>
          <p>They are generally linked if a word has already had an article in Wikipedia
and an article name corresponds to an instance name. Therefore, we match an
object of a triple (instance) and instances of class-instance relations. Then, we
extract classes as property ranges that the instance belongs.</p>
          <p>
            Next, in order to extract property ranges that cannot be extracted by a
previous technique, we match the categories that an article (which has a name
identical to that of an object of a triple) belongs to and the classes of is-a relations
in [
            <xref ref-type="bibr" rid="ref3">3</xref>
            ]. Then, we extract the classes as property ranges.
3.4
          </p>
        </sec>
      </sec>
      <sec id="sec-3-4">
        <title>Extraction of Property Hypernymy-Hyponymy Relations</title>
        <p>We attempt extracting property hypernymy-hyponymy relations. We use the
following two methods.</p>
        <p>
          Extracting by String Matching We extract the propety hypernymy-hyponymy
relations by using \backward string matching" in [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. The procedure of this
method is shown following 1 - 4.
1. When a property A is set as \any string + property B" we extract them
as property B rdfs:subPropertyOf property A.
2. In case they are set as \property A = any string + preposition + any string
+ property B", we eliminate them.
3. The domains of properties A and B, the ranges of properties A and B must
be identical to or approximate the class hierarchy. Using this, we lter them.
4. We extract the remaining as property hypernymy-hyponymy relations.
        </p>
        <p>Figure 1 shows an example. In the gure, the properties \Partner" and
\Domestic partner" are matched, and their domains and ranges are identical.
Therefore, we extract them as property hypernymy-hyponymy relations.
Extraction Using Triple from List Structures We match values of the
extracted propety names by scraping the Infobox triple and the list structure
in Wikipedia articles in each subject of the triples (instances). Then, when at
least one matched value exists, we extract the property name by scraping the
list structure and the property name by scraping the Infobox. Next, we compare
the number of instances of the propeties; the larger one is extracted as the
hypernymy, whereas, the smaller one is extracted as the hyponymy. Finally,
we match the domains and ranges in these candidate properties. Then, when
at least one matched properties exists, we extract these propeties as property
hyperhymy-hyporymy relations.</p>
        <p>Figure 2 shows an example. In this case, a subject of the triple \Nevada
County, California" has a property \Communities" (which was extracted by
scraping a list structure) and its values \Nevada City" and a property \Largest
city" (which was extracted by scraping Infobox) and its values \Nevada City".
Thus, we obtain \Community" as an upper property candidate and \Largest
city" as a lower property candidate. Subsequently, when we match the domains
and ranges of these properties, we know that these properties have identical
domain and same range. Thus, we extract \Community - Largest city" as a
property hypernymy-hyponymy relation.
3.5</p>
      </sec>
      <sec id="sec-3-5">
        <title>Estimation of Property Types</title>
        <p>Extracted properties by scraping Infobox have already classi ed Object type and
Data type. In this method, adding to Object type and Data type, we attempt to
estimate symmetric, transitive, functional, and inverse functional property types
using the triples whose extraction is described in Section 3.1.</p>
        <p>Extraction of Symmetric Property First, we attempt to estimate the
symmetric propety. We consider a subject X and a value Y of each property P ;</p>
        <p>Communities</p>
        <p>Largest city
Nevada County,</p>
        <p>California</p>
        <p>Instance</p>
        <p>Municipa
lities
rdfs:subPropertyOf</p>
        <p>City</p>
        <p>Nevada City
if a triple "Y P X " of P exist, we extract the propety P as a symmetric
property candidate . In addition, we attempt to estimate the ratio of all the
triples of P to the triples extracted triples as symmetric relationship candidates.
Extraction of Transitive Property Next, we attempt to estimate transitive
property. We consider a subject X , a value Y and a value Z of a subject Y of
each propety P ; if a triple "X P Z " of P exists, we extract the propety P as
a transitive propety candidate. In addition, we attempt to estimate the ratio of
all the triples of P to the triples extracted as transitive relationship candidates
.</p>
        <p>Extraction of Functional Property Next, we attempt to estimate the
functional property. We condider a subject X , a value Y of each property P ; if P
has a value Y for each instance X , we extract the property P as a functional
property.</p>
      </sec>
      <sec id="sec-3-6">
        <title>Extraction of Inverse Functional Property Moreover, we attempt to es</title>
        <p>timate the inverse functional property. We consider a subject X , a value Y of
each property P ; if P has only one instance X for each value Y , we extract the
property P as the inverse functional property.</p>
      </sec>
      <sec id="sec-3-7">
        <title>Extracting Using Property Hypernymy-Hyponymy Relations In addi</title>
        <p>tion to extraction by the above four methods, we estimate property types by
using property hypernymy-hyponymy relations. We add property types to the
subproperty of the property that have types.</p>
      </sec>
      <sec id="sec-3-8">
        <title>Re ning Extracted Relations</title>
        <p>In this section, we propose re ning methods of extracted property domain,
property range, symmetric property, transitive property, functional property, and
inverse functional property. Here, we attempt to improve the accuracy by mutually
using each relation extracted by each method.</p>
        <p>Re nement of Property Domain and Range In the method of Sections
3.2 and 3.3, multiple domains and ranges are linked to a property. In that case,
the property \BornPlace" links to the property domain \People" and \Japanese
people". \Japanese people" is a subclass of \People" and can be regarded as
a redundant domain. Therefore, we consider removing such redundant domains
and ranges. Hereby, using the is-a relation already extracted, if the hyper class
of the extracted domain is also extracted as a domain and the latter has a high
inclusion rate, the former is removed.</p>
      </sec>
      <sec id="sec-3-9">
        <title>Re nement of Symmetric Property and Transitive Property We re ne</title>
        <p>the symmetric and transitive proeprties extracted earlier. In the symmetric and
transitive properties, the domain and range of the property must match.
Therefore, we can lter them by using property domain and property range extracted
earlier.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experimental Results and Observations</title>
      <p>Now, we discuss the results and observations extracted from the Wikipedia dump
data as a source as of January 2017.
4.1</p>
      <sec id="sec-4-1">
        <title>Results and Observations of Property Names</title>
      </sec>
      <sec id="sec-4-2">
        <title>Results and Observations by Scraping Infobox From the Wikipedia dump</title>
        <p>data, we extracted 8,311,427 infoboxes, 12,309 infobox templates, and 22,767,071
triples. The number of property names was 12,088.</p>
        <p>According to the property type estimation, approximately 60% of the
properties were classi ed as either owl:ObjectProperty or owl:DatatypeProperty. The
reason 40% of the properties were classi ed as Unknown was that the rule of
classi cation was insufficient. In the future, we need to take into consideration
that property values such as \small" can be literal. we extracted 1,000 samples
from all triples to estimate the precision. The precision was 93:1 1:57%.</p>
        <p>Hereby, We used the following expression (1) as a formula for 95% con dence
interval. In formula (1), n represents the number of samples, N represents
population, and p^ represents the estimated amount which is the number of accuracy
samples divided by the total number of samples.</p>
        <p>[p^</p>
      </sec>
      <sec id="sec-4-3">
        <title>Results and Observations by Scraping List Structure From the Wikipedia</title>
        <p>dump data, we extracted 7,410,819 triples by scraping the list structure. The
number of property names was 2394. We can extract properties that cannot be
extracted by scraping Infobox, and they have numerous triples.</p>
        <p>We extracted 1000 samples from all the triples to estimate the precision. The
precision was 81:6 2:40%. A number of the errors were scraping errors, and
they were extracted when a large quantity of information was described in each
line of the list structures. For example, the property \Track listings" can be
observed in numerous Wikipedia albums or single articles of singers; however,
the list structures in these articles has generally described information such
as Writers, Released Date, and Length, apart from the Track listings. Thus,
Writers, Released Date and Length was extracted as values of the proeprty
\Track listings". We consider it feasible to remove these errors to add more
detailed rules of list structures with each category that the articles belong to.
4.2</p>
      </sec>
      <sec id="sec-4-4">
        <title>Results and Observations of Property Domain</title>
        <p>We were able to extract 500,967 property domains for 12,088 extracted
properties. The properties extracted by scraping Infobox has the Infobox template
as domains. Thus, we could de ne the property domain of all properties. In
addition, we de ne the property domain of 100 properties by scraping the list
structure in the Wikipedia articles. The property domains of 12,188 properties
were de ned. Thus, 84:2% properties have property domains.</p>
        <p>We used the property domain de ned in DBpedia Ontology for the
evaluation. DBpedia Ontology de nes properties, classes and property domains
manually. Therefore, there are numerous upper classes. However, our ontology uses
the Wikipedia category etc. Therefore, there are numerous lower classes.
Consequently, because it does not match in a straightforward comparison, we assumed
it to be correct to considered that the extracted property domain is included in
the property domain of DBpedia Ontology. In addition, the notation of
properties is different from that of DBpedia's properties in certain cases. By using
the Infobox template, we compare them with the Infobox items before being
mapped to the DBpedia ontology, to achieve consistency.</p>
        <p>The precision was 94:9 1:93%. Table 1 presents examples of property
domains. When a property has plurals property domains, we should integrate the
common upper concept classes. For example, the property \Staff" has \TV
program" and \TV drama" as property domain, and it has a number of classes as
property domains apart from these (e.g. \Radio program", \Baseball team").
Therefore, we attempt to integrate a common upper concept class in Section
3.6.
4.3</p>
      </sec>
      <sec id="sec-4-5">
        <title>Results and Observations of Property Range</title>
        <p>We were able to 78,117 propety ranges for 12,088 extracted properties. Similar
to the property domains, we evaluated property ranges using DBpedia Ontology.
The precision was 76:9 3:68%. Table 2 presents examples of property range.</p>
        <p>For example of errors, \Native dress" as property range of the property
\nationality" was extracted. This is an error owing to an incorrect class-instance
relation. Numerous errors by this method are because of this reason.
Therefore, we consider that the precision of this method increases by improving the
precision of the class-instance relations. Furthermore, \Article on
mathematics" was extracted as a range of the property \victory Frequency" or
\numberOfAccommodations". This is an error owing to an incorrect classi caion of
owl:ObjectProperty or owl:DatatypeProperty. Generally, the property type
becomes owl:DatatypeProperty. Thus, the range becomes literal (rdfs:Literal).
4.4</p>
      </sec>
      <sec id="sec-4-6">
        <title>Results and Observations of Property Hypernymy-Hyponymy</title>
      </sec>
      <sec id="sec-4-7">
        <title>Relations</title>
        <p>Results and Observations by String Matching We extracted 1297 property
hypernymy-hyponymy relations. We extracted 500 samples from all the relations
to estimate the precision. Thus, the precision was 89:0 2:36%. As examples of
errors, \Movement - Tradition or movement" and \Period - Time period" are
extracted. We can prevent the former error \Movement - Tradition or movement"
by adding rules while removing unnecessary relations after string matching. The
latter is an example in which the properties with identical meaning are extracted
as hypernymy and hyponymy. In this case, it is feasible to assess whether they
have identical meaning by using external resources.</p>
        <p>Results and Observations by Using Triple from List Structure We
were able to 242 property hyperhymy-hyponymy relations by using a triple from
the list structure. We determined all 242 relations' truthfulness and falseness
manually. The precision was 19:4%. As an example of errors, numerous upper
properties and lower properties had identical meanings. Furthermore,
numerous upper properties and lower properties were reversed. Because the triples
extracted from Infobox are exceedingly large (although we had compared the
properties by the total number of property values), the property extracted from</p>
        <p>Infobox is an upper property, whereas the property extracted from the heading
is a lower property.
4.5</p>
      </sec>
      <sec id="sec-4-8">
        <title>Results and Observations of Property Types</title>
        <p>We attempted to estimate by using the methods described in Section 3.2 with
12,088 extracted properties and 22,767,071 extracted triples.</p>
        <p>First, we attempted to estimate symmetric property. We were able to 13,371
symmetric relation triples and 89 symmetric property candidates. We determined
all 89 properties' truthfulness and falseness manually. Thus, the precision was
51:7%. Moreover, when we regard the inclusion rate as the threshold value, the
precision is 92:3% (threshold 3%).</p>
        <p>There are a number of properties by this method, including words such as
\Related" (\Related case", \Related index"). However, this method could
extract words such as \Former partner" and \Sister station" too.</p>
        <p>Secondly, we attempted to estimate the transitive property. We were able
to 48,035 transitive relation triples and 165 transitive properties candidates.
Hereby, we excluded those which can be regarded as symmetric among the
extracted relations because they are errors. As a results, We were able to 141
transitive property candidates excluding symmetric properties. We determined
all the 141 properties' truthfulness and falseness manually. Thus, the precision
was 19:6%.</p>
        <p>We consider the covering problem in Wikipedia to be a reason for such a
low accuracy. When we extract by this method, it is necessary to extract at
least three triples that become the transitive relations. Thus, it is necessary
to cover the information by Infobox or list structure as an identical property
name in the Wikipedia articles; however, such covered information is negligible.
For extracting transitive properties with high accuracy, we have to re ne the
property names, integrate the same one, and attempt to extract another method
that use a part of the nostructuredness information in articles.</p>
        <p>Thirdly, we attempted to estimate the functional property. We were able to
313,935 functional relation triples and 2,026 functional property candidates. We
extracted 500 samples from all the triples to estimate the precision. The precision
was 64:0 3:59%.</p>
        <p>The largest errors are that the number of triples is small and properties
that happen to have uniquely determined values of were extracted. Moreover,
because the information in Infobox is incomplete, we extracted it erroneously as
a functional property. We consider that errors owing to insufficient information
can be prevented by using information such as Wikipedia texts.</p>
        <p>Fourthly, we attempted to estimate inverse functional property. We were able
to 10,187 inverse functional relation triples and 526 inverse functional property
candidates. We extracted 200 samples from all the triples to estimate the
precision. The precision was 24:0 4:03%.</p>
        <p>There are problems of mark difference or synonym of property name, such
as \Main work" and the property \Notable work." Therefore, we have to re ne
the property names and integrate the same one to extract inverse functional
property with high precision. Moreover, there were numerous errors owing to
mistakes caused by scraping of Infobox. Therefore, it is necessary to re ne the
triple extraction method to improve accuracy.</p>
        <p>Finally, we attempted extraction by using property hypernymy-hyponymy
relations. We were able to 4 symmetric properties, 129 transitive properties, 195
functional properties, and 108 inverse functional properties. We dertermined all
properties' truthfulness and falseness manually. The precision was 75%, 7:8%,
68:7%, and 21:3% respectively. In order to increase the accuracy and extract
numerous relations, it is necessary to reconsider the estimation method of property
type and the extraction method of subproperties.
4.6</p>
      </sec>
      <sec id="sec-4-9">
        <title>Results and Observations of Re ning Extracted Relations</title>
      </sec>
      <sec id="sec-4-10">
        <title>Results and Observations of Re ning of Property Domain and Range</title>
        <p>We re ned extracted domains and ranges. As a result, the average and variance
of the domains and ranges per property changed, as shown in the Table 3. Even
after re nement, the number of domains and ranges per property is large, and
there are variations depending on the properties. This is because the hierarchical
structure of the class hierarchy is incomplete, and it is necessary to re ne the
class hierarchy in the future.</p>
      </sec>
      <sec id="sec-4-11">
        <title>Results and Observations of Re nement of Symmetric Property and</title>
        <p>Transitive Property We re ned the extracted symmetric properties and
transitive properties. As a result, the number of extracted relations and accuracy
changed, as illustrated in the Table 4. Although the precision marginally
increased with respect to the transitive properties, the precision marginally
decreased with respect to the symmetric properties. The reasons for this are
insufficient re nement and insufficient extraction of domains and ranges. In the
future, it is necessary to unify classes and improve the re nement methods of
domains and ranges.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>In this paper, we proposed and evaluated a method of constructing a
largescale and general-purpose Ontology with property axioms using Wikipedia as
the resource. Through this study, we demonstrated that Wikipedia is a valuable
resource for ontology learning for property axioms. It is feasible to extract
property domain, property range, and property hypernymy-hyponymy relations etc.
from Wikipedia. A few of the proposed methods can also be applied to
DBpedia. These can be considered effective for constructing costless and large-scale
ontologies in ontology learning.</p>
      <p>
        In the future, we plan to improve the precision of each method Moreover,
we plan to integrate properties and unify classes extracted in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] by using a upper
ontology such as WordNet. Also, we intend to provide our Wikipedia Ontology.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Miller</surname>
            ,
            <given-names>G.A.</given-names>
          </string-name>
          :
          <article-title>WordNet: A Lexical Database for English</article-title>
          .
          <source>Communications of the ACM</source>
          Vol.
          <volume>38</volume>
          , No.
          <volume>11</volume>
          :
          <fpage>39</fpage>
          -
          <lpage>41</lpage>
          . (
          <year>1995</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Lenat</surname>
            ,
            <given-names>D.B.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Guha</surname>
            ,
            <given-names>R.V.</given-names>
          </string-name>
          :
          <article-title>Building Large Knowledge Based Systems</article-title>
          .
          <source>AddisonWesley</source>
          (
          <year>1990</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Kawakami</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morita</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yamaguchi</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Building Wikipedia Ontology with More Semi-structured Information Resources. JIST2017, LNCS</article-title>
          , vol.
          <volume>10675</volume>
          , pp.
          <fpage>3</fpage>
          -
          <lpage>18</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Auer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kobilarov</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cyganiak</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ives</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          :
          <article-title>DBpedia: A Nucleus for a Web of Open Data</article-title>
          .
          <source>Lecture Notes in Computer Science</source>
          , Springer Berlin / Heidelberg, pp.
          <fpage>722</fpage>
          -
          <lpage>735</lpage>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Suchanek</surname>
            ,
            <given-names>F.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kasneci</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weikum</surname>
          </string-name>
          , G.:
          <article-title>YAGO: A Large Ontology from Wikipedia and WordNet</article-title>
          .
          <source>Elsevier Journal of Web Semantics</source>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Hoffart</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Suchanek</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Berberich</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weikum</surname>
          </string-name>
          , G.:
          <article-title>YAGO2: A Spatially and Temporally Enhanced Knowledge Base from Wikipedia</article-title>
          .
          <source>Reserch Report MPI-I2010-5007 Max-Planck-Institut fur Informatik</source>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Mahdisoltani</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Biega</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Suchanek</surname>
            ,
            <given-names>F.M.:</given-names>
          </string-name>
          <article-title>A Knowledge Base from Multilingual Wikipedias</article-title>
          . In: CIDR (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Sanchez</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moreno</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Del Vasto-Terrientes</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          (
          <year>2012</year>
          )
          <article-title>: Learning relation axiomsfrom text: An automatic web-based approach</article-title>
          .
          <source>Expert Systems with Applications</source>
          ,
          <volume>39</volume>
          ,
          <fpage>5792</fpage>
          5805.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Fleischhacker</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , Volker, J.,
          <string-name>
            <surname>Stuckenschmidt</surname>
          </string-name>
          , H.:
          <article-title>Mining rdf data for property axioms</article-title>
          . In: Meersman,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Panetto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Dillon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Rinderle-Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Dadam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            ,
            <surname>Pearson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Ferscha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Bergamaschi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Cruz</surname>
          </string-name>
          ,
          <string-name>
            <surname>I.F</surname>
          </string-name>
          . (eds.)
          <source>OTM</source>
          <year>2012</year>
          ,
          <article-title>Part II</article-title>
          . LNCS, vol.
          <volume>7566</volume>
          , pp.
          <fpage>718</fpage>
          <lpage>735</lpage>
          . Springer, Heidelberg (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>