=Paper=
{{Paper
|id=None
|storemode=property
|title=Predicting the Quality of Semantic Relations by Applying Machine Learning Classifiers
|pdfUrl=https://ceur-ws.org/Vol-674/Paper92.pdf
|volume=Vol-674
|dblpUrl=https://dblp.org/rec/conf/ekaw/FernandezSKM10
}}
==Predicting the Quality of Semantic Relations by Applying Machine Learning Classifiers==
Predicting the quality of semantic relations by applying
Machine Learning Classifiers to the Semantic Web
Miriam Fernandez, Marta Sabou, Petr Knoth, Enrico Motta
Knowledge Media Institute (KMi)
The Open University
Walton Hall, Milton Keynes, Mk7 6AA, United Kingdom
{m.fernandez, r.m.sabou, p.knoth, e.motta}@open.ac.uk
ABSTRACT correct independently of an interpretation context, in the case of
In this paper, we propose the application of Machine Learning Chapter ⊆ Book, subsumption has been used incorrectly to model
(ML) methods to the Semantic Web (SW) as a mechanism to pre- a meronymy relation.
dict the correctness of semantic relations. For this purpose, we One of the first attempts to address this problem is the work of
have acquired a learning dataset from the SW and we have per- Sabou et al. [4]. In this wok the authors investigate the use of the
formed an extensive experimental evaluation covering more than Semantic Web (SW) as a source of evidence for predicting the
1,800 relations of various types. We have obtained encouraging correctness of a semantic relation. They show that the SW is not
results, reaching a maximum of 74.2% of correctly classified se- just a motivation to investigate the problem, but a large collection
mantic relations for classifiers able to validate the correctness of of knowledge-rich results that can be exploited to address it. Fol-
multiple types of semantic relations (generic classifiers) and up to lowing this idea, the work presented in this paper makes use of the
98% for classifiers focused on evaluating the correctness of one SW as a source of evidence for predicting the correctness of se-
particular semantic relation (specialized classifiers). mantic relations. However, as opposed to [4], which introduces
several evaluation measures based on the adaptation of existing
Categories and Subject Descriptors Natural Language methodologies to SW data, this work aims to
approach the problem using Machine Learning (ML) techniques.
I.5.2 [Pattern Recognition]: Design Methodology –Classifier
For this purpose, we have worked on: a) acquiring a medium-
design and evaluation, Feature evaluation and selection, Pattern
scale learning dataset from the SW and b) performing an experi-
analysis.
mental evaluation covering more than 1,800 relations of various
types. We have obtained encouraging results, reaching a maxi-
General Terms mum of 74.2% of correctly classified semantic relations for clas-
Algorithms, Measurement, Design, Experimentation. sifiers able to validate the correctness of multiple types of seman-
tic relations (generic classifiers) and up to 98% for classifiers
Keywords focused on evaluating the correctness of one particular semantic
Semantic Web, Semantic Relations, Machine Learning. relation (specialized classifiers).
1. INTRODUCTION 2. ACQUIRING A LEARNING DATASET
The problem of relation extraction between two terms is a well- The problem addressed in this work can be formalized as a classi-
known research problem traditionally addressed by the Natural fication task. In this type of Machine Learning problems, the
Language Processing (NLP) community. The approaches found in learning method is presented with a set of classified examples
the literature follow several different trends like: the exploitation from which it is expected to learn how to predict the classification
of lexical patters to extract relations from textual corpora [3], the of unseen examples. The collection of classified examples, or the
generation of statistical measures that detect correlations between learning dataset, is obtained in three phases. In the first phase, a
words based on their frequency within documents [2] or, the ex- set of manually evaluated semantic relations is acquired. These
ploitation of structured knowledge resources like WordNet1 to relations can be seen as a quadruple where s is the
detect or refine relations [1]. source term, t is the target term, R is the relation to be evaluated,
and e {T, F} is a manual Boolean evaluation provided by users
With the evolution of the SW notion of knowledge reuse, from an where T denotes a true or correct relation, and F denotes a false or
ontology-centered view, to a more fine-grained perspective where incorrect relation; e.g., . This expe-
individual knowledge statements (i.e., semantic relations) are rimental data is obtained from the datasets of the Ontology
reused rather than entire ontologies, a parallel problem arises: Alignment Evaluation Initiative2 (OAEI) and includes the
estimating the correctness of a known relation between two terms. AGROVOC/NALT and the OAEI'08 datasets. These datasets
As an illustrative example, imagine the two following relations: comprise a total of 1,805 semantic relations of different types: ⊆,
Book – containsChapter –Chapter, Chapter ⊆ Book. While the ⊇, ⊥ and named. Among them, 1,129 are evaluated as true (T),
relation Book – containsChapter –Chapter can be considered
1 2
http://wordnet.princeton.edu/ http://oaei.ontologymatching.org/
correct relations, and 676 are evaluated as false (F), incorrect measures for the positive and negative class: True Positives rate
relations. In the second phase, a set of SW mappings (occurrences (TP), False Positives rate (FP), Precision, Recall, F-Measure (F-
of relations containing the same or equivalent source, s and target, Mea) and ROC area value. More details about these measures can
t terms in the publicly available SW data) is obtained for each be found in [5]. The results obtained by the best classifier for each
particular semantic relation. These mappings are extracted using classification problem can be seen in Table 1.
the services of the Watson SW gateway. Specific details about the Table 1. Best results obtained for each dataset
SW mapping extraction algorithm can be found in [4]. In the
third phase, these mappings are formalized and represented in Generic ⊆ ⊇ named
terms of the values of their features (or attributes). The selected J48 J48af NvBayes J48
attributes to represent each classified example are: Correct 74.2044% 85.2077% 98.0122% 76.1555%
• e, the relation correctness {T, F}. This is the class attribute,
i.e., the one that will be predicted for future examples. Incorrect 25.7956% 14.7923% 1.9878% 23.8445%
• Type(R), the type of relation to be evaluated: ⊆, ⊇, ⊥ and TPRate 0.742 0.852 0.98 0.762
named relations. FPRate 0.254 0.122 0.06 0.209
• | M |, the number of mappings.
Precision 0.76 0.889 0.984 0.79
• | M ⊆ |, the number of subclass mappings.
• | M ⊇ |, the number of superclass mappings. Recall 0.742 0.852 0.98 0.762
• | M ⊥ |, the number of disjoint mappings. F-Mea 0.747 0.851 0.981 0.766
• | M R |, the number of named related mappings. ROC 0.749 0.875 0.995 0.767
• | M S |, the number of sibbling mappings.
• For each particular mapping Mi we consider 4. CONCLUSIONS AND FUTURE WORK
Type (Ri), the relation type of the mapping: ⊆, ⊇, ⊥, named
In this paper, we investigate the problem of predicting the
and sibling.
Pl (Mi) the path length of the mapping Mi correctness of semantic relations. Our hypothesis is that ML
Np (Mi) the number of paths that lead to the mapping Mi. methods can be adapted to exploit the SW as a source of
Note that for sibling and named mappings the connection knowledge to perform this task. The result of our experi-
can be derived from 2 different paths connected by a com- ments are promising, reaching a maximum of 74.2% of cor-
mon node. rectly classified semantic relations for classifiers able to
| Mi ⊆ |, the number of subclass relations in Mi validate the correctness of multiple types of semantic rela-
| Mi ⊇ |, the number of superclass relations in Mi tions (generic classifiers) and up to 98% for classifiers fo-
| Mi ⊥|, the number of disjoint relations in Mi cused on evaluating the correctness of one particular se-
| Mi R |, the number of named relations in Mi
mantic relation (specialized classifiers).
Despite the success in the prediction process obtained by
3. EXPERIMENTS AND RESULTS
This study addressed four different classification problems: pre-
the classifiers, it is important to highlight that only 60% of
dicting the correctness of any particular semantic relation (generic the relations contained in these datasets were covered by
classifiers) and predicting the correctness of a given type of se- the SW. This limits our approach to domains where seman-
mantic relation: ⊆, ⊇ or named (specialized classifiers). Note that tic information is available, which constitutes an open prob-
the ⊥ relation has been discarded from our experiments due to the lem for future research work.
lack of negative examples. To address each of these problems,
three different classifiers: the J48 Decision Tree, the NaiveBayes 5. REFERENCES
classifier, and the LibSVM classifier, all of them provided by We- [1] Budanitsky, A. and Hirst, G. Evaluating WordNet-based
ka [5] were used. Each classifier was applied using the whole set measures of semantic distance. 2006. Computational Lin-
of attributes (Section 2) or a filtered set of attributes (af) obtained guistics, 32(1):13-47.
using a combination of the cfSubsetEval and the BestFirst algo-
rithms [5]. To train and test the classifiers, each dataset was di- [2] Calibrasi, R.L. and Vitanyi, P.M. The Google Similarity
vided in the following way: approximately 70% of the data was Distance. 2007. IEEE Transactions on Knowledge and Data
used for training and 30% of the data was used for testing. This Engineering, 19(3):370-383.
division was done manually to avoid the appearance of mappings [3] Cimiano, P. 2006. Ontology Learning and Population from
coming from the same semantic relation in the training and the Text: Algorithms, Evaluation and Applications. Springer-
test sets. Note that the SW mappings coming from the same se- Verlag New York, Inc.
mantic relation share in common at least the first eight attributes, [4] Sabou, M. and Gracia, J. Spider: Bringing Non-Equivalence
therefore, it is important to maintain them together in the same set Mappings to OAEI. 2008. In Proc. of the Third International
(either the train or the test set) for a fair evaluation. To evaluate Workshop on Ontology Matching
the classifiers and compare them against each other the following
measures were selected: the percentage of correctly classified [5] Witten, I.H. and Eibe F. Data Mining. Practical Machine
instances, the percentage of incorrectly classified instances and, Learning Tools and Techniques. 2000. The Morgan Kauf-
the weighted average of the values obtained using the following mann Series in Data Management Systems