<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>These authors contributed equally.
$ dvantonov@edu.hse.ru (D. Antonov); buscaldi@lipn.fr (D. Buscaldi)
 https://github.com/chejuro/ (D. Antonov); https://lipn.fr/~buscaldi (D. Buscaldi)</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Assessing the impact of Word Embeddings for Relation Classification: An Empirical Study</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Dzhal Antonov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Davide Buscaldi</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Ecole Polytechnique</institution>
          ,
          <addr-line>91120 Palaiseau</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Higher School of Economics</institution>
          ,
          <addr-line>Moscow</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>LIPN, Université Sorbonne Paris Nord</institution>
          ,
          <addr-line>CNRS UMR 7030, 93430 Villetaneuse</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0003</lpage>
      <abstract>
        <p>This paper presents an empirical study on the value of concept embeddings for relation classification, with a particular focus on hypernymy detection. Knowledge Graphs have become popular due to their ability to represent knowledge in a standardized form that enables reasoning, but often sufer from incompleteness. Relation Classification is a task that can help alleviate this issue. Recent advances in Deep Learning applied to Natural Language Processing have provided researchers with powerful tools for detecting relations between concepts: word embeddings. We investigate the efectiveness of diferent types and combinations of embeddings for the automatic relation classification task. We conduct experiments on two datasets based on WordNet and the AI-KG Knowledge Base. Our results confirm previous results that it is challenging to deduce the semantic relations from embeddings alone. We observe that hypernymy cannot be captured solely by a sub-space of the embedding space, despite specific dimensions carrying more information about this relation than others. Additionally, we show that it is dificult to apply a model learned on a general ontology to other domains, and that imbalance problems are aggravated in large knowledge bases where one relation dominates over all the others.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Knowledge Graphs</kwd>
        <kwd>Word Embeddings</kwd>
        <kwd>Relation Classification</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Knowledge Graphs (KGs) have become very popular in various tasks, thanks to their versatility
and their ability to represent knowledge in a standardized form that enables reasoning, both in
open and closed-domain applications. Examples of public, cross-domain knowledge graphs that
encode common knowledge include DBpedia [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], and YAGO [2]. However, KGs often sufer
from incompleteness problems [3]. Relation Classification is a task that can help alleviate the
problem of incompleteness in KGs; it consists in determining the presence of a named relation
between two semantic entities.
      </p>
      <p>Recently, the outstanding achievements of Deep Learning applied to Natural Language
Processing provided researchers with powerful tools for the detection of relations between
concepts. In particular, the significance of word embeddings in present-day Natural Language
Processing (NLP) techniques cannot be overstated. Besides their usefulness in encoding textual
data in neural network models, they are crucial due to their ability to capture a substantial
amount of linguistic and semantic information. Word2Vec [4] showed that it is possible to
encode some kind of relational information within the embedding space, in particular the ability
to capture word analogies (e.g.  −  +  ≃ ). However, some works such
as [5, 6] proved that the expectations regarding relation prediction from the entity embeddings
were excessively optimistic. Approaches such as TransE [7] have somehow “fixed" the problem
by introducing techniques to modify the position of concepts in the embedding space depending
on the relations in which they occur in the Knowledge Base. However, such methods can still
consider invalid relations that they have not seen in the knowledge base used for training,
despite them being correct.</p>
      <p>Some works have shown that it is actually possible to discover relations by looking at the
embedding of the concepts. Kata Gabor et al. [8] tried various combinations of the entities
concept to see which ones were the most useful to predict relations. They proved that the vector
ofset method for analogies is the least eficient in capturing generic semantic relations at a
large scale, while pairwise similarities can be better exploited in an additive or concatenative
setting. Maurizio Atzori and Simone Balloccu proposed an algorithm to deduce the existence
of the hypernymy (or is_a) relation between two concepts from their word embeddings [9].
Their work tries to unentangle the contextual information contained in the embedding space.
They obtained the best accuracy in hypernym discovery for unsupervised systems on data from
SemEval-2018 [10].</p>
      <p>In this work, we carry out an empirical study, in the wake of these works, trying to understand
what is the value of concept embeddings for relation classification. Despite the difusion and the
interest of this task, only a few works have tried to carry out an experimental study on the role
of embeddings for relation classification, and always on specific types of relations: for instance,
temporal [11] and discourse-based relations [12]. We considered two datasets: a general one
and a domain-specific one, both including the hypernymy relation. This consideration was
important as something we wanted to verify is if it is possible to learn a hypernymy detection
model from a general dataset and apply it to a domain-specific one, and what would be the
accuracy loss in doing so. In the rest of the paper, we present in Section 2 the dataset used, in 3
the diferent types of embeddings used and in 4 the experiments carried out. Finally, we present
in 5 our conclusions from these experiments. The code used for this work is available at the
address https://github.com/chejuro/Relation-prediction-from-embeddings.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Knowledge Bases and Datasets</title>
      <sec id="sec-2-1">
        <title>2.1. WordNet and WN18RR</title>
        <p>The WN18 dataset was introduced in 2013 by Bordes et al. [13]. It included the full 18 relations
scraped from WordNet for roughly 41,000 synsets (the equivalent of a concept in WordNet).
This dataset was afected by leakage due to the presence of symmetric relations, therefore
in 2018 [14] introduced the WN18RR dataset. This dataset features 11 relations, no pair of
which is reciprocal. Wordnet is a manually-curated resource of semantic concepts, restricted
to more “linguistic" relations compared to those expected in general world knowledge. The
most common semantic relation is the hypernymy one, representing more than 45% of total
relations, while some relations are very rare, such as antonymy (1.2%) or attribute (0.4%) [15].
The number of synsets is more than 175, 000 in the latest version.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Artificial Intelligence Knowledge Graph</title>
        <p>AI-KG [16] is a large-scale automatically generated knowledge graph that is made up of a
large number of articles and describes around 2.3M triplets that are connected by 55 semantic
relations. These relations include hypernymy (encoded as the skos:broader relation) and others
related to the scientific discourse (for instance: method A uses resource B to perform task C.
Similarly to Wordnet, AI-KG also has some reciprocal relations and an imbalance problem. After
cleaning and filtering out the least frequent relations we keep 10 relations with 1 triplets.
The details about the relations are shown in Table 1.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Word Embeddings</title>
      <sec id="sec-3-1">
        <title>3.1. Word2Vec</title>
        <p>Word2Vec [4] is one of the most famous and widely spread word vectorization techniques.
Word2Vec embeddings are learned either using the CBOW (Continuous Bag-of-Words) or
skipgram model. In the CBOW approach, the model predicts the target word given a context of
surrounding words. The context words are summed together to form a continuous bag-of-words
representation, which is then used as input to the model to predict the target word. The CBOW
algorithm is trained by iteratively adjusting the word embeddings to improve the accuracy of
the predicted target word given its context. Compared to the skip-gram model, which predicts
the context words given a target word, CBOW is faster to train.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. GloVe (Global Vectors)</title>
        <p>GloVe [17] is another model of non-deep representation of words, proposed in 2014 by a group
of developers at Stanford. It uses a co-occurrence matrix that describes how frequently diferent
words appear together in a corpus of text. The co-occurrence matrix is then factorized to obtain
word embeddings that capture the semantic and syntactic relationships between words. GloVe
is able to capture both local and global relationships between words, and it has been shown to
perform well on a wide range of NLP tasks, including language modeling, sentiment analysis,
and machine translation.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. FastText</title>
        <p>FastText is an extension to Word2Vec proposed by Facebook in 2016. It addresses an important
problem with embeddings, the Out-Of-Vocabulary (OOV) words. With GloVe and Word2Vec, if
a word is not known, then it is not possible to obtain an embedding. FastText addresses the
problem by breaking words into several character n-grams (sub-word tokens). Therefore, it is
possible that even OOV words can be reconstructed by assembling sub-word tokens.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. ConceptNet Numberbatch</title>
        <p>ConceptNet Numberbatch [18] are another type of Word2Vec-like embeddings trained on a
heterogeneous set of sources, including Wikipedia and ontologies such as OpenCyc and even
WordNet. The concept repository is based on ConceptNet, a semantic network that encodes
general knowledge about the world in a machine-readable format. These embeddings have
been conceived with the objective to include both structured and unstructured data sources,
which allows them to capture both explicit and implicit relationships between words. Therefore,
these embeddings address the flaws highlighted by [ 5] and [6] regarding the ability of word
embeddings to capture semantic relations between words.
3.5. BERT
BERT [19] is a bidirectional transformer-based machine learning technique pretrained using
a combination of masked language modeling objective and next sentence prediction on a
very large corpus. The BERT model is based on the Transformer model [20], which includes
the attention mechanism that highlights the contextual relationship between the words in a
phrase. The basic part consists of an encoder to read the input text and a decoder to generate
a prediction, filling the masked parts of the training sentence. In order for BERT to create a
language representation model, it only needs the encoder part. The encoder input to BERT is a
sequence of tokens that are converted into vectors and then processed in a neural network. The
main advantage of BERT embeddings is that they are contextual, i.e. a word does not have a
ifxed embedding but it changes in the function of its context. BERT embeddings can also be
computed for OOV words as they use sub-word tokenization, similarly to FastText.</p>
      </sec>
      <sec id="sec-3-5">
        <title>3.6. Sentence-BERT</title>
        <p>One of the problems of word embeddings is that they are representing a single word or a
compound word (if this compound word has been labeled as such in the training corpora). If
a concept is represented by multiple words, the usual way to obtain a representation is by
averaging or maxing over the embedding dimensions. But this method has been proven to
be sub-optimal. Sentence-BERT [21] is a modified version of the pre-trained BERT network
that creates comparably meaningful sentence embeddings utilizing a cosine similarity or triplet
loss on top of a siamese network architecture. To create a fixed-size sentence embedding,
Sentence-BERT adds pooling to the token embeddings produced by BERT. This network is
capable of encoding phrase semantics and therefore it can be used to encode concepts that are
expressed using multiple words.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Classification experiments</title>
      <p>Given a pair of entities or concepts (ℎ, ) extracted from a Knowledge Base , our objective
is to predict a relation  such as (ℎ, , ) is a valid relation in . Therefore, we can treat the
problem as a multi-class classification task in which, given the representation for the (ℎ, ) pair,
we predict the target . We select from the dataset a training set (80% of triplets) in which  is
known for each pair of concepts and use the rest of the dataset as a validation set.</p>
      <p>There are various possibilities regarding how to combine the representations (i.e. the
embeddings) of the ℎ and  together. According to the study by [8], analogy combinations do not
work well; therefore we considered averaging and concatenation. Given (ℎ) = (1, . . . , )
and () = (1, . . . , ), with  embedding size, for averaging we obtain (ℎ, ) as
the pairwise mean: (ℎ, ) = (︀ 1+1 , 2+2 , . . . , + )︀ . For concatenation, (ℎ, ) =
2 2 2
(1, . . . , , 1, . . . 2). The  size of the embeddings varies depending on the model. For
word2vec and derived models it is 300 dimensions, while for BERT and S-BERT is 768.</p>
      <sec id="sec-4-1">
        <title>4.1. WN18RR dataset</title>
        <sec id="sec-4-1-1">
          <title>4.1.1. Logistic Regression model</title>
          <p>Logistic regression was chosen as a basic classification algorithm because it has a lot of
advantages in comparison with other algorithms: it is eficient to train, efective for multi-class
problems, and it can estimate feature importance by model coeficients. Table 2 and Table
3 present results of classification for every word embeddings approach with pairwise mean
operation and with vector concatenation. Accuracy is calculated as the correct prediction over
the total number of examples, independently from the class. Precision is calculated as the
macro-average precision (TP/(TP+FP)) over each class. F1-score is the harmonic mean between
macro-average precision and macro-average recall (TP/(TP+FN)). According to the results, the
concatenation operation works significantly better and this can be explained by the fact that
we save the information of each entity in the final vector, and with the mean operation, we mix
that information. This result confirms the conclusions by [8].</p>
        </sec>
        <sec id="sec-4-1-2">
          <title>4.1.2. Logistic Regression interpretation for hypernymy relation</title>
          <p>We considered the concatenation representation to derive an interpretation of the embeddings
from the Logistic Regression model and we performed feature selection. This selection can
highlight the most important features, that is dimensions in the embedding space that are
expected to carry the most information regarding the hypernymy relation. Thus we trained
a new logistic regression model on a truncated set of features and compared the results with
those obtained with the model based on all features.</p>
          <p>We extracted the ten most important features (see Table 4) from the logistic regression model
based on ConceptNet-NB embeddings. It is interesting to observe that only 2 of these features
are related to the subject of the relation (the hyponym or more specific word) while the others
are related to the object (the hypernym). It is dificult to tell what the individual features mean
as each dimension of the embeddings is not linked to any specific linguistic feature. We used
these 10 best features to retrain a new “compressed" logistic regression model. We obtained a
precision of 0.7227 on the validation set while with all 600 features, it was 0.83426. According
to this result, the hypernymy relation is not encoded by a subset of dimensions (although some
are more informative than others), but rather the full embedding is required to capture the
meaning of the semantic relation.</p>
        </sec>
        <sec id="sec-4-1-3">
          <title>4.1.3. Multi-Layer Perceptron (MLP)</title>
          <p>Next, we implemented Multi-Layer Perceptron as a deep model baseline. It consists of 5 to 10
fully connected linear layers, with alternating hidden layers of size 300 and 100 and a final layer
of  units, corresponding to the number of relations to predict. The loss is cross-entropy loss.
Between each layer we have an activation function and a dropout layer (dropout=0.2). Table 5
shows the diferent hyperparameters of the model and the number of hidden layers. ReLu is
also tested, but the outcome does not change too much, even if it is the best one. According to
the results, MLP works much better than logistic regression.</p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. AI-KG dataset</title>
        <sec id="sec-4-2-1">
          <title>4.2.1. Classification models</title>
          <p>The AI-KG dataset with 10 relations and 971, 571 triplets was divided into train and validation
sets of size 90% and 10% respectively. Similarly to the WordNet experiments, we trained
Logistic Regression and MLP models and they obtained 0.5365 and 0.6197 precision scores
respectively. In Figure 1 we show the confusion matrix for the MLP model.</p>
        </sec>
        <sec id="sec-4-2-2">
          <title>4.2.2. Class imbalance problem</title>
          <p>We can notice from the confusion matrix that there is a class imbalance problem. As we saw
in Table 1, there is a large diference in the number of triplets supported by diferent relations.
To overcome this problem we tried to group together some relations to reduce the problem
to 5 relations. We did this as in AI-KG some relations are semantically similar while their
diference is only the category of one of the entities involved. For instance: methodUsedBy,
otherEntityUsedBy, materialUsedBy, taskUsedBy were grouped in just one class usedBy. In
this way, we got a more balanced situation with 5 classes and consequently got slightly better
results: 0.7811 in precision. The confusion matrix for this reduced dataset is shown in Figure 2.
We also show in Table 6 the precision and recall scores for each relation.</p>
        </sec>
        <sec id="sec-4-2-3">
          <title>4.2.3. Hypernymy relation prediction</title>
          <p>Another hypothesis that we wanted to test was to train a model on WordNet (general knowledge)
and see if this model could be applied to the more domain-specific data (AI-KG) to predict the
existence of a hypernymy relationship between two entities. The hypernymy relationship can
be mapped to the “skos:broader" relation in AI-KG. In Figure 3 we can see that similar concepts
are arranged in a similar way in the two knowledge graphs, which would suggest a certain
compatibility of the relations learned in WordNet with those present in AI-KG. However, the
result of the best model obtained only 0.13 in precision. It is still better than random choice
because in the AI-KG dataset there are 80425 hypernymy relations and 721,156 non-hypernymy
relations which means that random choice to predict correctly is around 11%. However, this
is an interesting result that lays some doubts about the generalization capability of relation
prediction models.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusions</title>
      <p>In this work, we tested various embeddings types and combination of embeddings for the
automatic relation classification task. The classification has been tested on two datasets based on
the WordNet and AI-KG Knowledge Bases. Our results confirm what emerged in previous works
that is dificult to extrapolate semantic relations from embeddings alone. As our experiment with
reduced dimensions shows, hypernymy cannot be captured just by a sub-space of the embedding
space, even if some dimensions seem to carry more information regarding this relation than
other ones. We also showed how imbalance problems may afect relation classification in some
more specific knowledge bases such as AI-KG, and that models built to predict a relation on a
general knowledge base cannot be used to predict the same relation on a diferent more specific
knowledge base. We plan in future to extend this experimentation to further embeddings and
Knowledge Bases.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This project has been carried out as part of the “projet d’établissement" titled “Plongements
lexicaux pour l’acquisition et la structuration de connaissances", funded by Sorbonne Paris Nord
University.
[2] F. M. Suchanek, G. Kasneci, G. Weikum, Yago: A large ontology from wikipedia and
wordnet, Journal of Web Semantics 6 (2008) 203–217.
[3] M. Destandau, J.-D. Fekete, The missing path: Analysing incompleteness
in knowledge graphs, Information Visualization 20 (2021) 66–82. URL:
https://doi.org/10.1177/1473871621991539. doi:10.1177/1473871621991539.
arXiv:https://doi.org/10.1177/1473871621991539.
[4] T. Mikolov, I. Sutskever, K. Chen, G. Corrado, J. Dean, Distributed representations of
words and phrases and their compositionality, Advances in Neural Information Processing
Systems 26 (2013).
[5] T. Linzen, Issues in evaluating semantic spaces using word analogies, in: Proceedings of
the 1st Workshop on Evaluating Vector-Space Representations for NLP, Association for
Computational Linguistics, Berlin, Germany, 2016, pp. 13–18. URL: https://aclanthology.
org/W16-2503. doi:10.18653/v1/W16-2503.
[6] A. Rogers, A. Drozd, B. Li, The (too many) problems of analogical reasoning with word
vectors, in: Proceedings of the 6th Joint Conference on Lexical and Computational
Semantics (*SEM 2017), Association for Computational Linguistics, Vancouver, Canada,
2017, pp. 135–148. URL: https://aclanthology.org/S17-1017. doi:10.18653/v1/S17-1017.
[7] A. Bordes, N. Usunier, A. Garcia-Duran, J. Weston, O. Yakhnenko, Translating embeddings
for modeling multi-relational data, Advances in neural information processing systems 26
(2013).
[8] K. Gábor, H. Zargayouna, I. Tellier, D. Buscaldi, T. Charnois, Exploring vector spaces
for semantic relations, in: Proceedings of the 2017 Conference on Empirical Methods in
Natural Language Processing, Association for Computational Linguistics, Copenhagen,
Denmark, 2017, pp. 1814–1823. URL: https://aclanthology.org/D17-1193. doi:10.18653/
v1/D17-1193.
[9] M. Atzori, S. Balloccu, Fully-unsupervised embeddings-based hypernym discovery,
Information 11 (2020) 268. doi:10.3390/info11050268.
[10] J. Camacho-Collados, C. Delli Bovi, L. Espinosa-Anke, S. Oramas, T. Pasini, E. Santus,
V. Shwartz, R. Navigli, H. Saggion, SemEval-2018 task 9: Hypernym discovery, in:
Proceedings of the 12th International Workshop on Semantic Evaluation, Association
for Computational Linguistics, New Orleans, Louisiana, 2018, pp. 712–724. URL: https:
//aclanthology.org/S18-1115. doi:10.18653/v1/S18-1115.
[11] P. Mirza, S. Tonelli, On the contribution of word embeddings to temporal relation
classification, in: The 26th International Conference on Computational Linguistics, ACL, 2016,
pp. 2818–2828.
[12] C. Braud, P. Denis, Comparing word representations for implicit discourse relation
classification, in: Proceedings of the 2015 Conference on Empirical Methods in Natural
Language Processing, 2015, pp. 2201–2211.
[13] A. Bordes, N. Usunier, A. Garcia-Duran, J. Weston, O. Yakhnenko, Translating embeddings
for modeling multi-relational data 2013 (2013).
[14] T. Dettmers, P. Minervini, P. Stenetorp, S. Riedel, Convolutional 2d knowledge
graph embeddings, CoRR abs/1707.01476 (2017). URL: http://arxiv.org/abs/1707.01476.
arXiv:1707.01476.
[15] M. Maziarz, M. Piasecki, S. Szpakowicz, The chicken-and-egg problem in wordnet design:
Synonymy, synsets and constitutive relations, Lang. Resour. Eval. 47 (2013) 769–796. URL:
https://doi.org/10.1007/s10579-012-9209-9. doi:10.1007/s10579-012-9209-9.
[16] D. Dessì, F. Osborne, D. Reforgiato Recupero, D. Buscaldi, E. Motta, H. Sack, AI-KG: an
automatically generated knowledge graph of artificial intelligence, in: The Semantic
Web–ISWC 2020: 19th International Semantic Web Conference, Athens, Greece, November
2–6, 2020, Proceedings, Part II 19, Springer, 2020, pp. 127–143.
[17] J. Pennington, R. Socher, C. D. Manning, Glove: Global vectors for word representation, in:
Proceedings of the 2014 conference on empirical methods in natural language processing
(EMNLP), 2014, pp. 1532–1543.
[18] R. Speer, J. Chin, C. Havasi, Conceptnet 5.5: An open multilingual graph of
general knowledge, CoRR abs/1612.03975 (2016). URL: http://arxiv.org/abs/1612.03975.
arXiv:1612.03975.
[19] J. Devlin, M. Chang, K. Lee, K. Toutanova, BERT: pre-training of deep bidirectional
transformers for language understanding, CoRR abs/1810.04805 (2018). URL: http://arxiv.
org/abs/1810.04805. arXiv:1810.04805.
[20] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser,
I. Polosukhin, Attention is all you need, CoRR abs/1706.03762 (2017). URL: http:
//arxiv.org/abs/1706.03762. arXiv:1706.03762.
[21] N. Reimers, I. Gurevych, Sentence-bert: Sentence embeddings using siamese bert-networks,
CoRR abs/1908.10084 (2019). URL: http://arxiv.org/abs/1908.10084. arXiv:1908.10084.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Auer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          , G. Kobilarov,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lehmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Cyganiak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Ives</surname>
          </string-name>
          ,
          <article-title>Dbpedia: A nucleus for a web of open data</article-title>
          , in: K. Aberer, K.-S. Choi,
          <string-name>
            <given-names>N.</given-names>
            <surname>Noy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Allemang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.-I.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Nixon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Golbeck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mika</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Maynard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Mizoguchi</surname>
          </string-name>
          , G. Schreiber,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cudré-Mauroux</surname>
          </string-name>
          (Eds.),
          <source>The Semantic Web</source>
          , Springer Berlin Heidelberg, Berlin, Heidelberg,
          <year>2007</year>
          , pp.
          <fpage>722</fpage>
          -
          <lpage>735</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>