<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Using Word Embeddings for Visual Data Exploration with Ontodia and Wikidata</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Gerhard Wohlgenannt</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nikolay Klimov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dmitry Mouromtsev</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Daniil Razdyakonov</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dmitry Pavlov</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yury Emelyanov</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Intern. Lab. of Information Science and Semantic Technologies, ITMO University</institution>
          ,
          <addr-line>St. Petersburg</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Vismart Ltd.</institution>
          ,
          <addr-line>St. Petersburg, Russia https://vismart.biz</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>One of the big challenges in Linked Data consumption is to create visual and natural language interfaces to the data usable for nontechnical users. Ontodia provides support for diagrammatic data exploration, showcased in this publication in combination with the Wikidata dataset. We present improvements to the natural language interface regarding exploring and querying Linked Data entities. The method uses models of distributional semantics to nd and rank entity properties related to user input in Ontodia. Various word embedding types and model settings are evaluated, and the results show that user experience in visual data exploration bene ts from the proposed approach.</p>
      </abstract>
      <kwd-group>
        <kwd>Linked Data querying</kwd>
        <kwd>word embeddings</kwd>
        <kwd>Ontodia</kwd>
        <kwd>Wikidata</kwd>
        <kwd>natural language interface</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        The gigantic data source of Linked Data (LD) is accessible both by machines and
humans. Especially for end users, there are high barriers, such as nding relevant
datasets, understanding the schema, or being familiar with query languages such
as SPARQL [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. One of the tools that provide an intuitive way to discover LD
for non-technical users is Ontodia3. Ontodia is an open-source library for OWL
and RDF diagramming and visual exploration. In its current version, natural
language (NL) search in the properties of given entities will only nd properties
exactly matching in the its labels. Here, we investigate a method to make the
search more exible and abstracting users from the underlying data schemata by
leveraging word embeddings to provide properties which are semantically related
to a user query. Using Wikidata4 as underlying dataset, we aim to i) investigate
if word embeddings are useful for the given problem, ii) evaluate which types
of pre-trained embedding models, and which parameters, are best suited for the
task, and iii) provide a prototype to demonstrate the bene ts of the method.
      </p>
      <p>We do not aim at full- edged question answering over LD with NL to
SPARQL transformation, but at improving the search functionality in
diagrammatic LD exploration.</p>
      <sec id="sec-1-1">
        <title>3 http://www.ontodia.org</title>
      </sec>
      <sec id="sec-1-2">
        <title>4 https://www.wikidata.org</title>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        Query expansion for keyword queries is a classical problem in information
retrieval. A traditional way of keyword expansion is the use of dictionaries such
as WordNet to nd synonyms or hypo- and hypernyms. This method su ers
from sparse data regarding Named Entities and missing coverage of specialized
domains. In the Semantic Web eld, eg. Augenstein et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] propose a method
to map keywords to LD resources by nding the properties that are related to
semantic similarity between resources. In contrast to our work, which searches
in entity properties, Augenstein et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] focus primarily on nding resources
(entities). Freitas et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] propose a complex system for querying
heterogeneous, and distributed datasets, which abstracts users from the underlying data
schemata. The system combines entity search, a Wikipedia-based semantic
relatedness measure and spreading activation to answer NL queries.
      </p>
      <p>
        Challenges and future directions in Question Answering on LD are presented
in Shekarpour et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. The application of word embeddings and deep learning
is listed prominently among the promising techniques for future investigation. In
line with this recommendation, we apply distributional semantics for the natural
language query interface of Ontodia. In general, word embeddings transform the
vocabulary of a given corpus into a continuous low-dimensional vector space
representation. They have been successfully applied, for example, for word similarity
computations, but also more complex natural language tasks [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>System Description</title>
      <p>The work presented in this paper extends Ontodia with improved search
capabilities. As mentioned, Ontodia is an open-source tool5 for simple OWL and
RDF visual data exploration. Ontodia is often integrated with metaphactory6
as a semantic platform backend. In a typical data exploration scenario, the user
starts querying the dataset at the system entry point7. At search result, the user
can switch to using Ontodia to explore the data space. In the current version,
search in the connections of an entity only nds literal matches of the search
term in the property labels. This limits the ease-of-use with unfamiliar datasets.
E.g, when looking for family relations of entity Van Gogh, the system will not
nd any matching properties due to missing exact lexical matches, see Figure 1.</p>
      <p>The prototype presented here makes use of a) aliases for property labels
de ned in Wikidata, and it applies distributional semantics in the form of word
embeddings to nd suitable properties related to a user query. Figure 2 shows
the results using the new search functionality, which are a combination of: (i)
exact matches of the input term in the property labels, (ii) exact matches in
property aliases, and (iii) related properties according to the word embedding
model used, ordered descendingly by semantic similarity.</p>
      <sec id="sec-3-1">
        <title>5 https://github.com/ontodia-org/ontodia</title>
      </sec>
      <sec id="sec-3-2">
        <title>6 http://www.metaphacts.com/product</title>
      </sec>
      <sec id="sec-3-3">
        <title>7 https://wikidata.metaphacts.com/resource/Start</title>
        <p>The updated search interface also allows for a new way of data exploration,
where the user is interested in a certain topic, for example family or politics, and
can then explore all entity properties (connections) related to the topic.
The prototype described here is available at:
http://ontodia-prop-suggest.apps.vismart.biz/wikidata.html.
3.1</p>
        <sec id="sec-3-3-1">
          <title>The Method</title>
          <p>A central ingredient to the method is the word embedding model. The models
were trained on a Wikipedia corpus { and in some cases additional textual
sources { and contain continuous vector space representations of the words from
the corpora which capture the distributional semantics of the words.</p>
          <p>First, the Wikidata properties need to be added to the model vector space.
For every property we split the property label (rdf:label) into a list of words,
and remove stopwords. The vector representation of a property is created as the
vectorial sum of the words. A variant of the system also includes the words from
the property descriptions to create the property vectors. At runtime, the same
process is applied to the natural language user query provided in the search box.
The query is split into single words, stopwords are removed, and the vectorial
representation is the sum of the query word vectors. Finally, the system ranks
the properties by cosine similarity between the query vector and all the property
vectors to nd the most relevant properties.</p>
          <p>The method is simple and computationally e cient. In this publication, the
focus is on the evaluation of the method, and especially on comparing the
performance of various types of word embeddings.
3.2</p>
        </sec>
        <sec id="sec-3-3-2">
          <title>Implementation</title>
          <p>The presented method and the accompanying code was implemented in Python
and can be found on GitHub8. The main modules include a preprocessing phase,
where the vectors for the Wikidata properties are constructed and persisted,
the module to rank properties according to user input, and the tools for the
evaluation of the system. For integration with Ontodia, we created a webservice
that takes the user input in JSON format, computes the property rankings, and
returns them in JSON format to Ontodia for display to the user.
4</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Evaluation</title>
      <p>First, this section describes aspects of evaluation setup like the Wikidata dataset,
the gold standard data used, system settings and the word embedding models.
Then, a detailed presentation of the evaluation results, including a discussion of
aspects like dataset quality and result interpretation, follow.
4.1</p>
      <sec id="sec-4-1">
        <title>Evaluation Setup</title>
        <p>Wikidata Dataset Wikidata is an open knowledge base, which can be
exported and interlinked with other datasets on the Linked Data web. Wikidata
is the central data storage for projects like Wikipedia.The dataset currently
includes around 28 million items, and, more relevant for this work, there are 3323
properties de ned to describe and connect the entities. The properties have
labels for various languages, and aliases (called \also known as") for many of the
labels. We focus on English language labels, for which currently 4603 aliases are
de ned. Additionally, properties usually have a short textual description, which
we also use in our method to create property representations.</p>
        <p>Gold Standard Dataset The aliases manually de ned in Wikidata are an
obvious source to be used as a gold standard dataset to evaluate our method.
For this purpose, any of the 4603 English language aliases is used as an query
term, and the system suggests a ranking of properties similar to the term. 1736
of 3323 properties actually have aliases de ned.</p>
        <sec id="sec-4-1-1">
          <title>8 https://github.com/gwohlgen/ontodia_search_properties</title>
          <p>
            Despite the varying quality of aliases (details in the Discussion section), we
decided to use them as a gold standard dataset. Eventually the proposed method
can even be applied to help detect questionable alias de nitions in the future.
System Settings In the evaluations, we experimented with various system
settings and word embedding models. The types of word embedding models are
described below, the most important system settings include:
{ Use description text (Boolean): For creating the representations of
properties in vector space, we compared the results of using only the words from
the property labels versus words from property labels and description texts.
{ Dimensions of vector model: Some prede ned vector models are
available with di erent numbers of vector dimensions (for example 50 vs. 100
vs. 300 dimensions). A lower number of dimensions makes the model more
computationally e cient, but it may loose semantic nuances.
{ Number of words in the model: In the pre-trained models the word
vectors are ordered descendingly by word frequency in the training corpus.
Big models with hundreds of thousands of vectors occupy a lot of memory
and take a long time load. Therefore, we compared the performance of models
with 300.000 words with smaller models with the 10.000 most frequent words.
Word Embeddings One of the main goals was to evaluate which of the
pretrained word embedding models is best suited for the task at hand. The
pretrained models available are not trained on exactly the same corpus, but all
include English Wikipedia. The following word embedding types were evaluated:
{ fastText: FastText [
            <xref ref-type="bibr" rid="ref2">2</xref>
            ] is an extension of the original Word2vec [
            <xref ref-type="bibr" rid="ref5">5</xref>
            ] model
which uses sub-word information. Words are represented as bag of character
n-grams. FastText generates better word embeddings for rare words, and
takes morphological information into account. Here, we applied a model
trained on Wikipedia 20169. Two variants were compared, a model with
300.000 words, and a small model with only the 10.000 most frequent words.
{ GloVe: GloVe[
            <xref ref-type="bibr" rid="ref6">6</xref>
            ] factors the logarithm of the co-occurrence matrix that
re ects the position of the context words in the word window. We used a
model pre-trained on a Wikipedia 2014 and Gigaword 5 corpus (6B tokens)10.
Variants include combinations of models with 300, 100 or 50 dimensions, and
300.000 versus only 10.000 word vectors.
{ LexVec: LexVec [
            <xref ref-type="bibr" rid="ref7">7</xref>
            ] is a word embedding method which factorizes PPMI
matrices and combines characteristics of techniques like Word2vec and GloVe.
LexVec performs well on word similarity and semantic analogy tasks, but
struggles on syntactic analogies. The model used was trained on a 7B token
corpus of English Wikipedia 2015 and NewsCrawl11. Again, we evaluated
variants of 300.000 versus 10.000 word vectors.
          </p>
        </sec>
        <sec id="sec-4-1-2">
          <title>9 https://github.com/facebookresearch/fastText/blob/master/pretrained</title>
          <p>vectors.md
10 https://github.com/stanfordnlp/GloVe
11 https://github.com/alexandres/lexvec
In the main evaluation which aims to judge the suitability of various word
embedding types we experiment with di erent models and settings. As stated, the
task is as follows: for any of the aliases de ned for Wikidata properties, we create
a ranking of related properties. The word vectors of the alias words are compared
to the vectors representing the properties. Every alias is compared to all 3323
properties, which is much harder than the real-world task of searching only in
the properties of a given entity. The later task is evaluated in the next section.</p>
          <p>Table 1 presents an overview of the results. Column one states the embedding
model type and the settings, namely the model size (either 300.000 or 10.000
words), and the dimensions of the vectors. The metrics Top-N re ect the ratio of
system suggestions, where the correct property is in the Top-N of the generated
ranking. MRR is the well-known mean reciprocal rank. The lower part of the table
includes some results for models which only use the words from the property label
to create the property vectors, but not from the description text (WO-D ).</p>
          <p>Top 1</p>
          <p>Top 3</p>
          <p>The fastText model with 300.000 word vectors and 300 vector dimensions
performs best over all metrics. We also experimented with a bigger fastText
model with around 2.5m word vectors, but those additional rare words just
increased memory consumption, the performance stayed almost the same. On
the other hand, it is evident that reducing the model size to 10.000 words a ects
performance negatively. Over all model types reducing model size from 300.000
to 10.000 words led to a sharp drop in accuracy. Regarding model types, fastText
is best suited for the task, followed by LexVec, and lastly GloVe. As seen in the
last part of the table, using the words from the description text to represent
property vectors is helpful. Finally, using ne grained word representations with
larger vectors (50 versus 100 versus 300 dimensions) has a strong positive e ect.
Property Search for Single Entities In the evaluations above, we measure
the accuracy for matching aliases against all the 3323 properties in Wikidata.
However, in an interactive scenario of visual data exploration with Ontodia,
the user query is typically restricted to the properties de ned for a speci c
entity. This scenario was simulated and evaluated by randomly choosing 1150
entities from the Wikidata dataset, and performing the evaluation with the their
properties and aliases.In total, about 85% of the properties had one or more
aliases de ned, with an average of 5:9 aliases per property. Table 2 presents
the evaluation using the fastText and LexVec models on the task of nding the
corresponding entity property for all aliases de ned for an entity.</p>
          <p>Top 1</p>
          <p>Top 3</p>
          <p>Again, fastText outperforms the LexVec embeddings. When ranking the
entity properties for the alias term by similarity, in over 70% of cases the rst
ranked property is correct with respect to the gold standard. For the Top-3
results, the number is 87:49%, and the MRR is 0:80. The results make us con dent
that the new search feature has a very positive impact on user experience. The
runtime of a query is typically under 10ms { well-suited for interactive systems.
4.3</p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>Discussion</title>
        <p>Dataset Quality During the evaluation and the inspection of the results we
found various issues with Wikidata dataset quality, which (i) explain part of
the misclassi cation of the method, and (ii) provide hints on improving dataset
quality, esp. the quality of aliases. First of all, in 14 cases the alias was exactly
the same term as the property label. More interestingly, many aliases are not
proper synonyms. For example, property P582 with label \end time", has alias
such as \divorced", or simply \to". Or, P150 with label \contains administrative
territorial entity", has aliases such as \divides into", \contains", \has villages"
{ some of which make it hard for the system to link to the correct property.
4.4</p>
      </sec>
      <sec id="sec-4-3">
        <title>System Performance</title>
        <p>The experiments summarized in Table 1 indicate that the fastText algorithm is
best suited for the task, followed by LexVec. System con guration, especially the
model vocabulary size and the number of vector dimensions are crucial for system
performance, and should only be compromised if decreasing memory footprint is
inevitable. Furthermore, including the property descriptions in the vector
provides better property representations. In our real-world use case (Section 4.2)
the method demonstrates su cient performance to improve user experience.</p>
        <p>Regarding computational performance, using Python and the gensim library,
the fastText model with 300.000 vectors and 300 dimensions consumes ca. 650M
of memory, a 10.000 words model requires 130M. The runtime for a query against
all 3323 properties is around 300ms, for the interactive use-case query time is
usually below 10ms.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>In this publication we present a method for simple and powerful search in entity
properties of Linked Data using natural language. A prototype of the method
is integrated into the Ontodia tool using Wikidata as data source. The method
applies models of distributed semantics to nd properties related to user input.
The contributions include (i) the presentation of a method for searching in Linked
Data which applies word embeddings to the given task in an e cient way, (ii) an
extensive evaluation of various types of word embedding models and parameters
such as model size and dimensionality against a gold standard, (iii) the provision
of the implementation and an online prototype. In future work we will apply the
presented approach to other datasets, and investigate the integration with more
powerful question answering for Linked Data techniques.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This work was supported by the Government of the Russian Federation (Grant
074-U01) through the ITMO Fellowship and Professorship Program.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Augenstein</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gentile</surname>
            ,
            <given-names>A.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Norton</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ciravegna</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Mapping keywords to linked data resources for automatic query expansion</article-title>
          . In: Cimiano,
          <string-name>
            <surname>P.e.a. (ed.) ESWC</surname>
          </string-name>
          <year>2013</year>
          . pp.
          <volume>101</volume>
          {
          <fpage>112</fpage>
          .
          <string-name>
            <surname>Springer</surname>
            <given-names>LNCS</given-names>
          </string-name>
          , Berlin, Heidelberg (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bojanowski</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grave</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joulin</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mikolov</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Enriching word vectors with subword information</article-title>
          .
          <source>arXiv preprint arXiv:1607.04606</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Freitas</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oliveira</surname>
            ,
            <given-names>J.G.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>O</given-names>
            <surname>'Riain</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          , da Silva,
          <string-name>
            <given-names>J.C.</given-names>
            ,
            <surname>Curry</surname>
          </string-name>
          , E.:
          <article-title>Querying linked data graphs using semantic relatedness: A vocabulary independent approach</article-title>
          .
          <source>Data &amp; Knowledge Engineering</source>
          <volume>88</volume>
          , 126 {
          <fpage>141</fpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Ghannay</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Favre</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Estve</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Camelin</surname>
          </string-name>
          , N.:
          <article-title>Word embedding evaluation and combination</article-title>
          . In: Calzolari,
          <string-name>
            <surname>N.</surname>
          </string-name>
          , al. (eds.)
          <article-title>LREC 2016</article-title>
          . ELRA, Paris, France (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Mikolov</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Corrado</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dean</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>E cient estimation of word representations in vector space</article-title>
          .
          <source>arXiv preprint arXiv:1301.3781</source>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Pennington</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Socher</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manning</surname>
          </string-name>
          , C.D.: Glove:
          <article-title>Global vectors for word representation</article-title>
          .
          <source>In: EMNLP</source>
          . pp.
          <volume>1532</volume>
          {
          <issue>1543</issue>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Salle</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Idiart</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Villavicencio</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Enhancing the lexvec distributed word representation model using positional contexts and external memory</article-title>
          .
          <source>CoRR abs/1606</source>
          .01283 (
          <year>2016</year>
          ), http://arxiv.org/abs/1606.01283
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Shekarpour</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lukovnikov</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>A.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Endris</surname>
            ,
            <given-names>K.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Singh</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thakkar</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lange</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Question answering on linked data: Challenges and future directions</article-title>
          .
          <source>CoRR abs/1601</source>
          .03541 (
          <year>2016</year>
          ), http://arxiv.org/abs/1601.03541
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>