<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Lessons Learned from a Knowledge-driven Search Application on-top of Large Data Sets</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>DENNIS DIEFENBACH</string-name>
          <email>diefenbach@univ-st-etienne.fr</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Université de Lyon</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>CNRS UMR</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Laboratoire Hubert Curien</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>France PIERRE TARDIVEAU</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Telecome Saint-Étienne</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>France ANDREAS BOTH</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>DATEV eG</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Germany KAMAL SINGH</string-name>
          <email>kamal.singh@univ-st-etienne.fr</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Université de Lyon</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>CNRS UMR</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Laboratoire Hubert Curien</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>France PIERRE MARET</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Université de Lyon</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>CNRS UMR</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Laboratoire Hubert Curien</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>France</string-name>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Additional Key Words and Phrases: Question Answering, User Interface, Big Data, User Feedback</string-name>
        </contrib>
      </contrib-group>
      <pub-date>
        <year>2018</year>
      </pub-date>
      <fpage>32</fpage>
      <lpage>39</lpage>
      <abstract>
        <p>The Web stores huge amounts of data. Additionally the number of stored data is increasing permanently. Hence, while using a search application, after some time it is likely that users are confronted with unknown instances or properties while triggering their search questions. From this observation the research challenge is derived, how data can automatically be visualized for a user without leaving the search application. Here, we will present observations from a search-driven application from the field of Question Answering on-top of the Web of Data. This Web application is using Wikidata - a data source derived from Wikipedia - and several other to provide access to general knowledge and specific knowledge from particular domains. Hence, the size of the data set is very large. The problem is how to tackle the sheer amount of available instances and properties (volume), the high variety due to the ambiguity of natural language questions, and the broad field represented by a general-purpose knowledge base. Data (instances and properties) need to be visualized so that it can be explored with respect to diferent dimensions and allowing diferent granularity. Additionally, feedback interaction points were required to make the system learn over time and deal with the ambiguity of natural language questions. Concluding, in this paper we will provide an overview of the challenges we have identified and the derived solutions. CCS Concepts: • Information systems → Web searching and information discovery; • Human-centered computing → Human computer interaction (HCI); Visualization systems and tools;</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        In the past user interfaces (UI) were providing the success to a predictable data set. UI designers and developers
were aware of the variety and volume of the data as well as the velocity of the data. This is fundamentally
changing due to the characteristics of Big Data and Big Data applications. Corresponding challenges were
previously discussed, e.g., in [
        <xref ref-type="bibr" rid="ref23 ref3 ref5">3, 5, 23</xref>
        ]. We primarily use the common classification of the “3 Vs of Big Data”
challenges1 (cf., [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]) as used by many other authors ([
        <xref ref-type="bibr" rid="ref17 ref18 ref22">17, 18, 22</xref>
        ]) to highlight some of the challenges from the
visual perspective:
1It is well-known that several other classifications are available, e.g., [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] uses 4V and [
        <xref ref-type="bibr" rid="ref21 ref7">7, 21</xref>
        ] use 5V.
      </p>
      <p>VisBIA 2018 – Workshop on Visual Interfaces for Big Data Environments in Industrial Applications. Co-located with AVI 2018 – International
Conference on Advanced Visual Interfaces, Resort Riva del Sole, Castiglione della Pescaia, Grosseto (Italy), 29 May 2018
© 2018 Copyright held by the owner/author(s).
• The volume of the data might be beyond the scale a human will ever be capable to process (in this paper
referred as Challenge 1). For example, the number of entities in Wikipedia/Wikidata is around 47 Million2.
Hence, a singular person cannot know all of them nor is it possible to show even fractions of this set in a
search result.
• The variety of the data is expanding in beyond a limit which can be presented in a reasonable way
(Challenge 2). For example, the Wikipedia entity “Florence” (Italian city, located in Tuscany) contains 530
properties3 (e.g., the continent, head of government and the time zone). Hence, even visualizing information
about a single entity need to be addressed to enable a proper user understanding and interaction.
• The velocity of the data is increasing (Challenge 3). For example, data integration process are adding new
instances to the considered data set due to the permanently growing of data in general4.
• Consequently the processes evaluating the data are very complicated and not easy-to-understand for
common users. Hence, the required trust needs to be addressed by interface components providing insights
of the data leading to establish confidence in the provided application. This challenge is not covered by the
3V classification. In this paper, we refer to it as Challenge 4.</p>
      <p>In this paper we will provide good practices for applications on-top of Big Data. These insights are derived
from a Question Answering application using the data of the Wikipedia/Wikidata Knowledge Base (KB) to answer
factoid questions. Hence, it covers general knowledge considered to be relevant for mankind (Wikidata: zipped
ifle size ca. 23 GB 5). A concrete example would be to answer a question like “Give me museums in Florence?” using
information contained in a Knowledge Base like Wikidata. As the used data represents actual facts, i.e., knowledge,
we refer to our approach as knowledge-driven.</p>
      <p>
        We elaborate our insights while using the prototypical UI called Trill[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], a user interface that can be used to
display the results of QA systems over large KBs. All screenshots in this publication show Trill combined in the
backend with the QA system WDAqua-core1[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. A live demo is available under:
      </p>
      <p>www.wdaqua.eu/qa</p>
      <p>The paper is organized as follows. In Section 2 we give a brief overview of related work. The Section 3 and
4 address the challenges mentioned above. The first provides insights to the visualizations while the second
addresses the interaction points built into the UI. Examples of Trill are provided each time to show an example.
Conclusions and future work are provided in Section 5.</p>
    </sec>
    <sec id="sec-2">
      <title>2 RELATED WORK</title>
      <p>
        In Big Data environments visualizations are important to cope with the volume, variety and velocity of the data.
For example [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] present visualization techniques to explore the human connectome, i.e., how the neurons in the
human brain are connected. [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] present visualization techniques for scientists to better understand aspects of
tropical meteorology from large-scale spatiotemporal climate simulation data. Finally [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] describes methods for
the visualization of fluid dynamics.
      </p>
      <p>
        Just a few works ofer an enhanced UI although the challenges beyond traditional search are well-known
(e.g., [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ]) and addressed by UIs with enhanced visual components (e.g., [
        <xref ref-type="bibr" rid="ref15 ref21">15, 21</xref>
        ]).
      </p>
      <p>
        In the domain of QA users also have to deal with huge amounts of data. In the last years a big efort was made
to develop high quality QA systems capable of answering natural language questions using unstructured or
(semi-)structured data. Also in the more particular domain of QA over KBs this is the case. One can count more
then 40 systems evaluated on public available benchmarks. For an exhaustive list we refer to [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Most of these
2Number from the 28/04/2018. Updated statistics can be found at www.wikidata.org
3query.wikidata.org/#SELECT%20%28count%28%2a%29%20as%20%3Fc%29%0AWHERE %20%7B%0A%20%20wd%3AQ2044%20%3Fp%20%3Fo%20.%0A%7D
4cf., tools.wmflabs.org/wikidata-todo/stats.php
5cf., wikidata.org/wiki/Wikidata:Database_download
34 •
works focus on the translation of natural language question to formal representations, particularly to SPARQL
queries. Some exceptions that we are aware of are SINA[
        <xref ref-type="bibr" rid="ref24">24</xref>
        ], QAKiS[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and Platypus6. In the following we use
the user interface (UI) named Trill for providing examples. It might be considered as reference implementation
of how to cope with the large amount of data that a user could have to deal with when large answer sets are
computed. While the UI of the above cited QA systems only provide simple representations for the answers, Trill
is able to present answer-data sets from diferent dimensions and with diferent granularities.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3 VISUALIZATION</title>
      <p>An important aspect in QA over KB is the requirement to present the answers of a possible complex data structure
to a (regular) user. Note that most research focus on the process of answer generation, i.e., to derive from
a natural language question a formal representation (e.g., a SQL query for Relational Database Management
Systems, or SPARQL for triplestores). If this is the first step to make data accessible to end users, it is certainly
not the last. For example, the answer of a SPARQL query is a set of URIs which are the identifiers of the
actual data of the corresponding result item, e.g., the unique identifier of the city of Florence (Italy) is https:
//www.wikidata.org/wiki/Q2044. Via this URI a large numbers of information can be retrieved, e.g., for Florence
there are 530 associated properties. Clearly only some of these can be shown to the user, because the volume and
variety of the data is too large (cf., Challenge 1).</p>
      <p>For designing the UI it is required to answer the following question: Which information should be presented
to the user, s.t., it provides an acceptable user experience.</p>
    </sec>
    <sec id="sec-4">
      <title>3.1 Aggregation of Answer Sets</title>
      <p>In the case of multiple entities we ofer three diferent ways to visualize the answer set. We call these diferent
ways answer aggregations.</p>
      <p>(1) The first aggregates the labels, and consist of the list of labels of the answer items. An example is shown in</p>
      <p>Figure 1c.
(2) The second aggregates the pictures of the answer entities. The answer entity depictions are shown near
each other with a small textual label. An example is shown in Figure 1a.
(3) Finally the third aggregation is happening on the geographic coordinate level. We aggregate the geographic
coordinates and we display the points together in a map. By clicking on the points the name of the entity
appears. An example is shown in Figure 1b.</p>
      <p>Clearly, this approach addresses the Challenge 1. From (1) – default solutions for representing data – to (2) and
(3).</p>
      <p>
        The aggregation level of images (2) is higher in contrast to textual representations. Additionally visual
representations of a search result item are decreasing the cognitive interpretation time in contrast to textual
representation and the images out of the search context are expected to be identified faster (cf., [
        <xref ref-type="bibr" rid="ref2 ref25 ref6">2, 6, 25</xref>
        ]). Hence,
despite the larger number of images within the viewport of the user it can also be assumed that users will process
the search result visualized by images faster. The aggregation level of a map is even higher. If required, a map of
the whole world can be presented visualizing huge numbers of items (e.g., following [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]).
      </p>
      <p>However, it is not always meaningful to show a particular type of aggregation. While the labels of the items
within the answer set can always be shown, this is not the case for images and geographic coordinates. For
example, for a list of persons the aggregation for geographic coordinates is meaningless although geospatial
properties are attached to a person entity (e.g., location of birth or death, various travel locations). To decide
which aggregation to show the ratio of entities having associated image and geographic coordinates are computed.
6http://sina.aksw.org, http://live.ailao.eu, http://qakis.org/qakis2/, https://askplatyp.us/?
•
a) UI element showing the museums in Florence
aggregated by image. The user can see for each museum a picture,
the name of the museum and, by clicking con the "i" icon,
get more information about the entity.
b) UI element showing the museums in Florence
aggregated by geo-location. The user can see how the museums
are distributed in space.</p>
    </sec>
    <sec id="sec-5">
      <title>c) UI element showing the museums in Florence aggre</title>
      <p>gated by label. To get more information about a museum
it sufices to click on the corresponding instance.</p>
    </sec>
    <sec id="sec-6">
      <title>d) This figure shows all the information that are shown to</title>
      <p>the user for the answer entity “Casa Buonarroti”, a museum
in Florence.
36 •</p>
      <p>If the ratio is above a predefined threshold, the corresponding aggregation is displayed. The aggregation are
prioritized as follows:
(1) If the number of available item depictions is beyond the predefined threshold, the depictions are shown.
(2) Else if the number of available geographic coordinates is beyond the predefined threshold, a map showing
the search result items as pins is presented to the user.</p>
      <p>(3) Else the list of labels of the items of the result set is shown.</p>
      <p>The presented aggregations improve the capabilities to handle Challenge 1, the volume of the data, and thanks
to the adaptation to the type Challenge 2, the variety of the data, is addressed too. Note: By aggregating the
data on high-level concepts and by the adaptation of the information shown depending on the semantic types
associated, also Challenge 3, the velocity, can be handled.</p>
    </sec>
    <sec id="sec-7">
      <title>3.2 Answer Set Item</title>
      <p>Each of the aggregations allow to explore more in depth a specific entity. We explored existing KBs like Wikidata
and DBpedia and found that there is a number of useful facts that can be directly shown to the user without
being overwhelming. The following information (if available) are shown (cf., Figure 1d):
1 the label of the answer entity in the corresponding language,
2 a textual description retrieved from Wikipedia,
3 a map indicating the location of the entity taken from OpenStreetMap,
4 an image depicting the entity, this represents the best way to immediately identify the answer entity.
5 external links related to the entity are shown like the webpage, the wikipedia entry, to Twitter, to Facebook,
to Instagram, to Google Scholar and ORCID.</p>
      <p>Finally for example for songs (like Longview (song by Green Day)) a video is shown.7 These information not
only add context to the answer entity increasing the confidence of a user, but also give direct opportunities to
consume this information which are directly linked to the searched entity.</p>
      <p>
        However, despite the properties manually picked due to the indisputable importance, there might be a huge
number of additional properties (cf., Challenge 2, Section 1). For this reason a mechanism is required to
distinguish more important properties from less important. Consequently we introduces an approach to compute
the top-k most relevant facts for the entity and only present them to the user. These properties are computed
using the SummaServer [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. This does not only provide some relevant facts to contextualize the entity but also
increase discoverability of the dataset itself. The user can browse through the propose facts and discover the
dataset in a serendipitous way. In Trill k was set to 5 as shown in Figure 1d.
      </p>
      <p>Hence, the answer presentation is also adapting to the properties of data addressing Challenge 1 and
Challenge 2.</p>
    </sec>
    <sec id="sec-8">
      <title>4 USER INTERACTION AND FEEDBACK</title>
      <p>
        In this section we are describing the interaction possibility that we implemented in Trill [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. The goal of
these interaction methods is to handle the huge amount of information which we have to deal with. Note, in
applications built upon Big Data mostly collecting feedback is crucial due to underlying AI algorithms (here the
Question Answering) and its improvement.8 Hence, each user feedback is added to the training data to be used
for improving the search quality (cf., [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ]) after the next release of the UI.9 This follows the high-level intention
to increase the data accessibility by improving the system’s quality (over time) w.r.t. computed answer set while
7Longview (song by Green Day)
8In Trill the collected log files are used to construct a new benchmark called WDAqua-core0 that is described in [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] and is available under [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
9Note, even “wrong” feedback of users will not crash this approach as machine learning algorithms are capable of handling conflicting data.
•
      </p>
    </sec>
    <sec id="sec-9">
      <title>a) Whenever a user asks a question, a simple box is appearing asking feedback to the user. It can be seen in the lower part of</title>
      <p>this figure.</p>
    </sec>
    <sec id="sec-10">
      <title>b) The “Did you mean” functionality allows a user to interact with the QA system to specify specific entities the question</title>
      <p>refereed to. The interpretation is then adjusted.</p>
    </sec>
    <sec id="sec-11">
      <title>c) Expert users that understand SPARQL can choose between all generated SPARQL query generated by the question</title>
      <p>answering system. By clicking on the corresponding SPARQL query the answer set is recomputed.</p>
      <p>Fig. 2. Diferent types of user feedback interfaces that are integrated into Trill. a) shows a very simple user feedback interface
that is used to collect training data. b), c) present interfaces that can be used when the question is ambiguous.
38 •
exploiting the user interaction. Hence, implementing touchpoints for users needs to be addressed. Here, three
methods for feedback collection are presented.</p>
      <p>The first interaction method is addressing common user (i). Therefore, the requirements for interaction are
kept intentionally very low. When a user asks a question and the answer is computed, a easy-to-understand form
is shown presenting the user the option to provide feedback whether the computed answer is correct or wrong.
An example is given in Figure 2a.</p>
      <p>The second interaction is used to engage the user when the question was not interpreted correctly by the QA
system. To deal with them we ofer two interfaces. One addresses experienced end users and the other one expert
users (i.e., users that are able to understand the technical query language used in the backend).</p>
      <p>Experienced end users can interact with the QA system to specify specific entities the question referred to,
i.e., they are clarifying the context/interpretation of the current question (ii). Note that this problem particularly
arises on KBs containing large volumes of data since the user’s question can be interpreted in many diferent
ways (e.g., the question “How old is Paris?” might be interpreted as the request for the age of a place, a person, a
mythical hero, etc.). The interpretation is then adjusted. An example is given in Figure 2b.</p>
      <p>In the expert interface (iii) multiple queries generated by the QA systems are listed (in Trill: SPARQL queries
are shown). By clicking on them an expert user triggers the computation and visualization of the corresponding
answer set. This way an expert user can interact with the system and find an answer even if the system could not
directly provide one. An example is given in Figure 2c.</p>
      <p>While providing these user touchpoints – (i), (ii), and (iii) – we are addressing Challenge 4.</p>
    </sec>
    <sec id="sec-12">
      <title>5 CONCLUSION</title>
      <p>In this paper we gave an overview about typical challenges which need to be addressed while implementing
applications driven by Big Data. To provide insights to the good practices experienced, we used a real-world
question answering system to address the four identified challenges while establishing applications on-top of
Big Data. The system covered the collected requirements w.r.t. (i) the presentable number of data elements in a
result set (aggregation), (ii) the variety of the property of data elements (selection), and (iii) diferent depth of
involvement derived from the capabilities of the expected user groups (regular user, experienced user, expert user).</p>
      <p>Acknowledgments Parts of this work received funding from the European Union’s Horizon 2020 research and
innovation programme under the Marie SkÅĆodowska-Curie grant agreement No. 642795, project: Answering
Questions using Web Data (WDAqua).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Johanna</given-names>
            <surname>Beyer</surname>
          </string-name>
          , Markus Hadwiger, Ali Al-Awami,
          <string-name>
            <surname>Won-Ki</surname>
            <given-names>Jeong</given-names>
          </string-name>
          , Narayanan Kasthuri, Jef W Lichtman, and
          <string-name>
            <given-names>Hanspeter</given-names>
            <surname>Pfister</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Exploring the connectome: Petascale volume visualization of microscopy data streams</article-title>
          .
          <source>IEEE computer graphics and applications 33</source>
          ,
          <issue>4</issue>
          (
          <year>2013</year>
          ),
          <fpage>50</fpage>
          -
          <lpage>61</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>David</given-names>
            <surname>Beymer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Peter Z.</given-names>
            <surname>Orton</surname>
          </string-name>
          , and
          <string-name>
            <surname>Daniel</surname>
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Russell</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>An Eye Tracking Study of How Pictures Influence Online Reading</article-title>
          . In Human-Computer Interaction - INTERACT
          <year>2007</year>
          ,
          <string-name>
            <given-names>Cécilia</given-names>
            <surname>Baranauskas</surname>
          </string-name>
          , Philippe Palanque, Julio Abascal, and Simone Diniz Junqueira Barbosa (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg,
          <fpage>456</fpage>
          -
          <lpage>460</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Christian</given-names>
            <surname>Bizer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Peter</given-names>
            <surname>Boncz</surname>
          </string-name>
          , Michael L Brodie, and
          <string-name>
            <given-names>Orri</given-names>
            <surname>Erling</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>The Meaningful Use of Big Data: Four Perspectives-Four Challenges</article-title>
          .
          <source>SIGMOD Record 40</source>
          ,
          <issue>4</issue>
          (
          <year>2011</year>
          ),
          <fpage>57</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Elena</given-names>
            <surname>Cabrio</surname>
          </string-name>
          , Julien Cojan, Alessio Palmero Aprosio, Bernardo Magnini, Alberto Lavelli, and
          <string-name>
            <given-names>Fabien</given-names>
            <surname>Gandon</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>QAKiS: an open domain QA system based on relational patterns</article-title>
          . In International Semantic Web Conference,
          <string-name>
            <surname>ISWC</surname>
          </string-name>
          <year>2012</year>
          .
          <article-title>CEUR-WS. org</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>CL</given-names>
            <surname>Philip</surname>
          </string-name>
          <article-title>Chen</article-title>
          and
          <string-name>
            <surname>Chun-Yang Zhang</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Data-intensive applications, challenges, techniques and technologies: A survey on Big Data</article-title>
          .
          <source>Information Sciences</source>
          <volume>275</volume>
          (
          <year>2014</year>
          ),
          <fpage>314</fpage>
          -
          <lpage>347</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Terry</surname>
            <given-names>L</given-names>
          </string-name>
          <string-name>
            <surname>Childers and Michael</surname>
          </string-name>
          J Houston.
          <year>1984</year>
          .
          <article-title>Conditions for a picture-superiority efect on consumer memory</article-title>
          .
          <source>Journal of consumer research 11</source>
          ,
          <issue>2</issue>
          (
          <year>1984</year>
          ),
          <fpage>643</fpage>
          -
          <lpage>654</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Yuri</given-names>
            <surname>Demchenko</surname>
          </string-name>
          , Paola Grosso, Cees De Laat, and
          <string-name>
            <given-names>Peter</given-names>
            <surname>Membrey</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Addressing big data issues in scientific data infrastructure</article-title>
          .
          <source>In Collaboration Technologies and Systems (CTS)</source>
          ,
          <source>2013 International Conference on. IEEE</source>
          ,
          <fpage>48</fpage>
          -
          <lpage>55</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Diefenbach</given-names>
            <surname>Dennis</surname>
          </string-name>
          .
          <year>2017</year>
          . WDAquaCore0Questions. https://github.com/WDAqua/WDAquaCore0Questions.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Dennis</given-names>
            <surname>Diefenbach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Shanzay</given-names>
            <surname>Amjad</surname>
          </string-name>
          , Andreas Both,
          <string-name>
            <given-names>Kamal</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Pierre</given-names>
            <surname>Maret</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Trill: A reusable Front-End for QA systems</article-title>
          . In ESWC P&amp;D.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Dennis</given-names>
            <surname>Diefenbach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Andreas</given-names>
            <surname>Both</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Kamal</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Pierre</given-names>
            <surname>Maret</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Towards a Question Answering System over the Semantic Web</article-title>
          .
          <source>Semantic Web Journal (under review)</source>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Dennis</given-names>
            <surname>Diefenbach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Niousha</given-names>
            <surname>Hormozi</surname>
          </string-name>
          , Shanzay Amjad, and
          <string-name>
            <given-names>Andreas</given-names>
            <surname>Both</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Introducing Feedback in Qanary: How Users can interact with QA systems</article-title>
          . In ESWC P&amp;D.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Dennis</given-names>
            <surname>Diefenbach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Vanessa</given-names>
            <surname>Lopez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Kamal</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Pierre</given-names>
            <surname>Maret</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Core techniques of question answering systems over knowledge bases: a survey. Knowledge and Information systems (</article-title>
          <year>2017</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>41</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>Dennis</given-names>
            <surname>Diefenbach</surname>
          </string-name>
          , Thomas Tanon,
          <string-name>
            <given-names>Kamal</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Pierre</given-names>
            <surname>Maret</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Question Answering Benchmarks for Wikidata</article-title>
          .
          <source>In ISWC</source>
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Dennis</given-names>
            <surname>Diefenbach</surname>
          </string-name>
          and
          <string-name>
            <given-names>Andreas</given-names>
            <surname>Thalhammer</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>PageRank and Generic Entity Summarization for RDF Knowledge Bases</article-title>
          .
          <source>In ESWC</source>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Marian</surname>
            <given-names>Dörk</given-names>
          </string-name>
          , Sheelagh Carpendale, Christopher Collins, and
          <string-name>
            <given-names>Carey</given-names>
            <surname>Williamson</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Visgets: Coordinated visualizations for web-based information exploration and discovery</article-title>
          .
          <source>IEEE Transactions on Visualization and Computer Graphics</source>
          <volume>14</volume>
          ,
          <issue>6</issue>
          (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>David</given-names>
            <surname>Ellsworth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Bryan</given-names>
            <surname>Green</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Patrick</given-names>
            <surname>Moran</surname>
          </string-name>
          .
          <year>2004</year>
          .
          <article-title>Interactive Terascale Particle Visualization</article-title>
          .
          <source>In Proceedings of the Conference on Visualization '04 (VIS '04)</source>
          . IEEE Computer Society, Washington, DC, USA,
          <fpage>353</fpage>
          -
          <lpage>360</lpage>
          . https://doi.org/10.1109/VISUAL.
          <year>2004</year>
          .55
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Gartner</surname>
          </string-name>
          . [n. d.]. IT Glossary - Big
          <string-name>
            <surname>Data</surname>
          </string-name>
          . Available from https://www.gartner.com/it-glossary/big-data/, Accessed 2018-
          <volume>05</volume>
          -20.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>Doug</given-names>
            <surname>Laney</surname>
          </string-name>
          .
          <year>2001</year>
          . 3-
          <string-name>
            <given-names>D</given-names>
            <surname>Data Management: Controlling Data</surname>
          </string-name>
          <string-name>
            <given-names>Volume</given-names>
            ,
            <surname>Velocity</surname>
          </string-name>
          , and Variety.
          <volume>6</volume>
          (
          <issue>01</issue>
          <year>2001</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Teng-Yok</surname>
            <given-names>Lee</given-names>
          </string-name>
          , Xin Tong,
          <string-name>
            <surname>Han-Wei</surname>
            <given-names>Shen</given-names>
          </string-name>
          , Pak Chung Wong, Samson Hagos, and
          <string-name>
            <given-names>L Ruby</given-names>
            <surname>Leung</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Feature tracking and visualization of the Madden-Julian Oscillation in climate simulation</article-title>
          .
          <source>IEEE Computer Graphics and Applications</source>
          <volume>33</volume>
          ,
          <issue>4</issue>
          (
          <year>2013</year>
          ),
          <fpage>29</fpage>
          -
          <lpage>37</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>James</surname>
            <given-names>K</given-names>
          </string-name>
          <string-name>
            <surname>Rayson</surname>
          </string-name>
          .
          <year>1999</year>
          .
          <article-title>Aggregate towers: Scale sensitive visualization and decluttering of geospatial data</article-title>
          .
          <source>In Information Visualization</source>
          ,
          <year>1999</year>
          .(Info Vis'
          <volume>99</volume>
          )
          <string-name>
            <surname>Proceedings</surname>
          </string-name>
          .
          <source>1999 IEEE Symposium on. IEEE</source>
          ,
          <fpage>92</fpage>
          -
          <lpage>99</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Tuukka</surname>
            <given-names>Ruotsalo</given-names>
          </string-name>
          , Giulio Jacucci, Petri Myllymäki, and
          <string-name>
            <given-names>Samuel</given-names>
            <surname>Kaski</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Interactive intent modeling: Information discovery beyond search</article-title>
          .
          <source>Commun. ACM 58</source>
          ,
          <issue>1</issue>
          (
          <year>2015</year>
          ),
          <fpage>86</fpage>
          -
          <lpage>92</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>Philip</given-names>
            <surname>Russom</surname>
          </string-name>
          et al.
          <year>2011</year>
          .
          <article-title>Big data analytics</article-title>
          .
          <source>TDWI best practices report, fourth quarter 19</source>
          ,
          <issue>4</issue>
          (
          <year>2011</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>34</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>Michael</given-names>
            <surname>Schroeck</surname>
          </string-name>
          , Rebecca Shockley, Janet Smart, Dolores Romero-Morales, and
          <string-name>
            <given-names>Peter</given-names>
            <surname>Tufano</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Analytics: The real-world use of big data</article-title>
          .
          <source>IBM Global Business Services</source>
          <volume>12</volume>
          (
          <year>2012</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>20</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Saeedeh</surname>
            <given-names>Shekarpour</given-names>
          </string-name>
          , Edgard Marx,
          <string-name>
            <surname>Axel-Cyrille Ngonga Ngomo</surname>
            , and
            <given-names>Sören</given-names>
          </string-name>
          <string-name>
            <surname>Auer</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Sina: Semantic interpretation of user queries for question answering on interlinked data</article-title>
          .
          <source>Web Semantics: Science, Services and Agents on the World Wide Web</source>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Roger</surname>
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Shepard</surname>
          </string-name>
          .
          <year>1967</year>
          .
          <article-title>Recognition memory for words, sentences, and pictures</article-title>
          .
          <source>Journal of Verbal Learning and Verbal Behavior</source>
          <volume>6</volume>
          ,
          <issue>1</issue>
          (
          <year>1967</year>
          ),
          <fpage>156</fpage>
          -
          <lpage>163</lpage>
          . https://doi.org/10.1016/S0022-
          <volume>5371</volume>
          (
          <issue>67</issue>
          )
          <fpage>80067</fpage>
          -
          <lpage>7</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>Andrew</given-names>
            <surname>Trotman</surname>
          </string-name>
          .
          <year>2005</year>
          .
          <article-title>Learning to rank</article-title>
          .
          <source>Information Retrieval 8</source>
          ,
          <issue>3</issue>
          (
          <year>2005</year>
          ),
          <fpage>359</fpage>
          -
          <lpage>381</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Ryen</surname>
            <given-names>W White</given-names>
          </string-name>
          <source>and Resa A Roth</source>
          .
          <year>2009</year>
          .
          <article-title>Exploratory search: Beyond the query-response paradigm</article-title>
          .
          <source>Synthesis lectures on information concepts</source>
          ,
          <source>retrieval, and services 1</source>
          ,
          <issue>1</issue>
          (
          <year>2009</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>98</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>