<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>The Semantic Combining for Exploration of Environmental and Disease Data Dashboard for Clinician Researchers</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Albert Navarro-Gallinad</string-name>
          <email>albert.navarro@adaptcentre.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alan Meehan</string-name>
          <email>alan.meehan@adaptcentre.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Declan O'Sullivan</string-name>
          <email>declan.osullivan@adaptcentre.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>ADAPT Centre for Digital Content, Trinity College Dublin, Dublin, Ireland School of Computer Science and Statistics, Trinity College Dublin</institution>
          ,
          <addr-line>Dublin</addr-line>
          ,
          <country country="IE">Ireland</country>
        </aff>
      </contrib-group>
      <fpage>73</fpage>
      <lpage>85</lpage>
      <abstract>
        <p>While Semantic Web technologies facilitate the integration of heterogeneous data sources through the Resource Description Framework (RDF) and ontologies, they present an obstacle for non-technical researchers who want to access and explore the data to meet their needs. To address this problem, visual tools and analytical platforms with a user-centred approach are an emerging solution. This paper outlines the design of a dashboard called Semantic Combining for Exploration of Environmental and Disease data (SCEED), an initial visual tool designed for use by clinician researchers to explore and retrieve combined environmental and disease data for further analysis. The evaluation of SCEED consists of a combination of standard usability and effectiveness methods, using the AVERT project as a case study. In the AVERT project, clinician researchers need to address the challenges of querying specific vasculitis flare clinical data for a particular patient to retrieve linked environmental data from a triplestore, and downloading the chosen data as input for their statistical models. The initial evaluation has concluded that the SCEED dashboard is an adequate initial design to fulfil, and points towards an interface to engage clinician researchers directly with Linked Data. Furthermore, this paper helps to highlight the difficulty of conducting usability evaluations with small sample sizes and how evaluation metrics can be combined to assess the requirements for developing an effective tool.</p>
      </abstract>
      <kwd-group>
        <kwd>Semantic Web</kwd>
        <kwd>dashboard</kwd>
        <kwd>usability evaluation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Semantic Web technologies have a steep learning curve which can present an
obstacle for non-technical researchers when trying to access and explore the
data for their needs. Visualization tools and analytical platforms operating on
top of Semantic Web architectures can support accessing and exploring Linked
Data for non-technical or non-domain expert users by aiding query formulation
in an intelligible manner [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. A dashboard approach with multiple coordinated
views offers advantages for statistical data including the integration of multiple
data sources, details of the underlying data, a flexible data analysis layer and
a reusable framework [
        <xref ref-type="bibr" rid="ref16 ref18 ref3">16, 18, 3</xref>
        ]. Furthermore, a user-centred approach to
dashboard design provides an easy and intuitive interface to be used by a focused
group [
        <xref ref-type="bibr" rid="ref11 ref16 ref18 ref3">16, 18, 3, 11</xref>
        ] and the use of standard usability evaluation with standard
post questionnaires enables comparison of prototype tools with later versions of
the tool, as well as with other tools.
      </p>
      <p>At the moment, typically clinician researchers require knowledge engineers to
access, explore and retrieve data that they are interested in from datasets when
implemented using standard Semantic Web technologies. Therefore, there is an
opportunity to propose a semantic analytical platform following a user-centred
approach. In particular, health related statisticians or a clinician with
statistical experience (hereafter clinician researchers) lack tools to explore clinical and
environmental linked data, which will be used as input to train their models
(described further in the Section 3).</p>
      <p>This paper outlines the design of the Semantic Combining for Exploration
of Environmental and Disease data (SCEED) dashboard, an initial visual tool
designed to be used by clinician researchers to explore and retrieve combined
environmental and disease data for further analysis. The contribution of this
paper is the SCEED dashboard itself along with an initial evaluation of the
usability and effectiveness through a standard usability test.</p>
      <p>The paper is structured as follows. Section 2 reviews related research. Section
3 overviews the AVERT project. Section 4 outlines the design and
implementation of the SCEED dashboard. Section 5 describes the evaluation method, the
results and analysis of the user experiment and discusses the outcome of this
initial evaluation. Section 6 concludes the paper and states the future work.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>This section overviews the state-of-the-art in Semantic-Web based visual tools
to meaningfully explore linked data for clinician researchers. We have classified
the reviewed tools based on the usability evaluation for their visual techniques.</p>
      <p>
        Relevant tools where non-standard usability evaluation used. The
Granatum project addresses the computational challenges that genomic
scientists have in analysing single-cell RNA sequencing data with a graphical analysis
pipeline [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]. As part of the project, Hasnain et al. 2014 [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] evaluates the
developed Liked Biomedical Dataspace for supplementing drug discovery with domain
experts, following a user-centred approach for bioinformaticians and biomedical
researchers. This dataspace uses ReVealD as the visual query system, which
is evaluated with ’Tracking Real-time User Experience (TRUE)’ methodology,
and later became a platform integrated for this project. Kamdar et al. 2014 [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]
evaluates this platform for biomedical researchers with metrics such as number
and time taken per step to complete a task. The fact that an ad hoc usability
questionnaire was used in this particular study encumbers further comparison
with other studies. Likewise, Villanueva-Rosales et al. 2015 [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] uses an ad hoc
usability questionnaire to evaluate, with multidisciplinary participants, an
experimental graphical user interface for The Earth, Life and Semantic Web Project
[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. In addition, Scharl et al. 2017 [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] assesses the usability of the semantic
analytical platforms usability with heuristic evaluation, formative usability tests
and feedback from actual users, communication professionals; non-standard
usability/effectiveness metrics.
      </p>
      <p>
        Relevant tools where standard usability evaluation was used. Sabol
et al. 2014 [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] presents a toolchain to explore and visually analyse Linked Data
for non-Semantic Web experts. The authors evaluate the work from a formative
usability angle with a quantitative, standard NASA Task Load Index (TLX)
for workload and time per task; and qualitative, think-aloud protocol metrics.
Furthermore, Dafli et al. 2015 [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] evaluates the usability and efficacy of the
Open Laberinth extension with specific scenarios aimed at health professionals. A
System Usability Scale (SUS) questionnaire, a standard method, in combination
with eViP questionnaire and expert reviews were the used metrics. This standard
questionnaire is also used by Braoveanu et al. 2016 [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] in combination with the
time per task and discussion of the task results for a user study with tourism
researchers and practitioners; and by Zained et al. 2015 [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] with an additional
custom questionnaire designed to test the FedViz interface for a researchers and
engineers in Semantic Web.
      </p>
      <p>In contrast to the related work outlined above, our approach is focused
on providing access and exploration of linked environmental and clinical data
for clinician researchers. In this paper, we present a dashboard with a
usercentred approach addressing clinician researchers assessed with a standard
usability/effectiveness evaluation, enabling comparison with later versions of the
SCEED tool, and with other related tools.
3</p>
    </sec>
    <sec id="sec-3">
      <title>AVERT Background</title>
      <p>
        AVERT1 and HELICAL2 are two projects in the field of Healthcare Data
Linkage which share the same data integration approach based on Semantic Web
technologies [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], providing a scalable semantic architecture for data related to
rare chronic diseases. The data model links clinical data for patients with ANCA
vasculitis (a rare kidney disease) with environmental data, for the goal of
predicting when flares of the disease may occur for individual patients.
      </p>
      <p>
        In the AVERT semantic architecture [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], Semantic Web technologies are
used to combine multiple diverse data sources with spatial and temporal common
features between medical registries and environmental data. These datasets are
converted to RDF [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] with R2RML [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], a mapping from relational databases
format to RDF datasets; and stored in a triple store, allowing information retrieval
through semantic queries. Then SPARQL enables the retrieval, manipulation
and linkage of the stored data. Currently a knowledge engineer performs these
      </p>
      <sec id="sec-3-1">
        <title>1 https://www.tcd.ie/medicine/thkc/avert/ 2 http://helical-itn.eu/</title>
        <p>SPARQL queries to fulfil the clinician researchers needs, a human-in-the-loop
approach.</p>
        <p>The intention going forward in such projects is to allow the clinician
researchers themselves (through a dashboard) to access and explore the clinical
and environmental data that are represented and linked through semantic web
technologies. The data will be used as input to train their models to
potentially find associations and relationships between environmental factors and the
disease flares of patients.</p>
        <p>An effective tool would thus intend to achieve the following requirements
extracted from expert consensus within AVERT:</p>
        <p>Requirement 1 : enable the clinician researcher to query specific clinical
patient data to retrieve linked environmental data, without the need for knowledge
of the underpinning semantic web technologies;</p>
        <p>Requirement 2 : support the understanding of the clinician researcher in the
use and limitations of the linked environmental data to support identification of
flares for rare diseases;</p>
        <p>Requirement 3 : allow for the download of selected clinical and environmental
data to be used as input in statistical models for data analysis.</p>
        <p>The SCEED dashboard is a prototype tool aimed at satisfying these three
requirements.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Design and Implementation</title>
      <p>The development of the dashboard was motivated by the needs of clinician
researchers in the AVERT project, who are not Semantic Web experts, to identify
relevant environmental data that should be linked to longitudinal ANCA
vasculitis patient clinical data to support spatio-temporal analysis of the data. This
is done in order to support prediction of flares for individual patients and to
ultimately support the discovery of environmental factors that trigger the disease
in the patient cohort.</p>
      <p>
        The dashboard operates on top of AVERTs semantic architecture (see Section
3), where data from multiple data sources is uplifted to RDF [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]: weather and
pollution data (27.5M triples) along with infectious disease data and clinical
data (2.6M triples). This relevant data is retrieved from a triplestore supporting
GeoSPARQL [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] queries, which are key for the nature of our data (which has
spatio-temporal components).
      </p>
      <p>The initial dashboard was designed to have four tabs (see Fig. 1), each of
which are described next.</p>
      <p>Query tab. In the Query tab of the dashboard, Fig. 1 part a), the user
can select options from the different flare related parameters. These selected
options are then substituted into a SPARQL query template, URL encoded and
executed against the data in the triplestore.This tab is aimed towards satisfying
Requirement 1.</p>
      <p>Link data tab. This is the first tab the user sees when submitting a query,
Fig. 1 part b). The aim of this tab is to provide the environmental data linked
to the clinical patient queried in the Query tab. This data is displayed as a
table with a hovering feature that displays the data description, gathered from
the climate data store provided by the European Centre for Medium-Range
Weather Forecasts (ECMWF) parameter database3. The table displayed after
the submission of the query can be downloaded as a CSV file (in support of
Requirement 3 ).</p>
      <p>Standard data tab. In the Standard data tab, Fig. 1 part c), the user can
compare different environmental variables for a better understanding of their
variability (in support of Requirement 2 ), since they have been (statistically)
standardized, hence they have been converted to the same scale, producing
standard scores (Z-scores). The table displayed has some highlighted values with
colour encoding depending on the category of the value, available to download
as a CSV file.</p>
      <p>Comparative data tab. A CSV file is stored with data from the previous
submitted query. These are then compared in an interactive plot with legend
features, allowing for selecting and deselecting of options key for multiple flare
environmental related data comparison (see Fig. 1 d). The multiline plot allows
a user to discover/present/identify trends, seasonality, comparison and check
for outliers previously discovered in the standardization tab; to improve their
comprehension of the environmental data previous to the patient’s flare event
(in support of Requirement 2 ).</p>
      <p>Visualization tab. In this tab Fig. 1 part e), the user can visualize the
last submitted query to have a cleaner view of each variable. The tab is aimed
to provide a quick insight of the data prior to download to make sure it has
accomplished the statisticians needs.</p>
      <p>The SCEED dashboard (shown in Fig. 1) is coded in Python (3.6), using
the Dash 1.7.0)4 package as a framework that facilitates building cross-platform
analytical platforms. This dashboard is coded dynamically, displaying the drop
down options for each parameter reacting to available data in the triplestore
endpoint. Therefore, if new data is added to the triplestore, according to the same
data model, the dashboard will react accordingly showing the new available data.
This approach is ideal when managing both clinical data (since data collection is
an ongoing process), as well as environmental time series data which is constantly
being updated
5</p>
    </sec>
    <sec id="sec-5">
      <title>Evaluation</title>
      <p>An experiment was undertaken to evaluate the usability of the initial SCEED
dashboard in accessing environmental data linked with clinical de-identified
patient records, by clinician researchers who have no practical experience with
in Semantic Web technologies. The user experiment was structured by a brief
introduction to the dashboards background, the tasks to be completed by the
participants and a follow up post-questionnaire.</p>
      <sec id="sec-5-1">
        <title>3 https://apps.ecmwf.int/codes/grib/param-db 4 https://dash.plotly.com/</title>
        <p>5.1</p>
        <sec id="sec-5-1-1">
          <title>Experimental Setup and Execution</title>
          <p>
            The target group was clinician researchers that are not Semantic Web experts,
who would be users of the analytical platform and who had data exploration
needs in the health domain similar to those of the AVERT project. This targeted
selection criteria resulted in the recruitment of seven participants: PhD students
(3) and professors (4), who were experienced in analysing clinical data with
statistical models. These participants are within the 30-50 age range, with a
female to male ratio of 2:5 and international with fluent English. This sample
size is covering the requirements, evaluating a prototype of a novel user-interface
design in the first stages that has a specialised nature (exploration of clinical and
linked environmental data) [
            <xref ref-type="bibr" rid="ref15">15</xref>
            ].
          </p>
          <p>The experiment started with the participants signing the informed consent
document, which commenced with a short explanation about the purpose of
the dashboard, its main target contribution to research and a mention of the
semantic technologies operating in the back-end. This was the first contact with
the SCEED dashboard and the participant had not interacted with or seen the
dashboard previously. Furthermore, each participant was asked to follow a role
while testing SCEED: that is a researcher with access to clinical patient data
would like to extract and comprehend environmental data related to patient
flares.</p>
          <p>
            Clinical data was simulated from the AVERT data model tailored to support
the chosen tasks. Environmental data was obtained from ERA5, the fifth
generation ECMWF atmospheric reanalysis of the global climate [
            <xref ref-type="bibr" rid="ref8">8</xref>
            ], and reduced
to four variables (columns in the data table of Fig. 1), again to support timely
exploration of the dashboard for the given tasks.
          </p>
          <p>
            Each participant was asked to follow a concurrent think-aloud protocol (CTA)
[
            <xref ref-type="bibr" rid="ref2">2</xref>
            ] and participants’ think-aloud statements were recorded by hand by the
experiment designer during the evaluation. The think-aloud protocol requires listening
to the participant process while completing the tasks as well as encouraging the
think-aloud action. The participants think-aloud statements and extra feedback
were recorded by hand for a qualitative analysis with Grounded Theory [
            <xref ref-type="bibr" rid="ref10 ref22">10, 22</xref>
            ].
          </p>
          <p>As the experiment was conducted during COVID-19 restrictions, synchronous
remote testing was the method used through a video conferencing platform
with remote control functions. Interestingly, we observed that the remote testing
nature of the study reinforced the ideal spectator role within the
participantobserver interaction, an optimal testing environment for the CTA method. An
hour was allocated to each participant to explore the dashboard, complete the
tasks and fill out the post-questionnaire.</p>
          <p>Each participant was asked to complete a series of tasks carefully selected
to assess the three core requirements of the dashboard stated previously. These
tasks were written and given together with the informed consent document at
the start of the video conference. The observer tracked manually the time spent
per task with a stopwatch when the participant explicitly made a comment
that the task had been completed. The tasks were set out as follows. First the
instruction, and then the criteria for when the task will have been completed by
the participant. The tasks were sequenced and numbered as follows:
1. Submit a query for a specific patients flare. The task will be complete
when an environmental data table is displayed in the LinkData tab.
2. Explain the meaning of each column in the environmental table.</p>
          <p>The task will be complete when the participant has hovered over the columns
headings of the main table and read the description of the environmental
variables.
3. Try different aggregation approaches.The task will be complete after
the participant has explored the different spatial aggregations available in
the Query section and the main table reacts/changes accordingly.
4. Compare variables for the same flare in the standard data tab. The
participant will have finished this task after selecting the Std.Data tab.
5. Compare environmental data from different flares in the
comparison tab. The participant will have finished this task when they have
successfully compared two flares in the Comp.Data tab.
6. Visualize Link Data variables in the visualization tab. The task will
be completed when the participant successfully visualizes the environmental
data prior to download in the Vis.Data tab.
7. Download useful raw data for the researchers needs. The
participant will have finished this task after downloading the data from either the
LinkData or Std.Data tabs.</p>
          <p>After the completion of the tasks, the participants were asked to complete a
Post-Study System Usability Questionnaire (PSSUQ) (described further in the
next section) to evaluate the user experience in a quantitative metric.</p>
          <p>The methods described above include a CTA protocol, successful completion
of the tasks and time on task to support the PSSUQ standard questionnaire.
The CTA protocol grants feedback to understand the effective task completion
and time on task in a meaningful way. These methods combine of quantitative
and qualitative metrics to evaluate the usability of the SCEED tool.
5.2</p>
        </sec>
        <sec id="sec-5-1-2">
          <title>Results</title>
          <p>Quantitative results Time on task. The box plots in Fig. 2 compare
participants times spent on task, which all the participants completed successfully. The
spread of the time values per task, the length of the boxes (IQR), is below 1 min
for the simplest tasks of submitting a query and downloading the data (T1 and
T7); between 1-2 min for the tasks of selecting different query parameters and
tabs (T3, T4 and T6) and around 3 min for the more complex tasks of explaining
the meaning and comparing the data (T2 and T5). Furthermore, the median
follows a similar pattern than the spread with most of the tasks below 3.5 minutes,
increased by 1 min for T2 and doubled for the most complex task of
comparing patients flares (T5). The box plots in Fig. 2 also identified 3 outliers and
proved to be suitable in studying the patterns of this data with a sample size of 7.
20.0
17.5
15.0
T1</p>
          <p>T2</p>
          <p>T3</p>
          <p>T4
Tasks</p>
          <p>T5</p>
          <p>T6</p>
          <p>T7</p>
          <p>
            PSSUQ: The Post-Study System Usability Questionnaire. The PSSUQ is a
general questionnaire meant to assess the usability evolution during the
development of a system with 19 questions [
            <xref ref-type="bibr" rid="ref13">13</xref>
            ], second version of the questionnaire
was used in this study. The PSSUQ follows a 7-point Likert Scale and assesses
four different metrics: system usefulness (SysUse), information quality
(InfoQual), interface quality (IntQual)and overall, averaged from 1-8, 9-15, 16-18,
1-19 questions. The questionnaire results and aggregations per group for the
SCEED dashboard are presented as box plots in Fig. 3. This visualization allows
us to compare the distributions without any assumptions, again adequate for
our sample size. Most of the PSSUQ scores have a median of 2 and a spread
between 1-1.5 points, reduced to ≤ 1 for the four averaged metrics with larger
sample sizes. Moreover, Q7 and Q16, which indicate the learning easiness and
the interface pleasantness, got the best scores. However, Q12 and Q18, regarding
easy finding of information and having the needed functions, have an increased
median of 3, the higher the worse in this scale, and Q9 has a spread of 3.5 points
indicating a diversity in opinions on the error messages. The identified outliers
for the individual questions are from 2 participants, which were not satisfied with
the system use and features available. Furthermore, some participants provided
qualitative comments through the PSSUQ open comment section coherent with
the previous results.
          </p>
          <p>Qualitative results The experiment also followed a concurrent think-aloud
protocol providing the study with qualitative data, analysed by means of Grounded
Theory. The observer/note-taker coded and categorized the annotations
manually and with the note-taker’s criteria alone. The categories of the annotations
were recurrent and natural, commenting upon the important features of the
dashboard. All the participants discussed the usefulness and understanding of
the patient IDs exploration, the Z-scores, the comparative plot visualization
5
e
r
o4
c
S
3
2
1
Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10Q11Q12Q13Q14Q15Q16Q17Q18Q19Sy UInseQuatlQuOavlerall
s fo In</p>
          <p>PSSUQ Item/Scale
and the downloading approach of patient flare linked environmental data in the
dashboard. However, the rest of the features, including the dates selection (days
before flare), the spatial aggregation feature, hovering for the variable
information display and the usefulness of the Vis tab; presented a variety of patterns.
Participants expressed that the linked environmental data would be more useful
if provided as a time series for a specific period instead of only dates before
the flare. Moreover, clinician researchers wanted the possibility to explore and
download all data in a summarized way.</p>
          <p>The emerging themes from the Grounded Theory analysis were (1) accessing
flare related environmental data, (2) associating multiple patients and (3)
exploring longitudinal data. These themes acknowledged the perception associated to
the complex topic of comprehending linked environmental data to support rare
disease research (Requirement 2 ).
The SCEED dashboard was developed in order to support clinician researchers
exploring clinical data linked with environmental data by querying the data,
visualizing these datasets as tables and visualizations for comprehension and
downloading the data for their models; without previous knowledge of Semantic
Web technologies. We conducted a user experiment to evaluate the SCEED
dashboard by the completion of 7 tasks using standard methodologies: time on
task, CTA and PSSUQ; to assess the usability. These tasks were selected to
assess the three core requirements.</p>
          <p>First, all the participants were successful in completing the tasks displaying
a similar pattern (Fig. 2). This pattern suggests that the time spent to complete
each task increases with the complexity of the tasks. This complexity directly
relates to the difficulty of fulfilling the requirements by the clinician researchers,
supporting the achievement of Requirements 1 and 3.</p>
          <p>
            Second, the analysis of the PSSUQ responses leads to a better understanding
of the SCEED dashboard specific features. The PSSUQ aggregated results in
Fig. 3 show the known consistent pattern of poor ratings for InfoQual relative
to IntQual and for Q9 [
            <xref ref-type="bibr" rid="ref14">14</xref>
            ], supporting the robustness of the questionnaire with
less than 15 participants [
            <xref ref-type="bibr" rid="ref20">20</xref>
            ]. Moreover, these aggregated results are lower than
the norm defined for the PSSUQ version 2 [
            <xref ref-type="bibr" rid="ref13">13</xref>
            ], the lower the value the higher
the satisfaction; and provide a reference for the next versions of the dashboard.
The open comments of this questionnaire indicated that the system was easy to
use and had good features; while requiring additional ones, explained by Q18
score (see Fig. 3), to fulfil Requirement 2.
          </p>
          <p>Third, the CTA allowed us to understand participants thoughts as they
occurred while completing the tasks. The categorization for the think-aloud
statements made clear that the dashboard needs improvements in a number of areas
which will be addressed in the next versions. Furthermore, the emerging themes
of the Grounded Theory analysis endorse the previous statements made along
fulfilling Requirements 1 and 3, and acknowledging alternatives on how
Requirement 2 could be addressed in the following versions. These next versions will be
updated with a more focused multiple patients approach on a selcted date range
to improve the environmental data exploration.</p>
          <p>When the results of the various evaluation methods described above are
examined together, we were able to achieve some more insights. A number of
examples of these insights are worth presenting. The emotional responses noted
during the CTA provides an explanation for the three outliers in Fig. 2. These
participants were curious and wanted to explore all the functionalities during
the tasks. On the other hand, task 2 dispersive values can be explained by the
inefficient formulation of the task, since some participants discovered the hover
functionality while performing task 1, resulting in quicker times.</p>
          <p>Having selected only clinician researchers provides more relevant results than
enlarging the sample size for this first version. However, this supposes a
challenge when making statements around the quantitative results which is why we
combine different metrics in the evaluation. Another limitation of this work is
that the manual evaluation for the qualitative data with a limited number of
participants, lacking depth of the qualitative results. These limitations will be
addressed in later evaluations with a more comprehensive and automatic
design, and increasing the sample size with the involvement of more real end users
including different interests for a wider acceptance of the prototype.</p>
          <p>However, the evaluation conducted on this paper, which combines different
standard metrics, could be beneficial for assessing other tools with low sample
sizes. Finally, the results on this initial evaluation hold promise for producing
an interface that will engage clinician researchers directly with Linked Data.
6</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Conclusions and Future Work</title>
      <p>From the design and the results obtained from the evaluation performed, the
SCEED dashboard is an adequate initial design to fulfil the clinician researchers
requirements of querying specific clinical patient data to retrieve
environmental data linked to vasculitis patient flare clinical data from the triplestore and
downloading meaningful data to be used as input for statistical models.
However, new features are necessary for comprehending the use and limitations of
the environmental data for rare disease flare discovery.</p>
      <p>The combination of measuring the time per task, CTA protocol and PSSUQ
provided enough data to assess the usability of the dashboard, highlighting the
successful aspects, identifying the items that need to be improved and the new
features to be added. In the future, this methodology will be used as a baseline
to track the evolution of the dashboard.</p>
      <p>Acknowledgements This research was conducted with the financial
support of HELICAL as part of the European Union’s Horizon 2020 research and
innovation programme under the Marie Sklodowska-Curie Grant Agreement No.
813545 at the ADAPT SFI Research Centre at Trinity College Dublin.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Battle</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kolas</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Enabling the geospatial Semantic Web with Parliament and GeoSPARQL</article-title>
          .
          <source>Semantic Web</source>
          <volume>3</volume>
          (
          <issue>4</issue>
          ),
          <fpage>355</fpage>
          -
          <lpage>370</lpage>
          (
          <year>2012</year>
          ). https://doi.org/10.3233/SW2012-0065
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Boren</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ramey</surname>
          </string-name>
          , J.:
          <article-title>Thinking aloud: Reconciling theory and practice</article-title>
          .
          <source>IEEE Transactions on Professional Communication</source>
          <volume>43</volume>
          (
          <issue>3</issue>
          ),
          <fpage>261</fpage>
          -
          <lpage>278</lpage>
          (
          <year>Sep 2000</year>
          ). https://doi.org/10.1109/47.867942
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3. Bra¸soveanu,
          <string-name>
            <given-names>A.M.P.</given-names>
            ,
            <surname>Sabou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Scharl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Hubmann-Haidvogel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Fischl</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          :
          <article-title>Visualizing statistical linked knowledge for decision support</article-title>
          .
          <source>Semantic Web</source>
          <volume>8</volume>
          (
          <issue>1</issue>
          ),
          <fpage>113</fpage>
          -
          <lpage>137</lpage>
          (
          <year>Jan 2017</year>
          ). https://doi.org/10.3233/SW-160225
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Chavira</surname>
            ,
            <given-names>L.A.G.</given-names>
          </string-name>
          :
          <article-title>The Earth Life and Semantic Web Project Experiment GUI</article-title>
          .
          <source>Tech. rep</source>
          ., U.S. Department of Health &amp; Human
          <string-name>
            <surname>Services</surname>
          </string-name>
          (
          <year>Jul 2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Dadzie</surname>
            ,
            <given-names>A.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rowe</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Approaches to visualising Linked Data: A survey</article-title>
          .
          <source>Semantic Web 2</source>
          <volume>2</volume>
          (
          <issue>2</issue>
          ),
          <fpage>89</fpage>
          -
          <lpage>124</lpage>
          (
          <year>Jan 2011</year>
          ). https://doi.org/10.3233/SW-2011-0037
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Dafli</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Antoniou</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ioannidis</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dombros</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Topps</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bamidis</surname>
          </string-name>
          , P.D.:
          <article-title>Virtual Patients on the Semantic Web: A Proof-of-Application Study</article-title>
          .
          <source>Journal of Medical Internet Research</source>
          <volume>17</volume>
          (
          <issue>1</issue>
          ),
          <year>e16</year>
          (
          <year>2015</year>
          ). https://doi.org/10.2196/jmir.3933
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Das</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sundara</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Atkinson</surname>
            ,
            <given-names>R.:</given-names>
          </string-name>
          <article-title>R2RML: RDB to RDF Mapping Language</article-title>
          . https://www.w3.org/TR/r2rml/ (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8. ERA5,
          <string-name>
            <surname>C.C.C.S</surname>
          </string-name>
          .C..
          <article-title>: Fifth Generation of ECMWF Atmospheric Reanalyses of the Global Climate</article-title>
          .
          <article-title>Copernicus Climate Change Service Climate Data Store (CDS)</article-title>
          . ECMWF. https://cds.climate.copernicus.eu/cdsapp#!/home
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Hasnain</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kamdar</surname>
            ,
            <given-names>M.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hasapis</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zeginis</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Warren</surname>
            ,
            <given-names>C.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Deus</surname>
            ,
            <given-names>H.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ntalaperas</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tarabanis</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mehdi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Decker</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          : Linked Biomedical Dataspace:
          <article-title>Lessons Learned Integrating Data for Drug Discovery</article-title>
          . In: International Semantic Web Conference
          <year>2014</year>
          . pp.
          <fpage>114</fpage>
          -
          <lpage>130</lpage>
          . Lecture Notes in Computer Science, Springer International Publishing (
          <year>2014</year>
          ). https://doi.org/10.1007/978-3-
          <fpage>319</fpage>
          - 11964-9 8
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Khan</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          : Qualitative Research Method: Grounded Theory.
          <source>International Journal of Business and Management</source>
          <volume>9</volume>
          (
          <year>Oct 2014</year>
          ). https://doi.org/10.5539/ijbm.v9n11p224
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Koopman</surname>
            ,
            <given-names>R.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kochendorfer</surname>
            ,
            <given-names>K.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moore</surname>
            ,
            <given-names>J.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mehr</surname>
            ,
            <given-names>D.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wakefield</surname>
            ,
            <given-names>D.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yadamsuren</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Coberly</surname>
            ,
            <given-names>J.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kruse</surname>
            ,
            <given-names>R.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wakefield</surname>
            ,
            <given-names>B.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Belden</surname>
            ,
            <given-names>J.L.</given-names>
          </string-name>
          :
          <article-title>A Diabetes Dashboard and Physician Efficiency and Accuracy in Accessing Data Needed for High-Quality Diabetes Care</article-title>
          .
          <source>The Annals of Family Medicine</source>
          <volume>9</volume>
          (
          <issue>5</issue>
          ),
          <fpage>398</fpage>
          -
          <lpage>405</lpage>
          (
          <year>Sep 2011</year>
          ). https://doi.org/10.1370/afm.1286
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Lassila</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Swick</surname>
            ,
            <given-names>R.R.</given-names>
          </string-name>
          :
          <article-title>Resource Description Framework (RDF) Model and Syntax Specification</article-title>
          . https://www.w3.org/TR/1999/REC-rdf-syntax-
          <volume>19990222</volume>
          / (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Lewis</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Psychometric Evaluation of the PSSUQ Using Data from Five Years of Usability Studies</article-title>
          .
          <source>Int. J. Hum. Comput. Interaction</source>
          <volume>14</volume>
          ,
          <fpage>463</fpage>
          -
          <lpage>488</lpage>
          (
          <year>Sep 2002</year>
          ). https://doi.org/10.1080/10447318.
          <year>2002</year>
          .9669130
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Lewis</surname>
            ,
            <given-names>J.R.</given-names>
          </string-name>
          :
          <article-title>IBM computer usability satisfaction questionnaires: Psychometric evaluation and instructions for use</article-title>
          .
          <source>International Journal of Human-Computer Interaction</source>
          <volume>7</volume>
          (
          <issue>1</issue>
          ),
          <fpage>57</fpage>
          -
          <lpage>78</lpage>
          (
          <year>Jan 1995</year>
          ). https://doi.org/10.1080/10447319509526110
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Macefield</surname>
          </string-name>
          , R.:
          <article-title>How To Specify the Participant Group Size for Usability Studies: A Practitioner's Guide</article-title>
          .
          <source>Journal of Usability Studies</source>
          <volume>5</volume>
          (
          <issue>1</issue>
          ),
          <volume>12</volume>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>McKenna</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Staheli</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fulcher</surname>
          </string-name>
          , C., Meyer, M.:
          <article-title>BubbleNet: A Cyber Security Dashboard for Visualizing Patterns</article-title>
          .
          <source>Computer Graphics Forum</source>
          <volume>35</volume>
          (
          <issue>3</issue>
          ),
          <fpage>281</fpage>
          -
          <lpage>290</lpage>
          (
          <year>2016</year>
          ). https://doi.org/10.1111/cgf.12904
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Reddy</surname>
            ,
            <given-names>B.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Houlding</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hederman</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Canney</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Debruyne</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>O'Brien</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Meehan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>O'Sullivan</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Little</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          :
          <article-title>Data linkage in medical science using the resource description framework: The AVERT model</article-title>
          .
          <source>HRB Open Research</source>
          <volume>1</volume>
          ,
          <issue>20</issue>
          (Mar
          <year>2019</year>
          ). https://doi.org/10.12688/hrbopenres.12851.2
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Reynolds</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cyganiak</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          :
          <article-title>The RDF Data Cube Vocabulary</article-title>
          . https://www.w3.org/TR/vocab-data-cube/ (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Sabol</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tschinkel</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Veas</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hoefler</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mutlu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Granitzer</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Discovery and Visual Analysis of Linked Data for Humans</article-title>
          .
          <source>The Semantic WebISWC 2014, Lecture Notes in Computer Science</source>
          <volume>8796</volume>
          ,
          <issue>309324</issue>
          (Oct
          <year>2014</year>
          ). https://doi.org/10.13140/2.1.3744.2566
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Sauro</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lewis</surname>
            ,
            <given-names>J.R.</given-names>
          </string-name>
          :
          <article-title>Quantifying the User Experience: Practical Statistics for User Research</article-title>
          . Elsevier, Cambridge, 2nd edition edn. (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Scharl</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Herring</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rafelsberger</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hubmann-Haidvogel</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kamolov</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fischl</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>F</surname>
          </string-name>
          ¨ols,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Weichselbraun</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          :
          <article-title>Semantic Systems and Visual Tools to Support Environmental Communication</article-title>
          .
          <source>IEEE Systems Journal</source>
          <volume>11</volume>
          (
          <issue>2</issue>
          ),
          <fpage>762</fpage>
          -
          <lpage>771</lpage>
          (
          <year>Jun 2017</year>
          ). https://doi.org/10.1109/JSYST.
          <year>2015</year>
          .2466439
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Tie</surname>
            ,
            <given-names>Y.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Birks</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Francis</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Grounded theory research: A design framework for novice researchers:</article-title>
          .
          <source>SAGE Open Medicine (Jan</source>
          <year>2019</year>
          ). https://doi.org/10.1177/2050312118822927
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Villanueva-Rosales</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chavira</surname>
            ,
            <given-names>L.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>del Rio</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pennington</surname>
            ,
            <given-names>D.:</given-names>
          </string-name>
          <article-title>eScience through the Integration of Data and Models: A Biodiversity Scenario</article-title>
          .
          <source>In: 2015 IEEE 11th International Conference on E-Science</source>
          . pp.
          <fpage>171</fpage>
          -
          <lpage>176</lpage>
          (
          <year>Aug 2015</year>
          ). https://doi.org/10.1109/eScience.
          <year>2015</year>
          .77
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Zainab</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Saleem</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mehmood</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zehra</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Decker</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hasnain</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>FedViz: A Visual Interface for SPARQL Queries Formulation and Execution</article-title>
          .
          <source>In: VOILA@ISWC2015</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Zhu</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wolfgruber</surname>
            ,
            <given-names>T.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tasato</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Arisdakessian</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garmire</surname>
            ,
            <given-names>D.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garmire</surname>
            ,
            <given-names>L.X.</given-names>
          </string-name>
          :
          <article-title>Granatum: A graphical single-cell RNA-Seq analysis pipeline for genomics scientists</article-title>
          .
          <source>Genome Medicine</source>
          <volume>9</volume>
          (
          <issue>1</issue>
          ),
          <volume>108</volume>
          (Dec
          <year>2017</year>
          ). https://doi.org/10.1186/s13073-017-0492-3
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>