=Paper=
{{Paper
|id=Vol-1658/paper6
|storemode=property
|title=Probing Interactivity in Open Data for General Practice. An Evidence-Based Approach
|pdfUrl=https://ceur-ws.org/Vol-1658/paper6.pdf
|volume=Vol-1658
|authors=Federico Cabitza,Francesco Del Zotti,Angela Locoro
|dblpUrl=https://dblp.org/rec/conf/avi/CabitzaZL16
}}
==Probing Interactivity in Open Data for General Practice. An Evidence-Based Approach==
<pdf width="1500px">https://ceur-ws.org/Vol-1658/paper6.pdf</pdf>
<pre>
    Probing interactivity in open data for General
       Practice. An evidence-based approach

          Federico Cabitza1,2 , Francesco Del Zotti3 , and Angela Locoro2
                   1
                  IRCCS Istituto Ortopedico Galeazzi, Milano, Italy
               2
                Universitá degli Studi di Milano-Bicocca, Milano, Italy
                    {cabitza,angela.locoro}@disco.unimib.it
             3
               Centro Studi FIMMG Verona and NetAudit, Verona, Italy
                                 delzotti@libero.it


        Abstract. We undertook a user study to evaluate whether the per-
        ceived utility of some common open data sets for family doctors would
        increase if they are rendered in interactive heat maps. We also investi-
        gated whether Parallel Coordinates (PC) are perceived as a convenient
        diagram to summarize multiple patients data; and whether making PCs
        interactive would increase their informativity. We interviewed 29 expert
        family doctors through a questionnaire to find out that: interactive maps
        make health datasets be perceived as more useful, especially in regard to
        registers on exposure to carcinogens. PCs were found to be informative
        visualizations but making them interactive increases their perceived in-
        formativity. In light of this evidence, designing for better interactivity is
        worthy of further efforts in Human-Data Interaction research.

        Keywords: evidence-based design, open data, general practice, data
        visualization


1     Introduction
An evidence-based approach to IT design4 [1] means to back up any claim of util-
ity or usability of a particular solution or tool with empirical studies that collect
data from real users (including the perceptions of domain experts) and undertake
standard statistical tests and group comparisons [11]. This approach is typical
in the Human-Computer Interaction field and it has also been recently proposed
for studies accomplished in the cognate research area denoted as Human-data
Interaction (HDI) [5]. HDI regards the design and study of innovative means
by which users can retrieve and explore great amounts of data and the optimal
ways by which these data can be represented conveniently and conveyed to users
so as to make them valuable and informative [6].
4
    Of course, the parallel with the approach to medical practice known with the phrase
    “evidence-based medicine” is not coincidental at all. Indeed, it constitutes a sort
    of natural extension of that methodological attitude to the IT-based tools that can
    have an impact on medical decision and care quality much similarly to the knowledge
    retained by doctors and nurses about clinical conditions and treatments.


                                             35
    As the name suggests, HDI lies at the intersection of the fields of human-
computer interaction, data and information visualization and cognitive ergonomics.
While other fields, which focus on very large databases, data warehouses and co-
operative information systems, are concerned with the collection of increasingly
complex and vast datasets, on the improvement and management of their infor-
mation quality, HDI studies focus on how to improve the user experience of the
information services that can be enabled by these datasets, and on the assess-
ment of dimensions of this experience like informativity, intuitiveness, clarity,
interactivity, confidence in decision making [4].
    In this paper we focus on health datasets and how these can positively in-
form general practice of family doctors as consumers of the above datasets. In
particular, we will report the results of an exploratory user study that we aimed
at evaluating whether the perceived utility of some common open data sets and
visualization tool would increase if the data consumers were allowed to inter-
act with the data and explore them. This study is then to be related to similar
initiatives aimed at evaluating the value of data visualization tools in the health-
care sector. To our knowledge the most relevant of the recent initiatives is the
“Visualizing Health” project5 , developed by the University of Michigan and the
Robert Wood Johnson Foundation. In this project, which is subtitled “a scientifi-
cally vetted style guide for communicating health data”, the involved researchers
compared tens of different styles by which to communicate risk-related informa-
tion in order to select the best ones for the general public according to a series
of empirical tests involving real users. The main finding, which we also back up,
is that when it comes to presenting health information, there is no single “best”
graphic, as this depends on the intended goal and, in a subtler way, on what
matters most to the intended users of the diagrams and infographics [4]. This
calls for an evidence-based approach to this kind of evaluations, which must be
tailored to specific groups of people (e.g., practitioners, patients) and contexts
of use [13].
    In this paper we focus on the context of use of health open data, from the
perspective of the health practitioners. Nowadays, an increasing number of data
are collected by health agencies and hospital institutions, often with the neces-
sary collaboration of the General Practitioners (GPs). Building health datasets,
keeping them up-to-date and making them open, freely accessible, and above all,
usable in everyday medical practice (i.e., the primary HDI concern) require skills
and resources, in particular the time of doctors and the money of citizens and
tax payers [8]. For this reason, there is a need for further studies that involve
potential users and data consumers in the early identification of useful datasets,
in prioritizing their digitization and publication in open data platforms, and in
exploring novel means that could improve the end-user experience. Our study is
a preliminary, but yet evidence-based [11] contribution in this strand of research.


5
    http://www.vizhealth.org/


                                        36
2     Method
This study addresses two related research questions on open data in healthcare.
First: What health open data are perceived as more useful in general practice (if
any)? Second: Would visualizing these open data by means of interactive (that
is browsable, zoomable, etc.) maps make these data to be perceived as more
useful and usable? In addition, we also address the question whether diagrams
displaying parallel coordinates [9] (see Figure 2) are effective means to represent
patient data at various levels of description and to help GPs understand popu-
lation trends, discover (frequency-based) normal and abnormal conditions, and
detect correlations between multiple conditions.


Fig. 1. The static map (on the left, a); the heat map (on the right, b) resulting from
interacting with the map depicted on the left.


    To these aims, the authors conceived a short questionnaire to be administered
as a Computer-Aided Web Self-Interview (CAWI). This questionnaire consisted
of four sections, which were to be rendered each in a different Web page: in the
first page, the platform was to present a list of 13 datasets that the authors
had previously selected among the health open datasets made available on the
DatiOpen.it6 platform, which is one of the biggest Italian open data portals
collecting more than 2,000 thousand datasets. In particular the respondents were
asked to choose at least one dataset (and at most three) that they could want
to consult in an online and customized “visual dashboard” supporting their own
professional practice.
    In the second page we further selected 6 datasets for their suitability to be
mapped into a georeferenced map like the one depicted in Figure 1.a, and asked
the doctors to evaluate the degree of utility for their profession of each dataset
on a six-value ordinal scale, from 1 (not useful at all) to 6 (very useful).
6
    http://www.datiopen.it/en


                                         37
    Then, we asked the doctors to imagine to be able to interact with the map,
so that they could browse the map by dragging, and zooming in and out at
the desired level of visualization (e.g., province, city, neighborhood if possible)
and get a heat map like the one depicted in Figure 1.b. We then asked whether
visualizing the above datasets on such a browsable map would make the data
more useful or not (on a 5-value scale ranging from 1 “the data would be much
less useful” to 5 “the data would be much more useful”, passing through a
middle value 3 “it wouldn’t make almost any difference”).
    In the third section, the questionnaire displayed a parallel coordinates di-
agram like the one depicted in Figure 2. We asked to indicate on a six-value
ordinal scale the degree of informativity of such a way to represent patient data
(from 1 not informative at all, to 6, very informative).
    Then we asked to imagine to be able to interact with the diagram, so as:
1) to select one specific line (that is a patient) and see that line highlighted
while the other lines would fade slightly in the background; 2) to draw small
rectangles above one or more vertical axis, in order to have the system highlight
only the patient-lines intersecting the axes within that data ranges (see Figure 3).
Similarly to the case of dataset utility, we asked whether the interactive diagram
would be more or less informative than the static version of it.
    Finally, the last page of the questionnaire contained two profile-related ques-
tions: one regarding the work experience of the respondent; the other one re-
garding the perceived familiarity with IT.
    We invited the potential respondents by sending an electronic mail to the
family doctors registered to the NetAudit7 mailing list; this is a list of approx-
imately 120 Italian GPs who are interested in medical audit and research ini-
tiatives regarding the assessment of performance in general practice and the
continuous improvement of the related medical quality and outcome.

7
    http://www.netaudit.org/


Fig. 2. Parallel coordinates to visualize patient data on multiple dimensions and see
both population-level tendencies and inter-dimensional correlations (inspired by [7]).


                                         38
Fig. 3. The same diagram depicted in Figure 2 after that three specific ranges of data
have been specified to filter the patient data to be highlighted: more precisely, female
patients treated by two specific medical teams (equipe) and between 70 and 50 years
of age.


3     Results
The questionnaire was left open for two weeks after the invitation had been
sent to potential respondents. In this time lapse no reminder was sent. When
we closed the survey, 29 GPs had accessed the questionnaire; 23 of them had
completely filled in every item of the questionnaire in an average time of 5.9
minutes (SD=2.6 minutes). In this respondent sample, 52% have been family
doctors for more than 30 years, 91% for more than 20 years; no respondent had
less than 10 years of work experience.
    The majority of the respondents (65%) claimed to be competent in Informa-
tion Technology (IT) but neither experts nor enthusiasts. These latter ones were,
respectively, approximately one third and one fifth of the sample. We detected
a statistically significant and moderately negative correlation between age and
IT expertise (-.49, p=.018), as it could be expected.
    As said in the previous section, respondents could choose the three most
useful datasets to be visualized on a personal dashboard for their practice, out
of a list of 13 data sets. The sets chosen more frequently by the respondents
were:

 1. Regional distribution of the percentage of elderly people8 involved in the
    program of Integrated Home Care (Assistenza Domiciliare Integrata).
 2. Number (and details) of the Rehabilitation facilities for the elderly (Istituti
    di Riabilitazione Extraospedaliera per Anziani ).
 3. Inpatient care average length of stay.

   These datasets were selected by approximately one third of the sample (38%,
33% and 30% respectively). The other datasets collected much fewer preferences,
with “number of cases affected by pneumococcal disease by year” that was not
chosen by any respondent.
8
    That is, people older than 65 years.


                                           39
   In regard to the utility of data sets to be displayed on a georeferenced map,
we got statistical significance for two sets only, which were considered useful for
general practice, namely:

 1. Number of registers on exposure to carcinogens (.5 vs. .95, p=.000).
 2. Hospitalization rates, by regime, patient genre and region (.26 vs. .74, p=.035).

    For the other data sets we did not get statistical significance: however, the
“number of medical prescriptions delivered” can be considered the least appre-
ciated information set (.64 vs. .36).
    We also asked whether putting all of the above datasets into an interactive
map, instead of a static one, would change the perceived utility of the datasets
themselves. The majority claimed mapping data would make a difference, and
that in so doing the datasets would be more useful (.83 vs. .17, p=.003, one
third of the GPs even claiming that the data “would be much more useful”).
The ranking of relative utility of mapping open data sets is reported in Table 1.


    Table 1. Open datasets to be visualized in maps ranked by perceived utility.

                                          Priority       Perceived
       Dataset                                                           Median
                                          level(sig.)    utility(sig.)


       Carcinogen exposure registers      Higher(*)      Positive(***) 5
                                                    (NS)
       National Drug Consumption          Uncertain      Positive(NS) 4
                                                    (NS)
       Home care patients                 Uncertain      Positive(NS) 4
                                                    (NS)
       Hospitalization rate               Uncertain      Positive(*) 4
                                                    (NS)
       Incidence of invasive diseases     Uncertain      Positive(NS) 4
                                                    (NS)
       Incidence of occupational diseases Uncertain      Positive(NS) 4
                                                    (NS)
       Number of prescriptions            Uncertain      Negative(NS) 3


    In regard to the last research question addressed in this short study, paral-
lel coordinates diagrams (see Figures 2 and 3 were found to be an informative
visualization to render patient data on multiple dimensions. The responses were
positive for the majority (.64 vs. .46, p=.286). Furthermore, almost every GP
involved in the user study claimed that making such a diagram interactive would
make it even more informative (.96 vs. .04, p=.000, 60% “much more informa-
tive”).


4   Discussion and Conclusions
The characteristics of the respondent sample, in terms of work experience and
attitude towards Health IT suggest that the respondents can be considered repre-
sentatives of a population of real domain experts and that the potential (positive)
bias introduced by the IT enthusiasts should be low. Besides the results reported


                                         40
above, the questionnaire also allowed the respondents leave free text comments
in order to suggest datasets and applications to be considered in further studies,
give advice on potential improvements to the visual tools presented, and more
generally share remarks with the research team.
    Through these items, three respondents suggested to perform a similar us-
ability study on one of the most adopted GP governance tools in Italy, i.e.,
MilleGPG 9 , by also noticing an almost total lack of this kind of studies, with
a few exceptions (e.g.,[3]), in regard to a class of applications that are used (or
must be used) by thousands of GPs in Italy, and in regard to applications that
can foster collaboration between healthcare operators by means of free text com-
ments, tags labelling and semantically aware classifications [10], for example for
transforming tabular data into valuable visualizations.
    Moreover, we report two comments that suggested relevant improvements to
the Parallel Coordinate (PC) dashboard. One GP suggested to implement PCs
indicating explicitly central tendency parameters on their axes (wherever appli-
cable), like means and medians, standard deviations and interquartile ranges,
both at the level of the single GP patient set, and (if available) at the regional
and national level. This doctor noticed how the bundle of lines, in virtue of
their variable thickness at the intersection with the different coordinates, would
nicely and visually represent the frequency distribution of the values within the
subject population. Another GP suggested to allow the user to switch from a
cross-sectional view (i.e., the default one) where each line represents the current
conditions of a single patient, to a longitudinal view. In such a view, the current
conditions would be represented as a highlighted line, while the bundle of the
other lines would represent the different conditions that the same patient exhib-
ited in the past, as these are recorded on the medical record of the GP or in the
patient’s health record.
    We are aware of the limitations of this study. For instance, the fact that
the involved doctors had to imagine the novel interaction, by looking at static
pictures of how the visualization tools would change according to their actions.
However, giving doctors real dashboards with which to interact and ask about
their perceptions would have been problematic for the impossibility to give them
a uniform training and to check remotely their interaction with the tool to
evaluate. For this reason, we believe that studies like the present one, where
interactivity is simulated in terms of its effects, could be useful anyway in a
preliminary phase of development, where to understand if endowing visual tools
with some kind of interactivity would be worth the effort.
    In light of the evidence collected in this study, we can conclude that de-
signing for a better interactivity of open data platforms for general practice is
worth further efforts. From the HDI perspective, this conclusion was desirable;
although it might also seem easily predictable, nevertheless that conclusion could
not be taken for granted in a committed evidence-based approach to Health IT
design [12, 11, 6]. For this reason, this study is a small but necessary contribution
in the research agenda outlined in Section 1, which is aimed at gaining feedback
9
    https://www.millegpg.it/


                                         41
and design-oriented indications from both caregivers and patients to improve
their experience in interacting with the next data visualization tools to come.


References
1. Ammenwerth, E., & Rigby, M. (Eds.). (2016) Evidence-Based Health Informatics:
   Promoting Safety and Efficiency Through Scientific Methods and Ethical Policy
   (Vol. 222). IOS Press.
2. Cabitza, F., & Simone, C. 2010. WOAD: a framework to enable the end-user devel-
   opment of coordination-oriented functionalities. Journal of Organizational and End
   User Computing (JOEUC), 22(2), 1-20.
3. Cabitza, F., Del Zotti, F., & Misericordia, P. 2014. Electronic Records for General
   Practice – Where we Are, Where we should Head to Improve Them. In Healthinf ’14:
   Proceedings of the International Conference on Health Informatics, (pp. 535-542).
   INSTICC.
4. Cabitza, F., Locoro, A., & Batini, C. 2015. A User Study to Assess the Situated
   Social Value of Open Data in Healthcare. Procedia Computer Science, 64, 306-313.
5. Cabitza, F., Locoro, A. 2016. Human-Data Interaction in Healthcare: Acknowledg-
   ing Use-related Chasms to Design for a Better Health Information. In the Procs of
   the 8th IADIS International Conference on e-Health 2016, Part of the Multi Con-
   ference on Computer Science and Information Systems, MCCSIS 2016 .
6. Cabitza, F., Fogli D., Giacomin, M. & Locoro, A. 2016. Valuable Visualization of
   Healthcare Information: from the quantified self data to conversations. In AVI 2016:
   Proceedings of the International Working Conference on Advanced Visual Interfaces,
   Bari, Italy, 7-10 June 2016. pp. 376-380. ACM.
7. Croon, R. D., Klerkx, J., & Duval, E. 2015. Design and evaluation of an interactive
   proof-of-concept dashboard for General Practitioners. In ICHI 2015: Proceedings of
   the International Conference on Healthcare Informatics, 2015. pp. 150-159. IEEE.
8. Eysenbach, G. 2008. Medicine 2.0: social networking, collaboration, participation,
   apomediation, and openness. Journal of medical Internet research, 10(3).
9. Inselberg, A. 2009. Parallel coordinates. In Ling Liu, Tamer Ozsu (Eds.) Encyclo-
   pedia of Database Systems. pp. 2018-2024. Springer.
10. Locoro, A., Grignani, D. and Mascardi, V. 2011. MANENT: An infrastructure for
   integrating, structuring and searching digital libraries. Studies in Computational
   Intelligence, Springer, 375, 315-341.
11. Longhurst, C. A., Palma, J. P., Grisim, L. M., Widen, E., Chan, M., & Sharek, P.
   J. 2013. Using an Evidence-Based Approach to EMR Implementation to Optimize
   Outcomes and Avoid Unintended Consequences. Journal of Healthcare Information
   Management (JHIM), 27(3), 79.
12. Rigby, Michael, Elske Ammenwerth, M. Beuscart-Zephir, Jytte Brender, Hannele
   Hypponen, Siobhan Melia, Pirkko Nykanen, Jan Talmon, and Nicolette de Keizer.
   2013 Evidence Based Health Informatics: 10 years of efforts to promote the principle.
   Yearb Med Inform 8(1), 34-46.
13. Solomon, J., Scherer, A. M., Exe, N. L., Witteman, H. O., Fagerlin, A., & Zikmund-
   Fisher, B. J. 2016. Is This Good or Bad?: Redesigning Visual Displays of Medical
   Test Results in Patient Portals to Provide Context and Meaning. In Proceedings
   of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing
   Systems. pp. 2314-2320. ACM.


                                          42

</pre>