<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A data narrative about tuberculosis pandemic in Gabon</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Raymond Ondzigue Mbenga</string-name>
          <email>raymond.ondziguembenga@etu.univ-tours.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Veronika Peralta</string-name>
          <email>veronika.peralta@univ-tours.fr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Thomas Devogele</string-name>
          <email>thomas.devogele@univ-tours.fr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Faten El Outa</string-name>
          <email>faten.elouta@etu.univ-tours.fr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sydney Maghendji Nzondo</string-name>
          <email>sydneymaghendji@yahoo.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Edgard Brice Ngoungou</string-name>
          <email>ngoungou2001@yahoo.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>DEBIM-UREMCSE, Univ. of Health Sciences</institution>
          ,
          <addr-line>Libreville</addr-line>
          ,
          <country country="GA">Gabon</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>LIFAT, University of Tours</institution>
          ,
          <addr-line>Blois</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Data narration is the activity of crafting narratives supported by facts extracted from data analysis, using interactive visualizations. It allows the transmission of findings in the data, by visual means, in order to facilitate their reception by a target audience. Despite its recognized utility in public health, data narratives are typically limited to the transmission of treatment recommendations to educate the general public. This paper describes the crafting of a data narrative about tuberculosis pandemic in Gabon, intended to an audience of health professionals and authorities. Specifically, we describe and illustrate all phases of the crafting process, combining best practices in data narration and epidemic intelligence.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Data narrative</kwd>
        <kwd>tuberculosis</kwd>
        <kwd>Gabon</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>for data narration in public health.</p>
      <p>
        In this paper we describe the crafting of a data
narraData narration, i.e. narrating with data visualisations tive about tuberculosis (TB) pandemic in Gabon. This
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], is receiving increasing interest in several commu- narrative is intended for health authorities and experts
nities such as journalism, business, e-government and in Epidemic Intelligence (EI), with the goal of describing
data science. A data narrative is defined as a structured the epidemiological situation of tuberculosis in a pilot
composition of messages, which convey findings over the area of study, the Libreville-Owendo-Akanda (LBVO-A)
data, and are typically delivered via visual means in order health region, before a countrywide move.
to facilitate their reception by a intended audience [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Specifically, we describe, for each step of the process,
More specifically, data narratives can be seen as ordered the key methodological aspects, the major dificulties and
sequences of steps, each of which may contain words, the particularities of epidemiological domain. Indeed, the
images, visualisations, audio, video, or any combination crafting process customizes general data narration ones
thereof, and which are based on data [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. incorporating specific features of epidemiology, best
prac
      </p>
      <p>
        Despite its recognised usefulness in public health [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], tices in EI and communication to epidemiologists and
where information on health problems must compete health authorities. In particular, data collection,
processwith thousands of other communication messages (info- ing and analysis have a key place in the process, such
demics), data narratives are little used, and are typically tasks being at the core of EI. In addition, classical statistic
limited to the transmission of treatment recommenda- analysis is enriched with other data mining tasks and
contions and patient feedback to educate the general public. fronted to the state of the art, the latter being a specific
Few works aim at conveying scientific results to an ex- requirement when addressing to a scientific audience.
pert audience to report on the health situation and the From a technical perspective, the underlying system
supresults of implemented policies, and more generally, to ports spatio-temporal data of very heterogeneous quality.
help decision making. Furthermore, such data narratives Indeed, geographic information systems are well-suited
are build following ad-hoc processes. In a general way, for EI applications [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
there is a lack of processes and methodological guidelines The main applied and methodological contributions
are:
• A detailed description of the crafting process,
from the collection of analytical needs to the
visual rendering of results.
• A discussion of the main challenges in data
nar
      </p>
      <p>ration in the field of epidemiology.
• A summary of the main messages learned from</p>
      <p>the data, which form the heart of the narrative.
• An example of usage of data narration as a
decision support tool to enable decision makers to
define a better tuberculosis control strategy.</p>
      <sec id="sec-1-1">
        <title>The paper is organized as follows. Section 2 describes</title>
        <p>related work on data narration and epidemic intelligence
processes. Section 3 describes the main phases of the
crafting process and the obtained results. Finally, Section
4 presents lessons learned and Section 5 concludes.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Related work</title>
      <p>In this sub-section we provide some background on
tuberculosis and we describe related works on data narration
and EI processes. We also review the use of data narration
in public health.</p>
      <sec id="sec-2-1">
        <title>2.1. About tuberculosis</title>
        <p>synthesis, in which the analyst assembles and organizes
information pieces to be communicated, facilitating the
telling of visual analytical findings in a compelling
narrative.</p>
        <p>The same three phases (but named diferently) are
described by Lee et al. [7]: (i) explore data, to retrieve
ifndings among data, (ii) make a story, to turn findings
into a sequence of narrative pieces and build the plot of
the narrative, and (iii) tell a story, to render the plot via
visual means.</p>
        <p>
          Recently, a conceptual model of data narrative [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]
formalises it main concepts and their relationships. The
model aims to guide the author through the data
narration process: define the analysis goals and break them
down into analytical questions, collect and explore data
through a set of collectors that allow to manipulate data
with varied tools for answering analytical questions,
analyze data and underline the findings , bring out messages
(from the findings) to communicate to the audience,
structure messages to build the narrative in terms of acts and
episodes and render it with visual means via dashboards.
Even if the conceptual model can be instantiated by
different processes (with diferent workflows), in a
demonstration scenario [8], the authors illustrate a particular
proceeding in 4 phases : (i) goal setting, (ii) data
exploration, (iii) narrative organisation, and (iv) presentation.
        </p>
        <p>We highlight that despite naming discrepancies, the
described proposals agree in the major phases of the data
narration process.</p>
        <sec id="sec-2-1-1">
          <title>Tuberculosis is an infectious disease caused by a bac</title>
          <p>terium, Mycobacterium Tuberculosis, which is
transmitted by air when an infected person coughs or sneezes.</p>
          <p>In the most frequent form of TB (pulmonary TB) the
pathogen attacks the lungs, but in more complex forms
(extra-pulmonary TB) it can also afect other organs
(e.g.lymph nodes, bones, kidneys, etc.). Typical
symptoms of TB are fever with night sweats, chronic cough,
fatigue, shortness of breath and loss of appetite.</p>
          <p>Tuberculosis continues to be a serious public health
problem whose magnitude requires global attention.
According to the World Health Organization (WHO), this
pandemic infected 10 million people around the world in
2019. In the African region, that of Sub-Saharan Africa is
the most infected with nearly two million cases reported
each year. Among these countries, Gabon is the sec- 2.3. Epidemic intelligence process
ond most afected with an annual incidence of 428 cases
per 100,000 inhabitants, behind Zimbabwe (562 cases per
100,000 inhabitants). The LBVO-A health region alone
accounts for more than 50% of tuberculosis cases notified
in Gabon.</p>
          <p>The World Health Organization (WHO) proposed an EI
process [9], with several protocols depending on whether
data is collected through Indicator-based surveillance
(IBS) or Event-based surveillance (EBS). While the
former deals with data that has been previously validated,
the latter focuses on new data, mostly based on rumours
2.2. Data narration process and unverified information, therefore requiring a
validaData narration, i.e. the crafting of a data narrative, is a tion process. Given the retrospective character of data
complex process at the crossroads of several domains : narration and the importance of data validation, we
condata processing, data analysis, data visualisation, commu- sider only IBS protocols, relaying on sources in the health
nication, among others. Despite numerous contributions sector.
in each of these fields, few works propose global method- The process is organized into five main phases: (i)
ologies, describing the whole data narration process. detection of raw data, which consists in selecting data</p>
          <p>
            Firstly, Chen et al. [
            <xref ref-type="bibr" rid="ref6">6</xref>
            ] distinguished two main phases sources and collecting data, (ii) triage of relevant data and
typically used for data narration : (a) visual analytics, information, which concerns data analysis (data quality
which requires to see all aspects of complex data, explore and descriptive and analytical epidemiology) and data
intheir interrelationships, and is supported by multiple co- terpretation (qualitative assessment of the significance of
ordinated views and sophisticated interaction techniques, ifndings), (iii) verification of signal , which consists in
confrom (b) storytelling, which is meant to convey only inter- firming the authenticity and conformity of the findings
esting and/or important information extracted through and their characteristics, generally by cross-checking
usthe analysis, presented in a simple and easily understand- ing other reliable sources, (iv) risk assessment of the event,
able way. They proposed an intermediate phase, data which implies determining the level of risk to human
health and the potential control measures that can be
implemented, and (v) communication, which concerns
the communication of indicators to diferent audiences
(experts, health authorities, partners, population, etc.) to
help in the decision-making process.
          </p>
          <p>Apart from the communication phase, the rest of the
phases of the EI process proposed by WHO correspond to
the data exploration phase of the data narration process.</p>
          <p>Indeed, nothing is advised concerning the way of crafting
messages (explaining the learned and validated signals)
and structuring them in a coherent discourse. Finally,
the communication phase of the EI process corresponds
to the presentation phase of the data narration process,
but also to efective dissemination of the information.</p>
          <p>In particular, the WHO suggests that various supports
may be used to share information, but the use of visual
artifacts is not detailed.
2.4. Data narration in public health
two visual stories, one on vaccine safety and the other
on cancer immunotherapy. Both examples, based on
data, combine multiple media (photographs, illustrations,
choropleth maps, tables, graphs, and diagrams) with text
to create powerful visual stories for selected target
audiences.</p>
          <p>A COVID-19 monitoring dashboard that provides data
exploration and visualization capabilities was designed
in [18]. One of its applications is to support data
storytelling.</p>
          <p>According to the state of the art, data storytelling
techniques have been used many times in the field of public
health, both applied to population awareness and
education, and recently also to EI. The proposed data narratives
were crafted in an ad-hoc way, not following neither EI
nor data narration processes, and only the presentation
phase (data storytelling) was reported. To the best of our
knowledge, this article is the first to describe the
complete process of crafting a data narrative about public
health phenomena.</p>
        </sec>
        <sec id="sec-2-1-2">
          <title>Many works describe the use of data narration techniques</title>
          <p>in public health. They mainly concern the
communication phase, dealing with storytelling issues.</p>
          <p>In the field of public health, there are many commu- 3. Crafting of a data narrative
nication strategies, but the most efective is storytelling about tuberculosis
[10].</p>
          <p>
            There is a growing trend to use storytelling as a re- In this section we describe the crafting of a data narrative
search and intervention tool on public health issues, espe- to tell tuberculosis pandemic situation to health
authoricially those with a strong disease prevention component ties in Gabon. The crafting process is inspired by classical
[11]. data narration ones [
            <xref ref-type="bibr" rid="ref2 ref6">6, 7, 2</xref>
            ] (described in Subsection 2.2),
          </p>
          <p>Indeed, storytelling is used to educate populations and enriched with particular tasks from the WHO EI
proon health protection practices, to advocate for improved cess [9]. The following sub-sections describe each of the
clinical care and to encourage eforts to combat infectious phases and the obtained results.
diseases [12].</p>
          <p>For example, in [13], authors tested a new commu- 3.1. Goal setting
nication modality to promote prevention messages and
colorectal cancer screening among Latinos. In [14], au- In this phase, the goal of the data narrative is defined.
thors conducted a survey to determine the efectiveness From this objective, one or more analytical questions can
of a rational emotional digital storytelling therapy on be derived.</p>
          <p>HIV/AIDS knowledge and risk perception among school
children in Enugu State, Nigeria. Objective definition The objective of this data
narra</p>
          <p>In [15], authors developed a digital storytelling inter- tive is to describe the epidemiological situation of
tubervention on diabetes to raise awareness among immigrant culosis in the LBVO-A health region of Gabon between
and refugee populations with limited English proficiency. 2016 and 2018. Specifically, it aims to provide answers</p>
          <p>In another study conducted in rural Alaskan commu- to decision-makers on the epidemiological profile of the
nities [16], authors investigated whether digital stories disease, in order to enable them to better orient their
could influence participants’ feelings about cancer, and decisions for an efective response.
whether viewing the digital stories led to a change (or
intention to change) in health behavior. Analytical questions formulation By refining this</p>
          <p>These studies showed that storytelling can increase objective and conducting interviews with various
ofipatients’ positive responses and increase their level of cials, we obtained the following initial list of analytical
knowledge. questions.</p>
          <p>To address the need for compelling and successful
information visualizations in biomedical sciences, authors Q1: What are the epidemiological characteristics of
of [17] propose a theoretical framework for visual sto- tuberculosis? We aim to describe the profile
(frerytelling and illustrate its potential application through
quency, variations) of tuberculosis according to
the characteristics of the tuberculosis patients;
Q2: What is the spatial and temporal distribution of
patients? Identifying the most afected areas is
essential to target response actions.</p>
        </sec>
        <sec id="sec-2-1-3">
          <title>Other analytical questions may arise during data exploration.</title>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>3.2. Data exploration</title>
        <p>Data exploration consists in exploring an epidemiological
dataset for trends, patterns and correlations. Considering
tasks proposed in EI and data narration processes, we
proceed in 7 steps: (i) data collection, (ii) data processing (to
solve data quality problems), (iii) data analysis
(descriptive and analytical epidemiology), (iv) data interpretation
(qualitative assessment of findings),</p>
        <p>(v) verification (cross-checking and comparison with
the state of the art), (vi) epidemic risk assessment, and
(vii) message formulation (key messages, based on
findings, to the target audience). This steps are no supposed
to be executed sequentially, and many back and forth
transitions may be necessary.</p>
        <sec id="sec-2-2-1">
          <title>Data processing Collected data were processed in</title>
          <p>Data collection Sociodemographic and clinical data order to solve several data quality problems, including
were collected from 7968 medical records of patients un- disambiguation of empty fields, standardization of data
dergoing TB treatment at the Nkembo Specialized Hospi- types, correction of geographic boundaries (e.g. for
overtal in Libreville. They were completed with geographical lapping districts). Age groups were computed and
prodata about the administrative boundaries (neighbour- fessional status were aggregated.
hoods and districts) of the LBVO-A region. Clean data were stored in a PostgreSQL spatial data</p>
          <p>Patients and their treatments are described in terms of warehouse.
the following dimensions:</p>
          <p>1According to the WHO, a "lost to follow-up" is a patient whose
treatment has been interrupted for at least two consecutive months
and for whom no treatment outcome has been assigned (including
patients transferred to another treatment unit and those whose
treatment outcome is not known)
• Time: The year of the treatment (2016 to 2018); Data analysis Data were analyzed using the
interac• Type of TB: Clinical form (pulmonary, extra- tive BI tool Tableau Desktop2, devising many queries and
pulmonary, multidrug-resistant or unknown); noting findings.
• Age: The age of the patient, organised in two Below, we present four examples of the analyzes
carlevels: age and age group; ried out. Since tuberculosis mostly afects vulnerable
people, we began by studying univariate distributions of
• Gender: The gender of the patient (male or fe- the dimensions most concerned by this aspect
(profesmale); sion, age, HIV status and geography). Findings set of
• Profession: The profession status of the patient; new queries.
• Geography: The patient’s place of residence, or- The distribution of cases by professional status,
illusganised in two levels: neighbourhood and district; trated in Figure 1, shows a predominance of students
• HIV status: The patient’s HIV status, whether (21.94%) and unemployed patients (17.81%). The
proportested (negative, positive) or unknown; tion of patients with unknown profession is also high
• Treatment outcome (cured, completed, failure, de- (22.39%).</p>
          <p>ceased, lost to follow-up1, discontinued and trans- The distribution of cases by age group (see top part
ferred). of Figure 3) shows more patients among young adults
(20-34 years) and mature adults (35-64 years).</p>
          <p>The spatial and spatio-temporal distributions of cases
by district and by neighbourhood (figures omitted
because of their size) did not show any spatial correlation,</p>
        </sec>
        <sec id="sec-2-2-2">
          <title>2https://www.tableau.com/</title>
          <p>which is very common in other spatial epidemiology population is under 22 years old.
studies. In order to investigate further, we studied the These two results allow the calculation of prevalence
distribution by neighbourhood typology (see Figure 2), (number of patients in the study period per 1000
inhabwhere the trends are more marked (p-value = 0.01). Ad- itants) by age group (bottom part of Figure 3). It can
ditional data sources were selected for this analysis, and be seen that children are little afected by tuberculosis,
additional statistical tools were used. but young adults are only slightly more afected than</p>
          <p>The distribution of cases by treatment outcome re- older adults. Finally, it should be noted that the small
vealed that 74.60% of patients are lost to follow-up. This number of senior patients does not allow the formulation
alarming finding lead to a new analytical question: of statistically-valid insights.</p>
          <p>Q3: What are the epidemiological characteristics of
patients lost to follow-up?</p>
        </sec>
        <sec id="sec-2-2-3">
          <title>Verification Analysis results showed that more than</title>
          <p>half of the patients (57.71%) come from poor
neighbourData interpretation Hereafter, we present a detailed hoods (see Figure 2). This result is in line with those of
example of data interpretation. [20].</p>
          <p>The high proportion of patients being students could Other results (not described in data analysis
parabe linked to the high consumption of drugs in schools. graphs above) were also verified. For example, men are
Indeed, a study carried out at the Paul Idjendjé Gondjout more afected by tuberculosis (61.60% of patients), which
School in Libreville [19] showed that among the surveyed is confirmed by other studies [ 21, 22]; pulmonary forms
pupils, aged between 14 and 20 years, 40% consumed of TB were observed in a large majority of patients (90%)
alcohol, 20% smoked tobacco and 15% smoked cannabis. and drug-resistant forms were marginal (0.28%), which</p>
          <p>To put these findings into perspective, given relatively is in line with professional knowledge.
young population of Gabon, we studied the age pyramid Other results deviate from the state of the art. For
exin the study area (middle part of Figure 3), taken from the ample, the proportion of patients lost to follow-up (75%) is
Gabon’s 2013 General Population and Housing Census too high compared to other countries in the Sub-Saharan
(RGPL 2013), which evidences that more than half of the African sub-region (e.g. in Mali [23]). This discrepancy
must be also informed. In addition, the proportion of
unemployed patients (17.81%), even being substantial,
is very low compared to those found in other studies
[24, 25]. This proportion must be completed by the high
number of patients with unknown profession (see Figure
1).</p>
        </sec>
        <sec id="sec-2-2-4">
          <title>Epidemic risk assessment There is a risk of spreading</title>
          <p>tuberculosis among the student population.</p>
          <p>Patients lost to follow-up may lead to a generalized TB
epidemic in the country. Geographically disadvantaged
areas are more risky.</p>
        </sec>
        <sec id="sec-2-2-5">
          <title>Message formulation The findings obtained during data analysis and interpretation, after validation and risk analysis, allow the formulation of the following messages (exhaustive list) for the audience:</title>
        </sec>
        <sec id="sec-2-2-6">
          <title>M1: Apart from patients with unknown profession (22.39%), students are the most afected by TB (21.94%). The influence of drug use is a lead to be explored.</title>
          <p>M2: The proportion of unemployed patients (17.81%)
is lower than in other African countries.</p>
        </sec>
        <sec id="sec-2-2-7">
          <title>M13: The sixth district of Libreville records the highest</title>
          <p>proportion of cases (27.64%).</p>
          <p>M14: In all the districts, the general trend is a slight
drop in prevalence. However, two districts (1st
of Akanda and 2nd of Owendo) show particular
trends and greater variations, partialy explained
by their small populations.</p>
          <p>M15: There are two hot districts for TB, Les-PK and
Nzeng-Ayong, with respectively 13.% and 11.58%
of patients.</p>
          <p>M16: More than half of the patients (57.71%) come from
poor neighbourhoods.</p>
          <p>M17: In the distribution of patients lost to follow-up by
type of neighbourhoods, we find the same trend
as for all patients.</p>
          <p>M18: TB afects all social classes and genders, with
most patients being adults and living in poor
neighbourhoods.</p>
          <p>M19: The profile of the patients lost ot follow-up, in
terms of gender, age, occupation and location, is
very similar to that of other patients.</p>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>3.3. Narrative structuring</title>
        <p>M3: A high proportion of patients (45.81%) are be- In this phase, messages are assembled and ordered in
tween 20 and 34 years old. a coherent and logical plot in order to facilitate their
M4: The highest prevalence values (from 16.73 to 17.42 understanding and attract the audience. We started by
per 1000 inhabitants, in 3 years) also happens for setting the audience and selecting the messages to convey
patients between 20 and 34 years old, but preva- to such audience. Then, we choose the narrative structure
lence slightly drops for older patients. and organized messages.</p>
        <p>M5: Children are little afected by TB (prevalence is</p>
        <p>lower than 2.22 per 1000 inhabitants, in 3 years). Audience choice The target audience for this TB data
M6: The proportion of patients lost to follow-up narrative is composed of health authorities in Gabon,
(74.60%) is alarming. It is higher than in other including the head of the National Tuberculosis Control
African countries. Program (PNLT), the head of the Institute of
EpidemiM7: There is no spatial nor spatio-temporal correla- ology and Endemic Control (IELE), policy makers, as
tion. well as epidemiologists and doctors involved in the fight
M8: The spatial and temporal evolution is correlated against TB.</p>
        <p>with the typology of the neighbourhoods (p-value
= 0.01), with precarious and mixed neighbour- Message selection All messages formulated during
hoods being significantly more afected, but show- data exploration are selected to be communicated to the
ing a slight downward trend. target audience.</p>
        <p>M9: A very large majority of patients (90%) sufer from
the pulmonary form, which is consistent with
professional knowledge. Multidrug-resistant TB
has been recorded in 0.28% of patients.</p>
        <p>M10: The distribution by clinical form of those lost to</p>
        <p>follow-up is very similar to that of all patients.</p>
        <p>M11: Among the patients, there is a male predominance</p>
        <p>(61.60%).</p>
        <p>M12: The proportion of patients with unknown HIV
status is very high (65%). This does not allow this
criterion to be considered in the patient profile.</p>
        <sec id="sec-2-3-1">
          <title>Structure choice Following data narration recommendations [2], we organize the plot of the narrative in several acts. Acts are composed of episodes, each one telling a message.</title>
          <p>The plot is organized in 8 acts, listed in the first
columns of Table 1.</p>
          <p>The first act introduces the study intention. The second
act presents the salient messages and the next 6 acts
focus on a dimension describing the patients, respectively,
occupation, gender, age, HIV status and geography. The
ifnal act presents conclusions and recommendations.</p>
          <p>We have opted for this structuring, as it facilitates
the understanding of a patient profile, according to the
various dimensions that make it up. It is intended that the
audience should first look at the introduction and general
presentation, but then they should be able to navigate
between the following acts according to their needs. The
order of navigation has no impact on the conclusions.</p>
          <p>This structure is well known in modern storytelling
as Martini-Glass [26]. It combines several types of
interactivity in a balanced way: The author lays down his
narrative path (Acts I and II) and the reader interacts and
explores the available paths to better understand the data
(Acts III to VIII). This makes navigation through the story
lfexible, according to the needs of the audience.</p>
        </sec>
        <sec id="sec-2-3-2">
          <title>Message mapping Finally, messages were mapped to acts, as shown in Table 1.</title>
        </sec>
      </sec>
      <sec id="sec-2-4">
        <title>3.4. Presentation</title>
        <sec id="sec-2-4-1">
          <title>This phase concerns the choice of the visual representa</title>
          <p>tion (e.g. interactive dashboards, infographics, slideshow,
video) and the setting of visual artifacts (graphics, colors,
text, etc.) for telling acts and episodes, attracting the
audience attention.</p>
          <p>Visual representation choice We designed and
implemented two versions of the data narrative with
different visual rendering: (i) an interactive narrative,
composed of interconnected interactive dashboards, and (ii) a
video3, capturing a particular navigation through the
interactive narrative, with audio explanations. We remark
that both visual versions are implemented in French; the
ifgures and data samples presented in this paper were
translated for ensuring readability.</p>
          <p>We used Tableau Desktop to render the interactive
narrative and OBS Studio for recording the video.</p>
          <p>Dashboard implementation We tested several
visualisation possibilities, set up the various charts, maps and
tables, and supplemented them with explanatory text,
visual efects and audio explanations.</p>
          <p>Each act is represented by an interactive dashboard,
with the exception of Act VII (geography), which
needing to display several maps, is broken down into several
dashboards. Each episode (concerning a message) is
represented by one or more visual artefacts (e.g. charts,
maps, tables, text, images, audio).</p>
          <p>As an example, Figure 4 presents the rendering of Act
II (general presentation). It takes the form of a welcome
screen, which gives access to the dashboards correspond- Unlike previous works, which target the general public,
ing to the following acts (in the top menu). Several types this narration targets experts in epidemiology and public
of visualizations are used to highlight the messages. The health, who are decision-makers.
interface ofers the possibility of filtering by neighbour- To our knowledge, this article is the first to describe
hood, district and year, for in-depth spatio-temporal fo- the process of crafting a data narrative considering the
cus, and to zoom in a map region. peculiarities of the epidemiological field and highlighting
the main methodological challenges. Concretely, (i) data
analysis is completed by a comparison with other
state-or4. Lessons learned the-art studies, (ii) messages are supported by statistical
tests, (iii) the geographic component is very important,
(iv) the restitution must be both guided and interactive,
and (v) back and forth transitions among process phases
and steps are be necessary.</p>
          <p>In addition to Gabonese health authorities, who are
fully satisfied by the experience, the data narrative was
presented to professionals of other African countries
[27]. We hope that this initiative will serve to inspire
other teams to reproduce the experience in other health
ifelds, in particular to facilitate understanding of the
epidemiological situation of other infectious diseases (covid,
cholera, dysentery, yellow fever, ebola, etc.).</p>
          <p>As perspectives, we plan to work on producing a data
narrative based on the simulation results. Concretely, we
are studying the spread of the tuberculosis epidemic with
a multi-agent system whose parameters will be taken
from the underlying spacial data warehouse, and which
will be used to test several scenarios and health policies.</p>
        </sec>
        <sec id="sec-2-4-2">
          <title>The crafting process is inspired by state-of-the-art models</title>
          <p>
            and processes [
            <xref ref-type="bibr" rid="ref2 ref6">2, 6, 7</xref>
            ]. However, several peculiarities of
the application context have led us to enrich this process.
This section presents the main lessons learned during
this adaptation.
          </p>
          <p>First, statistical data analysis is not suficient for public
health decision making. A systematic comparison with
the state of the art, by comparing the figures obtained, is
imperative in order to discern global phenomena from
regional or seasonal peculiarities. Thus, decision-makers
can judge which dimensions of the patient profile are
in agreement with the situation in other countries, for
which joint actions can be put in place, and which relate
to the Gabonese population. Similarly, the results
obtained must undergo extensive testing in order to prove
its statistical value. As the target audience is
predominantly scientific, these results can be communicated in
the narrative.</p>
          <p>Second, unlike seasonal narratives (frequent in data
journalism), in scientific narratives analytical questions
are not all known in advance. On the contrary, new
questions may arise during data analysis. We illustrated this
during the study of the lost to follow-up, in Subsection
3.2. Iterations between goal setting and data exploration
phases are often necessary. New findings can also impact
previous messages and require updating.</p>
          <p>Third, the geographic component is very important in
assessing the spatial and spatiotemporal extent of health
problems. The restitution in the form of maps is to be
privileged, but also, the spatial correlations.</p>
          <p>Finally, the data narrative should allow interactive
navigation between dashboards. There are diferent
proifles among the decision-makers. On the one hand, we
ifnd various needs in terms of dimensions and indicators
studied, for which a thematic organization (such as that
implemented in acts III to VII) is perfectly suited. On the
other hand, health authorities need a more
comprehensive and guided reading of the narrative. The challenge
is to find a good balance for rendering, both guided and
interactive.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>5. Conclusion</title>
      <sec id="sec-3-1">
        <title>In this article, we have described the process of crafting a data narrative about TB pandemic in Gabon.</title>
        <p>[7] B. Lee, N. H. Riche, P. Isenberg, S. Carpendale, More on Big Data, 2020, pp. 1380–1387.
than telling a story: Transforming data into visu- [19] Y. Mboumba Sambo, Lutte contre la consommation
ally shared stories, IEEE Computer Graphics and de la drogue en milieu scolaire au gabon: Cas du
Applications 35 (2015). Lycée Paul IDJENDJE GONDJOUT de la commune
[8] F. E. Outa, M. Francia, P. Marcel, V. Peralta, P. Vassil- de Libreville, Dissertation, INJS Gabon, 2018.
iadis, Supporting the generation of data narratives, [20] E. Engohan Alloghe, M. Toung Mve, S.
Ramaroin: ER Forum, Demo and Posters 2020, Vienna, Aus- joana, J. J. Iba Ba, D. Nkoghe, Epidemiologie de
tria, 2020. tuberculose infantile au centre antituberculeux de
[9] World Health Organization, Early detection, assess- libreville de 1997–2001, Med trop 66 (2006) 469–471.
ment and response to acute public health events: [21] B. Melki, S. Saad, H. Daghfous, M. Khelifa, F. Tritar,
implementation of early warning and response with Forme grave de la tuberculose : le
pyopneumothoa focus on event-based surveillance, Technical doc- rax tuberculeux, Revue des Maladies Respiratoires
ument, 2014. 32 (2015).
[10] N. Patel, N. Patel, Modern technology and its use [22] B. Larbani, M. Terniche, S. Taright, M. Makhloufi,
as storytelling communication strategy in public La prise en charge de la tuberculose pulmonaire
health, MOJ Public Health 6 (2017) 338–341. dans une unité de contrôle de la tuberculose d’alger,
[11] B. McCall, L. Shallcross, M. Wilson, C. Fuller, Revue des Maladies Respiratoires 34 (2017).</p>
        <p>A. Hayward, Storytelling as a research tool and [23] A. Sylla, B. Marchou, N. Kassi, N. Ello, T. Aba,
intervention around public health perceptions and G. Kouakou, C. Mossou, E. Ehui, S. Eholié, E.
Biasbehaviour: a protocol for a systematic narrative sagnéné, Co-infection tuberculose/vih: à propos
review, BMJ open 9 (2019). de 717 cas suivis dans un service de maladies
in[12] E. K. Tsui, A. Starecheski, Uses of oral history and fectieuses en afrique subsaharienne, Médecine et
digital storytelling in public health research and Maladies Infectieuses 47 (2017) S137–S138.
practice, Public health 154 (2018) 24–30. [24] G. Tékpa, V. Fikouma, R. M. M. Téngothi,
[13] L. K. Larkey, J. Gonzalez, Storytelling for promoting J. de Dieu Longo, A. P. A. Woyengba, B. Kofi,
Ascolorectal cancer prevention and early detection pects épidémiologiques et cliniques de la
tubercuamong latinos, Patient education and counseling lose en milieu hospitalier à bangui, The Pan African
67 (2007) 272–278. Medical Journal 33 (2019).
[14] B. Ezegbe, C. Eseadi, M. O. Ede, J. N. Igbo, A. Aneke, [25] S. NIANG, E. ABDALLAHI, K. THIAM, F. B. R.</p>
        <p>D. Mezieobi, G. C. Ugwu, A. U. Ugwoezuonu, E. Eliz- MBAYE, M. CISSE, A. DIENG, N. T. BADIANE, et al.,
abeth, K. R. Ede, et al., Eficacy of rational emotive Aspects épidémiologiques, diagnostiques et
évodigital storytelling intervention on knowledge and lutifs de la tuberculose pulmonaire à microscopie
risk perception of hiv/aids among schoolchildren positive au district sanitaire de saint-louis., Revue
in nigeria, Medicine 97 (2018). Africaine de Médecine Interne 5 (2018) 65–69.
[15] J. W. Njeru, C. A. Patten, M. M. Hanza, T. A. [26] E. Segel, J. Heer, Narrative visualization: Telling
Brockman, J. L. Ridgeway, J. A. Weis, M. M. Clark, stories with data, IEEE TVCG 16 (2010).
M. Goodson, A. Osman, G. Porraz-Capetillo, et al., [27] R. Ondzigue Mbenga, V. Peralta, T. Devogele,
Stories for change: development of a diabetes dig- S. Maghendji, E. B. Ngoungou, Processus de
narraital storytelling intervention for refugees and im- tion de données en intelligence épidémique avec
apmigrants to minnesota using qualitative methods, plication à la pandémie de tuberculose au gabon, in:
BMC public health 15 (2015) 1–11. 8e Journées Camerounaises d’Informatique
Médi[16] M. Cueva, R. Kuhnley, L. Revels, N. E. Schoenberg, cale, 2021.</p>
        <p>M. Dignan, Digital storytelling: a tool for health
promotion and cancer awareness in rural alaskan
communities, International journal of circumpolar
health 74 (2015).
[17] T. Botsis, J. E. Fairman, M. B. Moran, V. Anagnostou,</p>
        <p>Visual storytelling enhances knowledge
dissemination in biomedical science, Journal of biomedical
informatics 107 (2020).
[18] A. S. Peddireddy, D. Xie, P. Patil, M. L. Wilson,</p>
        <p>D. Machi, S. Venkatramanan, B. Klahn, P.
Porebski, P. Bhattacharya, S. Dumbre, et al., From 5vs to
6cs: Operationalizing epidemic data management
with covid-19 surveillance, in: 2020 IEEE Int. Conf.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Hullman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Drucker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. H.</given-names>
            <surname>Riche</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Lee</surname>
          </string-name>
          , D. Fisher, E. Adar,
          <article-title>A deeper understanding of sequence in narrative visualization</article-title>
          ,
          <source>IEEE TVCG 19</source>
          (
          <year>2013</year>
          )
          <fpage>2406</fpage>
          -
          <lpage>2415</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>F. E.</given-names>
            <surname>Outa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Francia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Marcel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Peralta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Vassiliadis</surname>
          </string-name>
          ,
          <article-title>Towards a conceptual model for data narratives</article-title>
          ,
          <source>in: ER</source>
          <year>2020</year>
          , Vienna, Austria,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>R.</given-names>
            <surname>Kosara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Mackinlay</surname>
          </string-name>
          ,
          <article-title>Storytelling: The next step for visualization</article-title>
          ,
          <source>IEEE Computer 46</source>
          (
          <year>2013</year>
          )
          <fpage>44</fpage>
          -
          <lpage>50</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bouman</surname>
          </string-name>
          ,
          <article-title>Storytelling makes public health statistics more accessible</article-title>
          ,
          <source>European Journal of Public Health</source>
          <volume>27</volume>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Rivest</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bédard</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.-J. Proulx</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Nadeau</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Hubert</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Pastor</surname>
          </string-name>
          , Solap technology:
          <article-title>Merging business intelligence with geospatial technology for interactive spatio-temporal exploration and analysis of data</article-title>
          ,
          <source>ISPRS Journal of Photogrammetry and Remote Sensing</source>
          <volume>60</volume>
          (
          <year>2005</year>
          )
          <fpage>17</fpage>
          -
          <lpage>33</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. L.</given-names>
            <surname>Andrienko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. V.</given-names>
            <surname>Andrienko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. H.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Turkay</surname>
          </string-name>
          ,
          <article-title>Supporting story synthesis: Bridging the gap between visual analytics and storytelling</article-title>
          ,
          <source>IEEE Trans. Vis. Comput. Graph</source>
          .
          <volume>26</volume>
          (
          <year>2020</year>
          )
          <fpage>2499</fpage>
          -
          <lpage>2516</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>