=Paper= {{Paper |id=Vol-2884/paper_131 |storemode=property |title=Health Care Misinformation: an Artificial Intelligence Challenge for Low-resource languages |pdfUrl=https://ceur-ws.org/Vol-2884/paper_131.pdf |volume=Vol-2884 |authors=Sarah Luger,Martina Anto-Ocrah,Tapo Allahsera,Christopher Homan,Marcos Zampieri,Michael Leventhal }} ==Health Care Misinformation: an Artificial Intelligence Challenge for Low-resource languages== https://ceur-ws.org/Vol-2884/paper_131.pdf
                        Health Care Misinformation: An artificial intelligence
                                challenge for low-resource languages
             Sarah Luger1 , Martina Anto-Ocrah2 , Tapo Allahsera3 , Christopher M. Homan3 ,
                                 Marcos Zampieri3 , Michael Leventhal4
                                            1
                                         Orange Silicon Valley, San Francisco, CA, USA
                                    2
                                  University of Rochester Medical Center, Rochester, NY, USA
                                    3
                                      Rochester Institute of Technology, Rochester, NY, USA
           4
             Centre National de l’Education en Robotique et en Intelligence Artificielle (RobotsMali), Bamako, Mali

           sarah.luger@orange.com | martina_anto-ocrah@urmc.rochester.edu | (aat3261 | cmhvcs | mazgla)@rit.edu
                                               mleventhal@robotsmali.org


                            Abstract                                 “AI for Social Good” or fairness, accountability, and trans-
                                                                     parency research is to address problems that reflect West-
  In this paper, we motivate using state-of-the-art artificial in-   ern challenges with institutional racism and sexism. This
  telligence technologies to address challenges presented by         Western-centric approach, while nascent, misses opportuni-
  low-resource languages. We also reflect on both the impor-         ties to increase financial, educational, and health well-being
  tance and priorities of AI research with respect to the less
  wealthy economies of the world. We explore the contribu-
                                                                     elsewhere. There is tremendous opportunity, especially in
  tions of colonialism to language (in)accessibility and public      machine translation, (MT), of medical information, to in-
  health misinformation during the Covid-19 pandemic in the          crease digital inclusion and health for low-resource language
  African region. Using the West African country of Mali as          speakers.
  a case study, we discuss the historic contribution of colonial        Andrew Ng, through his deeplearning.ai newsletter, ran a
  educational systems to the creation of disenfranchised popu-       survey asking what the AI community should focus on in or-
  lations. These populations are left with limited access to im-     der to promote social good. The authors of this paper believe
  portant medical information that can mean life or death in         that a top focus should be to solve problems in developing
  the current Covid-19 pandemic. We propose a humans-in-the-         countries where it could have an enormous impact and to
  loop neural machine translation, (NMT), solution to medical        help create expertise in the developing world. This big im-
  information translation. In our solution, the state-of-the-art
  NMT approach is applied to the low-resource language Bam-
                                                                     pact and increased expertise means the people that live in
  bara which is spoken by a majority of the Malian people. By        the developing world can control their own technology and
  implementing a crowdsourced Bambara language data collec-          their own destiny.
  tion and translation component in this machine learning prob-         In 2018, the McKinsey Global Institute published re-
  lem, we engage the local Malians. The aim of this project is to    search outlining the financial benefits of corporate and na-
  address the lack of Bambara language resources and leverage        tional AI investment. One of their top four analyses was that
  current best practice in order to undo some of the artefacts of
  colonialism. We describe the unique challenges and research          [the] adoption of AI could widen gaps between coun-
  issues raised by this novel application of AI technology.            tries, companies, and workers...AI leaders (mostly de-
                                                                       veloped economies) could capture an additional 20 to
                                                                       25 percent in economic benefits compared with today,
                        Background                                     while emerging economies may capture only half their
                                                                       upside (Bughin et al. 2018).
AI research can contribute to the diminution of global in-
equities borne of the colonial period. However, the research           Another of the top four analyses was:
community, with notable exceptions, fails either to recog-             The pace of AI adoption and...how countries choose to
nize this opportunity or to be interested in it because re-            embrace these technologies (or not) will likely impact
search priorities reflect the same mindset that produced colo-         the extent to which their businesses, economies, and so-
nialism. Based on well-publicized instances of bias in AI              cieties can benefit. The race is already on among com-
systems over the past several years (Buolamwini and Ge-                panies and countries. In all cases, there are trade-offs
bru 2018; Angwin et al. 2016), the priority for current                that need to be understood and managed appropriately
                                                                       in order to capture the potential of AI for the world
AAAI Fall 2020 Symposium on AI for Social Good.                        economy (Bughin et al. 2018).
Copyright c 2020 for this paper by its authors. Use permitted un-
der Creative Commons License Attribution 4.0 International (CC         At this juncture, artificial intelligence is a field that faces
BY 4.0).                                                             both vast promise and daunting peril. We seek to raise
awareness of the challenges faced by communities isolated         an under-resourced African language, Bambara, as a gen-
by a lack of language resources, especially digital ones. In      eralizable use case. Exploring Bambara MT illustrates that
addition, we present a repeatable use case for low-resource       the AI challenges and research problems aiming to do so-
languages: using neural machine translation with humans-          cial good must also reflect the priorities of the inhabitants of
in-the-loop to improve global access to health care informa-      financially under-resourced countries.
tion.
   As a team working on AI with the full participation of                      Colonialism and its Legacy
an African research team, on a project for Africans, we en-
                                                                  To gain a sense of the significance of misinformation to
counter the notion of “social good” frequently, as well as the
                                                                  public health crises, consider the situation in Mali in May,
intrinsically related concepts of “fairness,” “accountability,”
                                                                  2020. As the death toll at that time had reached over 300,000
and “transparency.” We have formed the view that it almost
                                                                  people globally, and news of the increasing Covid-19 mor-
always turns around the problems and perspective as per-
                                                                  tality rates dominated media outlets, the United Nations
ceived and as defined in the societies of plenty. When we
                                                                  announced the “Verified” initiative (Kreider 2020) to fight
(some of the authors), as Africans, look from any side of the
                                                                  Covid-19 misinformation. And yet it was only in the follow-
political spectrum at the debate in the developed countries,
                                                                  ing month that they began donating medical relief to Mali
we do not sense that there is real conviction that our fates
                                                                  in order to support an integrated and quick response to the
are interlinked on a global level.
                                                                  Covid-19 crisis (for the Coordination of Humanitarian Af-
   Wealthier nations may feel that international institu-
                                                                  fairs 2020).
tions, which they fund, are addressing global needs. Many
Africans understand that the primary function of these in-        French and British colonial language perspectives
stitutions is to keep African suffering out of the developed
world. Wealthier nations may point to foreign aid as their        Since the colonial era, language has served to disenfranchise
contribution to alleviating that suffering. Many Africans see     African populations. However, different approaches to colo-
that foreign aid is always tied narrowly to the priorities de-    nization by France and Britain led to very different outcomes
fined by the donors and not the people purportedly aided,         in this regard. In order to save costs and provide the appear-
that the great bulk of the money goes back into the donor         ance of a moral justification for colonialism, the British re-
country through salaries paid to consultants and goods pur-       lied on missionaries to manage education in their colonies.
chased in the donor country, and that the overall effect is to    This approach was inherently decentralized, with individual
suppress the development of local industry and expertise.         missions having great liberty in how they taught. This al-
   Preliminary findings from our work shows that what many        lowed them to provide most instruction in the local vernac-
Africans would love to have instead is access to the same re-     ular, and teach English as a second language as a specific
sources that many people in the wealthier economies and           topic.
past colonial powers have to educate themselves, to start            France, by contrast, used language to drive assimilation
businesses, and to create opportunities and solutions to the      and effectively “turn” Africans into French people (Cogneau
problems that they face. There are many systematic ways           and Moradi 2014; Benavot and Riddle 1988; Garnier and
that these countries historically have labored to deny such       Schafer 2006). Schools required government certification,
access to ex-colonial countries and there continue, to this       the hiring of government-certified teachers, and adherence
day, to be numerous systematic ways that they continue to         to a government-sanctioned curriculum. All instruction was
do this.                                                          in French only. The colonial state was the primary educator,
                                                                  and only those who could navigate the administrative and
Challenge                                                         cost barriers received an education.
This paper begins with background information on the sup-            These divergent approaches led to significant disparities
pression of native-language education in the era of colonial-     in school enrollment and literacy levels in the colonies, with
ism as an paradigmatic historical example of a widespread,        higher school enrollments and literacy in the less central-
long-term policy to deny Africans the access to resources         ized British-format system, compared to the more central-
to develop their own intellectual capacity and the capacity       ized French system (Cogneau and Moradi 2014; Benavot
to solve problems relevant to them. We argue that the con-        and Riddle 1988; Garnier and Schafer 2006).
tinuity of the colonial mindset is reflected in the fact that,
today, only minuscule resources exist for Africans to learn       Modern-day ramifications of colonial language on
AI, that Africans are systematically, en masse, denied ac-        the Covid-19 crisis in Mali
cess to resources to learn AI and participate in AI communi-      In Mali, which was colonized by France for 68 years, French
ties that exist only outside of Africa, and that “AI for Social   remains the official language. Yet only 20% of the popu-
Good” has not even considered a problem as basic as ap-           lation have mastered it, due to the high costs of and bar-
plying natural language processing, (NLP), to the languages       riers to educational resources (Mingat and Suchaut 2000;
that Africans speak.                                              ArcGIS 2020). Most Malians are multilingual, and the ma-
   We present a case study covering the impact of this inat-      jority of them speak Bambara, the primary language of the
tention and denial of resources on health care information        predominant ethnic group (Mingat and Suchaut 2000). Due
in Africa as the Covid-19 epidemic swept the world. We use        to a paucity of information about Covid-19 in Bambara,
our own efforts to study the problems of developing NLP for       those 15.2 million Malians with fluency in Bambara but not
French have limited access to critical public health informa-   vide. Using crowdsourcing platforms, Malians can be re-
tion, such as viral transmission modes, use of personal pro-    sourced to translate small amounts of Bambara to French
tective equipment, movement restrictions, quarantine mea-       (and vice versa). This crowdsourcing process can create suf-
sures, and social distancing protocols. Absent the capacity     ficient training data necessary for implementing MT tech-
to widely disseminate crucial, novel information, efforts to    nology (Wu et al. 2016; Leventhal et al. 2020). Crowdsourc-
combat Covid-19 in some of the most vulnerable and disen-       ing begins the digital data development cycle aimed at tran-
franchised Malian communities continues to be challenging.      sitioning Bambara out of the low-resource language cate-
                                                                gory. Highly digitally-resourced languages leverage suffi-
    Using AI to improve health information                      cient data to improve the quality of their automated trans-
                                                                lations. This transition would also reduce unnecessary bur-
In this section we present our strategy to improve Bambara      dens placed on local governments who are plagued with the
language resources. We begin with leveraging emergent neu-      devastating Covid-19 pandemic, whilst still reeling from the
ral machine translation technology which relies on aligning     effects of colonialism.
corresponding text from Bambara and French. Then, we de-           There have been many attempts to use machine transla-
scribe our preliminary study which uses a relatively small      tion for Covid-19 response (Way et al. 2020; TAUS 2020;
amount of data and helps identify the challenges of human       without borders 2020; Project 2020), but only the last two of
translation for Bambara and similar, primarily spoken, lan-     these, Translators without Borders (without borders 2020)
guages. Finally, we discuss the importance of crowdsourc-       and The Endangered Languages Project (Project 2020) con-
ing and the development of our neural machine translation       sider African languages. We see these efforts as motivation
system.                                                         for bottom-up solutions through crowdsourcing so that their
                                                                same success in MT modeling can be achieved for Bam-
Proposed Solution                                               bara. Broadly, our goal is to use Bambara as a test case for
Text alignment is a process that creates a correspondence       modeling best practice for future initiatives in low resource
from a ground truth translation to that of a novel gener-       language data collection, crowdsourced labor training and
ated translation. In situations like this with low resource     annotation, and high-quality NMT model building.
languages, alignment begins by using a trained Bambara to
French translator on a data set of Bambara to French sen-       Preliminary study
tences to create a loose correlation between the sets. From
                                                                We undertook a preliminary study of NMT, collecting data
there, an automated aligner processes the translated French
                                                                and creating a model to translate between Bambara and
sentences and the ground truth French to create an "align-
                                                                French and English. The goal of this work was not only to
ment".
                                                                elucidate the challenges of NLP for this particular language
   Word alignment models (Och and Ney 2004) are very im-
                                                                and, in general, for under-resourced languages, but also to
portant in neural and statistical MT pipelines. Poor align-
                                                                gather data for the preparation of a full-scale attack on the
ment performance tends to lead to poor MT performance.
                                                                problem. This work is described in more detail in (Luger,
Several studies have investigated the relation between high-
                                                                Homan, and Tapo 2020; Leventhal et al. 2020).
quality word alignment and MT quality in terms of au-
tomatic metrics such as BLEU scores (Fraser and Marcu
2007). Obtaining high quality word alignment depends on         Data Collection and Preparation
the availability of suitable (often large) parallel corpora     The data for our initial study is a dictionary dataset from
which is a known challenge for low-resource languages like      SIL Mali1 with examples of sentences used to demonstrate
Bambara. There have been studies proposing methods to im-       word usage in Spanish, French, English, and Bambara; and
prove word alignment models for low resource language           a tri-lingual health guide titled “Where there is no doctor.2 ”
pairs (Xiang, Deng, and Zhou 2010; McCoy and Frank                 Data preparation, including alignment, proved to be about
2018) including the use of a resource-richer pivot language     60% of the overall time spent in person-hours on the exper-
to improve word alignment between a low resource pair (tri-     iment and required on-the-ground organization and recruit-
angulation) (Levinboim and Chiang 2015), however, to the        ment of skilled volunteers in Mali.
best of our knowledge, there have been no studies addressing       Most of the dictionary examples of expressions in Bam-
Bambara specifically.                                           bara are formatted as dictionary entries followed by their
   As noted, building Bambara language capacity in Mali         translations in French and in English. Most of these are sin-
via MT requires constructing Bambara-language informa-          gle sentences, so there is sentence-to-sentence alignment in
tion from source data in another language (ACALAN 2020).        the majority of cases. However, there remains a sufficient
Quickly scaling MT technology however depends on suffi-         number of exceptions to render automated pairing impossi-
cient amounts of translated text from source to target lan-     ble. Part of the problem lies in the unique linguistic and cul-
guage to train the translation system before it can achieve     tural elements of the bambaraphone environment; it is often
state-of-the-art levels. Bambara lacks such training data and   not possible to meaningfully translate an expression in Bam-
has been considered (from the perspective of MT training        bara without giving an explanation of the context.
data) a low-resource language (Wu et al. 2016).
                                                                   1
   Thus, MT technology that uses a humans-in-the-loop ap-              https://www.sil-mali.org/en/content/introducing-sil-mali
                                                                   2
proach can engage local Malians to bridge the language di-             https://gafe.dokotoro.org/
   The medical health guide is aligned by chapters, each of       a random sampling of 41 translations of Bambara, 21 into
which is roughly aligned by paragraphs. But at the para-          English and 20 into French. The evaluators did not collab-
graph level there are too many exceptions for automated           orate with each other. The evaluators were asked to assess
pairing to be feasible. Further, at the sentence level many of    several aspects of the translations, including identifying spe-
the bambaraphone-specific problems found in the dictionary        cific parts that were well or poorly translated. Finally, the
dataset are present here, particularly in explanations of con-    evaluators were asked to identify those translations that suc-
cepts that can be succinctly expressed in English or French       ceeded in conveying most of the meaning of the Bambara
but for which Bambara lacks terminology and the bambara-          source, and to assign them a quality score. Of these 41 sen-
phone environment lacks an equivalent physical or cultural        tences, one evaluator classified 8 sentences as nearly perfect
context.                                                          or very good while the second gave 17 this rank. All 8 of
   Both datasets required manual alignment by individuals         the first evaluator’s translations were selected by the second.
fluent in written Bambara and either French or English, and       The Cohen Kappa score of the pair is 0.5141 indicating mod-
able to exercise expert-level judgment on linguistic and, oc-     erate agreement (Viera and Garrett 2005).
casionally, medical questions. Access to such human exper-           Our analysis suggests that we did not provide sufficient
tise was a major factor limiting the quantity of data we were     guidance as to what constitutes an acceptable translation to
able to align. We implemented a software alignment tool to        our human Bambara evaluators. Further, one evaluator was
manually align sentences and to save those sentence pairs         simply more lenient than the other in what they deemed was
that a human editor considered properly aligned. In separate      acceptable for meeting the subjective label of “nearly perfect
tasks, four annotators with a middle school level understand-     or very good translation”. Moreover, we had difficulty for-
ing of Bambara performed alignment on French-Bambara              mulating translation criteria due to limited experience with
and English-Bambara sentence pairs using the tool.                human translation of Bambara, in addition to the ab initio
   The final dataset contains 2,146 parallel sentences of         nature of this experiment with machine translation of Bam-
Bambara-French and 2,158 parallel sentences of Bambara-           bara. Moving forward, our results will inform the develop-
English–a tiny amount of data for NMT compared to mas-            ment of more rigorous criteria in future experiments.
sive state-of-the-art models that are trained on millions of
sentences (Arivazhagan et al. 2019).                                                     Conclusion
   Thus, our NMT is a transformer (Vaswani et al. 2017) of        Our study constitutes the first attempt of modeling automatic
appropriate size for a relatively smaller training dataset (van   translation for the low-resource language of Bambara. We
Biljon, Pretorius, and Kreutzer 2020). It has six layers with     identified challenges for future work, such as the develop-
four attention heads for encoder and decoder, the trans-          ment of alignment tools for small-scale datasets, the need
former layer has a size of 1024, and the hidden layer size        for a general domain evaluation set, and better training of
256, the embeddings have 256 units. Embeddings and vo-            human translation evaluators. The current limitation of pro-
cabularies are not shared across languages, but the softmax       cessing written text as input might furthermore benefit from
layer weights are tied to the output embedding weights.           the integration of spoken resources through speech recogni-
The model is implemented with the Joey NMT frame-                 tion or speech translation, since Bambara is primarily spo-
work (Kreutzer, Bastings, and Riezler 2019) based on Py-          ken and the lack of standardization in writing complicates
Torch (Paszke et al. 2019).                                       the creation of clean reference sets and consistent evalua-
   Training runs for 120 epochs in batches of 1024 tokens         tion.
each. The ADAM optimizer (Kingma and Ba 2014) is used
with a constant learning rate of 0.0004 to update model                                 Future Work
weights. This setting was found to be best to tune for high-
est BLEU (Papineni et al. 2002), compared to decaying             Moving forward we would like to take advantage of the
or warmup-cooldown learning rate scheduling. For regular-         human-in-the-loop approach described here to create more
ization, we experimented with dropout and label smooth-           resources to improve word alignment and MT systems for
ing. The best values were 0.1 for dropout and 0.2 for label       low-resource languages in general and Bambara in partic-
smoothing across the board. For inference, beam search with       ular. Another avenue we would like to explore is the use
width of 5 is used. The remaining hyperparameters are doc-        of monolingual data. The health care domain is rich in re-
umented in the Joey NMT configuration files.                      sources for English (e.g. UMLS 3 , SNOMED 4 , NCBO’s
                                                                  BioPortal5 ) and such monolingual data can be used to im-
       Neural Machine Translation Results                         prove the performance of MT systems on the English side
                                                                  of the English–Bambara translation pair (Burlot and Yvon
Translation results were evaluated both automatically and
                                                                  2019). Finally, the use of term banks, either manually or au-
with human evaluators. We obtained BLEU scores of ap-
                                                                  tomatically compiled, is another under-explored avenue for
proximately 20 for our best model. BLEU or “bilingual eval-
                                                                  low-resource languages (Haque, Penkale, and Way 2014)
uation understudy” is a system of measuring automated ma-
                                                                  which we believe can be particularly helpful for technical
chine translation’s text output with high scores being closest
                                                                  domains such as medicine and health care.
to those of a professional human translator (Papineni et al.
                                                                     3
2002).                                                                 https://www.nlm.nih.gov/research/umls/index.html
                                                                     4
   Two human evaluators, native speakers of Bambara and                http://www.snomed.org/
                                                                     5
self-assessed to be fluent in English and French, evaluated            https://bioportal.bioontology.org/
   In addition, we have made data sets, including aligned and      equipment. https://reliefweb.int/report/mali/support-malis-
annotated French and Bambara sentence pairs available to           COVID-19-response-plan-united-nations-hands-over-48-
the machine translation and AI for Good community: Bam-            tons-medical-supplies.
bara Data Repository6 . Please reach out to us regarding           Fraser, A., and Marcu, D. 2007. Measuring word alignment
these low-resource language resources as we are attempting         quality for statistical machine translation. Computational
to make as much of our research as possible available to the       Linguistics 33(3):293–303.
community.
                                                                   Garnier, M., and Schafer, M. 2006. Educational model and
                                                                   expansion of enrollments in sub-saharan africa. Sociology
                      Acknowledgments                              of Education - SOCIOL EDUC 79:153–176.
We would like to thank Julia Kreutzer, Arthur Nagashima,           Haque, R.; Penkale, S.; and Way, A. 2014. Bilin-
the Masakhane machine translation for Africa community,            gual termbank creation via log-likelihood comparison and
and SIL Mali7 . Our work could not have been possible              phrase-based statistical machine translation. In Proceedings
without your valuable insight and contributions to ongoing         of the 4th International Workshop on Computational Termi-
progress in this field. Earlier versions of this work are de-      nology (Computerm), 42–51.
scribed in (Luger, Homan, and Tapo 2020; Leventhal et al.
2020).                                                             Kingma, D. P., and Ba, J. 2014. Adam: A method for
                                                                   stochastic optimization. arXiv preprint arXiv:1412.6980.
                           References                              Kreider, K.           2020.       Why the western sys-
                                                                   tem of covid-19 response won’t work in africa.
ACALAN. 2020. African academy of languages, african
                                                                   https://theowp.org/reports/why-the-western-system-of-
union. historical background. https://acalan-au.org/aboutus.
                                                                   COVID-19-response-wont-work-in-africa/.
php.
Angwin, J.; Larson, J.; Mattu, S.; and Kirchner, L. 2016.          Kreutzer, J.; Bastings, J.; and Riezler, S. 2019. Joey NMT:
Machine bias. ProPublica, May 23:2016.                             A minimalist NMT toolkit for novices. In Proceedings of
                                                                   the 2019 Conference on Empirical Methods in Natural Lan-
ArcGIS.                2020.              Mali      languages.     guage Processing and the 9th International Joint Confer-
https://www.arcgis.com/home/item.html?id=                          ence on Natural Language Processing (EMNLP-IJCNLP):
b5b4f736b5714f32b12a0322e5405734.                                  System Demonstrations, 109–114. Hong Kong, China: As-
Arivazhagan, N.; Bapna, A.; Firat, O.; Lepikhin, D.; John-         sociation for Computational Linguistics.
son, M.; Krikun, M.; Chen, M. X.; Cao, Y.; Foster, G.;             Leventhal, M.; Tapo, A.; Luger, S.; Zampieri, M.; and
Cherry, C.; Macherey, W.; Chen, Z.; and Wu, Y. 2019. Mas-          Homan, C. M. 2020. Assessing human translations from
sively multilingual neural machine translation in the wild:        french to bambara for machine learning: a pilot study. arXiv
Findings and challenges.                                           preprint arXiv:2004.00068.
Benavot, A., and Riddle, P. 1988. The expansion of pri-            Levinboim, T., and Chiang, D. 2015. Multi-task word align-
mary education, 1870-1940: Trends and issues. Sociology            ment triangulation for low-resource languages. In Proceed-
of Education 61:191.                                               ings of the 2015 Conference of the North American Chapter
Bughin, J.; Seong, J.; Manyika, J.; Chui, M.; and Joshi, R.        of the Association for Computational Linguistics: Human
2018. Mckinsey global institute notes from the ai frontier:        Language Technologies, 1221–1226.
Modeling the impact of ai on the world economy. McKinsey           Luger, S.; Homan, C. M.; and Tapo, A. 2020. Towards a
Global Institute, September 1:64.                                  crowdsourcing platform for low resource languages: A col-
Buolamwini, J., and Gebru, T. 2018. Gender shades: Inter-          lectivist approach. AAAI: Human Computation (HCOMP).
sectional accuracy disparities in commercial gender classifi-      McCoy, R. T., and Frank, R. 2018. Phonologically in-
cation. In Friedler, S. A., and Wilson, C., eds., Conference       formed edit distance algorithms for word alignment with
on Fairness, Accountability and Transparency, FAT 2018,            low-resource languages. Proceedings of the Society for
23-24 February 2018, New York, NY, USA, volume 81 of               Computation in Linguistics 1(1):102–112.
Proceedings of Machine Learning Research, 77–91. PMLR.
                                                                   Mingat, A., and Suchaut, B.           2000.    Les systèmes
Burlot, F., and Yvon, F. 2019. Using monolingual data              éducatifs africains. une analyse économique compara-
in neural machine translation: a systematic study. arXiv           tive. https://www.scirp.org/(S(351jmbntvnsjt1aadkposzje))/
preprint arXiv:1903.11437.                                         reference/ReferencesPapers.aspx?ReferenceID=2215667.
Cogneau, D., and Moradi, A. 2014. British and french ed-
                                                                   Och, F. J., and Ney, H. 2004. The alignment template ap-
ucational legacies in africa. https://voxeu.org/article/british-
                                                                   proach to statistical machine translation. Computational lin-
and-french-educational-legacies-africa.
                                                                   guistics 30(4):417–449.
for the Coordination of Humanitarian Affairs, U. N. O.
                                                                   Papineni, K.; Roukos, S.; Ward, T.; and jing Zhu, W. 2002.
2020. Support for mali’s covid-19 response plan: The
                                                                   Bleu: a method for automatic evaluation of machine transla-
united nations hands over 48 tons of medical supplies and
                                                                   tion. 311–318.
   6
       https://github.com/israaar/mt_bambara_data                  Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.;
   7
       https://www.sil-mali.org/en/content/introducing-sil-mali    Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga,
L.; Desmaison, A.; Kopf, A.; Yang, E.; DeVito, Z.; Raison,
M.; Tejani, A.; Chilamkurthy, S.; Steiner, B.; Fang, L.; Bai,
J.; and Chintala, S. 2019. Pytorch: An imperative style,
high-performance deep learning library. In Wallach, H.;
Larochelle, H.; Beygelzimer, A.; d’Alché Buc, F.; Fox, E.;
and Garnett, R., eds., Advances in Neural Information Pro-
cessing Systems 32. Curran Associates, Inc. 8026–8037.
Project, E. L. 2020. Covid-19 information in indige-
nous, endangered, and under-resourced languages. https:
//endangeredlanguagesproject.github.io/COVID-19/.
TAUS. 2020. Corona corpus. https://md.taus.net/corona.
van Biljon, E.; Pretorius, A.; and Kreutzer, J. 2020. On
optimal transformer depth for low-resource language trans-
lation. “AfricaNLP” Workshop at the 8th International Con-
ference on Learning Representations.
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones,
L.; Gomez, A. N.; Kaiser, Ł.; and Polosukhin, I. 2017. At-
tention is all you need. In Advances in Neural Information
Processing Systems (NeurIPS).
Viera, A., and Garrett, J. 2005. Understanding interobserver
agreement: the kappa statistic. Fam Med 37(5):360–3.
Way, A.; Haque, R.; Xie, G.; Gaspari, F.; Popovic, M.; and
Poncelas, A. 2020. Facilitating access to multilingual
covid-19 information via neural machine translation. arXiv
preprint arXiv:2005.00283.
without borders, T. 2020. Twb glossary for covid-19. https:
//translatorswithoutborders.org/twb-glossary-for-covid-19/.
Wu, Y.; Schuster, M.; Chen, Z.; Le, Q. V.; Norouzi, M.;
Macherey, W.; Krikun, M.; Cao, Y.; Gao, Q.; Macherey, K.;
Klingner, J.; Shah, A.; Johnson, M.; Liu, X.; Łukasz Kaiser;
Gouws, S.; Kato, Y.; Kudo, T.; Kazawa, H.; Stevens, K.;
Kurian, G.; Patil, N.; Wang, W.; Young, C.; Smith, J.; Riesa,
J.; Rudnick, A.; Vinyals, O.; Corrado, G.; Hughes, M.; and
Dean, J. 2016. Google’s neural machine translation system:
Bridging the gap between human and machine translation.
Xiang, B.; Deng, Y.; and Zhou, B. 2010. Diversify and
combine: Improving word alignment for machine translation
on low-resource languages. In Proceedings of the ACL 2010
Conference Short Papers, 22–26.