=Paper= {{Paper |id=Vol-3151/short4 |storemode=property |title=Analyzing the composition of remedies in ancient pharmacopeias with FCA |pdfUrl=https://ceur-ws.org/Vol-3151/short4.pdf |volume=Vol-3151 |authors=Agnès Braud,Xavier Dolques,Pierre Fechter,Nicolas Lachiche,Florence Le Ber,Véronique Pitchon |dblpUrl=https://dblp.org/rec/conf/icfca/BraudDFLBP21 }} ==Analyzing the composition of remedies in ancient pharmacopeias with FCA== https://ceur-ws.org/Vol-3151/short4.pdf
      Analyzing the composition of remedies in
         ancient pharmacopeias with FCA

      Agnès Braud1 , Xavier Dolques1 , Pierre Fechter2 , Nicolas Lachiche1 ,
                  Florence Le Ber1 , and Véronique Pitchon3

(1) Université de Strasbourg, CNRS, ENGEES, ICube UMR 7357, F67000 Strasbourg
                {agnes.braud,dolques,nicolas.lachiche}@unistra.fr
                          florence.leber@engees.unistra.fr
      (2) Université de Strasbourg, CNRS, ESBS UMR 7242, F67000 Strasbourg
                                 p.fechter@unistra.fr
   (3) Université de Strasbourg, CNRS, Archimede UMR 7044, F67000 Strasbourg
                                  pitchon@unistra.fr



       Abstract. This paper presents the collaborative work led in an inter-
       disciplinary project on studying remedies in Arabic medieval pharma-
       copeia. The goal is to find new molecules in plants or combinations of
       active principles to substitute them for antibiotics which encounter some
       limits. Formal Concept Analysis is used to discover co-occurrences of
       ingredients. We describe the difficulties inherent to these data and the
       results of preliminary analyses.


1    Introduction

This paper presents the preliminary work led in an interdisciplinary project
that aims at studying remedies in Arabic medieval pharmacopeia. The project
gathers researchers in history, microbiology, botanic and computer science. The
general goal is to discover the active ingredients in remedies and to test them
experimentally on bacteriae. The extracted knowledge could be used to substi-
tute molecules in plants or combinations of active principles for antibiotics which
encounter some limits.
    In this work, we focus on medieval remedies prescribed for urinary problems.
We present a dataset of remedies extracted from 5 pharmacopeias. This dataset
is rather small, but the modeling is not straightforward. Although the building
of the database is not finished, some analyses can already be done on the com-
position of remedies. Formal Concept Analysis (FCA) [3] appears as particularly
suitable for this task, and preliminary results we obtained using it seemed very
interesting to historians and microbiologists.
    Section 2 presents the process for carrying out the project and in particu-
lar data collection. Section 3 describes the data, their characteristics and the
difficulties resulting from them. In Sect. 4, we present our preliminary analyses
and show the benefits of using FCA in this context. Finally, Sect. 5 yields some
conclusions.
RealDataFCA'2021: Analyzing Real Data with Formal Concept Analysis, June 29,
2021, Strasbourg, France
       Copyright © 2021 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0).
2   Process overview

The whole process of this research work is presented on Fig. 1.




                      Fig. 1. Overview of the project process


    The first step is to select pharmacopeias and remedies in them. This is done
by an historian, who reads Arabic, on the basis of the symptoms indicated in the
description of the remedy and on her knowledge of Arabic medicine. Let us note
that Arabic medecine is based on the theory of Humours that considers that the
body is constituted of four humours (water, fire, earth, air) having some qualities
(hot, cold, dry, wet). The second step consists in clarifying some plants which
are not described clearly enough in the text, with the help of a botanist. The
next step is to build a database to store the remedies, whose modeling difficulties
will be discussed in the next section. Then comes the formulation of questions
that experts have on the data and that can be solved using FCA. The analysis of
lattices and rules may bring new insights or questions both in history and on the
use of plants in remedies. Some hypotheses on the use of plants stemed from this
analysis will be studied in collaboration with a botanist (for the identification
and characterization of therapeutic properties of plants) and a pharmacologist
studying drugs based on natural substances (for the identification of the active
molecules in plants), and then tested on bacteriae by microbiologists.
    In this project, 5 pharmacopeias have been selected:

 – two versions of the pharmacopeia of Sābūr ibn Sahl (9th century). Sābūr ibn
   Sahl was born in Jundishāpūr (Iran) and was chief of the hospital there before
   moving to Iraq. A version of the pharmacopeia [6] was for the pharmacists
   of ‘Ad.udı̄’s hospital in Baghdad (this pharmacopeia will be designated by
Fig. 2. Example of pages of manuscripts - Left: Dioscorides Pedanius of Anazarbos,
Kitab Disqūrı̄dis (Dioscorides’s book), circa 1100–1200, National Library of France.
Right: Al-Kindı̄ Folio 98r (photo credit: Aya Sofia, Istanbul, Ms. N° 3603)


   ISA), and a shorter one [4] for pharmacists outside of the hospital who were
   considered less skillful (designated by ISP);
 – the pharmacopeia of Al-Kindı̄ (9th century) [8], who studied in Bassora and
   Baghdad. This pharmacopeia will be designated by AK;
 – the pharmacopeia of Al-Tilmı̄dh (11th-12th centuries) [5], who was chief of
   ‘Ad.udı̄’s hospital in Baghdad. This pharmacopeia will be designated by AT;
 – the pharmacopeia of Al-Samarqandı̄ [9]. This pharmacopeia will be desig-
   nated by AS.
    The historian has based her choice firstly on the availability of books de-
scribing them, then preferably around Baghdad because of her knowledge on
this area and also because Arabic pharmacopeias have a very rich content, and
on a long period for a wide historical view. A few other criteria are linked to
interesting historical questions.
    Pharmacopeias are available as translations and/or original manuscripts. Fig-
ure 2 shows pages of two manuscripts. Both translations and original manuscripts
may contain errors, so in the first case the historian may also have an original
manuscript to check, and in the second case there may be errors made by copy-
ists so that several different manuscripts are necessary. Among several hundreds
of remedies, 38 prescribed for urinary symptoms were found. Among these, we
removed 3 for this work: 1 from ISA which was already in ISP with exactly the
same recipe, and 2 universal remedies with more than 50 ingredients (1 in ISP,
1 in ISA).
    In the following, remedy identifiers will be prefixed by the acronym of the
pharmacopeia in which they appear.


3   Data characteristics
The data that have been collected are rather small, but the collection requires
a lot of time and the data are actually complex. The description of a remedy is
composed of a list of ingredients (see Table 1 for an example) with quantities,
and a description of the preparation. Let us note that part of the ingredients
are present in the recipe to treat the disease, and some others may be present
to treat their side effects, but this is not indicated.


                      Table 1. List of ingredients of a remedy

     - Armenian borax         - Ginger                 - Peeled sweet almonds
     - Kerman cumin           - White pepper           - Rue leaves
     - Parsley                - Scammony               - Dates from Hairūn




    Ingredients belong to different categories: plant, mushroom, mineral, animal
(like honey or musc) and metal. The description of plants is not homogeneous:
 – some plant descriptions correspond to different taxonomic rank (e.g. species
   for Scammony, genus for Euphorbia, several families for Fern);
 – some descriptions correspond to a plant, other to a part of a plant, or even
   a part of a part of... a plant (e.g., Acorn/Acorn shell/Inside peel of acorn
   shell, or Thymus without part while just a part is probably used). Let us
   notice that several parts of the same plant can be used in a remedy;
 – some ingredients have some transformations associated (e.g., Peeled and
   grilled bitter almonds);
 – some ingredients are a mix of ingredients (e.g., Persian za’atar);
 – a few ingredients have an origin (e.g., Kerman cumin) and it is an important
   information since it usually reflects a high quality;
 – some are still to determine (e.g., Idrūmaghmū).
    Data modeling requires expert knowledge to determine which information is
important to capture; for example, should Acorn and Inside peel of acorn shell
be comparable? Moreover, for a work on active principles, having a precise de-
scription of plants is important as different species may have different properties.
The exact species may be implicit for the author, who is used to a specific one
or to the local one. A more precise information can be inferred on the species,
either by checking the translation, searching for contextual information in books,
or relying on illustrations present in the book. In this last case, the botanist may
help as well as with knowledge on the parts of plants traditionally used. At the
end, the discussions on data modeling have raised new questions. For example,
after a discussion on the problem of representing acorn and the different parts
used, the historian has decided to study further in books the use of oak.
    All these points make that a relational database is not suitable to store these
data. A graph database offers more flexibility to represent them and to adapt
the representation to new cases. We have thus chosen this type of database. For
the same reasons, building a context for FCA requires some choices among the
information.
    Currently, the data on which we work are the data issued from the translation
of texts. The work to clarify species and parts of plants is in progress, led by
the historian and the botanist, and will take some time, but it is still possible
to start some analyses.


4   Preliminary analyses

For our preliminary analyses on the data, we have worked on two questions
asked by the microbiologist: ”which ingredients appear the most often?” (Q1)
and ”which ingredients often appear together?” (Q2). As shown in previous
works [2], FCA-based approaches are very suitable to solve this type of questions.
    We have worked both on the ingredients as they appear in the recipes, and
on these ingredients with the ones coming from plants replaced just by the corre-
sponding plant (without part and transformations, such as Oak for Acorn shell or
Almond tree for Peeled sweet almonds). The plant can be at the level of species,
genus, family... Let us note R the set of remedies, I the set of ingredients with-
out modification and IP the set of ingredients with the ingredients coming from
plants replaced by the plant, contains ⊆ R × I and containsP ⊆ R × IP two in-
cidence relations describing the composition of remedies. We have thus built one
context CI = (R, I, contains) and another context CIP = (R, IP , containsP ).
Let us recall that we work on 35 remedies concerning urinary symptoms. They
are distributed in the 5 pharmacopeias as follows: 3 in AK, 5 in AS, 7 in AT, 6
in ISA and 14 in ISP. Their recipes contain between 2 and 21 ingredients giving
170 different ingredient descriptions in CI, and 144 attributes in CIP. The lat-
tice obtained from CI (LCI ) contains 138 concepts, the one obtained from CIP
(LCIP ) 151.
    Starting with question Q1 (”which ingredients appear the most often?”), we
study the lattices and in particular the concepts with the biggest extents. Table 2
shows some of the concepts obtained in LCI and the ones in LCIP corresponding
to the same ingredients but considering the plant. The extent is presented so that
the different pharmacopeias are separated. Concept ({Celery seeds}, {AK114,
AT16, AT20, AT72, ISA137, ISA162, ISP100, ISP16, ISP185, ISP186, ISP78,
ISP9 }) from LCI reveals that Celery seeds is the ingredient that appears the
most often (12 times) in remedies. Then Indian nard appears 9 times, Saffron
and Black pepper 7 times. Of course, it is possible to know which ingredients
appear most often by computing statistics in the database, but the extension of
the concept also shows that Celery seeds appears in 4 out of 5 pharmacopeias.
Similarly, lattice LCIP reveals Celery as appearing the most often in remedies.
There is however a little change with Oak appearing with 8 occurrences, while
Acorn appeared only 6 times in LCI . As can be seen in Table 2, this is due
to the use of different parts of oak with different transformations (Acorn in 6
remedies, Acorn shell and Grilled inside peel of acorn shell in remedy AT137,
Grilled acorn shell in AT80). There are thus 3 concepts introducing oak parts in
LCI . Moreover, among the 6 remedies in the extension of the concept introducing
Acorn shell, 2 are from ISA pharmacopeia and none from ISP while they are
from the same author. This may be due to the fact that ISP was for pharmacists
outside of the hospital who where considered less skillful than those from the
hospital, and acorn shell is toxic and thus should be used carefully.


                 Table 2. Examples of concepts in LCI and LCIP


              Concepts of LCI                          Concepts of LCIP
      Intent          Extent (cardinality)       Intent     Extent (cardinality)
Celery seeds         AK114,                   Celery       AK114,
                     AT16, AT20, AT72,                     AT16, AT20, AT72,
                     ISA137, ISA162,                       ISA137, ISA162,
                     ISP100, ISP16, ISP185,                ISP100, ISP16, ISP185,
                     ISP186, ISP78, ISP9 (12)              ISP186, ISP78, ISP9 (12)
Indian nard          AT137, AT72, AT80,       Indian nard AT137, AT72, AT80,
                     ISA137, ISA162, ISA261,               ISA137, ISA162, ISA261,
                     ISP16, ISP27, ISP78 (9)               ISP16, ISP27, ISP78 (9)
Saffron              AT16,                    Saffron      AT16,
                     ISP16, ISP186, ISP27,                 ISP16, ISP186, ISP27,
                     ISP6, ISP70, ISP9 (7)                 ISP6, ISP70, ISP9 (7)
Black pepper         AS78,                    Black pepper AS78,
                     ISA137, ISA261,                       ISA137, ISA261,
                     ISP47, ISP54, ISP6,                   ISP47, ISP54, ISP6,
                     ISP70 (7)                             ISP70 (7)
Acorn                AS124b, ASN39,
                     AT137, AT81
                     ISA125, ISA231 (6)
Cypress, Olibanum AT137 (1)
bark,    Olibanum,
Lavender, Cumin,
Acorn         shell,                          Oak          AS124b, ASN39,
Grilled inside peel                                        AT137, AT377, AT80,
of acorn shell,                                            AT81,
Indian nard                                                ISA125, ISA231 (8)
Cypress, Olibanum AT80 (1)
bark,    Olibanum,
Lavender, Grilled
acorn shell, Indian
nard




    For question Q2 (”which ingredients often appear together?”), association
rules are a good starting point. We computed those with a minimal support of
3 remedies and a minimal confidence of 80%. For context CI, we obtained 15
rules. Among these rules, 10 were implication rules. Three of these implication
rules are given in Table 3.
    One appeared with a support of 7: Celery seeds → Fennel seeds, another
one with a support of 6: Opium → Saffron. With a support of 4, we had 2
                            Table 3. Examples of implication rules obtained on context CI

                                             Rule                                   Support
                                             Celery seeds → Fennel seeds               7
                                             Opium → Saffron                           6
                                             Asarabacca, Indian nard → Celery seeds    4




implication rules involving 3 ingredients such as: Asarabacca, Indian nard →
Celery seeds. Context CIP leads to more rules: 24, among which 20 implication
rules. If we consider the first rule (Celery seeds → Fennel seeds), we then explore
the subconcepts of the concept introducing Celery seeds to see which ingredients
other than Fennel seeds also appear with it. Figure 3 shows an extract of this
part of the lattice. This analysis is interesting for microbiologists, in order to
make hypotheses on the respective roles of the ingredients, and which ones play
similar roles. Finally, experts were really interested by the results obtained with
FCA on this first dataset.


                                                                                                    Concept_ingredient_136
                                                                                                         Celery seeds
                                                                                                          U_AK114
                                                                                                           U_AT16
                                                                                                           U_AT20
                                                                                                           U_AT72
                                                                                                          U_ISA137
                                                                                                          U_ISA162
                                                                                                          U_ISP100
                                                                                                          U_ISP16
                                                                                                          U_ISP185
                                                                                                          U_ISP186
                                                                                                          U_ISP78
                                                                                                           U_ISP9


                                                                                                    Concept_ingredient_134
                         Concept_ingredient_121   Concept_ingredient_126   Concept_ingredient_127       Celery seeds          Concept_ingredient_119                            Concept_ingredient_120
Concept_ingredient_53         Celery seeds             Celery seeds             Celery seeds            Fennel seeds                                    Concept_ingredient_46        Celery seeds        Concept_ingredient_39
                                                                                                                             Peeled snake melon seeds
                              Asarabacca                Indian nard                Anise                  U_AK114                                                                       Opium
    Celery seeds                                                                                                                   Celery seeds             Celery seeds                                  Light-colored poppy
Peeled bitter almonds           U_AT72                    U_AT72                 U_AK114                  U_AT16                                           Gum ammoniac                Saffron               Celery seeds
                                                                                                                                      U_AT20
                               U_ISA137                  U_ISA137                U_ISA162                 U_AT72                                                                       U_AT16
     U_ISP185                                                                                                                         U_AT72                  U_AT72                                            U_AT16
                               U_ISA162                  U_ISA162                U_ISP16                 U_ISA162                                                                     U_ISP16
     U_ISP186                                                                                                                        U_ISP186                U_ISP100                                           U_AT20
                               U_ISP16                   U_ISP16                 U_ISP185                U_ISP186                     U_ISP9                                          U_ISP186
                               U_ISP185                  U_ISP78                 U_ISP78                  U_ISP78                                                                      U_ISP9
                                                                                                           U_ISP9


Concept_ingredient_113    Concept_ingredient_88   Concept_ingredient_108   Concept_ingredient_107   Concept_ingredient_106    Concept_ingredient_110
     Celery seeds             Celery seeds             Celery seeds            Celery seeds             Celery seeds         Peeled snake melon seeds
      Indian nard                Anise                     Anise                Indian nard                Anise                   Celery seeds
     Asarabacca               Asarabacca                Indian nard            Fennel seeds             Fennel seeds                   Wine
        U_AT72                                                                                                                     Fennel seeds
                               U_ISA162                  U_ISA162                 U_AT72                  U_AK114
       U_ISA137                U_ISP16                   U_ISP16                 U_ISA162                U_ISA162                     U_AT72
       U_ISA162                U_ISP185                  U_ISP78                 U_ISP78                  U_ISP78                    U_ISP186
       U_ISP16                                                                                                                        U_ISP9




                                      Fig. 3. Extract of lattice LCI extracted from context CI



    The work in [1] is based on the same objective of finding co-occurring ingre-
dients but in a different pharmacopeia and for skin, mouth, or eye infections.
It uses community detection in a network of ingredients. FCA is a more direct
process that gives a complete view of the combinations of ingredients. Moreover
the expert can see the remedies corresponding to each combination and exploit
association rules. An FCA-based approach has been used in a similar project
focusing on the identification of local plants for addressing sanitary problems in
Sub-Saharan Africa [10, 7]. In this project, data represent plants or part of plants,
and how, and in which country, they are used to treat animal or plant diseases.
They are collected from various contemporary texts and concern contemporary
practices, unlike our data for which we thus have more uncertainty.
5    Conclusion
This project highlights several interesting points about the work on such real
data. First, the necessity and richness of adopting an interdisciplinary approach.
Indeed, interdisciplinarity is required at each step of the process. Data prepara-
tion requires historians, botanists and computer scientists. The questioning on
data leading to the analyses realized by computer scientists comes from histo-
rians and microbiologists. Finally, the results are discussed with historians and
microbiologists, who will also perform experimental tests of interesting hypothe-
ses. The questioning evolves all along the process with the discussions and new
results. Second point, FCA appears as particularly suitable for expert needs,
and even the simple analyses of lattices and rules presented in this paper have
revealed useful insights.
    The next steps of the project will aim at enriching the database, both by
clarifying ingredients and by integrating other data like quantities for the ingre-
dients, symptoms for which remedies are prescribed, preparation of the remedies,
remedies for other kind of diseases, and by adding information on plants such as
toxicity. More complex analyses will then be possible.
    Currently, the amount of data the historian needs to work is rather small,
however if the amount of data increases we may have to search for techniques
to facilitate the exploration of the results.

References
 1. Connelly, E., del Genio, C.I., Harrison, F.: Data mining a medieval medical text re-
    veals patterns in ingredient choice that reflect biological activity against infectious
    agents 11(1) (2020)
 2. Couceiro, M., Napoli, A.: Elements About Exploratory, Knowledge-Based, Hy-
    brid, and Explainable Knowledge Discovery. In: ICFCA 2019, Frankfurt, Germany.
    LNAI, vol. 11511, pp. 3–16. Springer (Jun 2019)
 3. Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations.
    Springer Verlag (1999)
 4. Kahl, O.: The Small dispensatory: Sābūr ibn Sahl. Brill, Leiden (2003)
 5. Kahl, O.: The dispensatory of Ibn at-Tilmı̄dh, Arabic text, English translation,
    study and glossary. Brill, Leiden (2007)
 6. Kahl, O.: Sābūr ibn Sahl’s Dispensatory in the Recension of the ’Ad.udı̄ Hospital.
    Brill, Leiden; Boston (2009)
 7. Keip, P., Gutierrez, A., Huchard, M., Le Ber, F., Sarter, S., Silvie, P., Martin,
    P.: Effects of Input Data Formalisation in Relational Concept Analysis for a Data
    Model with a Ternary Relation. In: ICFCA 2019, Frankfurt, Germany. LNCS, vol.
    11511, pp. 191–207. Springer (Jun 2019)
 8. Levey, M.: The Medical Formulary or Aqrābādhı̄n of Al-Kindı̄. Univ. of Pennsyl-
    vania Press, Philadelphia (1966)
 9. Levey, M., al Khaledy, N.: Chemistry in the Medical Formulary of Al-Samarqandı̄.
    Univ. of Pennsylvania Press, Philadelphia (1967)
10. Silvie, P.J., Martin, P., Huchard, M., Keip, P., Gutierrez, A., Sarter, S.: Prototyping
    a Knowledge-Based System to Identify Botanical Extracts for Plant Health in Sub-
    Saharan Africa. Plants 10(5) (2021)