Nanotate: Semantically annotating experimental
                    protocols with nanopublications

             Olga Giraldo1[0000 0003 2978 8922] , Miguel Ruano2[0000 0002 7241 7089] , Robin
             A. Richardson3[0000 0002 9984 2720] , Remzi Celebi4[0000 0001 7769 4272] , Michel
                Dumontier4[0000 0003 4727 9435] , and Tobias Kuhn1[0000 0002 1267 0234]
                 1
                     Department of Computer Science, VU Amsterdam, Amsterdam, Netherlands
                                    2
                                      Universidad del Valle, Cali, Colombia
                            3
                              Netherlands eScience Center, Amsterdam, Netherlands
                                4
                                   Maastricht University, Maastricht, Netherlands


                      Abstract. An experimental protocol describes a sequence of tasks ex-
                      ecuted to perform experimental research in biological and biomedical
                      areas, e.g. genetics, immunology, neuroscience and virology. Such exper-
                      imental protocols indicate, for each step, exactly how it should be exe-
                      cuted, often including equipment, reagents, descriptions of critical steps,
                      troubleshooting instructions, other kinds of tips, as well as any other in-
                      formation that researchers deem important for facilitating the reusabil-
                      ity of the protocol. These protocols therefore have a clear systematic
                      structure, but when published they are treated like any other scientific
                      publication i.e. as a narrative text in HTML or PDF format. The formal
                      structure is therefore not easily accessible and cannot be reused. This
                      paper addresses this problem by extracting, representing and publish-
                      ing steps from experimental protocols to make them Findable, Accessi-
                      ble, Interoperable, and Reusable (FAIR). Our work builds upon human
                      annotations in combination with Named Entity Recognition delivering
                      nanopublications. Our software toolkit, Nanotate, is based on a flexible
                      web based annotation environment, namely Hypothes.is, the BioPortal
                      NER web services and the nanopublications infrastructure. Our evalua-
                      tion shows that our approach is viable and our tool user-friendly.


             1       Introduction

             Experimental protocols are documents providing detailed descriptions of the
             processes by means of which results, often data, are generated in biological sci-
             ences. For reproducibility purposes, both data and protocols describing the steps
             followed to obtain the data, should be available. The protocols often include
             equipment, reagents, critical steps, troubleshooting guidelines, tips and other
             information that facilitates reusability. Experimental protocols are described in
             natural language lacking of a formal structure; it is not surprising that important
             details are sometimes missing, e.g. time or temperature to centrifuge a sample,
             precise storage conditions of a suspension, or specific features of an equipment
             or reagent used.


Copyright © 2022 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
2       Giraldo et al.

    To illustrate the importance and nature of experimental protocols, let us
consider “Identification of QTL Associated with Drought Tolerance in Common
Bean”; it involves the execution of a number of protocols, such as sample prepa-
ration, DNA isolation and amplification of the DNA via PCR. The study as a
whole can be seen as a workflow that includes several steps, each of them a pro-
tocol on its own. Each protocol consists of a sequence of structured instructions.
Each of the protocols and also each of their steps have inputs and outputs. In
this sense, steps within each protocol could be seen as the smallest most gran-
ular part of the workflow. Following this analogy, understanding the study as a
container of protocols, both steps within protocols as well as protocols on their
own should be reproducible.
    When protocols are published, they are treated like any other scientific pub-
lication. Little attention is paid to the workflow nature implicit in this kind of
document, or to the chain of provenance indicating where it comes from and how
it has changed. The protocol is understood as a text-based narrative instead of a
self-descriptive Findable Accessible Interoperable and Reusable (FAIR) [21] com-
pliant document [8]. In addition, when protocols are not properly documented or
preserved, researchers can no longer interpret, communicate, or share informa-
tion e↵ectively. To address this problem, in this paper we focus on representing
and publishing steps from experimental protocols as nanopublications [10], in
order to make them FAIR and machine-consumable. Specifically, we provide a
semantic annotation layer in order to improve search, organization, cataloging
and maintenance of protocols.
    Our approach makes use of domain expert annotations in protocols to extract
information about steps, samples, instruments and reagents participating in each
step. In this work, we investigate how we can facilitate the participation of the
domain experts when extracting steps from experimental protocols while hid-
ing the complexity of representing and publishing such artifacts. We developed
Nanotate. Nanotate combines Named Entity Recognition and Human Based
Annotation in a single annotation framework. The results suggest that our ap-
proach is practical, the semantic annotations rich, consistent between annotators
and meaningful; and the interface is perceived to be usable and user-friendly.


2     Background
Open sharing of methods is an essential element to ensure reproducible research
in life science, e.g. through repositories like BioProtocols5 , protocol exchange
[13], protocols.io [19] and MethodsX6 . Some existing approaches are focused on
the semantic representation of protocols in order to provide all the information
required for the replication of biomedical and biological protocols, e.g. the EX-
ACT2 ontology [16] aimed at semantic extraction of knowledge of biomedical
protocols. The SMART Protocols ontology [9] represents protocols in biological
sciences as a workflow embedded within a document. Bioschemas also provides
5
    https://bio-protocol.org/Default.aspx
6
    http://www.sciencedirect.com/science/journal/22150161
                                  Title Suppressed Due to Excessive Length       3

a simple way to add structured data to web pages, such as the LabProtocol
profile7 , which models the details of publications about experimental protocols.
    Provenance is an important aspect in experiments, and Reproduce Microscopy
Experiments (REPRODUCE-ME) [15] is an ontology for the semantic documen-
tation of provenance in microscopy experiments. To make scientific workflows
open and FAIR, a semantic model to publish scientific workflows as FAIR data
has been proposed [6]. Several tools for the manual annotation of biomedical
documents have been developed in order to facilitate search and retrieval of
semantic content in this kind of documents. Some examples are Bionotate [5],
Semantator [17] and Brat [18], and they are based on a strategy to search terms
in thesauruses or ontologies by finding occurrences of a concept chain in a text
fragment using coincidence of terms.

3     Nanotate Approach and Implementation
To address the problem of extracting, representing and publishing steps from
experimental protocols we developed Nanotate, a web based tool that facili-
tates the publication of annotated steps as nanopublications. The goal of our
approach is to allow end-users to directly publish nanopublications about an-
notated protocol steps. Our approach is highly scalable; Nanotate extends an
existing annotation framework, Hypothes.is, which is widely used, and has an
extensive community of users, thus supporting developers and end users. More
importantly, Hypothes.is is highly adaptable; further adaptations of the inter-
face are possible while reusing the backend and authentication mechanisms.
     Nanotate prioritizes Human based annotation in specific parts of the scien-
tific discourse; in this case the identification of steps in experimental protocols
and the material entities participating in each one of them (samples, equip-
ment, reagents). Nanotate incorporates capability for the automatic recognition
of samples, equipment and reagents with classes from 8 ontologies (OBI [2], SP
[9], BAO [1], EFO [12], CHEBI [11], UBERON, NCBI TAXON [7] and ERO
[20]), by way of the BioPortal API.

3.1   Nanotate architecture
The Nanotate architecture is shown in Figure 1. The workflow architecture starts
on the protocols website, called here “annotated site”. We replaced the hy-
pothes.is annotator and sidebar with our own annotation user interface. We
use the BioPortal API to consult the ontologies with terminology related to the
samples, equipment and reagents to be annotated. To produce nanopublications
about individual annotated steps we use the Python libraries nanopub8 and
fairworkflows9 , which allow for searching and publishing nanopublications and
support the construction of FAIR scientific workflows using nanopublications[14].
Below, we further describe the Nanotate architecture components. In order to
7
  https://bioschemas.org/profiles/LabProtocol/0.6-DRAFT-2020_12_08/
8
  https://github.com/fair-workflows/nanopub
9
  https://github.com/fair-workflows/fairworkflows
4        Giraldo et al.


                           Fig. 1. Nanotate architecture


explain each stage of the process, a running example is available online10 .


Generation of annotations (from HTML to JSON). In this process par-
ticipate the following components: i) Annotated site, ii) Nanotate bookmark,
iii) Nanotate client and iv) BioPortal API. i) Annotated site: We start on the
web page of the protocol to be annotated. Nanotate takes as input full exper-
imental protocol documents in HTML. ii) Nanotate bookmark: Nanotate does
not use the hypothes.is client (‘annotator’ and ‘sidebar’); instead, it uses the
bookmarklet approach in a similar way to The HelloWorldAnnotated demo11 .
The “Nanotate bookmark” is used to redirect from an “Annotated Site” to
the “Nanotate client” through a URL that contains data about the selected
text to be annotated. iii) Nanotate client: Our UI works as a template guiding
users through the annotation process and posting the annotations to hypothes.is.
The UI includes a tag box to add one or multiple tags from the available options
(“sample”, “equipment”, “reagent”, “input”, “output”, “step”). After labeling the
selected text, a JSON file in which hypothes.is stores the annotation is generated.
iv) BioPortal API: The next step is adding context to the tagged text by using
ontology terms. Nanotate facilitates the connection with the BioPortal API for
consulting the 8 ontologies mentioned above. When it is possible to link a tagged
text to an ontology term, the resulting JSON includes the URI of the ontology
term. All annotations about steps and the material entities participating on each
one of them are posted to Hypothes.is.


Generation of nanopublications (from JSON to RDF). Here is generated
the publication of each annotated step and their components as nanopublica-
tions. In this process participate the component Nanopub library. To generate
the nanopublications, the users just press the button “Nanopub” located in each
annotated step. The nanopub library was configured to group annotations ac-
cording to their text position. These are grouped, if an annotation that does not
10
     https://git.io/JDFOV
11
     https://github.com/judell/HelloWorldAnnotated
                                 Title Suppressed Due to Excessive Length       5

have the tag “step” is contained in an annotation that does have a tag “step”. All
annotations are sent to Nanotate via API where validation is done and subse-
quent publication. In addition, the published nanopublications are stored locally
in a MongoDB data storage.


Generation of RDF workflows. Once having all the nanopublications for
the individual steps, the next step is creating the corresponding workflow. Here,
Nanotate client, the users consult the nanopublications and press the button
“new workflow”. Then, users register the fields: i) label: to add a name to the
workflow to be created and ii) description: short description about the new
workflow. Finally, the users select the nanopublications that are part of the
workflow. The resulting nanopublications are then published as above. The pub-
lished nanopublications about workflows are also stored locally in a MongoDB
data storage.
    Nanotate is a free and open source tool. The code behind the tool is avail-
able on Github https://github.com/nanotate-tool. End-users can install it
by creating a bookmark. The documentation about how to install and use the
Nanotate is available at http://doi.org/10.5281/zenodo.5101941 A running
instance of the tool can be found at https://nanotate.bitsfetch.com/


4      Evaluation Design

To assess our approach we carried out a controlled annotation study with domain
experts. We evaluated the nature and consistency of the annotations and, the
subjective usability of the tool. The methodology is presented below.


Materials. We worked with i) six open access protocols in molecular biology
from Bio-protocols and Nature Protocol Exchange; See Table 1. ii) human an-
notators: Three annotators with experience in laboratory techniques. iii) the
annotation tool, Nanotate: a web-based tool used in this study. During the
training sessions the participants learnt how to use the tool. iv) training doc-
umentation: We provided the participants with a detailed training document,
how to install and use the tool. It gives examples of samples, equipment, and
reagents and how this information should be annotated. And v) a questionnaire
to evaluate a subjective usability of the Nanotate tool: A usability question-
naire with ten standard questions from the System Usability Scale (SUS) [4].
Experts in life sciences rated the tool following the standard ten questionnaire
items on a five-point scale ranging from “strongly agree” to “strongly disagree”.
The questionnaire can be found online12 .


Methods. Our controlled annotation has a series of activities organized in the
following stages: i) training session, ii) assignment of protocols to annotators,
12
     https://forms.gle/o6JYQ7xY7wVFsqqG8
6        Giraldo et al.

                          Table 1. Set of annotated protocols

           # Protocol ID                 Source                   Steps
           1 DOI:10.21769/BioProtoc.323 Bio-protocols                 7
           2 DOI:10.21203/rs.2.1347/v2   Nature Protocol Exchange    12
           3 DOI:10.1038/protex.2013.007 Nature Protocol Exchange    11
           4 DOI:10.21203/rs.2.1645/v2   Nature Protocol Exchange     8
           5 DOI:10.21769/BioProtoc.1751 Bio-protocols               16
           6 DOI:10.21769/BioProtoc.1077 Bio-protocols               11


iii) review of annotations and iv) generating data for the analysis. In the first
stage, a virtual session was organised with each annotator in order to train them
in the use of the Nanotate tool and give them some tips about good annota-
tion practices. The meetings were carried out by using Google Hangouts. In the
second stage, the six protocols presented in table 1 were annotated by three
human annotators. Then, in the third stage, virtual meetings were scheduled
with annotators in order to solve doubts and inconsistencies. Specifically, when
we had to deal with nanopublications validation problems –A subset of protocol
steps were annotated but the corresponding nanopublication was generated in-
correctly, probably due to problems in the HTML of those protocols. Finally, the
data in the form of the semantic annotations structured as nanopublication as
well as the answers to the usability questionnaire were analyzed. We focused on
tag distribution, ontology coverage, completeness analysis, and inter-annotator
agreement.


5      Results
5.1     Tag distribution
A total of 232 part-of-speech entities were tagged with one of the six available
categories (sample, equipment, reagent, input, output, step). The results of this
stage are summarized in Figure 2 (the full data is available online13 ). In this
study, 6.4% (17) of the samples were also tagged as “input”; and 5.7% (15) of
the samples were also tagged as “ output” (see Figure 2). 22 parts-of-speech were
just tagged as “sample”. These results indicate that the categories “input” and
“ output” were the least used to tag parts-of-speech. The results indicate that
the 65 steps could be identified and annotated (see Figure 2). Finally, 19.7%
(52) parts-of-speech were tagged as “equipment” and 21.6% (57) parts-of-speech
were tagged as “reagent” (see left-hand side of Figure 2).

5.2     Annotations mapped to ontology terms
Parts-of-speech tagged as “sample”, “equipment” and “reagent” were mapped
to ontology terms that come from the 8 aforementioned ontologies available in
13
     http://doi.org/10.5281/zenodo.5089323
                                            Title Suppressed Due to Excessive Length                               7


                                                                                                    52
                                                          equipment                            31
      sample/input            17
                                                                            0
     sample/output            15
                                                                                                         57
           sample                  22                       reagent                       25
                                                                                     12
        equipment                          52
                                                                                                    54
           reagent                              57
                                                               sample                     23
              step                                   65                     1

                     0   10   20 30 40 50 60 70           80            0       10    20 30 40 50 60 70       80
                              Number of Annotations                                   Number of Annotations


Fig. 2. Tag distribution (left). Comparison to ontology matching (right), total number
of annotations (blue), those linked to ontology terms (red) and those with an ontology
class available (yellow)


BioPortal. The results of this stage are summarized in Figure 2 (right; the full
data is available online14 ). 59.6 % (31) of the text tagged as “equipment” were
mapped to ontology terms from OBI (8), BAO (20), SP (2) and ERO (1). 43.9%
(25) of the text tagged as “reagent” could be mapped to ontology terms from
CHEBI (17), OBI (3), SP (4) and BAO (1). 39.7% (23) of the text tagged as
“sample” could be mapped to ontology terms from NCBI TAXON (3), UBERON
(1), CHEBI (11), BAO (2), SP (5) and EFO (1). We manually analyzed the cases
that we not mapped to an ontology term to find out about the reasons for that.
We found the following four main reasons: i) terminology not represented in
the available ontologies; for instance reagent manufacturer names (e.g.: TRIsol)
and acronyms or short names (e,g,: PBMCs, PBS). ii) name of specific type
of reagents (e.g.: extraction bu↵er, this is a bu↵er), and samples (e.g.: human
venous blood, this is a blood). In this second case, just superclasses are available
in the ontologies used (See yellow bar in Figure 2). These ontologies do not reach
a high level of granularity. iii) Plural words (like mosquitoes, bacteria), are not
represented in our subset of ontologies, and iv) Typos, for instance Micropistle
instead of Micropestle.


5.3      Completeness checks

As the above analyses focused only on the annotated entities, we should also
have a look at completeness. We are looking for indications of the extent to
which entities were missed. We found that first steps should always specify an
input, while last steps specify an output. In the set of analyzed protocols, we
found that 100% of first steps included an annotated input; 83.33%, five of
the six last steps, included an annotated output. For equipment annotations,
14
     http://doi.org/10.5281/zenodo.5089726
8        Giraldo et al.

steps in such protocols almost always involved the use of some equipment. We
checked how often these were correctly covered. We found that in 55,38% (36)
of the steps, at least one of the equipment was not explicitly mentioned in the
text. The domain experts are able to infer the use of some equipment, due to
the narrative; for instance, use of experimental actions like “centrifuge”, and
additional information such as temperature of centrifugation (e.g.: 4 C) help to
deduce the use of a “refrigerated centrifuge” in a particular step (more examples
are available in a table online15 ). This table include nanopublication links where
inferred equipment are missing in the assertion. We can therefore conclude that
the completeness is far from perfect, but the annotations cover a substantial part
of the mentioned entities.

5.4     Inter-annotator agreement
Annotators categorized the annotations into the di↵erent workflow component
classes (i.e., step, reagent, equipment, input, output) with perfect agreement.
However, in some cases, the spanning text of these annotations overlapped but
did not match perfectly. We observed that the annotators chose the same span-
ning text to annotate “step” and “sample/input” (samples that were also tagged
as input). Partial matches were identified where one or more annotators high-
lighted slightly di↵erent text. That was the case of the 9.6% of the text tagged as
“equipment”, 15,4% of the text tagged as “sample”, 17,5% of the text tagged as
“reagent” and 20% of the text tagged as “sample/output” (samples that were also
tagged as output). As was presented in subsection 5.2, we found a low incidence
in annotations linked to ontology terms. However, the annotators have a high
agreement on linking annotations to the ontology terms (Fleiss’s Kappa: 0.70).
The full data, including the set of annotations used to calculate inter-annotator
agreement is available online16 .

5.5     Subjective usability by questionnaire (SUS)
The table https://doi.org/10.5281/zenodo.5528946 summarizes the results
of the SUS. The participants were three annotators of the annotation study de-
scribed above and one additional expert who used the tool but did not participate
in the annotation study. Overall, the tool got a SUS score of 93.12%; between
“excellent” and “best imaginable” on the adjective scale [3]. From these results
it is clear that the tool was well received and that users hardly experienced
problems using it.


6      Discussion and conclusions
This work involves manual annotation of protocols in order to extract informa-
tion about samples, instruments and, reagents participating in each step. The
15
     https://doi.org/10.5281/zenodo.5528934
16
     http://doi.org/10.5281/zenodo.5095720
                                    Title Suppressed Due to Excessive Length          9

methodological aspects involved the participation of domain experts, reusing
existing resources while focusing on how end users could make that valuable
information from experimental protocols readily available in a semantic man-
ner. We developed Nanotate, a tool to publish nanopublications from annotated
protocol steps. It extends the Hypothesis platform by making it compliant with
the nanopublication workflow. It also adds NER capabilities from BioPortal in
a single annotation framework.
     From this experiment we concluded that the Nanotate tool, the guidelines
about what and how to annotate, and the knowledge that comes from experts in
laboratory techniques were key in achieving a high consistency in the annotations
and the subsequent nanopublications. This is a consequence of a standardized
annotation process to publish individual protocol steps and their participants
(samples, equipment, reagents). We also found some missing elements in proto-
cols. For example, no inputs or outputs. Some equipment could be inferred by
experts but it could not be annotated because there was no explicit mention
to it. Such inaccuracies limit the reproducibility and reusability of protocols. It
is necessary to improve reporting structures for experimental protocols, this re-
quires collective e↵orts from authors, peer reviewers, editors and funding bodies
[9]. The main limitation in this study was the lack of a large number of annota-
tors and protocols. Our study started during the first year of the covid pandemic
and it was difficult to plan the virtual sessions and find annotators. Nanotate
facilitates the annotation of web protocols in HTML format. As a future work,
we want to facilitate the annotation of protocols available in other formats.

Acknowledgments
This work was supported by the Dutch Research Council (NWO)(No. 628.011.011).


References
 1. Abeyruwan, S., Vempati, U.D., Küçük-McGinty, H., Visser, U., Koleti, A., et al.:
    Evolving BioAssay ontology (BAO): modularization, integration and applications.
    Journal of Biomedical Semantics 5(Suppl 1 Proceedings of the Bio-Ontologies Spec
    Interest G), S5 (2014)
 2. Bandrowski, A., Brinkman, R., Brochhausen, M., Brush, M.H., Bug, B., et al.: The
    Ontology for Biomedical Investigations. PLOS ONE 11(4), e0154556 (apr 2016)
 3. Bangor, A., Kortum, P.T., Miller, J.T.: An empirical evaluation of the system
    usability scale. Intl. Journal of Human–Computer Interaction 24(6) (Jul 2008).
    https://doi.org/https://doi.org/10.1080/10447310802205776
 4. Brooke, J.: Sus: A quick and dirty usability scale. Usability Eval. Ind. 189 (11
    1995)
 5. C, C., T, M., A, B., P, W.D., L, P.: Collaborative text-annotation resource for
    disease-centered relation extraction from biomedical text. J Biomed Inform 42(5),
    967–977 (2009). https://doi.org/https://doi.org/10.1016/j.jbi.2009.02.001
 6. Celebi, R., Rebelo Moreira, J., Hassan, A.A., Ayyar, S., Ridder, L., et al.: Towards
    fair protocols and workflows: the openpredict use case. PeerJ. Computer science
    6, e281–e281 (2020). https://doi.org/https://doi.org/10.7717/peerj-cs.281
10      Giraldo et al.

 7. Federhen, S.: Type material in the NCBI Taxonomy Database. Nucleic Acids Res
    43, D1086–98 (2015)
 8. Giraldo, O., Garcia, A., Corcho, O.: A guideline for reporting ex-
    perimental protocols in life sciences. PeerJ 6(e4795) (May 2018).
    https://doi.org/https://doi.org/10.7717/peerj.4795
 9. Giraldo, O., Garcı́a, A., López, F., Corcho, O.: Using semantics for representing
    experimental protocols. Journal of Biomedical Semantics 8(1), 52 (Nov 2017).
    https://doi.org/https://doi.org/10.1186/s13326-017-0160-y
10. Groth, P., Gibson, A., Velterop, J.: The anatomy of a nanopublication 30, 51–56
    (2010). https://doi.org/https://doi.org/10.3233/ISU-2010-0613
11. Hastings, J., de Matos, P., Dekker, A., Ennis, M., Harsha, B., et al.: The ChEBI
    reference database and ontology for biologically relevant chemistry: enhancements
    for 2013. Nucleic Acids Res 41, D456–63 (2013)
12. Malone, J., Holloway, E., Adamusiak, T., Kapushesky, M., Zheng, J., et al.: Model-
    ing sample variables with an Experimental Factor Ontology. Bioinformatics 26(8),
    1112–1118 (2010)
13. Protocols, N.: Introducing the new protocol exchange site. Nat Protoc 14(1945)
    (Jun 2019). https://doi.org/https://doi.org/10.1038/s41596-019-0199-6
14. Richardson, R.A., Celebi, R., van der Burg, S., Smits, D., Ridder, L., Du-
    montier, M., Kuhn, T.: User-friendly composition of fair workflows in a note-
    book environment. In: Proceedings of the 11th on Knowledge Capture Con-
    ference. p. 1–8. K-CAP ’21, Association for Computing Machinery, New York,
    NY, USA (2021). https://doi.org/10.1145/3460210.3493546, https://doi.org/
    10.1145/3460210.3493546
15. Samuel, S., König-Ries, B.: Reproduce-me: Ontology-based data access for repro-
    ducibility of microscopy experiments. In: Blomqvist, E., Hose, K., Paulheim, H.,
    Lawrynowicz, A., Ciravegna, F., Hartig, O. (eds.) The Semantic Web: ESWC 2017
    Satellite Events. pp. 17–20. Springer International Publishing, Cham (2017)
16. Soldatova, L.N., Nadis, D., King, R.D., Basu, P.S., Haddi, E., et al.: Exact2:
    the semantics of biomedical protocols. BMC Bioinformatics 15(14), S5 (2014).
    https://doi.org/https://doi.org/10.1186/1471-2105-15-S14-S5
17. Song, D., Chute, C.G., Tao, C.: Semantator: annotating clinical narratives with
    semantic web ontologies. (article). AMIA Jt Summits Transl Sci Proc 2012, 20–29
    (2012)
18. Stenetorp, P., Pyysalo, S., Topić, G., Ohta, T., Ananiadou, S., Tsujii, J.: brat: a
    web-based tool for NLP-assisted text annotation. In: Proceedings of the Demon-
    strations at the 13th Conference of the European Chapter of the Association for
    Computational Linguistics. pp. 102–107. Association for Computational Linguis-
    tics, Avignon, France (Apr 2012), https://aclanthology.org/E12-2021
19. Teytelman, L., Stoliartchouk, A., Kindler, L., Hurwitz, B.L.: Protocols.io: Vir-
    tual communities for protocol development and discussion. PLoS biology 22(14(8))
    (Aug 2016). https://doi.org/10.1371/journal.pbio.1002538, https://doi.org/10.
    1371/journal.pbio.1002538
20. Torniai, C., Brush, M., Vasilevsky, N., Segerdell, E., Wilson, M., et al.: Devel-
    oping an application ontology for biomedical resource annotation and retrieval:
    Challenges and lessons learned, vol. 833, pp. 101–108 (2011)
21. Wilkinson, M.D., Dumontier, M., Aalbersberg, I.J.J., Appleton, G.,
    Axton, M., et al.: The fair guiding principles for scientific data
    management and stewardship. Scientific data 3(160018) (Mar 2016).
    https://doi.org/https://doi.org/10.1038/sdata.2016.18