1. Introduction

Exploring the Use of Ontology Components for Distantly-Supervised Disease and Phenotype Named Entity Recognition

Sumyyah Toonsi

0 1

Şenay Kafkas

0 1

Robert Hoehndorf

0 1 0 Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST) , Thuwal, 23955, Kingdom of Saudi Arabia 1 Computer, Electrical and Mathematical Sciences & Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST) , Thuwal, 23955, Kingdom of Saudi Arabia

13 24

The lack of curated corpora is one of the major obstacles for Named Entity Recognition (NER). With the advancements in deep learning and development of robust language models, distant supervision utilizing weakly labelled data is often used to alleviate this problem. Previous approaches utilized weakly labeled corpora from Wikipedia or from the literature. However, to the best of our knowledge, none of them explored the use of the diferent ontology components for disease/phenotype NER under the distant supervision scheme. In this study, we explored whether diferent ontology components can be used to develop a distantly supervised disease/phenotype entity recognition model. We trained diferent models by considering ontology labels, synonyms, definitions, axioms and their combinations in addition to a model trained on literature. Results showed that content from the disease/phenotype ontologies can be exploited to develop a NER model performing at the state-of-the-art level. In particular, models that utilised both the ontology definitions and axioms showed competitive performance compared to the model trained on literature. This relieves the need of finding and annotating external corpora. Furthermore, models trained using ontology components made zero-shot predictions on the test datasets which were not observed by the models training on the literature based datasets.

eol>Named Entity Recognition Text mining ontologies

1. Introduction

Named Entity Recognition (NER) is a form of Natural Language processing (NLP) that aims to identify and classify named entities such as organisation, person, disease and genes in text. NER is a challenging task due to the nature of language which includes abbreviations, synonymous entities, and in general variable descriptions of entities.

Early methods for NER used dictionaries due to their applicability and time eficiency. Lexical approaches such as the NCBO (National Center for Biomedical Ontology) annotator [ 1 ], ZOOMA [ 2 ], and the OBO (Open Biological and Biomedical Ontologies) annotator [ 3 ] are not able to recognise new concepts and cannot detect all variations of expressions. This is because once dictionaries are constructed with terms, they can only find exact matches to those terms. Hence, dictionary-based approaches sufer from low recall.

With the emergence of machine learning, better NER methods were developed. This was possible through exposing statistical models to curated text where mentions of entities are identified by human curators and provided to these models. Subsequently, these models were able to generalize to unseen entities better than previous methods. For instance, GNormPlus [ 4 ] was developed to find gene/protein mentions using a supervised model which demonstrated competitive results at the time. Although supervised methods showed remarkable improvements in performance, they require curated instances for the model to learn. That is, the model expects instances of text where mentions of entities are clearly provided to learn to distinguish concepts of interest. This becomes a serious problem when one wants to recognise a novel/unexplored concept. Moreover, supervised methods often fail to recognise concepts uncovered by the curated corpora.

To alleviate the need for curated corpora, distant-supervision was explored for NER. In particular, distantly supervised models are trained on a weakly labeled training set, i.e., obtained from an imprecise source. For instance, dictionaries could be used to annotate text with exact matches which can produce both false positives and false negatives. Methods like BOND[ 5 ], PatNER[ 6 ], ChemNER[ 7 ], PhenoTagger [8], Conf-MPU [9] and Dong and colleagues [10] demonstrated the potential of distant supervision for NER. The aforementioned methods created weakly labeled sets using labels and synonyms found in ontologies/vocabularies to extract training instances from unlabeled corpora. Later, these instances were used to train diferent models which in some cases outperformed state-of-the-art methods.

Inspired by the advances achieved by distant supervision, we explored the contribution of diferent components of ontologies (Labels and synonyms, definitions, and complex axioms) to the task of NER under the distant supervision scheme. In all of the previously mentioned distantly-supervised NER methods, only labels and synonyms of ontologies/vocabularies were used to create the weakly labeled corpora from literature. The use of diferent ontology components to develop NER models has not been comprehensively explored for diseases/phenotypes. In addition to the use of labels and synonyms, in this study, we go a step further to explore the use of definitions and axioms to develop a disease/phenotype NER model. We hypothesize that the dense and rich knowledge found in ontologies can be used to develop NER models without the need of external corpora such as literature abstracts. We conducted our experiments on disease and phenotype entity recognition because, the study of diseases and phenotypes is important for understanding disease diagnosis, treatment and epidemiology.

2. Materials and Methods 2.1. Ontologies, literature resource and benchmark corpora 2.1.1. Ontologies

We used the Disease Ontology (DO) [11] on 15/April/2022) (downloaded on 1/March/2022) and the MEDIC vocabulary [12] in our study. DO is an ontology from the Open Biomedical Ontologies (OBO) [11], whereas MEDIC is a vocabulary of disease terms represented in the Web Ontology Language (OWL) [12]. We used the Human Phenotype Ontology (HPO) [13] (downloaded on 5/Jan/2022) for the phenotype concepts.

2.1.2. Literature

We used Medline [14] as a literature resource to generate our abstract-based weakly labeled dataset. To select abstracts that cover ontology concepts, we used an in-house index covering 32,923,095 Medline records (downloaded on Dec-15-2022) generated using Elasticsearch [15].

2.1.3. Benchmark corpora

To evaluate the named entity recognition models, we used four benchmark corpus; the NCBI– Disease Corpus [16] and the MedMentions Corpus (disease and phenotype) [17] and GSC+ [18]. NCBI–Disease is a widely used corpus where disease mentions are annotated and reviewed by multiple annotators. MedMentions is a large corpus annotated by an extensive set of Unified Medical Language System (UMLS) concepts. We selected the abstracts with disease annotations from MedMentions and named this the MedMentions–disease Corpus. To form this corpus, we used UMLS-to-MESH mappings from UMLS to obtain the MESH codes and selected the disease concepts which exist in our disease dictionary (described in section 2.2). Similarly, we selected the abstracts with phenotype concepts where we found mappings from UMLS-to-HPO and named this dataset as MedMentions–phenotypes. GSC+ is a widely used benchmarking dataset covering phenotype concepts particularly from HPO. We used the test dataset version released by [8]. Table 1 shows the distribution of the abstracts and annotations in the four benchmark corpora.

2.2. Dictionary generation

We generated and used two dictionaries to weakly label Medline abstracts for disease and phenotype concepts. To generate our dictionaries, first, we extracted the labels and synonyms of all concepts from MEDIC, DO and HPO. Second, we filtered out the possible ambiguous labels/synonyms which are often stop words, short labels/synonyms (1 or 2 character long) and labels/synonyms shared by two diferent concepts from the dictionary. For example, DO contains a synonym which is "go" for the "geroderma osteodysplasticum" concept (DOID:0111266). The synonym "go" is ambiguous with the verb "go". Filtering out ambiguous names is a common practice used in text mining workflows that rely on lexical matches. We used the Natural Language Toolkit (NLTK) stop words [19] and filtered out any exact match with the labels/synonyms in MEDIC and DO and HPO. In both sources, we did not find any match with the list of stop words. We also filtered out the labels/synonyms having less than 3 characters to avoid false positives. Additionally, for the generation of the dictionary for diseases, we filtered out all the disease labels/synonyms which exactly match with protein labels/synonyms from the HUGO Gene Nomenclature Committee (HGNC) Database [20] to avoid false positive matches with protein names. Third, we generated the plural form of each label/synonym by using the Inflect Python module [ 21]. For example, the module generates “tetanic cataracts” for the given multi-word term, “tetanic cataract” (DOID:13822). Our final disease dictionary covers 244,903 disease labels and synonyms of 29,374 distinct concepts from MEDIC and DO. The final phenotype dictionary covers 79,010 phenotype labels and synonyms of 14,631 distinct concepts from HPO.

2.3. Ontology components used

An ontology , as previously described in [22], has four main components: • Classes and relations, where classes and relations are assigned unique identifiers. • Domain vocabulary, where labels and synonyms are linked to ontology classes and relations. • Textual definitions, where descriptions about classes and relations are provided, usually in natural language. • Formal axioms, where relations between concepts are described in some formal language and possibly linked to other ontologies and sources.

We used labels and synonyms, textual definitions, and formal axioms components separately to create weakly labeled corpora and the statistics are reported in Table 2.

2.4. Training dataset construction 2.4.1. Abstracts from literature

To generate the training set for distant supervision, first, we retrieved the relevant literature by searching the indexed Medline for the exact match of each label/synonym from the dictionaries. We retrieved the top [ 1-5 ] Medline abstracts/titles hits per concept that is identified based on the default Elastic Search Engine relevance scoring settings (TF-IDF [ 23 ] based scoring). Second, we used the dictionaries and annotated the downloaded abstracts lexically and converted the annotations to the I-O-B format (a common format for tagging tokens in a chunking task where indicates the first token (Beginning) of an annotation, subsequent (Inside) token of the same annotation and representing a token that is not annotated (Outside)) [ 24 ] by using spaCy [ 25 ]. Finally, we obtained two sets of corpora; one for the disease concepts and the other for the phenotype concepts. We found 16,307 distinct phenotype labels/synonyms belonging to 6,962 classes from HPO in at least one Medline record by searching the indexed literature. These concepts are covered by 16096, 31372, 46032, 60098 and 74087 distinct Medline abstracts/titles at top 1, 2, 3, 4, 5 hits respectively, and we used them as our training sets for phenotypes. We found 35,333 distinct disease labels/synonyms linked to 8,400 distinct concepts from MEDIC and DO in at least one Medline records. These concepts are covered by 41698, 81007, 118295, 154060 and 187462 distinct Medline abstracts/titles at top 1, 2, 3, 4, 5 hits respectively and we used as our training sets for disease concepts.

2.4.2. Labels and synonyms

Using the direct labels and synonyms from ontologies, we created two sets for phenotypes and diseases. For phenotypes, the labels and synonyms extracted from HPO were directly considered as positives as shown in Table 3. We used the labels and synonyms from DO and added MEDIC as well. The labels and synonyms were retrieved from the dictionary described in 2.2.

2.4.3. Definitions

Definitions in DO are available in natural language. To associate the concept with its definition, we added the concept label/synonyms to the beginning of a definition as shown in Table 3. For concepts which lacked definitions, we simply included their labels/synonyms with a dummy sentence replicated for all. For instance, if a disease does not have a definition, its dummy definition is “ is a disease”. Since definitions can included other concepts (e.g. parent concepts) in their description, mentions of such concepts can be troublesome. To partially resolve this issue, we annotated the definitions with the dictionaries described in 2.2 Matches against the dictionaries were treated as positive mentions of concepts. In total, we retrieved 9,435 definitions from DO and used dummy definitions for 19,939 concepts. For phenotypes, we included definitions for 10,202 concepts and used dummy definitions for 2,451 concept. 2.4.4. Axioms Axioms are not readily available for natural language tasks since they are expressed in formal language. To tackle this issue, we first processed axioms as previously described in [ 26 ]. Next, we replaced ontology identifiers with their labels/synonyms. We also included axioms which reference external ontologies and replaced their identifiers with names as shown in Table 3.

For diseases, we used 30,834 axioms from DO. For phenotypes, we included 37,062 axioms from HPO. Axioms of both concepts included references to external ontologies which we downloaded and processed to map their identifiers to their names. The external ontologies that were included are: the Basic Formal Ontology (BFO) [ 27 ], the Chemical Entities of Biological Interest (ChEBI) [ 28 ], the Cell Ontology (CL) [ 29 ], the Gene Ontology (GO), the Relation Ontology (RO) [ 30 ], and the Uber-anatomy Ontology (UBERON) [ 31 ].

2.5. Named entity recognition using distant supervision

NER refers to identifying boundaries of entity mentions in text (disease and phenotype mentions in our case). We used distant supervision to train our models by using BioBERT to recognise disease and phenotype mentions in text. Figure 1 depicts the system overview.

BioBERT is a BERT (Bidirectional Encoder Representations from Transformers) [ 32 ] pretrained language model based on large biomedical corpora. BERT is a contextualized word representation model trained using masked language modeling. It provides self-supervised deep bidirectional representations from unlabeled text by jointly conditioning on both left and right contexts. The pre-trained BERT model can be fine-tuned with an additional output layer to generate models for various desired NLP tasks. We used simpletransformers [ 33 ] which provides a wrapper model to distantly supervise an entity recognition model. More specifically, the wrapped model is used to fine-tune BERT models by adding a token-level classifier on top that classifies tokens into one of the output classes which are I-O-B (Inside-Outside-Beginning). In the training phase, our models are initialised with weights from BioBERT-Base v1.1 [ 34 ] and then fine-tuned on the disease and phenotype entity recognition task using our training corpora.

3. Results

We set up our experiments on four separate benchmarking corpora covering phenotype and disease concepts; NCBI–disease, MedMentions–disease, MedMentions–phenotype and GCS+. We reported our NER results using the Precision, Recall and F-score metrics. We used a relaxed scheme to calculate the metrics where we considered any partial overlap between the prediction and the curated annotations to be a true positive. That is, predictions are considered to be Training phase

Test phase

Labels/Synonyms Axioms Definitions Ontology

Dictionary Dictionary construction (Label, synonyms, plurals ) Distant dataset

Indexed PubMed for titles and abstracts Distant dataset generation Training a model (Simple Transformers)

Deep learning model (BioBERT)

Test text Named

Entity Recognition

Annotated text positives whenever the indices (locations in text) of the prediction and the curated annotations overlap.

Table 4 shows the performance of the disease NER models which are distantly supervised on diferent ontology components or on abstracts (best F1-score is achieved at top 1, see Additional File 1) on the disease test sets (see Table 1). For the sake of comparison, we also included a supervised BioBERT model that is trained on the NCBI-disease training set. Our results showed that supervised BioBERT trained on the curated set performed the best on NCBI–disease (0.94 F1score) because concepts are highly conserved in this dataset. To fairly compare the performance of the methods, we further evaluated the models on the MedMentions–disease dataset. Results showed that the distantly supervised models (trained on abstracts and definitions plus axioms) achieved higher F1 scores (0.68 for abstracts and 0.67 for definitions and axioms) compared to the model trained on the curated set (0.66 F1-score) which is actually biased towards the NCBI–disease dataset (we found out there is 80% overlap in concept IDs between NCBI training and test sets). The models trained on solely labels and synonyms, axioms, definitions showed lower F1-score compared to the model trained on abstracts. On the other hand, the model trained on definitions plus axioms achieved a competitive F1-score compared to the model trained on abstracts. This result is more evident on the MedMentions-disease test set.

4. Discussion

Our main goal was to explore whether ontology components can help to develop distantly supervised disease/phenotype entity recognition models which are competitive to the state-ofthe-art. To that end, we exploited ontological components to create textual context using the labels/synonyms, axioms and definitions. We observed that utilising the context in ontologies via distant supervision aids in developing a NER model at the state-of-the-art level. While the models trained solely on labels and synonyms achieves lowest simply due to lack of context; the models incorporating context such as axioms and definitions improved the performance upon the models that lack context.

The disease NER model trained on the axioms and definitions achieved competitive F1-score compared to the model trained on the abstracts only. However, we observed 6% discrepancy between the phenotype NER models trained on the abstracts (best F1-score is achieved at top 2) and axioms and definitions together. To investigate the reason for this discrepancy, we focused on the False Positive (FP) predictions that we achieved on the GSC+ test corpus. The model trained on the weakly labeled abstracts produced 440 FPs while the model trained on the phenotype definitions and axioms produced 608 FPs. We found that 184 out of 608 FPs are produced distinctly by the model trained on definitions and axioms and not by the one trained on the abstracts. We randomly sampled 20 FPs from these 184 FPs for further manual analysis. Our manual analysis on these 20 FPs showed that all of them were actually True Positives but have been missed by the GSC+ dataset. For example, we found “Uniparental disomy” (HP:0032382) in PMID:8103288 was captured correctly by the model but was missed by GSC+ annotations. More importantly, we observed that the majority of the FPs were not introduced in the definitions and axioms training corpus but were rather predicted as zeroshot instances (i.e. instances that were not seen by the model during training). For example, “Angelman syndrome” in PMID:8786067 which does not correspond to any label/synonyms in HPO and does not exist in the corpus was annotated by the model trained on definitions and axioms. Furthermore, the model trained on literature abstracts did not have these FPs since they were specifically included as classes in the training set. Details on our manual analysis can be found in the Additional Files 1.

We conducted our study on DO and HPO. These ontologies are widely used and therefore contain dense content which can help to generate suficiently large weakly label datasets. Although the approach is generic and its utility can be explored for any given ontology; the performance would depend on the density of the content of the ontology of choice. That is, if the ontology does not suficiently describe a concept, it is not possible to obtain a well-performing model.

5. Conclusion

In conclusion, our analysis showed that the ontology components can provide a suitable corpus to build a NER model that is competitive to state-of-the-art. This alleviates the need for annotating a large number of abstracts and facilitates the creation of weakly labeled training corpora. Easily obtained corpora are desirable since they reduce both the computational and time overheads. To our best knowledge, this is the first work that uses ontology axioms to build disease/phenotypes NER models.

Additionally, the models trained on ontology components were capable of zero-shot learning on the test datasets. This was not the cases for the models trained on curated sets and the models trained on the large weakly labeled literature abstracts. Our approach is generic and its utility can be explored with any other given ontology which has suficient content that describes the concept of interest.

Acknowledgments

We thank Dr. Mahmut Uludağ for his technical assistance in processing MEDLINE data. This work has been supported by funding from King Abdullah University of Science and Technology (KAUST) Ofice of Sponsored Research (OSR) under Award No. URF/1/4355-0101, URF/1/4675-01-01, URF/1/4697-01-01, URF/1/5041-01-01, REI/1/5334-01-01, FCC/1/1976-46-01 and FCC/1/1976-34-01. 5227–5240. URL: https://aclanthology.org/2021.emnlp-main.424. doi:10.18653/v1/2021. emnlp-main.424. [8] L. Luo, S. Yan, P.-T. Lai, D. Veltri, A. Oler, S. Xirasagar, R. Ghosh, M. Similuk, P. N. Robinson, Z. Lu, PhenoTagger: a hybrid method for phenotype concept recognition using human phenotype ontology, Bioinformatics 37 (2021) 1884–1890. URL: https://doi.org/10.1093/ bioinformatics/btab019. doi:10.1093/bioinformatics/btab019. [9] K. Zhou, Y. Li, Q. Li, Distantly supervised named entity recognition via confidence-based multi-class positive and unlabeled learning, in: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Dublin, Ireland, 2022, pp. 7198–7211. URL: https://aclanthology. org/2022.acl-long.498. doi:10.18653/v1/2022.acl-long.498. [10] H. Dong, V. Suárez-Paniagua, H. Zhang, M. Wang, A. Casey, E. Davidson, J. Chen, B. Alex, W. Whiteley, H. Wu, Ontology-driven and weakly supervised rare disease identification from clinical notes, BMC Medical Informatics and Decision Making 23 (2023). URL: https://doi.org/10.1186/s12911-023-02181-9. doi:10.1186/s12911-023-02181-9. [11] L. M. Schriml, et al., Human Disease Ontology 2018 update: classification, content and workflow expansion, Nucleic Acids Research 47 (2018) D955–D962. URL: https://doi.org/ 10.1093/nar/gky1032. doi:10.1093/nar/gky1032. [12] A. P. Davis, T. C. Wiegers, M. C. Rosenstein, C. J. Mattingly, MEDIC: a practical disease vocabulary used at the Comparative Toxicogenomics Database, Database 2012 (2012). URL: https://doi.org/10.1093/database/bar065. doi:10.1093/database/bar065, bar065. [13] S. Köhler, et al., Expansion of the human phenotype ontology (HPO) knowledge base and resources, Nucleic Acids Research 47 (2018) D1018–D1027. URL: https://doi.org/10.1093/ nar/gky1105. doi:10.1093/nar/gky1105. [14] NCBI, Pubmed, 1996. https://pubmed.ncbi.nlm.nih.gov/, Last accessed on 2022-04-18. [15] N. Elastic, Swiftype, Elastic search, 2010. https://www.elastic.co/, Last accessed on 202204-18. [16] R. I. Doğan, R. Leaman, Z. Lu, NCBI disease corpus: A resource for disease name recognition and concept normalization, Journal of Biomedical Informatics 47 (2014) 1–10. URL: https: //doi.org/10.1016/j.jbi.2013.12.006. doi:10.1016/j.jbi.2013.12.006. [17] S. Mohan, D. Li, Medmentions: A large biomedical corpus annotated with umls concepts, 2019. URL: https://arxiv.org/abs/1902.09476. doi:10.48550/ARXIV.1902.09476. [18] M. Lobo, A. Lamurias, F. M. Couto, Identifying human phenotype terms by combining machine learning and validation rules, BioMed Research International 2017 (2017) 1–8.

URL: https://doi.org/10.1155/2017/8565739. doi:10.1155/2017/8565739. [19] I. Brigadir, Nltk stop words, 2019. https://github.com/igorbrigadir/stopwords/blob/master/ en/nltk.txt, Last accessed on 2022-09-14. [20] S. Tweedie, B. Braschi, K. Gray, T. E. M. Jones, R. L. Seal, B. Yates, E. A. Bruford, Genenames.org: the HGNC and VGNC resources in 2021, Nucleic Acids Research 49 (2020) D939–D946. URL: https://doi.org/10.1093/nar/gkaa980. doi:10.1093/nar/gkaa980. [21] P. Dyson, Inflect python module, 2022. https://pypi.org/project/inflect/, Last accessed on 2022-09-14. [22] R. Hoehndorf, P. N. Schofield, G. V. Gkoutos, The role of ontologies in biological and biomedical research: a functional perspective, Briefings in bioinformatics 16 (2015) 1069– • Additional file 1 — AdditionalFile1.xls First sheet name as “performance_on_abstracts” contains the performances of the models trained on the weakly labeled abstract datasets selected based on top [ 1-5 ] hits from the ElasticSearch Index. Second sheet named as “manual_error_analysis” contains our manual analysis results on the False Positives from the GSC+ dataset. The file is available from github: https://github.com/ bio-ontology-research-group/OntoNER

[1]

Jonquet ,

N. H.

Shah ,

M. A.

Musen , The open biomedical annotator , in: American Medical Informatics Association Symposium on Translational BioInformatics, AMIA-TBI' 09 , San Francisco, CA, USA, 2009 , pp. 56 - 60 .

[2]

Kapushesky , et al., Gene expression atlas update-a value-added database of microarray and sequencing-based functional genomics experiments , Nucleic Acids Research 40 ( 2011 ) D1077 - D1081 . URL: https://doi.org/10.1093/nar/gkr913. doi: 10 .1093/nar/gkr913.

[3]

Taboada ,

Rodriguez ,

Martinez ,

Pardo ,

M. J.

Sobrido , Automated semantic annotation of rare disease cases: a case study, Database 2014 ( 2014 ) bau045 - bau045 . URL: https://doi.org/10.1093/database/bau045. doi: 10 .1093/database/bau045.

[4] C.-H. Wei , H.- Y.

Kao , Z. Lu,

GNormPlus: An integrative approach for tagging genes, gene families, and protein domains , BioMed Research International 2015 ( 2015 ) 1 - 7 . URL: https://doi.org/10.1155/ 2015 /918710. doi: 10 .1155/ 2015 /918710.

[5]

Liang ,

Yu ,

Jiang ,

Er ,

Wang ,

Zhao ,

Zhang , Bond: Bert-assisted opendomain named entity recognition with distant supervision , in: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , KDD '20, Association for Computing Machinery, New York, NY, USA, 2020 , p. 1054 - 1064 . URL: https://doi.org/10.1145/3394486.3403149. doi: 10 .1145/3394486.3403149.

[6]

Wang ,

Guan ,

Zhang ,

Li , J. Han, Pattern-enhanced named entity recognition with distant supervision , in: 2020 IEEE International Conference on Big Data (Big Data) , 2020 , pp. 818 - 827 . doi: 10 .1109/BigData50022. 2020 . 9378052 .

[7]

Wang ,

Hu ,

Song ,

Garg ,

Xiao , J. Han, ChemNER: Fine-grained chemistry named entity recognition with ontology-guided distant supervision , in: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing , Association for Computational Linguistics, Online and

Punta

Cana , Dominican Republic, 2021 , pp. 1080 .

[23]

Sammut , G. I. Webb (Eds.), TF-IDF , Springer

, Boston, MA, 2010 , pp. 986 - 987 . URL: https://doi.org/10.1007/978-0- 387 -30164-8_ 832 . doi: 10 .1007/978-0- 387 -30164-8_ 832 .

[24]

L. A.

Ramshaw ,

M. P.

Marcus , Text chunking using transformation-based learning , in: ACL Third Workshop on Very Large Corpora , 1995 , pp. 82 - 94 . doi:https://doi.org/ 10.48550/arXiv.cmp-lg/9505040.

[25]

Honnibal , I. Montani , spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing, 2017 . To appear.

[26]

F. Z.

Smaili ,

Gao ,

Hoehndorf , Onto2Vec: joint vector-based representation of biological entities and their ontology-based annotations , Bioinformatics 34 ( 2018 ) i52 - i60 . URL: https://doi.org/10.1093/bioinformatics/bty259. doi: 10 .1093/bioinformatics/bty259.

[27]

Arp ,

Smith ,

A. D.

Spear , Building ontologies with Basic Formal Ontology , The MIT Press, Cambridge, Massachusetts;London, England;, 2015 ; 2016 ;.

[28]

Hastings , et al., Chebi in 2016: Improved services and an expanding collection of metabolites , Nucleic acids research 44 ( 2016 ) D1214-9 . URL: https://europepmc.org/ articles/PMC4702775. doi: 10 .1093/nar/gkv1031.

[29]

Bakken ,

Cowell ,

B. D.

Aevermann ,

Novotny ,

Hodge ,

J. A.

Miller ,

Lee , I. Chang ,

McCorrison ,

Pulendran , et al., Cell type discovery and representation in the era of high-content single cell phenotyping , BMC bioinformatics 18 ( 2017 ) 7 - 16 .

[30]

R. P.

Huntley ,

M. A.

Harris ,

Alam-Faruque ,

J. A.

Blake ,

Carbon ,

Dietze ,

E. C.

Dimmer ,

R. E.

Foulger ,

D. P.

Hill ,

V. K.

Khodiyar , et al., A method for increasing expressivity of gene ontology annotations using a compositional approach , BMC bioinformatics 15 ( 2014 ) 1 - 11 .

[31]

C. J.

Mungall ,

Torniai ,

G. V.

Gkoutos ,

S. E.

Lewis ,

M. A.

Haendel , Uberon, an integrative multi-species anatomy ontology , Genome biology 13 ( 2012 ) 1 - 20 .

[32]

Devlin , M.-

Chang ,

Lee ,

Toutanova , in : Proceedings of the 2019 Conference of the North, Association for Computational Linguistics , 2019 . URL: https://doi.org/10.18653/ v1/n19- 1423 . doi: 10 .18653/v1/n19- 1423 .

[33]

T. C.

Rajapakse , Simple transformers, https://github.com/ThilinaRajapakse/ simpletransformers, 2019 .

[34]

Lee ,

Yoon ,

Kim ,

C. H.

So ,

Kang , Biobert github respository, 2019 . (https://github.com/dmis-lab/biobert).