1. Introduction

C-STSS: A Context -based Short Text Semantic Similarity approach applied to biomedical named entity linking⋆

Asma Djellal

0 1

Maya Souilah Benabdelhafid

0 1 0 Ecole Supérieure de Comptabilité et de Finance , ESCF Constantine , Algeria 1 Lire laboratory, Abdelhamide Mehri Constantine 2 University , Constantine , Algeria

This research paper delves into Human-Computer Interaction by investigating Knowledge Graph-based Question Answering systems in the biomedical domain. The study leverages Knowledge Graphs as potent tools to enhance Named Entity Linking in short texts, where limited context poses challenges. Conventional linking methods struggle with single Named Entity linking due to poor context and name variation issues, afecting their eficiency. To address these challenges, several scholars are working on designing Knowledge Graph-based Question Answering Systems with a focus on the name variation problem by relying on Named Entity morphological forms but they are rarely considering their semantic similarities. This paper introduces a Context-based Short Text Semantic Similarity approach for Named Entity Linking in the biomedical domain. The proposed approach improves the performance of Question Answering systems by utilizing contextual semantic similarities in short texts and combining knowledge-based and corpus-based methods for fine-grained meaning comparison, which allow addressing sparseness and vocabulary mismatches, showcasing the paper's uniqueness.

eol>Question Answering Systems Natural Language Processing Biomedical Named Entity Linking Contextual Semantic Similarities Short Text

1. Introduction This paper introduces the Context-based Short Text

Semantic Similarity (C-STSS) approach, a sophisticated In the ever-evolving landscape of Natural Language Pro- framework that aims to bridge the gap between the limcessing (NLP), the challenge of deciphering the nuances ited context of short biomedical texts and the rich seof short texts, particularly within specialized domains mantic knowledge encompassed within specialized dolike biomedicine, has emerged as a critical area of re- mains. By dissecting semantic similarities and leveraging search. Short texts, encompassing brief queries and ques- domain-specific knowledge, C-STSS provides nuanced tions, lack the extensive context often found in longer analysis, facilitating accurate NEL even in the face of texts, posing formidable obstacles for accurate Named sparse and mismatched vocabulary. This innovative apEntity Linking (NEL), which is a key part for developing proach holds the promise of revolutionizing NEL within Question Answering Systems (QAS) [ 1 ]. The core dif- the constraints of short texts, opening new avenues ifculty lies in disambiguating Named Entities (NE) [ 2 ], for exploration at the intersection of NLP and Humanespecially those sharing similar surface forms [ 3 ], and Computer Interaction (HCI). capturing subtle semantic diferences essential for accu- The remainder of this paper is organized as follows. rate NEL. Section 2 outlines some preliminaries related to the re

To address these challenges, this paper pioneers a search work. Section 3 reviews some related works and novel approach that considers the fine-grained meaning analyses drawbacks of recent biomedical NEL systems. comparison by integrating knowledge-based and corpus- Section 4 constitutes the bulk of the paper and presents based methods [ 2 ]. Corpus-based methods leverage con- C-STSS, our proposed approach for dealing with NEL textual information from textual data to compute gen- problem in short biomedical text. Section 5 concludes eral semantic relatedness between words. Meanwhile, the paper and suggests directions for future works. knowledge-based methods draw upon the wealth of semantic information stored in resources like Knowledge Graphs (KG). By integrating these approaches, the study 2. Preliminaries aims to overcome the sparseness and vocabulary mismatches inherent in short texts.

3. Related Work

user queries and extracting answers by matching and reasoning in KG. For instance, to answer the question "Who is Apple CEO? (see Figure 1), these systems tackle challenges like: In recent years, the focus of NLP research has extended from the general language domain to the biomedical field, driven by Biomedical NLP (BioNLP) shared tasks and the 1. Named Entity Recognition (NER), identifies increasing application of BioNLP tools in areas like clinfragments mentioning NE in text. In the above ical research and quality improvement [14, 15, 16, 17]. question the mention "Apple" is identified as a NE More particularly, Biomedical QA (BioQA) have been 2. Named Entity Disambiguation (NED), seeks introduced for enabling innovative applications to efecfor each NE its corresponding meaning over a tively perceive, access, and understand complex biomedigiven KG, e.g. Wikidata. In our case, "Apple" cal knowledge [18]. On one hand, we can find for instance can be linked to several Wikidata entries with cTAKES [19], TaggerOne [20], and QuickUMLS [21] that diferent QID e.g. Q89 (apple, the fruit) or Q312 are commonly used as rule-based knowledge-intensive (Apple Inc., the company). concept normalization tools. These solutions use rules to 3. Named Entity Linking (NEL), links each NE to generate lexical variants for each noun phrase and then its exact meaning over a KG, e.g., IRIs in Wikidata, perform dictionary queries for each variant. Although based on the surrounding context. According to they provide robust performance, they implicitly assume the question, "Apple" has to be disambiguated the availability of concept aliases in the target language as "Apple Inc., the company" with the ID Q312. and focus on normalizing mentions and recognizing NE Therefore, NEL task has to link it to its IRI https: without efectively linking them [22]. //www.wikidata.org/wiki/Q312. Despite the developments, BioQA systems are still immature and rarely used in real-life settings. Current re

It is important to note that this paper aligns with the search often emphasizes morphological and string simiprevailing research trend, employing the term NEL to larities of NE, neglecting their semantic similarities. NEL encompass both Disambiguation and Linking tasks, a con- approaches are being introduced to maps various exvention adopted by several state-of-the-art approaches. pressions, terms, or abbreviations to their correspondThroughout the remainder of this paper, the primary fo- ing common semantic representation or concept identicus revolves specifically around NEL task, rather than the ifer in a given terminology or vocabulary. Biomedical complete QAS. For in-depth technical insights into NER language models are being explored to improve entitytask, interested readers are referred to the comprehensive linking strategies and to achieve automatic term mapping surveys [ 8, 9 ]. The NEL process generally involves two and some efective approaches to English corpora have steps: been proposed. For instance, in [23], authors have proposed a collective inference approach, which leverages • Retrieving Candidates Entities: The first step semantic information and structures in ontology to solve entails retrieving a set of candidate entities from the NEL problem for biomedical literature. Also, in [24], the KG that the recognized NE may refer to. Var- scholars have proposed a graph-based linking approach ious techniques are employed, including name which starts by constructing graphs for mentions, KG, dictionary-based methods [10], surface form ex- and candidates and then exploits the information enpansion [11], and semantic relationships [12]. tropy and similarity algorithm to perform NEL. Like our These methods rely primarily on string compar- approach, these contributions are dependent on the conisons between the NE and the candidates, gen- text and KG. In addition, scholars in [25] have proposed erating a set of potential entities. For example, LATTE, a LATent Type Entity linking model, leveraging "Apple", might be mapped to candidates like Q89 latent semantic information to improve entity linking, and Q312 in Wikidata (see Figure 1). while authors in [26] have used semantic type informa• Selecting the Correct Candidate: Given that a tion for improved entity disambiguation.

NE can often refer to a large number of candidate Diferent from the above works where no evaluation entities [13], the challenge lies in selecting the benchmark has been developed to evaluate how well lanmost relevant one. This step requires ranking guage models represent biomedical concepts according the candidate entities based on the surrounding to their corresponding context, authors in [27] propose a context and selecting the highly scored candidate novel dataset, BioWiC, to evaluate the ability of language that best fits the meaning of the given NE. For models to encode biomedical terms in context. Another instance, if "Apple" refers to both the fruit and the research direction is to use for example BERT-based recompany, according to the context, the correct trieve and re-rank models [28]. For instance, in [29], candidate "Apple Co" needs to be selected. scholars have improved biomedical pretrained language models with knowledge.

C-STSS approach involves four main sub-processes (see

Figure 2). First, the Pre-process verifies and prepares the input question and recognizes the involved NE. Then, the Expansion generates the NE context by expanding the input question. Thereafter, Candidates Generation retrieves all NE candidates from DBpedia. Finally, the Ranking sub-process uses Semantic similarities to score candidates based on the generated context, and then links the NE to the highest scored candidate. This process frames NEL task as a ranking problem and will be detailed further in the following sections.

Let us notice that a particularly challenging is the task subjected to critical transformations. After verifying the of NEL in short texts, such as questions, where limited question’s structure for grammatical or spelling errors, contextual information hampers conventional linking cleaning and normalization are performed to remove unmethods. Addressing these challenges, this paper in- necessary or noisy words. This involves employing NL troduces a C-STSS approach, designed to enhance the techniques such as tokenization [ 2 ] and stop-word reperformance of biomedical NEL systems dealing with moval [30], focusing on retaining only nouns, verbs, and short texts through contextual semantic similarities. adjectives. Finally, cTAKES [18], an open-source NLP tool, is utilized in order to recognize the involved NE.

It is noteworthy that due to the brevity of questions, 4. C-STSS Approach for words from the entity mention are included in the conBiomedical NEL text window, especially if the entity consists of two or more words. For instance, in the case of NE "Malignant tumor" contextual words like "Malignant" and "tumor" are extracted as they contain meaningful common nouns.

In a biomedical scenario, a sample question could be: "How can Cancer be prevented and detected". Having this question as input, the pre-process generates as output the set of recognized NE and a set of words .

Input: "How can Cancer be prevented and detected?" Output:

• A set of words = Cancer, prevented, detected • A set of = Cancer 4.1. Pre-Process

The pre-processing step is vital as it significantly influ

ences the outcome of the linking process, ensuring that the input question is refined and suitable for subsequent analysis. In the Pre-Process stage, the input question is 4.2. Expansion

The Expansion module aims to enhance the contextual semantic similarity measurement particularly in short texts. In such case, traditional entity-entity relatedness

approaches become inefective due to the lack of con- vocabulary gap in short texts. Synonyms, despite text, and vocabulary mismatch further complicates the their diferent surface form, are strongly semantimeasurement of similarity between candidate descrip- cally related. tions and context. To overcome these challenges, the Expansion module enrich and expand the question with At the end of the Expansion, the context window will semantically related words. Initially, a stemming algo- be enriched with additional related words. Following rithm is used to reduce each word ∈ to its root our biomedical scenario, the set of words is enriched or stem in order to ensure a consistent comparison [31]. resulting to the context as represented in Table 1. Then, it enriches the stemmed words by incorporating Input: A set of words = Cancer, prevented, detected their synonyms using WordNet [32] as a background KG.

Consequently, this module enables a more comprehen- Output: A set of contextual words sive analysis of the semantic similarities between the recognized NE and its candidates by allowing: 4.3. Candidate Generation • Lexical comparison: Family words sharing the The Candidate Generation module focuses on retrieving same stem, e.g., "prevention" and "prevented", potential candidate entities to which the NE can refer could be compared. These words, although to within DBpedia, a central KG comprising over 228 slightly varied, are semantically related. million entities from Wikipedia and Wikidata. The pro• Semantic comparison: Words with diferent cess begins by a simple string comparison to identify lexical forms but similar meanings, namely syn- candidates whose names match the NE. However, dealonyms e.g., "prevented" and "avoided" bridge the ing with name variations is a considerable challenge in the biomedical field [ 33]. This variation is so extensive is created to diferentiate them. For that, we generate a that a single entity can have multiple names, for instance, SPARQL query, specifying the NamedEntity (disambigua"decreases in hemoglobin" could refer to at least four dif- tion) notion and the property wikiPageDisambiguates, ferent entities in MedDRA , which all look alike: "changes to retrieve all links listed on this page and add them to in hemoglobin", "increase in hematocrit", "hemoglobin the set of candidates. decreased", and "decreases in platelets". Addressing the From the previous Biomedical scenario, challenge of name variation, Candidate Generation em- we retrieve the set of candidates: = ploys several techniques: , ℎ... having exact string match with = by executing the • Exact String Match: Candidates sharing the ex- SPARQL query presented in the following listing over act string name with the NE are considered. DBpedia. The result is shown in Figure 3. • Abbreviations/Acronyms: Biomedical dictionaries are utilized to handle abbreviations and 4.4. Ranking acronyms common in the biomedical domain. • Numbers: Variations in writing numbers (Arabic, Ranking module holds immense importance in the NEL Roman, or English spelled) are normalized for process as it discerns the most suitable candidate for the consistency. NE based on the question context. When provided with • Adjectives: Multiple adjectives associated with a a context and a set of candidates , this module single noun employing composites like "and," "/", uses a ranking algorithm to compute for each candidate or "or" are separated and considered individually. diferent contextual semantic similarities. • Tokenization: Biomedical terms composed of These contextual semantic similarities refer to the meamultiple tokens connected by hyphens require surement of how closely the candidate aligns with the dehyphenation for proper token sequence gener- context. To this end, the algorithm computes some conation. textual semantic similarities according to the equation (1). The candidate with the highest score will be identi

These techniques are elaborated in Table 2, providing fied as the correct meaning of the NE. It is essential to an example for each case. highlight that the similarity between each candidate and

Let us notice that, exact string matches can be retrieved the context is measured over its description in DBpedia. using DBpedia’s disambiguation pages. If multiple DBpe- () = ( (, )) (1) dia entries share the same name, a disambiguation page

Here, is a semantic similarity function. For for words that are overly frequent in candidate descripeach ∈ , the following semantic similarities are tions, () is employed to assign lower weights to computed: these less distinguishing words.

Textual similarity: Given the context and a Hence, in order to compute textual similarity becandidate ∈ , we create two vectors representing tween the two vectors = (1, 2, .., ) and = their textual content: the candidate description vector, (′1, ′2, .., ′), the cosine method is applied. This method noted as and the contextual words vector, noted as calculates the cosine of the angle between these two vec. It should be noted that, lemmatization is applied on tors. It is defined as : candidates descriptions for omitting stop words, very (, ) = ( , ) frequent and very rare words. We employ a standard = ∑︀ ′/(√︂∑︀ 2 √︂∑︀ ′2) (3) Vector Space Model, with a − weighting scheme for representing both vectors: = (1, 2, ..., ), The primary challenge with using cosine similarity in each dimension of corresponds to the word weight advanced models lies in vocabulary mismatch. Cosine and is defined as: similarity essentially measures the correlation between = (, ) × () (2) the words of two textual vectors [ 2 ]. Consequently, this Where (, ), is the Term-Frequency function method fails to measure similarity when the vectors do and denotes the frequency of the contextual word in not share identical words. Even if there are semantically the candidate description . It assesses the significance related words, they are not taken into account. To face of the contextual word within the candidate’s description. this drawback, we opt for knowledge-based methods to While (), stands for Inverse Document Frequency, expand the input question with all words with semantic signifies the number of candidates whose descriptions relevance when generating its context. This will successincorporate the contextual word . In order to account fully overcome issues such sparseness and vocabulary () = () = ()/ ∑︀ ( ) (4)

∈

Here, () represents the number of links pointing to the candidate in DBpedia.

Word co-occurrence: In the state-of-the-art systems, the co-occurrence feature traditionally signifies the simultaneous appearance of a set of NE within the same text, allowing them to be collectively linked. Regrettably, this approach faces limitations when applied to short texts, where the presence of multiple NE is rare. Despite that, we adapted the co-occurrence concept to measure the contextual relevance between the NE and a given candidate. In our methodology, this feature is redefined as:

“The appearing of several contextual words within a given candidate description”

Obviously, the more diferent contextual words found within the candidate description, the closer it aligns with the NE context. To quantify this similarity, given the NE context and a candidate , we examine two sets of words: the set of contextual words denoted as and the set of candidate description words denoted as .

The word co-occurrence similarity function is defined as follows: (, ) = − ( , )

/|| = ∑︀

signifies the count of contextual words

Here, ∑︀ contained within . This refined definition ofers a nuanced understanding of word co-occurrence, enhancing the precision of context relevance measurements.

The details provided above are condensed into the subsequent algorithm, outlining our C-STSS approach. It encapsulates the intricacies of our C-STSS approach for biomedical NEL. Given an input question, C-STSS process employs the NER function to recognize the involved NE, generates the context using the Context function, and retrieves all potential candidates over DBpedia by employing Candidates function. These candidates are selected based on the five cases explained earlier. C-STSS mismatch while assessing textual similarity. algorithm incorporates furthermore functions in order to

Candidate Popularity: Measuring the popularity of identify the more relevant candidate: Lemmatization is entities is a crucial factor in determining their relevance applied to omit stop words, very frequent and very rare to a given NE. According to [13] a simple linking method words above context and candidate to enhance clarity. based solely on candidate popularity can achieve 71% Words function retrieves feature words for context and accuracy. It is essential to note that certain candidates candidate, shaping the subsequent analysis. Frequency are exceptionally rare compared to others. For instance, function uses a − weighting scheme for repreconsider the = ; while "Cancer (film)" might senting context and candidate vectors, ensuring a robust be a rare occurrence, "Cancer (astrology)" might be more representation of the textual data. common, with "Cancer (disease)" being the most popular entity. This observation can be formalized by analyzing Algorithm 1 C-STSS approach of biomedical NEL candidates’ incoming and outgoing links within DBpedia.

The candidate popularity function, denoted as (), is defined as follows: (5)

Require: Question

Ensure: ∈ having the highest score 1: ← () 2: ← () 3: ← ( ) 4: ← ∅ 5: ← 0 6: ← 0 7: ← () 8: ← () 9: ← () 10: ← 0 11: for each ∈ do 12: ← + () 13: end for 14: for each ∈ do 15: ← () 16: ← () 17: ← (, ) 18: ← () 19: ← / 20: ← () 21: ← 0 22: for each ∈ do 23: if ∈ then 24: ← +1 25: end if 26: end for 27: 28: ← /|| ← ∑︀(, , ) 29: ← (1/ ∑3︀ ( − )2) =1 30: if ( > and < ) then 31: ← 32: ← 33: ← 34: end if 35: end for 36: Return ()

To assess the similarity between words in the context and those in the candidate descriptions, three distinct semantic similarity metrics are calculated and combined to score each candidate:

This similarity computation is iteratively applied to all candidate entities in order to scoring them. The candidate with the highest score and the lowest standard deviation is returned as the correct one.

4.5. Discussion While various scholars focus on addressing the name variation problem in BioQA by considering morphological forms of biomedical NE, few incorporate semantic similarities. C-STSS approach combines NE morphological forms and contextual semantic similarities. To further enhance its eficacy, our approach integrates knowledgebased methods with corpus-based ones, alleviating issues related to sparseness and vocabulary mismatch. This fusion of techniques forms the core innovation of this research.

To conclude, it is now well established that biomedical text requires methods targeted for the domain. Developments in Deep Learning and a series of successful shared challenges have contributed to a steady progress in techniques for Bio-NLP text. Contributing to this ongoing progress and particularly focusing on computational methods, our future issue will aim to create and encourage research in novel approaches for analyzing biomedical text. More particularly, on transformer-based models that seem to be the future of NLP as explained in recent surveys [34, 35, 36, 37, 38].

5. Conclusion In recent years, KG have undergone substantial growth

in both theoretical frameworks and practical applications.

Despite these advancements, KGQAS encounter persistent challenges. They face limitations due to historical precedents and excessive human intervention, necessitating innovative solutions.

Within the intricate domain of biomedicine, additional complexities emerge. Indeed, NEL in the medical domain is a newer problem. This paper presents a Context-based Short Text Semantic Similarity approach, designed to enhance biomedical NEL systems by exploiting contextual [9] V. Yadav, S. Bethard, A survey on recent advances tive biomedical entity linking using a dual encoder, in named entity recognition from deep learning arXiv preprint arXiv:2103.05028 (2021). models, arXiv preprint arXiv:1910.11470 (2019). [23] J. G. Zheng, D. Howsmon, B. Zhang, J. Hahn, [10] Z. Yang, H. Lin, Y. Li, Exploiting the performance D. McGuinness, J. Hendler, H. Ji, Entity linking of dictionary-based bio-entity name recognition in for biomedical literature, BMC medical informatics biomedical literature, Computational biology and and decision making 15 (2015) 1–9. chemistry 32 (2008) 287–291. [24] H. Wang, J. G. Zheng, X. Ma, P. Fox, H. Ji, Language [11] A. Reshamwala, D. Mishra, P. Pawar, Review on and domain independent entity linking with quannatural language processing, IRACST Engineering tified collective validation, in: Proceedings of the Science and Technology: An International Journal 2015 Conference on Empirical Methods in Natural (ESTIJ) 3 (2013) 113–116. Language Processing, 2015, pp. 695–704. [12] R. Meymandpour, J. G. Davis, A semantic similarity [25] M. Zhu, B. Celikkaya, P. Bhatia, C. K. Reddy, Latte: measure for linked data: An information content- Latent type modeling for biomedical entity linking, based approach, Knowledge-Based Systems 109 in: Proceedings of the AAAI conference on artificial (2016) 276–293. intelligence, volume 34, 2020, pp. 9757–9764. [13] W. Shen, J. Wang, J. Han, Entity linking with a [26] S. Vashishth, D. Newman-Grifis, R. Joshi, R. Dutt, knowledge base: Issues, techniques, and solutions, C. P. Rosé, Improving broad-coverage medical enIEEE Transactions on Knowledge and Data Engi- tity linking with semantic type prediction and largeneering 27 (2014) 443–460. scale datasets, Journal of biomedical informatics [14] G. Frisoni, G. Moro, A. Carbonaro, A survey on 121 (2021) 103880.

event extraction for natural language understand- [27] H. Rouhizadeh, I. Nikishina, A. Yazdani, A. Boring: Riding the biomedical literature wave, IEEE net, B. Zhang, J. Ehrsam, C. Gaudet-Blavignac, Access 9 (2021) 160721–160757. N. Naderi, D. Teodoro, Biowic: An evaluation [15] T. A. Koleck, C. Dreisbach, P. E. Bourne, S. Bakken, benchmark for biomedical concept representation, Natural language processing of symptoms docu- bioRxiv (2023) 2023–11. mented in free-text narratives of electronic health [28] Y. He, Z. Zhu, Y. Zhang, Q. Chen, J. Caverlee, Inrecords: a systematic review, Journal of the Ameri- fusing disease knowledge into bert for health quescan Medical Informatics Association 26 (2019) 364– tion answering, medical inference and disease name 379. recognition, arXiv preprint arXiv:2010.03746 (2020). [16] I. J. B. Young, S. Luz, N. Lone, A systematic review [29] Z. Yuan, Y. Liu, C. Tan, S. Huang, F. Huang, Improvof natural language processing for classification ing biomedical pretrained language models with tasks in the field of incident reporting and adverse knowledge, arXiv preprint arXiv:2104.10344 (2021). event analysis, International journal of medical [30] Z. Xu, X. Luo, S. Zhang, X. Wei, L. Mei, C. Hu, Mininformatics 132 (2019) 103971. ing temporal explicit and implicit semantic relations [17] E. French, B. T. McInnes, An overview of biomedical between entities using web search engines, Future entity linking throughout the years, Journal of Generation Computer Systems 37 (2014) 468–477. biomedical informatics 137 (2023) 104252. [31] C. Ramasubramanian, R. Ramya, Efective pre[18] Q. Jin, Z. Yuan, G. Xiong, Q. Yu, H. Ying, C. Tan, processing activities in text mining using improved M. Chen, S. Huang, X. Liu, S. Yu, Biomedical ques- porter’s stemming algorithm, International Journal tion answering: a survey of approaches and chal- of Advanced Research in Computer and Communilenges, ACM Computing Surveys (CSUR) 55 (2022) cation Engineering 2 (2013) 4536–4538. 1–36. [32] C. Fellbaum, WordNet: An electronic lexical [19] G. K. Savova, J. J. Masanz, P. V. Ogren, J. Zheng, database, MIT press, 1998.

S. Sohn, K. C. Kipper-Schuler, C. G. Chute, Mayo [33] L. Chen, G. Varoquaux, F. M. Suchanek, A clinical text analysis and knowledge extraction sys- lightweight neural model for biomedical entity linktem (ctakes): architecture, component evaluation ing, in: Proceedings of the AAAI conference on and applications, Journal of the American Medical artificial intelligence, volume 35, 2021, pp. 12657– Informatics Association 17 (2010) 507–513. 12665. [20] R. Leaman, Z. Lu, Taggerone: joint named entity [34] L. Cai, J. Li, H. Lv, W. Liu, H. Niu, Z. Wang, Inrecognition and normalization with semi-markov corporating domain knowledge for biomedical text models, Bioinformatics 32 (2016) 2839–2846. analysis into deep learning: A survey, Journal of [21] L. Soldaini, N. Goharian, Quickumls: a fast, unsu- Biomedical Informatics (2023) 104418. pervised approach for medical concept extraction, [35] K. S. Kalyan, A. Rajasekharan, S. Sangeetha, Ammu: in: MedIR workshop, sigir, 2016, pp. 1–4. a survey of transformer-based biomedical pre[22] R. Bhowmik, K. Stratos, G. de Melo, Fast and efec- trained language models, Journal of biomedical informatics 126 (2022) 103982. [36] S. Islam, H. Elmekki, A. Elsebai, J. Bentahar,

N. Drawel, G. Rjoub, W. Pedrycz, A comprehensive survey on applications of transformers for deep learning tasks, Expert Systems with Applications (2023) 122666. [37] K. Hall, V. Chang, C. Jayne, A review on natural language processing models for covid-19 research,

Healthcare Analytics (2022) 100078. [38] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit,

L. Jones, A. N. Gomez, Ł. Kaiser, I. Polosukhin, Attention is all you need, Advances in neural information processing systems 30 (2017).

[1]

Dimitrakis ,

Sgontzos ,

Tzitzikas , A survey on question answering systems over linked data and documents , Journal of intelligent information systems 55 ( 2020 ) 233 - 259 .

[2]

Zhu ,

C. A.

Iglesias , Exploiting semantic similarity for named entity disambiguation in knowledge graphs , Expert Systems with Applications 101 ( 2018 ) 8 - 24 .

[3]

Navigli , Word sense disambiguation: A survey, ACM computing surveys (CSUR) 41 ( 2009 ) 1 - 69 .

[4]

Auer ,

Bizer , G. Kobilarov,

Lehmann ,

Cyganiak ,

Ives , Dbpedia: A nucleus for a web of open data , in: international semantic web conference, Springer, 2007 , pp. 722 - 735 .

[5]

Bollacker ,

Evans ,

Paritosh ,

Sturge ,

Taylor , Freebase: a collaboratively created graph database for structuring human knowledge , in: Proceedings of the 2008 ACM SIGMOD international conference on Management of data , 2008 , pp. 1247 - 1250 .

[6]

Vrandečić , Wikidata: A new platform for collaborative data collection , in: Proceedings of the 21st international conference on world wide web , 2012 , pp. 1063 - 1064 .

[7]

M. R. A. H.

Rony ,

Chaudhuri ,

Usbeck ,

Lehmann , Tree-kgqa: an unsupervised approach for question answering over knowledge graphs , IEEE Access 10 ( 2022 ) 50467 - 50478 .

[8]

Al-Moslmi ,

M. G.

Ocaña ,

A. L.

Opdahl ,

Veres , Named entity extraction for knowledge graphs: A literature overview , IEEE Access 8 ( 2020 ) 32862 - 32881 .