1. Introduction

CLEF 2024 SimpleText Tasks 1-3: Use of Llama-2 for Text Simplification⋆

Rowan Mann

Tomislav Mikulandric

0 0 The University of Split , Ul. Ruđera Boškovića 31, 21000, Split , Croatia

In an era defined by the vast availability of information, the challenge of discerning reliable information is more pressing than ever. Our paper presents findings from Tasks 1, 2, and 3 of the SimpleText track at the 15th Conference and Labs of the Evaluation Forum (CLEF) 2024, aimed at advancing research in automatic simplification of scientific texts using LLaMA-2. Task 1 involved selecting relevant passages for simplified summaries, leveraging ElasticSearch and TF-IDF with cosine similarity for evaluating relevance. We achieved an average Flesch-Kincaid grade level of 0.6, indicating a moderate complexity suitable for further simplification. Task 2 focused on identifying and explaining dificult concepts. Using the LLaMA-2 13B model, we extracted and rated the dificulty of scientific terms, generating explanations for the most challenging ones. However, reliance on Wikipedia for definitions proved inconsistent, highlighting a limitation in our methodology. Task 3 addressed the simplification of scientific abstracts and sentences. We utilized LLaMA-2 to generate simplified versions, efectively maintaining the original meaning while reducing complexity and length. Human validation confirmed the preservation of essential content in the simplified texts. Our research demonstrates the eficacy of LLaMA-2 for text simplification tasks, albeit with noted challenges in obtaining reliable definitions from external sources like Wikipedia. These findings contribute to the broader goal of enhancing scientific literacy through accessible information.

eol>LLMs text simplification LLaMA-2

1. Introduction

• Task 1: What is in (or out)? Selecting passages to include in a simplified summary. • Task 2: What is unclear? Dificult concept identification and explanation • Task 3: Rewrite this! Given a query, simplify passages from scientific abstracts.

2. Task 1: Experimental Setup 2.1. Data Description

The data provided to us by CLEF consisted of 2 folders, “corpus” and “topics qrels”. The corpus includes a new vector database with sentence embedding scores and retains the previously released ElasticSearch Index for field-specific document searches. The ElasticSearch Index allows querying fields such as id, abstract, authors, title, year, and doi from the DBLP dump and is suitable for various applications like passage retrieval, Latent Dirichlet Allocation models, and training Graph Neural Networks. The Vector Database stores each article’s id and sentence-embedding vectors from their title and abstract, excluding articles with empty or very short abstracts, supporting longer queries enabled by sentence embedding.

The SimpleText 2024 Task 1 Corpus includes topics defined by articles from The Guardian’s tech section (G01 to G20) and Tech Xplore (T01 to T20), with URLs and textual content provided for participant use. Queries associated with each topic, manually verified for relevance, enable retrieval of relevant DBLP passages. This edition introduces new queries for The Guardian articles, generated by ChatGPT 4.0, focusing on specific sub-topics and provided in CSV and JSON formats. The Simpletext 2024 task1 train.qrels file ofers quality relevance judgments on a 0-2 scale for abstracts, incorporating data from previous editions and new judgments for topics G01-G15, excluding articles with nearly empty abstracts to ensure consistency with the new vector database.

2.2. Method

We created an ElasticSearch function “query elasticsearch” to query the ElasticSearch database. It took two parameters: query (the search query) and size (the number of results to return, defaulting to 100). The function sent a GET request to the ElasticSearch URL with the specified query and size and returned the search results in JSON format. # F u n c t i o n t o q u e r y t h e E l a s t i c S e a r c h d e f q u e r y _ e l a s t i c s e a r c h ( q u e r y , s i z e = 1 0 0 ) : r e s p o n s e = r e q u e s t s . g e t ( f " { ES_URL } ? q = { q u e r y }& s i z e = { s i z e } " , a u t h = ( ’ i n e x ’ , ’ q a t c 2 0 1 1 ’ ) ) i f r e s p o n s e . s t a t u s _ c o d e == 2 0 0 :

r e t u r n r e s p o n s e . j s o n ( ) [ ’ h i t s ’ ] [ ’ h i t s ’ ] e l s e : p r i n t ( " F a i l e d t o f e t c h d a t a : " , r e s p o n s e . s t a t u s _ c o d e ) r e t u r n [ ]

We used the first five examples from the “simpletext 2024 task1 queries.json” file and went over every index. (Appendix A)

We created a function to calculate how relevant the abstracts retrieved were to our search. The function created a Text Frequency Inverse Document Frequency (TF IDF), for vectorizing the texts, which assessed relevancy of words with regards to our corpus, then calculated the cosine similarity of our vectorised words. This function could then return a relevance score (rel score)

To create the combined score, we calculated word dificulty based on the Flesch Kincaid grade level. The Flesch–Kincaid grade level is one of the formulas used for assessing reading-ease, scores indicate the grade a person would have to be in US education system to understand the text. d e f f l e s c h _ k i n c a i d _ g r a d e _ l e v e l ( t e x t ) : # C o n s t a n t s f o r t h e f o r m u l a ASL = a v e r a g e _ s e n t e n c e _ l e n g t h ( t e x t ) ASW = a v e r a g e _ s y l l a b l e s _ p e r _ w o r d ( t e x t ) # C a l c u l a t i n g t h e s c o r e s c o r e = 0 . 3 9 ∗ ASL + 1 1 . 8 ∗ ASW − 1 5 . 5 9 # N o r m a l i z e s c o r e t o r a n g e from 0 t o 1 n o r m a l i z e d _ s c o r e = n o r m a l i z e ( s c o r e , m i n _ s c o r e = 0 , m a x _ s c o r e = 2 5 ) # A d j u s t m a x _ s c o r e a s n e e d e d r e t u r n n o r m a l i z e d _ s c o r e

3. Task 1: Experimental Results

We analysed our success by using elastic search to select passages and calculated scores using FKGL and normalisation. The mean of these scores was close to 0.6 which meant that the texts were more complex than everyday speech and appropriate to be used for the next tasks.

4. Task 2: Experimental Setup 4.1. Data Description

The dataset for "Task 2: Identifying and Explaining Dificult Concepts" in the SimpleText Lab is divided into training and validation folders, each containing several tab-separated files. The training folder includes documents.tsv (576 rows, 115 documents), documents users.tsv (145 rows, document and expert IDs), terms.tsv (1,910 rows, terms, dificulty, expert ID), definitions explanations.tsv (1,046 rows, definitions, explanations, expert ID), and definitions generated.tsv (589 rows, automatically generated definitions). The validation folder contains definitions explanations.tsv (960 rows, definitions without explanations), definitions generated.tsv (932 rows, automatically generated definitions), and terms.tsv (680 rows, terms, dificulty). Initial annotations were performed by multiple experts, with a second round of validation by an external expert to identify additional terms and definitions. The dataset will later include test files for the evaluation phase.

The test dataset for "Task 2: Identifying and Explaining Dificult Concepts" in the SimpleText Lab includes several tab-separated files. The documents.tsv file contains 501 rows across 55 documents with columns for document ID, sentence ID, and sentence text. The terms.tsv and definitions explanations.tsv ifles, available after the evaluation phase, provide annotated sentence IDs, extracted terms, dificulty levels (easy, medium, dificult), user-provided definitions, and explanations. Finally, the definitions generated.tsv file contains 3,816 rows with unique definition IDs and the corresponding definitions to be ranked.

4.2. Method

We created a prompt for LLAMA-2 13B model that asked the LLM to iterate over each of our source sentences and extract three scientific terms from the phrase. p r o m p t _ t e r m s = " " "

You a r e a r o b o t t h a t ONLY o u t p u t s JSON .

You r e p l y i n JSON f o r m a t w i t h t h e f i e l d ’ t e r m s ’ .

You p r o v i d e ONLY s e m i c o l o n − s e p a r a t e d l i s t o f MAXIMUM 3

s c i e n t i f i c t e r m s o f a s o u r c e s e n t e n c e ONLY .

You DO NOT add ’ S u r e , Here a r e t h e s c i e n t i f i c t e r m s o f y o u r s e n t e n c e : ’ .

Example s o u r c e s e n t e n c e : I n t h e modern e r a o f a u t o m a t i o n and r o b o t i c s , \ autonomous v e h i c l e s a r e c u r r e n t l y t h e f o c u s o f a c a d e m i c and i n d u s t r i a l r e s e a r c h . ? \ Example a n s w e r : { ’ t e r m s ’ : ’ r o b o t i c s ; autonomous v e h i c l e s ’ } Now h e r e i s my s e n t e n c e : " " "

We used Regex to help us deal with regular expressions, removing unnecessary content in the outputs. (Appendix B)

The terms were then sorted into three rows, with duplicates removed, one term per row and we prompted Llama to give us a dificulty rating of easy, medium, or dificult for our terms. (Appendix C) We used wikipedia to return definitions for the dificult terms, with limited success. (Appendix D) We also asked the LLM to provide an explanation. When creating our prompt for our LLM, we gave it a few examples of correct return phrases, that were taken from the document provided. This was to improve the ability to achieve “few-shot” results. (Appendix E)

We then created a function to remove unnecessary text. (Appendix F)

Finally, we compiled our results in a JSON file, with those terms considered “d” for dificult, generating definitions. (Appendix G)

5. Task 2: Experimental Results

The LLM was successful in generating definitions for our dificult terms, but an issue we encountered was that Wikipedia was unsuccessful in generating definitions for our terms. Therefore, this certainly harms the appropriateness of this method as many of our definitions are missing.

6. Task 3: Experimental Setup 6.1. Method

We used LLAMA-2 13B once more, creating a larger context window of 4096. We gave the sentences to the LLM, asking it to simplify the texts. Again, we instructed the LLM to remove fluf words like “Sure!” etc. This gave us an additional column for our simplified sentences, simplified snt. (Appendix H)

Once again, it was important to remove unnecessary text therefore we created a function to carry out this task. (Appendix I)

7. Task 3: Experimental Results

Our results from the LLAMA 13B model for simplifying both the source abstracts and source sentences seems promising. Based on human validation of the simplified phrases, it’s seems clear that the meaning has been preserved while reducing the complexity of words and the length of the sentences.

8. Conclusion

Our research has shown LLAMA-2 18B to be an efective model for selecting and simplifying passages from scientific texts. However, we’ve also highlighted the unreliability of relying on wikipedia for the provision of definitions in this context.

Acknowledgments References

We’d like to extend our gratitude to the University of Brest for organising the Blended Intensive Programme (BIP) AI For Humanities. We would also like to thank Liana Ermakova for her teaching of the course and Caroline L’haridon for her support during our stay in Brest. .1. Appendix A d e f main ( ) : # Read q u e r i e s from JSON f i l e i n t o a d a t a f r a m e q u e r i e s = pd . r e a d _ j s o n ( ’ / c o n t e n t / d r i v e / MyDrive / B I P / S i m p l e T e x t / t a s k 1 / t a s k 1 / t o p i c s _ q r e l s / s i m p l e t e x t _ 2 0 2 4 _ t a s k 1 _ q u e r i e s . j s o n ’ ) q u e r i e s = q u e r i e s . h e a d ( 5 ) a l l _ r e s u l t s = [ ] f o r i n d e x , q u e r y _ r o w i n q u e r i e s . i t e r r o w s ( ) : q u e r y _ t e x t = q u e r y _ r o w [ ’ q u e r y _ t e x t ’ ] t o p i c _ i d = q u e r y _ r o w [ ’ t o p i c _ i d ’ ] q u e r y _ i d = q u e r y _ r o w [ ’ q u e r y _ i d ’ ] d o c s = q u e r y _ e l a s t i c s e a r c h ( q u e r y _ t e x t ) s c o r e s = c a l c u l a t e _ r e l e v a n c e ( d o c s , q u e r y _ t e x t ) r e s u l t s = f o r m a t _ r e s u l t s ( docs , s c o r e s , t o p i c _ i d , q u e r y _ i d ) a l l _ r e s u l t s . e x t e n d ( r e s u l t s ) # Output r e s u l t s t o a JSON f i l e with open ( ’ r e s u l t s . j s o n ’ , ’w’ ) a s f :

j s o n . dump ( a l l _ r e s u l t s , f , i n d e n t = 4 ) .2. Appendix B d e f e x t r a c t _ v a l u e _ i n s i d e _ c u r l y _ b r a c e s ( t e x t ) : # Use r e g e x t o f i n d t h e v a l u e i n s i d e c u r l y b r a c e s match = r e . s e a r c h ( r " \ { ( [ ^ { } ] ∗ ) \ } " , t e x t ) i f match :

r e t u r n match . group ( 1 ) e l s e :

r e t u r n None .3. Appendix C p r o m p t _ d i f f i c u l t y = " " "

You a r e a r o b o t t h a t r a t e s t h e d i f f i c u l t y o f d i f f e r e n t ter ms . You p r o v i d e ONE LEVEL o d i f f i c u l t y f o r s c i e n t i f i c term s . You need t o c o n s i d e r two words a s one term .

P r o v i d e ONE r a t i n g f o r t h e u n d e r s t a b l i t y d i f f i c u l t y o f term p r o v i d e d .

There a r e 3 l e v e l s . You need t o use : e f o r easy , m f o r medium and d f o r d i f f i c u l t .

Give t h e r a t i n g i n s i d e o f c u r l y b r a c e s l i k e t h i s { e } You can r e p l y with ONLY one word .

Example s o u r c e : autonomous v e h i c l e s Example answer : { ’m’ }

Now h e r e i s my s e n t e n c e : " " " .4. Appendix D i m p o r t w i k i p e d i a d e f g e t _ w i k i p e d i a _ d e f i n i t i o n ( term ) : t r y : # F e t c h W i k i p e d i a summary f o r t h e term summary = w i k i p e d i a . summary ( term ) r e t u r n summary e x c e p t w i k i p e d i a . e x c e p t i o n s . D i s a m b i g u a t i o n E r r o r a s e : # I f t h e r e ’ s a d i s a m b i g u a t i o n e r r o r , h a n d l e i t a s needed r e t u r n " D i s a m b i g u a t i o n E r r o r : Ambiguous term " e x c e p t w i k i p e d i a . e x c e p t i o n s . P a g e E r r o r a s e : # I f t h e page doesn ’ t e x i s t , h a n d l e i t a s needed r e t u r n " P a g e E r r o r : Term not found " e x c e p t E x c e p t i o n a s e : # Handle o t h e r e x c e p t i o n s r e t u r n s t r ( e ) # Assuming t e s t [ ’ d i f f i c u l t y ’ ] c o n t a i n s term s f o r which you want

W i k i p e d i a d e f i n i t i o n s # t e s t [ ’ wiki ’ ] = t e s t [ ’ term ’ ] . a p p l y ( g e t _ w i k i p e d i a _ d e f i n i t i o n ) t e s t . l o c [ t e s t [ ’ d i f f i c u l t y ’ ] == ’ d ’ , ’ wiki ’ ] = t e s t . l o c [ t e s t [ ’ d i f f i c u l t y ’ ] == ’ d ’ , ’ term ’ ] . a p p l y ( g e t _ w i k i p e d i a _ d e f i n i t i o n ) t e s t .5. Appendix E p r o m p t _ e x p l a n a t i o n = " " "

You a r e a r o b o t t h a t e x p l a i n s d i f f i c u l t s c i e n t i f i c term s . DO NOT add i n t r o l i k e " Sure , I ’ d be happy t o h e l p ! " Use o n l y once s e n t a n c e and wrap t h e s e n t a n c e i n c u r l y b r a c e s . D o n t j u s t i f y your answers . D o n t g i v e i n f o r m a t i o n not mentioned i n t h e CONTEXT INFORMATION .

Example s o u r c e : w i r e l e s s network environment Example answer : { ’ a system i n which d e v i c e s makes use o f Radio Frequency c o n n e c t i o n s between nodes i n t h e network a system i n which d e v i c e s a r e c o n n e c t e d t o a network w i t h o u t t h e need f o r p h y s i c a l c a b l e s or wires ’ } Example s o u r c e : B l u e t o o t h w i r e l e s s t e c h n o l o g y Example answer : { ’ s h o r t − range w i r e l e s s communication t e c h n o l o g y t h a t a l l o w s d e v i c e s t o c o n n e c t and exchange d a t a . I t f a c i l i t a t e s d a t a exchange between d e v i c e s l i k e smartphones , computers , and p e r i p h e r a l s such a s headphones or m e d i c a l d e v i c e s . B l u e t o o t h t e c h n o l o g y e l i m i n a t e s t h e need f o r p h y s i c a l c a b l e s , p r o v i d i n g c o n v e n i e n c e and v e r s a t i l i t y i n d e v i c e c o n n e c t i v i t y . ’ } Example s o u r c e : a p p l i c a t i o n Example answer : { ’ s o f t w a r e program or t o o l d e s i g n e d t o perform s p e c i f i c t a s k s or f u n c t i o n s on e l e c t r o n i c d e v i c e s . I t can range from p r o d u c t i v i t y t o o l s and games t o u t i l i t i e s and communication p l a t f o r m s on e l e c t r o n i c d e v i c e s such a s computers , smartphones , or t a b l e t s . ’ } Example s o u r c e : PDA Example answer : { ’ PDA i s t h e acronym f o r p e r s o n a l d i g i t a l a s s i s t a n t , which i s a handheld e l e c t r o n i c d e v i c e d e s i g n e d f o r p e r s o n a l o r g a n i z a t i o n , communication , and i n f o r m a t i o n a c c e s s . PDAs may i n c l u d e f e a t u r e s such a s c a l e n d a r s , c o n t a c t l i s t s , and note − t a k i n g c a p a b i l i t i e s , s e r v i n g a s p o r t a b l e t o o l s f o r managing d a i l y t a s k s . PDA i s t h e acronym f o r p e r s o n a l d i g i t a l a s s i s t a n t , which i s a handheld e l e c t r o n i c d e v i c e c r a f t e d f o r p e r s o n a l o r g a n i z a t i o n , communication , and i n f o r m a t i o n r e t r i e v a l . PDAs o f t e n i n c o r p o r a t e f e a t u r e s l i k e c a l e n d a r s , c o n t a c t l i s t s , and note − t a k i n g c a p a b i l i t i e s , f u n c t i o n i n g a s p o r t a b l e t o o l s f o r managing d a i l y t a s k s and s t a y i n g c o n n e c t e d . While modern smartphones have l a r g e l y r e p l a c e d t r a d i t i o n a l PDAs , t h e c o n c e p t i n f l u e n c e d t h e development o f contemporary m o b i l e d e v i c e s . ’ } Example s o u r c e : p i l o t s t u d y Example answer : { ’ a p r e l i m i n a r y r e s e a r c h i n v e s t i g a t i o n c o n d u c t e d on a s m a l l s c a l e t o a s s e s s t h e f e a s i b i l i t y , and p o t e n t i a l c h a l l e n g e s o f a l a r g e r r e s e a r c h p r o j e c t . an i n i t i a l and s m a l l e r − s c a l e r e s e a r c h i n v e s t i g a t i o n u n d e r t a k e n t o e v a l u a t e t h e f e a s i b i l i t y , methodology , and p o t e n t i a l o b s t a c l e s o f a l a r g e r r e s e a r c h p r o j e c t . I t s e r v e s a s a t e s t i n g ground t o r e f i n e t h e s t u d y d e s i g n , i d e n t i f y l o g i s t i c a l i s s u e s , and enhance t h e o v e r a l l r o b u s t n e s s and e f f e c t i v e n e s s o f t h e p l a n n e d f u l l − s c a l e r e s e a r c h e n d e a v o r . ’ }

Now h e r e i s my ONE s e n t e n c e e x p l a n a t i o n : " " " .6. Appendix F d e f r e m o v e _ r e d u n d a n t _ t e x t ( t e x t ) : # D e f i n e p a t t e r n s t o s e a r c h f o r p a t t e r n s = [ r ’ ^ Hey t h e r e ! ’ , r ’ ^ Sure ! ’ , r ’ ^ As a s c i e n t i f i c j o u r n a l i s t , ’ , r ’ I \ ’m h e r e t o b r e a k down a complex s t u d y i n t o s i m p l e t e r m s f o r you \ . ’ , r ’ Here \ ’ s a s i m p l i f i e d v e r s i o n o f t h e t e x t ’ , r ’ L e t me b r e a k i t down f o r you : ’ , r ’ I \ ’m h e r e t o b r e a k down a complex s t u d y i n t o s i m p l e t e r m s f o r you \ . ’ , r ’ I \ ’m h e r e t o b r e a k down complex s c i e n t i f i c c o n c e p t s i n t o s i m p l e , easy − to − u n d e r s t a n d l a n g u a g e . ’ , r ’ I \ ’m h e r e t o b r e a k down a complex t o p i c i n t o s i m p l e r t e r m s f o r you . So , l e t \ ’ s t a l k about ’ , r ’ Here i s my one s e n t e n c e e x p l a n a t i o n of ’ .7. Appendix G # Add d e f i n i t i o n and e x p l a n a t i o n i f t h e y a r e not empty i f row [ " d i f f i c u l t y " ] == " d " : d e f i n i t i o n = row . g e t ( " d e f i n i t i o n " , None ) e x p l a n a t i o n = row . g e t ( " e x p l a n a t i o n " , None ) i f d e f i n i t i o n :

j s o n _ o b j [ " d e f i n i t i o n " ] = d e f i n i t i o n i f e x p l a n a t i o n :

j s o n _ o b j [ " e x p l a n a t i o n " ] = e x p l a n a t i o n r e t u r n j s o n _ o b j .8. Appendix H # Example usage d e f s i m p l i f y ( s n t ) : c = model . c r e a t e _ c h a t _ c o m p l e t i o n ( messages =[ { " r o l e " : " system " , " c o n t e n t " : " You a r e a s c i e n t i f i c

j o u r n a l i s t who p o p u l a r i z e s s c i e n t i f i c r e s u l t s . " } , { " r o l e " : " u s e r " , " c o n t e n t " : " S i m p l i f y t h e f o l l o w i n g t e x t : \ n " + s n t } ) r e t u r n c [ ’ c h o i c e s ’ ] [ 0 ] [ ’ message ’ ] [ ’ c o n t e n t ’ ] . s t r i p ( ) d e f s i m p l i f y ( s n t ) : c=model . c r e a t e _ c h a t _ c o m p l e t i o n ( messages = [ { " r o l e " : " system " , " c o n t e n t " : " You a r e a s c i e n t i f i c j o u r n a l i s t who p o p u l a r i z e s s c i e n t i f i c r e s u l t s . " } , " r o l e " : " u s e r " , " c o n t e n t " : " S i m p l i f y t h e f o l l o w i n g t e x t : \ n "+ s n t ) r e t u r n c [ ’ c h o i c e s ’ ] [ 0 ] [ ’ message ’ ] [ ’ c o n t e n t ’ ] . s t r i p ( ) s i m p l i f y ( " With t h e e v e r i n c r e a s i n g number o f unmanned a e r i a l v e h i c l e s g e t t i n g i n v o l v e d i n a c t i v i t i e s i n t h e c i v i l i a n and commercial domain , t h e r e i s an i n c r e a s e d need f o r autonomy i n t h e s e s y s t e m s t o o . " ) .9. Appendix I d e f r e m o v e _ r e d u n d a n t _ t e x t ( t e x t ) : # D e f i n e p a t t e r n s t o s e a r c h f o r p a t t e r n s = [ r ’ ^ Hey t h e r e ! ’ , r ’ ^ Sure ! ’ , r ’ ^ As a s c i e n t i f i c j o u r n a l i s t , ’ , r ’ I \ ’m h e r e t o b r e a k down a complex s t u d y i n t o s i m p l e t erms f o r you \ . ’ , r ’ Here \ ’ s a s i m p l i f i e d v e r s i o n o f t h e t e x t ’ , r ’ L e t me b r e a k i t down f o r you : ’ , r ’ I \ ’m h e r e t o b r e a k down a complex s t u d y i n t o s i m p l e t erms f o r you \ . ’ , r ’ I \ ’m h e r e t o b r e a k down complex s c i e n t i f i c c o n c e p t s i n t o s i m pl e , easy − to − u n d e r s t a n d l a n g u a g e . ’ , r ’ I \ ’m h e r e t o b r e a k down a complex t o p i c i n t o s i m p l e r t erms f o r you . So , l e t \ ’ s t a l k about ’ , r ’ Sure , I \ ’ d be happy t o h e l p ! ’ , r ’ Here \ ’ s a s i m p l i f i e d e x p l a n a t i o n of ’ , r ’ I n o t h e r words , ’ , r ’ I n s i m p l e terms , ’ ] # Compile r e g u l a r e x p r e s s i o n s r e g e x _ p a t t e r n s = [ r e . c o m p i l e ( p a t t e r n ) f o r p a t t e r n i n p a t t e r n s ] # Remove p a t t e r n s from t e x t f o r p a t t e r n i n r e g e x _ p a t t e r n s :

t e x t = r e . sub ( p a t t e r n , ’ ’ , t e x t ) . s t r i p ( )

[1]

Ermakova , et al., Overview of CLEF 2024 SimpleText track on improving access to scientific texts , in: L. Goeuriot , et al. (Eds.), Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fifteenth International Conference of the CLEF Association (CLEF 2024 ), Lecture Notes in Computer Science, Springer, 2024 .

[2]

SanJuan , et al., Overview of the CLEF 2024 SimpleText task 1: Retrieve passages to include in a simplified summary , in: G. Faggioli , et al. (Eds.), Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2024 ), CEUR Workshop Proceedings, CEUR-WS.org, 2024 .

[3] G. M. D. Nunzio , et al., Overview of the CLEF 2024 SimpleText task 2: Identify and explain dificult concepts , in: G. Faggioli , et al. (Eds.), Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2024 ), CEUR Workshop Proceedings, CEUR-WS.org, 2024 .

[4]

Ermakova , et al., Overview of the CLEF 2024 SimpleText task 3: Simplify scientific text , in: G. Faggioli , et al. (Eds.), Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2024 ), CEUR Workshop Proceedings, CEUR-WS.org, 2024 .

[5] J. D'Souza , et al., Overview of the CLEF 2024 SimpleText task 4: Track the state-of-the-art in scholarly publications , in: G. Faggioli , et al. (Eds.), Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2024 ), CEUR Workshop Proceedings, CEUR-WS.org, 2024 .