CLEF 2024 SimpleText Tasks 1-3: Use of Llama-2 for Text Simplification ⋆ Notebook for the SimpleText Lab at CLEF 2024

CLEF 2024 SimpleText Tasks 1-3: Use of Llama-2 for Text Simplification ⋆ Notebook for the SimpleText Lab at CLEF 2024 RowanMann rowanmann93@gmail.com Christian Albrechts-Universität zu Kiel (CAU)

Christian-Albrechts-Platz 4 24118 Kiel

TomislavMikulandric tomislav.mikulandric@gmail.com The University of Split

Ul. Ruđera Boškovića 31 21000 Split Croatia

CLEF 2024 SimpleText Tasks 1-3: Use of Llama-2 for Text Simplification ⋆ Notebook for the SimpleText Lab at CLEF 2024 1613-0073 97DF434D08A24F9E691E6AD6039F657B GROBID - A machine learning software for extracting information from scholarly documents LLMs text simplification LLaMA-2

In an era defined by the vast availability of information, the challenge of discerning reliable information is more pressing than ever. Our paper presents findings from Tasks 1, 2, and 3 of the SimpleText track at the 15th Conference and Labs of the Evaluation Forum (CLEF) 2024, aimed at advancing research in automatic simplification of scientific texts using LLaMA-2.

Task 1 involved selecting relevant passages for simplified summaries, leveraging ElasticSearch and TF-IDF with cosine similarity for evaluating relevance. We achieved an average Flesch-Kincaid grade level of 0.6, indicating a moderate complexity suitable for further simplification.

Task 2 focused on identifying and explaining difficult concepts. Using the LLaMA-2 13B model, we extracted and rated the difficulty of scientific terms, generating explanations for the most challenging ones. However, reliance on Wikipedia for definitions proved inconsistent, highlighting a limitation in our methodology.

Task 3 addressed the simplification of scientific abstracts and sentences. We utilized LLaMA-2 to generate simplified versions, effectively maintaining the original meaning while reducing complexity and length. Human validation confirmed the preservation of essential content in the simplified texts.

Our research demonstrates the efficacy of LLaMA-2 for text simplification tasks, albeit with noted challenges in obtaining reliable definitions from external sources like Wikipedia. These findings contribute to the broader goal of enhancing scientific literacy through accessible information.

Introduction

We live in an era characterised by an abundance of information available to all, almost instantaneously. However, far from creating a world defined by truth and understanding, our era seems to be accurately defined by misinformation and polarity. Fake news and algorithmically determined "echo-chambers" have helped spread conspiracy and division across the world, with consequences that reverberate far beyond their origins in cyberspace.

For the average person, it's more difficult than ever to know what information to believe. We all need to be able to understand our world, so scientific literacy is a more important skill than ever.

This paper presents the results of our analysis of Tasks 1, 2 and 3 of the SimpleText track as part of the 15th Conference and Labs of the Evaluation Forum 2024. The main goal of SimpleText is to advance research in the area of automatic simplification of scientific texts [1].

The paper deals with:

• Task 1: What is in (or out)? Selecting passages to include in a simplified summary.

• Task 2: What is unclear? Difficult concept identification and explanation • Task 3: Rewrite this! Given a query, simplify passages from scientific abstracts.

Task 1: Experimental Setup

Data Description

The data provided to us by CLEF consisted of 2 folders, "corpus" and "topics qrels". The corpus includes a new vector database with sentence embedding scores and retains the previously released ElasticSearch Index for field-specific document searches. The ElasticSearch Index allows querying fields such as id, abstract, authors, title, year, and doi from the DBLP dump and is suitable for various applications like passage retrieval, Latent Dirichlet Allocation models, and training Graph Neural Networks. The Vector Database stores each article's id and sentence-embedding vectors from their title and abstract, excluding articles with empty or very short abstracts, supporting longer queries enabled by sentence embedding. The SimpleText 2024 Task 1 Corpus includes topics defined by articles from The Guardian's tech section (G01 to G20) and Tech Xplore (T01 to T20), with URLs and textual content provided for participant use. Queries associated with each topic, manually verified for relevance, enable retrieval of relevant DBLP passages. This edition introduces new queries for The Guardian articles, generated by ChatGPT 4.0, focusing on specific sub-topics and provided in CSV and JSON formats. The Simpletext 2024 task1 train.qrels file offers quality relevance judgments on a 0-2 scale for abstracts, incorporating data from previous editions and new judgments for topics G01-G15, excluding articles with nearly empty abstracts to ensure consistency with the new vector database.

Method

We created an ElasticSearch function "query elasticsearch" to query the ElasticSearch database. It took two parameters: query (the search query) and size (the number of results to return, defaulting to 100). The function sent a GET request to the ElasticSearch URL with the specified query and size and returned the search results in JSON format. # F u n c t i o n t o q u e r y t h e E l a s t i c S e a r c h d e f q u e r y _ e l a s t i c s e a r c h ( query , s i z e = 1 0 0 ) : r e s p o n s e = r e q u e s t s . g e t ( f " { ES_URL } ? q = { q u e r y }& s i z e = { s i z e } " , a u t h = ( ' i n e x ' , ' q a t c 2 0 1 We created a function to calculate how relevant the abstracts retrieved were to our search. The function created a Text Frequency Inverse Document Frequency (TF IDF), for vectorizing the texts, which assessed relevancy of words with regards to our corpus, then calculated the cosine similarity of our vectorised words. This function could then return a relevance score (rel score)

To create the combined score, we calculated word difficulty based on the Flesch Kincaid grade level. The Flesch-Kincaid grade level is one of the formulas used for assessing reading-ease, scores indicate the grade a person would have to be in US education system to understand the text.

Task 1: Experimental Results

We analysed our success by using elastic search to select passages and calculated scores using FKGL and normalisation. The mean of these scores was close to 0.6 which meant that the texts were more complex than everyday speech and appropriate to be used for the next tasks. The test dataset for "Task 2: Identifying and Explaining Difficult Concepts" in the SimpleText Lab includes several tab-separated files. The documents.tsv file contains 501 rows across 55 documents with columns for document ID, sentence ID, and sentence text. The terms.tsv and definitions explanations.tsv files, available after the evaluation phase, provide annotated sentence IDs, extracted terms, difficulty levels (easy, medium, difficult), user-provided definitions, and explanations. Finally, the definitions generated.tsv file contains 3,816 rows with unique definition IDs and the corresponding definitions to be ranked.

Method

We created a prompt for LLAMA-2 13B model that asked the LLM to iterate over each of our source sentences and extract three scientific terms from the phrase. The terms were then sorted into three rows, with duplicates removed, one term per row and we prompted Llama to give us a difficulty rating of easy, medium, or difficult for our terms. (Appendix C)

We used wikipedia to return definitions for the difficult terms, with limited success. (Appendix D)

We also asked the LLM to provide an explanation. When creating our prompt for our LLM, we gave it a few examples of correct return phrases, that were taken from the document provided. This was to improve the ability to achieve "few-shot" results. (Appendix E)

We then created a function to remove unnecessary text. (Appendix F) Finally, we compiled our results in a JSON file, with those terms considered "d" for difficult, generating definitions. (Appendix G)

Task 2: Experimental Results

The LLM was successful in generating definitions for our difficult terms, but an issue we encountered was that Wikipedia was unsuccessful in generating definitions for our terms. Therefore, this certainly harms the appropriateness of this method as many of our definitions are missing.

Task 3: Experimental Setup

Method

We used LLAMA-2 13B once more, creating a larger context window of 4096. We gave the sentences to the LLM, asking it to simplify the texts. Again, we instructed the LLM to remove fluff words like "Sure!" etc. This gave us an additional column for our simplified sentences, simplified snt. (Appendix H) Once again, it was important to remove unnecessary text therefore we created a function to carry out this task. (Appendix I)

Task 3: Experimental Results

Our results from the LLAMA 13B model for simplifying both the source abstracts and source sentences seems promising. Based on human validation of the simplified phrases, it's seems clear that the meaning has been preserved while reducing the complexity of words and the length of the sentences.

Conclusion

Our research has shown LLAMA-2 18B to be an effective model for selecting and simplifying passages from scientific texts. However, we've also highlighted the unreliability of relying on wikipedia for the provision of definitions in this context.

.1. Appendix A d e f main ( ) :

# Read q u e r i e s from JSON f i l e i n t o a d a t a f r a m e q u e r i e s = pd . r e a d _ j s o n ( ' / c o n t e n t / d r i v e / MyDrive / BIP / S i m p l e T e x t / t a s k 1 / t a s k 1 / t o p i c s _ q r e l s / s i m p l e t e x t _ 2 0 2 4 _ t a s k 1 _ q u e r i e s . j s o n ' ) q u e r i e s = q u e r i e s . head ( 5 ) a l l _ r e s u l t s = [ ] f o r i n d e x , query_row i n q u e r i e s . i t e r r o w s ( ) :

q u e r y _ t e x t = query_row [ ' q u e r y _ t e x t ' ] t o p i c _ i d = query_row [ ' t o p i c _ i d ' ] q u e r y _ i d = query_row [ ' q u e r y _ i d ' ] d o c s = q u e r y _ e l a s t i c s e a r c h ( q u e r y _ t e x t ) s c o r e s = c a l c u l a t e _ r e l e v a n c e ( docs , q u e r y _ t e x t ) r e s u l t s = f o r m a t _ r e s u l t s ( docs , s c o r e s , t o p i c _ i d , q u e r y _ i d ) a l l _ r e s u l t s . e x t e n d ( r e s u l t s ) # Output r e s u l t s t o a JSON f i l e w i t h open ( ' r e s u l t s . j s o n ' , 'w ' ) a s f : j s o n . dump ( a l l _ r e s u l t s , f , i n d e n t = 4 )

. on a s m a l l s c a l e t o a s s e s s t h e f e a s i b i l i t y , and p o t e n t i a l c h a l l e n g e s o f a l a r g e r r e s e a r c h p r o j e c t . an i n i t i a l and s m a l l e r − s c a l e r e s e a r c h i n v e s t i g a t i o n u n d e r t a k e n t o e v a l u a t e t h e f e a s i b i l i t y , methodology , and p o t e n t i a l o b s t a c l e s o f a l a r g e r r e s e a r c h p r o j e c t . I t s e r v e s a s a t e s t i n g ground t o r e f i n e t h e s t u d y d e s i g n , i d e n t i f y l o g i s t i c a l i s s u e s , and { " r o l e " : " s y s t e m " , " c o n t e n t " : " You a r e a s c i e n t i f i c j o u r n a l i s t who p o p u l a r i z e s s c i e n t i f i c r e s u l t s . " } , { " r o l e " : " u s e r " , " c o n t e n t " : " S i m p l i f y t h e f o l l o w i n g t e x t : \ n " + s n # Remove p a t t e r n s from t e x t f o r p a t t e r n i n r e g e x _ p a t t e r n s : t e x t = r e . sub ( p a t t e r n , ' ' , t e x t ) . s t r i p ( ) r e t u r n t e x t

Appendix B

d e f f l e s c h _ k i n c a i d _ g r a d e _ l e v e l ( t e x t ) : # C o n s t a n t s f o r t h e f o r m u l a ASL = a v e r a g e _ s e n t e n c e _ l e n g t h ( t e x t ) ASW = a v e r a g e _ s y l l a b l e s _ p e r _ w o r d ( t e x t ) # C a l c u l a t i n g t h e s c o r e s c o r e = 0 . 3 9 * ASL + 1 1 . 8 * ASW − 1 5 . 5 9 # N o r m a l i z e s c o r e t o r a n g e from 0 t o 1 n o r m a l i z e d _ s c o r e = n o r m a l i z e ( s c o r e , m i n _ s c o r e = 0 , m a x _ s c o r e = 2 5 ) # A d j u s t m a x _ s c o r e a s n e e d e d r e t u r n n o r m a l i z e d _ s c o r e

p r o m p t _ t e r m s = " " " You a r e a r o b o t t h a t ONLY o u t p u t s JSON . You r e p l y i n JSON f o r m a t w i t h t h e f i e l d ' terms ' . You p r o v i d e ONLY s e m i c o l o n − s e p a r a t e d l i s t o f MAXIMUM 3 s c i e n t i f i c t e r m s o f a s o u r c e s e n t e n c e ONLY . You DO NOT add ' Sure , Here a r e t h e s c i e n t i f i c t e r m s o f your s e n t e n c e : ' . Example s o u r c e s e n t e n c e : I n t h e modern e r a o f a u t o m a t i o n and r o b o t i c s , \ autonomous v e h i c l e s a r e c u r r e n t l y t h e f o c u s o f a c a d e m i c and i n d u s t r i a l r e s e a r c h . ? \ Example answer : { ' terms ' : ' r o b o t i c s ; autonomous v e h i c l e s ' } Now h e r e i s my s e n t e n c e : " " " We used Regex to help us deal with regular expressions, removing unnecessary content in the outputs. (Appendix B)

d e f ex t r a c t _ v a l u e _ i n s i d e _ c u r l y _ b r a c e s ( t e x t ) : # Use r e g e x t o f i n d t h e v a l u e i n s i d e c u r l y b r a c e s match = r e . s e a r c h ( r " \ { ( [ ^{ } ] * ) \ } " , t e x t ) i f match : r e t u r n match . group ( 1 ) e l s e : r e t u r n None .3. Appendix C p r o m p t _ d i f f i c u l t y = " " " You a r e a r o b o t t h a t r a t e s t h e d i f f i c u l t y o f d i f f e r e n t t e r m s . You p r o v i d e ONE LEVEL o d i f f i c u l t y f o r s c i e n t i f i c t e r m s . You need t o c o n s i d e r two words a s one term . P r o v i d e ONE r a t i n g f o r t h e u n d e r s t a b l i t y d i f f i c u l t y o f term p r o v i d e d . There a r e 3 l e v e l s . You need t o u s e : e f o r easy , m f o r medium and d f o r d i f f i c u l t . Give t h e r a t i n g i n s i d e o f c u r l y b r a c e s l i k e t h i s { e } You can r e p l y w i t h ONLY one word . Example s o u r c e : autonomous v e h i c l e s Example answer : { ' m' } Now h e r e i s my s e n t e n c e : " " " .4. Appendix D i m p o r t w i k i p e d i a d e f g e t _ w i k i p e d i a _ d e f i n i t i o n ( term ) : t r y : # F e t c h W i k i p e d i a summary f o r t h e term summary = w i k i p e d i a . summary ( term ) r e t u r n summary e x c e p t w i k i p e d i a . e x c e p t i o n s . D i s a m b i g u a t i o n E r r o r a s e : # I f t h e r e ' s a d i s a m b i g u a t i o n e r r o r , h a n d l e i t a s n e e d e d r e t u r n " D i s a m b i g u a t i o n E r r o r : Ambiguous term " e x c e p t w i k i p e d i a . e x c e p t i o n s . P a g e E r r o r a s e : # I f t h e page doesn ' t e x i s t , h a n d l e i t a s n e e d e d r e t u r n " P a g e E r r o r : Term n o t f o u n d " e x c e p t E x c e p t i o n a s e : # Handle o t h e r e x c e p t i o n s r e t u r n s t r ( e ) # Assuming t e s t [ ' d i f f i c u l t y ' ] c o n t a i n s t e r m s f o r which you want W i k i p e d i a d e f i n i t i o n s # t e s t [ ' wi k i ' ] = t e s t [ ' term ' ] . a p p l y ( g e t _ w i k i p e d i a _ d e f i n i t i o n ) t e s t . l o c [ t e s t [ ' d i f f i c u l t y ' ] == ' d ' , ' w i ki ' ] = t e s t . l o c [ t e s t [ ' d i f f i c u l t y ' ] == ' d ' , ' term ' ] . a p p l y ( g e t _ w i k i p e d i a _ d e f i n i t i o n ) t e s t .5. Appendix E p r o m p t _ e x p l a n a t i o n = " " " You a r e a r o b o t t h a t e x p l a i n s d i f f i c u l t s c i e n t i f i c t e r m s . DO NOT add i n t r o l i k e " Sure , I ' d be happy t o h e l p ! " Use o n l y once s e n t a n c e and wrap t h e s e n t a n c e i n c u r l y b r a c e s . D o n t j u s t i f y your a n s w e r s . D o n t g i v e i n f o r m a t i o n n o t m e n t i o n e d i n t h e CONTEXT INFORMATION . Example s o u r c e : w i r e l e s s network e n v i r o n m e n t Example answer : { ' a s y s t e m i n which d e v i c e s makes u s e o f R a d i o F r e q u e n c y c o n n ec t i o n s between n o d e s i n t h e network a s y s t e m i n which d e v i c e s a r e c o n n e c t e d t o a network w i t h o u t t h e need f o r p h y s i c a l c a b l e s o r w i r e s ' } Example s o u r c e : B l u e t o o t h w i r e l e s s t e c h n o l o g y Example answer : { ' s h o r t − r a n g e w i r e l e s s c o m m u n i c a t i o n t e c h n o l o g y t h a t a l l o w s d e v i c e s t o c o n n e c t and e x c h a n g e d a t a . I t f a c i l i t a t e s d a t a e x c h a n g e between d e v i c e s l i k e s m a r t p h o n e s , co m p u t e r s , and p e r i p h e r a l s s u c h a s h e a d p h o n e s o r m e d i c a l d e v i c e s . B l u e t o o t h t e c h n o l o g y e l i m i n a t e s t h e need f o r p h y s i c a l c a b l e s , p r o v i d i n g c o n v e n i e n c e and v e r s a t i l i t y i n d e v i c e c o n n e c t i v i t y . ' }Example s o u r c e : a p p l i c a t i o n Example answer : { ' s o f t w a r e program o r t o o l d e s i g n e d t o p e r f o r m s p e c i f i c t a s k s o r f u n c t i o n s on e l e c t r o n i c d e v i c e s . I t can r a n g e from p r o d u c t i v i t y t o o l s and games t o u t i l i t i e s and c o m m u n i c a t i o n p l a t f o r m s on e l e c t r o n i c d e v i c e s s u c h a s co m p u t e r s , s m a r t p h o n e s , o r t a b l e t s . ' } Example s o u r c e : PDA Example answer : { ' PDA i s t h e acronym f o r p e r s o n a l d i g i t a l a s s i s t a n t , which i s a h a n d h e l d e l e c t r o n i c d e v i c e d e s i g n e d f o r p e r s o n a l o r g a n i z a t i o n , communication , and i n f o r m a t i o n a c c e s s . PDAs may i n c l u d e f e a t u r e s s u c h a s c a l e n d a r s , c o n t a c t l i s t s , and note − t a k i n g c a p a b i l i t i e s , s e r v i n g a s p o r t a b l e t o o l s f o r managing d a i l y t a s k s . PDA i s t h e acronym f o r p e r s o n a l d i g i t a l a s s i s t a n t , which i s a h a n d h e l d e l e c t r o n i c d e v i c e c r a f t e d f o r p e r s o n a l o r g a n i z a t i o n , communication , and i n f o r m a t i o n r e t r i e v a l . PDAs o f t e n i n c o r p o r a t e f e a t u r e s l i k e c a l e n d a r s , c o n t a c t l i s t s , and note − t a k i n g c a p a b i l i t i e s , f u n c t i o n i n g a s p o r t a b l e t o o l s f o r managing d a i l y t a s k s and s t a y i n g c o n n e c t e d . While modern s m a r t p h o n e s have l a r g e l y r e p l a c e d t r a d i t i o n a l PDAs , t h e c o n c e p t i n f l u e n c e d t h e d e v e l o p m e n t o f c o n t e m p o r a r y m o b i l e d e v i c e s . ' } Example s o u r c e : p i l o t s t u d y Example answer : { ' a p r e l i m i n a r y r e s e a r c h i n v e s t i g a t i o n c o n d u c t e d

en h a nc e t h e o v e r a l l r o b u s t n e s s and e f f e c t i v e n e s s o f t h e p l a n n e d f u l l − s c a l e r e s e a r c h e n d e a v o r . ' } Now h e r e i s my ONE s e n t e n c e e x p l a n a t i o n : " " " .6. Appendix F d e f r e m o v e _ r e d u n d a n t _ t e x t ( t e x t ) : # D e f i n e p a t t e r n s t o s e a r c h f o r p a t t e r n s = [ r ' ^Hey t h e r e ! ' , r ' ^S u r e ! ' , r ' ^As a s c i e n t i f i c j o u r n a l i s t , ' , r ' I \ 'm h e r e t o b r e a k down a complex s t u d y i n t o s i m p l e t e r m s f o r you \ . ' , r ' Here \ ' s a s i m p l i f i e d v e r s i o n o f t h e t e x t ' , r ' L e t me b r e a k i t down f o r you : ' , r ' I \ 'm h e r e t o b r e a k down a complex s t u d y i n t o s i m p l e t e r m s f o r you \ . ' , r ' I \ 'm h e r e t o b r e a k down complex s c i e n t i f i c c o n c e p t s i n t o s i m p l e , easy − to − u n d e r s t a n d l a n g u a g e . ' , r ' I \ 'm h e r e t o b r e a k down a complex t o p i c i n t o s i m p l e r t e r m s f o r you . So , l e t \ ' s t a l k ab o u t ' , r ' Here i s my one s e n t e n c e e x p l a n a t i o n of ' ] .7. Appendix G # Add d e f i n i t i o n and e x p l a n a t i o n i f t h e y a r e n o t empty i f row [ " d i f f i c u l t y " ] == " d " : d e f i n i t i o n = row . g e t ( " d e f i n i t i o n " , None ) e x p l a n a t i o n = row . g e t ( " e x p l a n a t i o n " , None ) i f d e f i n i t i o n : j s o n _ o b j [ " d e f i n i t i o n " ] = d e f i n i t i o n i f e x p l a n a t i o n : j s o n _ o b j [ " e x p l a n a t i o n " ] = e x p l a n a t i o n r e t u r n j s o n _ o b j .8. Appendix H # Example u s a g e d e f s i m p l i f y ( s n t ) : c = model . c r e a t e _ c h a t _ c o m p l e t i o n ( m e s s a g e s =[

t } ] ) r e t u r n c [ ' c h o i c e s ' ] [ 0 ] [ ' message ' ] [ ' c o n t e n t ' ] . s t r i p ( ) d e f s i m p l i f y ( s n t ) : c = model . c r e a t e _ c h a t _ c o m p l e t i o n ( m e s s a g e s = [{ " r o l e " : " s y s t e m " , " c o n t e n t " : " You a r e a s c i e n t i f i c j o u r n a l i s t who p o p u l a r i z e s s c i e n t i f i c r e s u l t s . " } , { " r o l e " : " u s e r " , " c o n t e n t " : " S i m p l i f y t h e f o l l o w i n g t e x t : \ n " + s n t } ] ) r e tu r n c [ ' c h o i c e s ' ] [ 0 ] [ ' message ' ] [ ' c o n t e n t ' ] . s t r i p ( ) s i m p l i f y ( " With t h e e v e r i n c r e a s i n g number o f unmanned a e r i a l v e h i c l e s g e t t i n g i n v o l v e d i n a c t i v i t i e s i n t h e c i vi l i a n and c o m m e r c i a l domain , t h e r e i s an i n c r e a s e d need f o r autonomy i n t h e s e s y s t e m s t o o . " ) .9. Appendix I d e f r e m o v e _ r e d u n d a n t _ t e x t ( t e x t ) : # D e f i n e p a t t e r n s t o s e a r c h f o r p a t t e r n s = [ r ' ^Hey t h e r e ! ' , r ' ^S u r e ! ' , r ' ^As a s c i e n t i f i c j o u r n a l i s t , ' , r ' I \ 'm h e r e t o b r e a k down a complex s t u d y i n t o s i m p l e t e r m s f o r you \ . ' , r ' Here \ ' s a s i m p l i f i e d v e r s i o n o f t h e t e x t ' , r ' L e t me b r e a k i t down f o r you : ' , r ' I \ 'm h e r e t o b r e a k down a complex s t u d y i n t o s i m p l e t e r m s f o r you \ . ' , r ' I \ 'm h e r e t o b r e a k down complex s c i e n t i f i c c o n c e p t s i n t o s i m p l e , easy − to − u n d e r s t a n d l a n g u a g e . ' , r ' I \ 'm h e r e t o b r e a k down a complex t o p i c i n t o s i m p l e r t e r m s f o r you . So , l e t \ ' s t a l k ab o u t ' , r ' Sure , I \ ' d be happy t o h e l p ! ' , r ' Here \ ' s a s i m p l i f i e d e x p l a n a t i o n of ' , r ' I n o t h e r words , ' , r ' I n s i m p l e terms , ' ] # Compile r e g u l a r e x p r e s s i o n s r e g e x _ p a t t e r n s = [ r e . c o m p i l e ( p a t t e r n ) f o r p a t t e r n i n p a t t e r n s ]

Table 11Official results for Task 1MMR Precision 10 Precision 20 NDCG 10 NDCG 20 BprefMAPT1 1 0.2170,02330,01500,01210,01060,0062 0,0025T1 2 0,54440,37330,27500,24430,21830,0963 0,0601

4. Task 2: Experimental Setup 4.1. Data Description Thedataset for "Task 2: Identifying and Explaining Difficult Concepts" in the SimpleText Lab is divided into training and validation folders, each containing several tab-separated files. The training folder includes documents.tsv (576 rows, 115 documents), documents users.tsv (145 rows, document and expert IDs), terms.tsv (1,910 rows, terms, difficulty, expert ID), definitions explanations.tsv (1,046 rows, definitions, explanations, expert ID), and definitions generated.tsv (589 rows, automatically generated definitions). The validation folder contains definitions explanations.tsv (960 rows, definitions without explanations), definitions generated.tsv (932 rows, automatically generated definitions), and terms.tsv (680 rows, terms, difficulty). Initial annotations were performed by multiple experts, with a second round of validation by an external expert to identify additional terms and definitions. The dataset will later include test files for the evaluation phase.

Table 2 Official2

results for task 2Recall overall Recall avgTask 2.20,00690,0040Task 2.2 10,00830,0084

Acknowledgments

We'd like to extend our gratitude to the University of Brest for organising the Blended Intensive Programme (BIP) AI For Humanities. We would also like to thank Liana Ermakova for her teaching of the course and Caroline L'haridon for her support during our stay in Brest.

Overview of CLEF 2024 SimpleText track on improving access to scientific texts LErmakova Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fifteenth International Conference of the CLEF Association (CLEF 2024) Lecture Notes in Computer Science LGoeuriot Springer 2024 Overview of the CLEF 2024 SimpleText task 1: Retrieve passages to include in a simplified summary ESanjuan Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2024) CEUR Workshop Proceedings GFaggioli 2024 Overview of the CLEF 2024 SimpleText task 2: Identify and explain difficult concepts GM DNunzio Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2024) CEUR Workshop Proceedings GFaggioli 2024 Overview of the CLEF 2024 SimpleText task 3: Simplify scientific text LErmakova Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2024) CEUR Workshop Proceedings GFaggioli 2024 Overview of the CLEF 2024 SimpleText task 4: Track the state-of-the-art in scholarly publications JSouza Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2024) CEUR Workshop Proceedings GFaggioli 2024