Preface The CLEF 2021 conference is the twenty-second edition of the popular CLEF campaign and workshop series that has run since 2000 contributing to the sys- tematic evaluation of multilingual and multimodal information access systems, primarily through experimentation on shared tasks. In 2010 CLEF was launched in a new format, as a conference with research presentations, panels, poster and demo sessions and laboratory evaluation workshops. These are proposed and operated by groups of organizers volunteering their time and effort to define, promote, administrate and run an evaluation activity. CLEF 20211 was organized by the University “Politehnica” of Bucharest, Romania, from 21 to 24 September 2021. The continued outbreak of the Covid-19 pandemic affected the organization of CLEF 2021. The CLEF steering committee along with the organizers of CLEF 2021, after detailed discussions, decided to run the conference fully virtually. The conference format remained the same as in past years, and consisted of keynotes, contributed papers, lab sessions, and poster sessions, including reports from other benchmarking initiatives from around the world. All sessions were organized and run online. 15 lab proposals were received and evaluated in peer review based on their innovation potential and the quality of the resources created. To identify the best proposals, besides the well-established criteria from the editions of previ- ous years of CLEF such as topical relevance, novelty, potential impact on future world affairs, likely number of participants, and the quality of the organizing consortium, this year we further stressed the connection to real-life usage sce- narios and we tried to avoid as much as possible overlaps among labs in order to promote synergies and integration. The 12 selected labs represented scientific challenges based on new data sets and real world problems in multimodal and multilingual information access. These data sets provide unique opportunities for scientists to explore collections, to develop solutions for these problems, to receive feedback on the performance of their solutions and to discuss the issues with peers at the workshops. We continued the mentorship program to support the preparation of lab proposals for newcomers to CLEF. The CLEF newcomers mentoring program offered help, guidance, and feedback on the writing of draft lab proposals by assigning a mentor to proponents, who helped them in preparing and maturing the lab proposal for submission. If the lab proposal fell into the scope of an already existing CLEF lab, the mentor helped proponents to get in touch with those lab organizers and team up forces. Building on previous experience, the Labs at CLEF 2021 demonstrate the maturity of the CLEF evaluation environment by creating new tasks, new and 1 http://clef2021.clef-initiative.eu/ larger data sets, new ways of evaluation or more languages. Details of the indi- vidual Labs are described by the Lab organizers in these proceedings. Below is a short summary of them. ARQMath: Answer Retrieval for Mathematical Questions2 considers the problem of finding answers to new mathematical questions among posted answers on the community question answering site Math Stack Exchange. The goals of the lab are to develop methods for mathematical information retrieval based on both text and formula analysis. BioASQ3 challenges researchers with large-scale biomedical semantic index- ing and question answering (QA). The challenges include tasks relevant to hierarchical text classification, machine learning, information retrieval, QA from texts and structured data, multi-document summarization and many other areas. The aim of the BioASQ workshop is to push the research frontier towards systems that use the diverse and voluminous information available online to respond directly to the information needs of biomedical scientists. CheckThat!: Identification and Verification of Political Claims4 aims to foster the development of technology capable of both spotting and verify- ing check-worthy claims in political debates in English, Arabic and Italian. The concrete tasks were to assess the check worthiness of a claim in a tweet, check if a (similar) claim has been previously verified, retrieve evidence to fact-check a claim, and verify the factuality of a claim. ChEMU: Cheminformatics Elsevier Melbourne University5 proposes two key information extraction tasks over chemical reactions from patents. Task 1 aims to identify chemical compounds and their specific types, i.e. to assign the label of a chemical compound according to the role which it plays within a chemical reaction. Task 2 requires identification of event trigger words (e.g. “added” and “stirred”) which all have the same type of “EVENT TRIGGER”, and then determination of the chemical entity argu- ments of these events. CLEF eHealth6 aims to support the development of techniques to aid laypeo- ple, clinicians and policy-makers in easily retrieving and making sense of medical content to support their decision making. The goals of the lab are to develop processing methods and resources in a multilingual setting to enrich difficult-to-understand eHealth texts and provide valuable documentation. eRisk: Early Risk Prediction on the Internet7 explores challenges of evaluation methodology, effectiveness metrics and other processes related to early risk detection. Early detection technologies can be employed in different areas, particularly those related to health and safety. The 2020 edition of the 2 https://www.cs.rit.edu/~dprl/ARQMath 3 http://www.bioasq.org/workshop2021 4 https://sites.google.com/view/clef2021-checkthat 5 http://chemu2021.eng.unimelb.edu.au/ 6 https://clefehealth.imag.fr/ 7 https://erisk.irlab.org/ lab focused on texts written in social media for the early detection of signs of self-harm and depression. ImageCLEF: Multimedia Retrieval8 provides an evaluation forum for vi- sual media analysis, indexing, classification/learning, and retrieval in medi- cal, nature, security and lifelogging applications with a focus on multimodal data, so data from a variety of sources and media. LifeCLEF: Multimedia Life Species Identification9 aims at boosting re- search on the identification and prediction of living organisms in order to solve the taxonomic gap and improve our knowledge of biodiversity. Through its biodiversity informatics related challenges, LifeCLEF is intended to push the boundaries of the state-of-the-art in several research directions at the frontier of multimedia information retrieval, machine learning and knowl- edge engineering. Lilas: Living Labs for Academic Search10 aims to bring together re- searchers interested in the online evaluation of academic search systems. The long term goal is to foster knowledge on improving the search for aca- demic resources like literature, research data, and the interlinking between these resources in fields from the Life Sciences and the Social Sciences. The immediate goal of this lab is to develop ideas, best practices, and guidelines for a full online evaluation campaign at CLEF 2021. PAN: Digital Text Forensics and Stylometry11 is a networking initiative for the digital text forensics, where researchers and practitioners study tech- nologies that analyze texts with regard to originality, authorship, and trust- worthiness. PAN provides evaluation resources consisting of large-scale cor- pora, performance measures, and web services that allow for meaningful eval- uations. The main goal is to provide for sustainable and reproducible evalu- ations, to get a clear view of the capabilities of state-of-the-art-algorithms. SimpleText: (Re)Telling right scientific stories to non-specialists via text simplification12 aims to create a community interested in generating a simplified summary of scientific documents and to contribute in making the science really open and accessible for everyone. The goal is to generate a simplified abstract of multiple scientific documents based on a given query. Touché: Argument retrieval13 is the first shared task on the topic of ar- gument retrieval. Decision making processes, be it at the societal or at the personal level, eventually come to a point where one side will challenge the other with a why-question, which is a prompt to justify one’s stance. Thus, technologies for argument mining and argumentation processing are matur- ing at a rapid pace, giving rise for the first time to argument retrieval. 8 https://www.imageclef.org/2021 9 https://www.imageclef.org/LifeCLEF2021 10 https://clef-lilas.github.io/ 11 http://pan.webis.de/ 12 https://www.irit.fr/simpleText/ 13 https://touche.webis.de/ As a group, the 152 lab organizers were based in 22 countries, with Germany, and France leading the distribution. Despite CLEF’s traditionally Europe-based audience, 44 (28.9%) organizers were affiliated with international institutions outside of Europe. The gender distribution was biased towards 75% male orga- nizers. CLEF has always been backed by European projects that complement the incredible amount of volunteering work performed by Lab Organizers and the CLEF community with the resources needed for its necessary central coordina- tion, in a similar manner to the other major international evaluation initiatives such as TREC, NTCIR, FIRE and MediaEval. Since 2014, the organisation of CLEF no longer has direct support from European projects and are working to transform itself into a self-sustainable activity. This is being made possible thanks to the establishment of the CLEF Association14 , a non-profit legal entity in late 2013, which, through the support of its members, ensures the resources needed to smoothly run and coordinate CLEF. Acknowledgments We would like to thank the mentors who helped in shepherding the preparation of lab proposals by newcomers: Bogdan Ionescu, University “Politehnica” of Bucharest, Romania; Henning Müller, University of Applied Sciences Western Switzerland (HES-SO), Switzerland. We would like to thank the members of CLEF-LOC (the CLEF Lab Organi- zation Committee) for their thoughtful and elaborate contributions to assessing the proposals during the selection process: Donna Harman, National Institute of Standards and Technology (NIST), USA; Braschler Martin, Zurich University of Applied Sciences, Switzerland; Paolo Rosso, Universitat Politècnica de València, Spain. Last but not least, without the important and tireless effort of the enthu- siastic and creative proposal authors, the organizers of the selected labs and workshops, the colleagues and friends involved in running them, and the partic- ipants who contribute their time to making the labs and workshops a success, the CLEF labs would not be possible. Thank you all very much! July, 2021 Guglielmo Faggioli, Nicola Ferro, Alexis Joly, Maria Maistro, Florina Piroi 14 http://www.clef-initiative.eu/association Organization CLEF 2021, Conference and Labs of the Evaluation Forum – Experimental IR meets Multilinguality, Multimodality, and Interaction, was hosted (online) by the University “Politehnica” of Bucharest, Romania. General Chairs K. Selçuk Candan, Arizona State University, USA Bogdan Ionescu, University “Politehnica” of Bucharest, Romania Program Chairs Lorraine Goeuriot, Université Grenoble Alpes, France Birger Larsen, Aalborg University Copenhagen, Denmark Henning Müller, University of Applied Sciences Western Switzerland, Switzer- land Lab Chairs Alexis Joly, INRIA Sophia-Antipolis, France Maria Maistro, University of Copenhagen, Denmark Florina Piroi, Vienna University of Technology, Austria Lab Mentorship Chair Lorraine Goeuriot, Université Grenoble Alpes, France Publicity Chairs Liviu-Daniel Stefan, University “Politehnica” of Bucharest, Romania Mihai Dogariu, University “Politehnica” of Bucharest, Romania Outreach Program Chairs Yu-Gang Jiang, Fudan University, China - Asian Liaison Hugo Jair Escalante, Instituto Nacional de Astrofisica, Optica y Electronica, Mexico - Central American Liaison Fabio A. Gonzalez, National University of Colombia, Colombia - South Ameri- can Liaison Ben Herbst, Praelexis, South Africa - African Liaison Abdulmotaleb El Saddik, University of Ottawa, Canada - North American Liai- son Industry & Sponsorship Chairs Şeila Abdulamit, Vodafone, Romania Mihai-Gabriel Constantin, University “Politehnica” of Bucharest, Romania Bogdan Boteanu, University “Politehnica” of Bucharest, Romania Website & Social Media Chair Denisa Ionas, cu, University “Politehnica” of Bucharest, Romania Finance Chair Ion Marghescu, University “Politehnica” of Bucharest, Romania Proceedings Chairs Guglielmo Faggioli, University of Padua, Italy Nicola Ferro, University of Padua, Italy CLEF Steering Committee Steering Committee Chair Nicola Ferro, University of Padua, Italy Deputy Steering Committee Chair for the Conference Paolo Rosso, Universitat Politècnica de València, Spain Deputy Steering Committee Chair for the Evaluation Labs Martin Braschler, Zurich University of Applied Sciences, Switzerland Members Khalid Choukri, Evaluations and Language resources Distribution Agency (ELDA), France Paul Clough, University of Sheffield, United Kingdom Fabio Crestani, Università della Svizzera italiana, Switzerland Carsten Eickhoff, Brown University, USA Norbert Fuhr, University of Duisburg-Essen, Germany Lorraine Goeuriot, Université Grenoble Alpes, France Julio Gonzalo, National Distance Education University (UNED), Spain Donna Harman, National Institute for Standards and Technology (NIST), USA Evangelos Kanoulas, University of Amsterdam, The Netherlands Birger Larsen, University of Aalborg, Denmark David E. Losada, Universidade de Santiago de Compostela, Spain Mihai Lupu, Vienna University of Technology, Austria Josiane Mothe, IRIT, Université de Toulouse, France Henning Müller, University of Applied Sciences Western Switzerland (HES-SO), Switzerland Jian-Yun Nie, Université de Montréal, Canada Eric SanJuan, University of Avignon, France Giuseppe Santucci, Sapienza University of Rome, Italy Jacques Savoy, University of Neuchêtel, Switzerland Laure Soulier, Pierre and Marie Curie University (Paris 6), France Theodora Tsikrika, Information Technologies Institute (ITI), Centre for Re- search and Technology Hellas (CERTH), Greece Christa Womser-Hacker, University of Hildesheim, Germany Past Members Djoerd Hiemstra, Radboud University, The Netherlands Jaana Kekäläinen, University of Tampere, Finland Séamus Lawless, Trinity College Dublin, Ireland Carol Peters, ISTI, National Council of Research (CNR), Italy (Steering Committee Chair 2000–2009) Emanuele Pianta, Centre for the Evaluation of Language and Communication Technologies (CELCT), Italy Maarten de Rijke, University of Amsterdam UvA, The Netherlands Alan Smeaton, Dublin City University, Ireland