=Paper=
{{Paper
|id=Vol-2696/paper_273
|storemode=property
|title=Overview of LiLAS 2020 - Living Labs for Academic Search Workshop Lab (extended abstract)
|pdfUrl=https://ceur-ws.org/Vol-2696/paper_273.pdf
|volume=Vol-2696
|authors=Philipp Schaer,Johann Schaible,Leyla Jael Garcia Castro
|dblpUrl=https://dblp.org/rec/conf/clef/SchaerSG20
}}
==Overview of LiLAS 2020 - Living Labs for Academic Search Workshop Lab (extended abstract)==
Overview of LiLAS 2020 – Living Labs for Academic Search Workshop Lab Extended Abstract? Philipp Schaer1[0000−0002−8817−4632] , Johann Schaible2[0000−0002−5441−7640] , and Leyla Jael Garcia Castro3[0000−0003−3986−0510] 1 TH Köln - University of Applied Sciences, Germany philipp.schaer@th-koeln.de 2 GESIS - Leibniz Institute for the Social Sciences, Germany johann.schaible@gesis.org 3 ZB MED - Information Centre for Life Sciences, Germany ljgarcia@zbmed.de 1 Introduction and Background In our previous work [7] and [8], we described the motivation and the outline for a new CLEF evaluation lab. The Living Labs for Academic Search (LiLAS) lab fosters the discussion, research, and evaluation of academic search systems by applying the concept of living labs to the domain of academic search [9]. This extended abstract summarizes the main ideas and contributions discussed in these previous articles. The timeless challenge of Academic Search that the field of Information Re- trieval has been dealing with for many years is more relevant than ever. With the rise of the COVID-19 pandemic it gained new momentum in the IR commu- nity with Initiatives like TREC COVID. However, test collections and special- ized data sets like CORD-19 only allow for system-oriented experiments, while the evaluation of algorithms in real-world environments is only available to re- searchers from industry. In LiLAS, we open up two academic search platforms to allow participating researchers to evaluate their systems in a Docker-based research environment. The need for innovation in academic search is shown by the stagnating system performance in controlled evaluation campaigns, as demonstrated in TREC and CLEF meta-evaluation studies [10,1]. User studies in real systems of scientific information and digital libraries show similar conditions. Although massive data collections of scientific documents are available in platforms like arXiv, PubMed, or other digital libraries, central user needs and requirements remain unsatisfied. The central mission is to find both relevant and high-quality documents - if pos- sible, directly on the first result page. Besides this ad-hoc retrieval problem, Copyright c 2020 for this paper by its authors. Use permitted under Creative Com- mons License Attribution 4.0 International (CC BY 4.0). CLEF 2020, 22-25 Septem- ber 2020, Thessaloniki, Greece. ? Extended abstract of the paper originally published in [7] other tasks such as the recommendation of relevant cross-modality content in- cluding research data sets or specialized tasks like expert finding are not even considered here. On top of that, relevance in academic search is multi-layered [4] and a topic that drives research communities like the Bibliometrics-enhanced Information Retrieval (BIR) workshops [5]. CLEF and TREC hosted the Living Labs for Information Retrieval (LL4IR) and Open Search (TREC-OS) initiatives [2] that are the predecessors of LiLAS. The goal of LiLAS is to expand the knowledge on improving the search for academic resources such as literature, research data, and the interlinking between these resources. LiLAS cooperates with two academic search systems providers from Life Sciences and Social Sciences. Both system providers support LiLAS by allowing participants of the lab to employ experimental search components into their production online systems. We will have access to the click logs of these systems and use them to employ A/B tests or more complex interleaving experiments. Our living lab platform STELLA makes this possible by bringing platform operators and researchers together and providing a methodological and technical framework for online experiments [3]. 2 Evaluation Infrastructure We use STELLA as our living lab evaluation infrastructure. STELLA is aiming to make it easier to evaluate academic retrieval information and recommendation systems [3]. Figure 1 shows an overview of how the steps flow from a researcher’s or developer’s idea to the evaluation feedback so the changes can be tuned and improved. It all starts with an idea, for instance adding synonyms to the key- words used by an end-user when searching for information. Developers will work on a modified version of the production system, including that change they want to analyze. Whenever an end-user goes to the system, everything will look as usual. Once the search keywords are introduced, STELLA will show end-users some results from the experimental system and some results from the regular production system. End-users will continue their regular interaction with the sys- tem. Based on the retrieved documents and the following interaction, STELLA will create an evaluation profile together with some statistics. Researchers and developers will then analyze STELLA’s feedback and will react accordingly to get the usage level they are aiming at. STELLA’s infrastructure relies on the container virtualization environment Docker [6], making it easier for STELLA to run multiple experimental systems, i.e., a multi-container environment, and compare them to each other and the pro- duction system as well. The core component in STELLA is a central Application Public Interface (API) connecting data and content providers with experimen- tal systems, a.k.a. participant systems or participants, encapsulated as Docker containers. Further information can be found at the project website4 , including some technical details via a series of blogs published regularly. 4 https://stella-project.org/ Fig. 1. STELLA workflow, an online living lab supporting testing from ideas to evaluation: Participants package their systems with the help of Docker containers that are deployed in the backend of academic information retrieval and recommendation systems. Users interact directly with the system, with a percentage diverted to the experimental features. Researchers and developers retrieve results and deliver feedback to tune and improve changes. 3 Conclusion and Outlook Currently, STELLA supports two main tasks: ad-hoc retrieval and recommenda- tion, both of them used within LiLAS. The two academic search systems LIVIVO and GESIS Search are from the two disjoint scientific domains life sciences and social sciences and include different metadata on research articles, data sets, and many other entities. For the next CLEF lab in 2021 we will focus on (1) ad-hoc retrieval for life science documents and (2) research data recommendations on social science topics. These tasks allow us to use the different data types available in the platforms and offer participants the unique opportunity for their solutions to be tested in real-time environments. Acknowledgements This work was partially funded by the German Research Foundation (DFG) under the project no. 407518790. References 1. Armstrong, T.G., Moffat, A., Webber, W., Zobel, J.: Improvements that don’t add up: ad-hoc retrieval results since 1998. In: Proceeding of the 18th ACM conference on information and knowledge management. pp. 601–610. CIKM ’09, ACM, Hong Kong, China (2009). https://doi.org/10.1145/1645953.1646031 2. Balog, K., Schuth, A., Dekker, P., Tavakolpoursaleh, N., Schaer, P., Chuang, P.Y.: Overview of the TREC 2016 Open Search track. In: TREC. vol. Special Publication 500-321. National Institute of Standards and Technology (NIST) (2016) 3. Breuer, T., Schaer, P., Tavalkolpoursaleh, N., Schaible, J., Wolff, B., Müller, B.: Stella: Towards a framework for the reproducibility of online search experiments. In: Proceedings of the The Open-Source IR Replicability Challenge (OSIRRC) @ SIGIR (2019) 4. Carevic, Z., Schaer, P.: On the connection between citation-based and topical rel- evance ranking: Results of a pretest using isearch. In: Proceedings of the First Workshop on Bibliometric-enhanced Information Retrieval co-located with 36th European Conference on Information Retrieval (ECIR 2014), Amsterdam, The Netherlands, April 13, 2014. CEUR Workshop Proceedings, vol. 1143, pp. 37–44. CEUR-WS.org (2014), http://ceur-ws.org/Vol-1143/paper5.pdf 5. Mayr, P., Scharnhorst, A., Larsen, B., Schaer, P., Mutschke, P.: Bibliometric- enhanced information retrieval. In: Advances in Information Retrieval - 36th Eu- ropean Conference on IR Research, ECIR 2014, Amsterdam, The Netherlands, April 13-16, 2014. Proceedings. Lecture Notes in Computer Science, vol. 8416, pp. 798–801. Springer (2014). https://doi.org/10.1007/978-3-319-06028-6 99 6. Merkel, D.: Docker: lightweight Linux containers for consistent development and deployment. Linux Journal 2014(239), 2:2 (Mar 2014) 7. Schaer, P., Schaible, J., Garcia Castro, L.J.: Overview of LiLAS 2020 – Living Labs for Academic Search. In: Arampatzis, A., Kanoulas, E., Tsikrika, T., Vrochidis, S., Joho, H., Lioma, C., Eickhoff, C., Cappellato, L., Névéol, A., Ferro, N. (eds.) Ex- perimental IR Meets Multilinguality, Multimodality, and Interaction Proceedings of the Eleventh International Conference of the CLEF Association (CLEF 2020). Lecture Notes in Computer Science, vol. 12260 (2020) 8. Schaer, P., Schaible, J., Müller, B.: Living Labs for Academic Search at CLEF 2020. In: Jose, J.M., Yilmaz, E., Magalhães, J., Castells, P., Ferro, N., Silva, M.J., Martins, F. (eds.) Advances in Information Retrieval. Lecture Notes in Computer Science, vol. 12036, pp. 580–586. Springer International Publishing, Cham (2020) 9. Schaible, J., Breuer, T., Tavakolpoursaleh, N., Müller, B., Wolff, B., Schaer, P.: Evaluation infrastructures for academic shared tasks. Datenbank-Spektrum 20(1), 29–36 (Mar 2020). https://doi.org/10.1007/s13222-020-00335-x 10. Yang, W., Lu, K., Yang, P., Lin, J.: Critically Examining the ”Neural Hype”: Weak Baselines and the Additivity of Effectiveness Gains from Neural Ranking Models. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR’19. pp. 1129–1132. ACM Press, Paris, France (2019). https://doi.org/10.1145/3331184.3331340