=Paper=
{{Paper
|id=Vol-3177/paper15
|storemode=property
|title=A Comprehensive Dataset for Modern Learning to Rank Solutions (Abstract)
|pdfUrl=https://ceur-ws.org/Vol-3177/paper15.pdf
|volume=Vol-3177
|authors=Domenico Dato,Sean MacAvaney,Franco Maria Nardini,Raffaele Perego,Nicola Tonellotto
|dblpUrl=https://dblp.org/rec/conf/iir/DatoMN0T22
}}
==A Comprehensive Dataset for Modern Learning to Rank Solutions (Abstract)==
A Comprehensive Dataset for Modern Learning to Rank Solutions (Abstract) Domenico Dato1 , Sean MacAvaney2 , Franco Maria Nardini3 , Raffaele Perego3 and Nicola Tonellotto4 1 Istella, Italy 2 University of Glasgow, UK 3 ISTI-CNR, Italy 4 University of Pisa, Italy Abstract In recent years, interest in neural Learning-to-Rank (LtR) approaches based on pre-trained language models has grown. These techniques have been demonstrated to be very effective at various ranking tasks, such as question answering and ad-hoc document ranking. The main reason for this success is the ability of deep neural networks to understand complex language patterns and learn to extract effective features from text. In the same time frame, feature-based LtR methods reached maturity, and research on this area focused primarily on specific aspects such as efficiency or diversification. These two research areas progressed almost entirely disjointly and the effectiveness of neural LtR approaches compared to traditional feature-based LtR methods has not yet been well-established. A major reason that left the two areas well separated is the lack of publicly-available datasets enabling a direct comparison. LtR datasets providing query-document feature vectors do not contain the raw query and document text while the benchmarks often used for evaluating neural models, e.g., MS-MARCO, TREC Robust, etc., provide text but do not provide query-document feature vectors. In this presentation, we introduce Istella22, a new dataset that enables such comparisons by providing both query/document text and strong query-document feature vectors used by an industrial search engine. The dataset, detailed in a resource paper that will be presented at ACM SIGIR 2022 [1], consists of a comprehensive corpus of 8.4M web documents, a collection of query-document pairs including 220 hand-crafted features, relevance judgments on a 5-graded scale, and a set of 2,198 textual queries used for testing purposes. Istella22 enables a fair evaluation of traditional learning-to-rank and transfer ranking techniques on the same data. LtR models exploit the feature-based representations of training samples while pre-trained transformer-based neural rankers can be evaluated on the corresponding textual content of queries and documents. Through preliminary experiments on Istella22, we find that neural re-ranking approaches lag behind LtR models in terms of effectiveness. However, LtR models identify the scores from neural models as strong signals. References [1] D. Dato, S. MacAvaney, F. M. Nardini, R. Perego, N. Tonellotto, The Istella22 Dataset: Bridging Traditional and Neural Learning to Rank Evaluation, in: Proc. ACM SIGIR, 2022. IIR 2022: 12th Italian Information Retrieval Workshop, June 29 - June 30th, 2022, Milan, Italy $ domenico@istella.ai (D. Dato); sean.macavaney@glasgow.ac.uk (S. MacAvaney); francomaria.nardini@isti.cnr.it (F. M. Nardini); raffaele.perego@isti.cnr.it (R. Perego); nicola.tonellotto@unipi.it (N. Tonellotto) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org)