<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>X. Wu, Q. Liu, J. Qin, Y. Yu, Peerrank: Robust learning to rank with peer loss over noisy
labels, IEEE Access</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1109/ACCESS.2022.3142096</article-id>
      <title-group>
        <article-title>SOUR: an Outliers Detection Algorithm in Learning to Rank (Abstract)</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Federico Marcuzzi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Claudio Lucchese</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Salvatore Orlando</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Università Ca' Foscari Venezia</institution>
          ,
          <addr-line>Via Torino, 155, 30170 Mestre, Venezia VE</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <volume>10</volume>
      <issue>2022</issue>
      <fpage>11</fpage>
      <lpage>12</lpage>
      <abstract>
        <p>Outlier data points are known to afect negatively the learning process of regression or classification models, yet their impact in the learning-to-rank scenario has not been thoroughly investigated so far. In this talk we present our efort to solve this research problem. The full version of this work will appear at ICTIR 2022 [1]. We designed SOUR, a learning-to-rank method that detects and removes outliers before building an efective ranking model. We limit our analysis to gradient boosting decision trees, but our algorithm can be easily adapted to handle diferent learning strategy, such as artificial Neural Network. SOUR searches for outlier instances that are consistently incorrectly ranked in several consecutive iterations of the learning process. We performed an extensive evaluation analysis on three publicly available datasets and we empirically demonstrated that i) removing a limited number of outlier data instances before re-training a new model, provides statistically significant improvements in term of efectiveness ii) SOUR outperforms state-of-the-art de-noising and outlier detection methods such as [2]. Finally, we investigated how the removal of the outliers afects the ensemble structure and we found that the ensemble leaves were purer when trained without the presence of the outliers.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Information Retrieval</kwd>
        <kwd>Learning to Rank</kwd>
        <kwd>Machine Learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body />
  <back>
    <ref-list />
  </back>
</article>