<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Appendix C Results of the Robust Track</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Giorgio Maria Di Nunzio</string-name>
          <email>dinunzio@dei.unipd.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nicola Ferro</string-name>
          <email>ferro@dei.unipd.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Information Engineering University of Padua Italy</institution>
        </aff>
      </contrib-group>
      <fpage>641</fpage>
      <lpage>682</lpage>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Introduction
3</p>
      <p>Results for CLEF 2008 Ad-hoc Robust Track
3. Individual Experiment Results and Graphs
This section provides the individual results for each official experiment. For each experiment the following tables and graphs are
shown:
- Overall statistics and information
- Interpolated recall vs precision averages plot
- Average precision statistics and box plot
- Average precision comparison to median plot
- Document cutoff levels vs precision at DCL plot
- R-Precision statistics and box plot
- R-Precision comparison to median plot
Topics are identified with DOIs, as well as the experiments. The prefix for the DOI of a topic is 10.2452. The following example
shows how to build the DOI for a topic given its number: for topic 200-AH, the corresponding DOI is 10.2452/200-AH
List of Submitted Experiments
7</p>
      <p>TD
TDN
TD
TD
TD
TD
TD
TD
TD
TD
TD
TD
TD
TD
TD
TD
TD
TD
TD
TD
TD
TD
TDN
TDN
TDN
TDN
TDN
TD
TDN
TD
TDN
TDN
TD
TD
TD
TD
TD
TD
TD
TD
ufrgs
ufrgs
ufrgs
ufrgs
ufrgs
uniba
uniba
uniba
uniba
unine
unine
unine
unine
uniba
Track Overview Results and Graphs
11
0%</p>
      <p>0%
1
0.5
0.5
−1
−1</p>
      <p>50%
Recall</p>
      <p>178−AH
Topic Identifier
166−AH
167−AH
168−AH
169−AH
170−AH
171−AH
172−AH
173−AH
174−AH
175−AH
176−AH
177−AH
179−AH
180−AH
181−AH
182−AH
183−AH
184−AH
185−AH
186−AH
187−AH
188−AH
189−AH
190−AH</p>
      <p>Ad−Hoc Robust Monolingual English Test Task Top 5 Participants − Comparison to Median Average Precision by Topic (Topics 191−AH to 265−AH)
191−AH
192−AH
193−AH
194−AH
s
t
n
e
m
i
r
e
p
xEufrgs [Experiment UFRGS_R_MONO2_TEST; MAP 33.95%; Not Pooled]</p>
      <p>ixa [Experiment EN2ENNOWSD; MAP 35.34%; Not Pooled]
ufrgs [Experiment UFRGS_R_MONO1_TEST; MAP 31.20%; Not Pooled]</p>
      <p>inaoe [Experiment INAOEF; MAP 28.46%; Not Pooled]
know−center [Experiment ASSO; MAP 27.72%; Not Pooled]</p>
      <p>inaoe [Experiment INAOEV; MAP 25.82%; Not Pooled]
uniba [Experiment MONO11NUS2F; MAP 19.24%; Not Pooled]
uniba [Experiment MONO1TDNUS2F; MAP 16.81%; Not Pooled]
uniba [Experiment MONO13NUS2F; MAP 15.48%; Not Pooled]
uniba [Experiment MONO12NUS2FOUT; MAP 14.57%; Not Pooled]
uniba [Experiment MONO14NUS2F; MAP 6.87%; Not Pooled]
0%
10%
20%
30%
70%
80%
90%</p>
      <p>100%
40% 50% 60%</p>
      <p>Average Precision</p>
      <p>Track Overview Results and Graphs AH-ROBUST-MONO-EN-TEST-CLEF2008
10.2455/TUKEY_T_TEST.488B4DC45C240AEDD7AED91CF79383BE</p>
      <p>Ad−Hoc Robust Monolingual English Test Task − Tukey T test with "top group" highlighted
0.3 arcsin(s0q.r4t(Average 0P.r5ecsion)) 0.6
0%</p>
      <p>5
1
0.5
0.5
−1
−1
−1
166−AH
167−AH
168−AH
169−AH
170−AH
171−AH
172−AH
173−AH
174−AH
175−AH
176−AH
177−AH
179−AH
180−AH
181−AH
182−AH
183−AH
184−AH
185−AH
186−AH
187−AH
188−AH
189−AH
190−AH</p>
      <p>Ad−Hoc Robust Monolingual English Test Task Top 5 Participants − Comparison to Median R−Precision by Topic (Topics 191−AH to 265−AH)
191−AH
192−AH
193−AH
194−AH
195−AH
196−AH
197−AH
198−AH
199−AH
200−AH
251−AH
252−AH
254−AH
255−AH
256−AH
257−AH
258−AH
unine [Experiment UNINEROBUST4; R−Prec 42.99%; Not Pooled]
geneva [Experiment ISILEMTDN; R−Prec 38.05%; Not Pooled]
ucm [Experiment BM25_BO1; R−Prec 36.15%; Not Pooled]
ixa [Experiment EN2ENNOWSDPSREL; R−Prec 36.12%; Not Pooled]
ufrgs [Experiment UFRGS_R_MONO2_TEST; R−Prec 32.81%; Not Pooled]
unine [Experiment UNINEROBUST4; R−Prec 42.99%; Not Pooled]
geneva [Experiment ISILEMTDN; R−Prec 38.05%; Not Pooled]
ucm [Experiment BM25_BO1; R−Prec 36.15%; Not Pooled]
ixa [Experiment EN2ENNOWSDPSREL; R−Prec 36.12%; Not Pooled]
ufrgs [Experiment UFRGS_R_MONO2_TEST; R−Prec 32.81%; Not Pooled]
s
t
n
e
m
i
r
e
p
xEufrgs [Experiment UFRGS_R_MONO2_TEST; R−Prec 32.81%; Not Pooled]</p>
      <p>ixa [Experiment EN2ENNOWSD; R−Prec 33.14%; Not Pooled]
ufrgs [Experiment UFRGS_R_MONO1_TEST; R−Prec 30.43%; Not Pooled]
know−center [Experiment ASSO; R−Prec 27.45%; Not Pooled]
inaoe [Experiment INAOEF; R−Prec 27.35%; Not Pooled]
inaoe [Experiment INAOEV; R−Prec 25.53%; Not Pooled]
uniba [Experiment MONO11NUS2F; R−Prec 21.20%; Not Pooled]
uniba [Experiment MONO1TDNUS2F; R−Prec 19.00%; Not Pooled]
uniba [Experiment MONO13NUS2F; R−Prec 17.01%; Not Pooled]
uniba [Experiment MONO12NUS2FOUT; R−Prec 16.43%; Not Pooled]
uniba [Experiment MONO14NUS2F; R−Prec 8.57%; Not Pooled]
0%
10%
20%
30%
40%
60%
70%
80%
90%</p>
      <p>100%
50%</p>
      <p>R−Precision</p>
      <p>Ad−Hoc Robust Monolingual English Test Task − Tukey T test with "top group" highlighted</p>
      <p>UNINEROBUST4
UNINEROBUST1</p>
      <p>ISILEMTDN</p>
      <p>ISILEMTD
EN2ENNOWSDPSREL</p>
      <p>BM25_KLD</p>
      <p>BM25_BO1
BM25_BO1_AVICTF</p>
      <p>BM25_BASELINE</p>
      <p>EN2ENNOWSD
UFRGS_R_MONO2_TEST
UFRGS_R_MONO1_TEST</p>
      <p>ASSO
INAOEF</p>
      <p>INAOEV</p>
      <p>MONO11NUS2F
MONO1TDNUS2F</p>
      <p>MONO13NUS2F
MONO12NUS2FOUT</p>
      <p>MONO14NUS2F
0.1
ixa</p>
      <p>Precision averages (%) for individual queries
4
AUTOMATIC
Spanish; Castilian
title, description
false
topics, UBC docs</p>
      <p>best
Ad−Hoc Robust Word Sense Disambiguation Bilingual English Test Task − Standard Recall Levels vs Mean Interpolated Precision
100% ES2ENUBCDOCSPSREL
90%
18 −AH 189−AH 190−AH</p>
      <p>ES2ENUBCDOCSPSREL
264−AH 265−AH</p>
      <p>ES2ENUBCDOCSPSREL
28 −AH 289−AH 290−AH</p>
      <p>ES2ENUBCDOCSPSREL
ixa</p>
      <p>Precision averages (%) for individual queries
ixa
retrieved,
R_PRECISION
Maximum
Minimum
First Quartile
Second Quartile
Third Quartile
Interquartile range
Mean
Standard Deviation
Lower Outlier Threshold
Upper Outlier Threshold
Mean With No Outliers
Std With No Outliers
Ad−Hoc Robust Word Sense Disambiguation Bil ngual English Test Task − Comparison to Median R−Precision by Topic (Topics 141−AH to 165−AH)
10
15
20
ixa</p>
      <p>Precision averages (%) for individual queries
40%
30%
20%
10%
0%0%
1
AUTOMATIC
Spanish; Castilian
title, description
false
Ad−Hoc Robust Word Sense Disambiguation Bilingual English Test Task − Standard Recall Levels vs Mean Interpolated Precision
100% UFRGS_R_BI_WSD1_TEST
90%
80%
70%
60%
10%
20%
30%
40%
60%
70%
80%
90%</p>
      <p>100%
0%
5%
10%
15%
20%
25%
30%
35%
40%
65%
70%
75%
80%
85%
90%
ufrgs</p>
      <p>Precision averages (%) for individual queries
ufrgs
retrieved,
R_PRECISION
Maximum
Minimum
First Quartile
Second Quartile
Third Quartile
Interquartile range
Mean
Standard Deviation
Lower Outlier Threshold
Upper Outlier Threshold
Mean With No Outliers
Std With No Outliers</p>
      <p>Ad−Hoc Robust Word Sense Disambiguation Bilingual English Test Task − Box plot of the Topics of the Experiment
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
60%
65%
70%
75%
80%
85%
90%
187−AH 18 −AH 189−AH 190−AH</p>
      <p>UFRGS_R_BI_WSD1_TEST
287−AH 28 −AH 289−AH 290−AH</p>
      <p>UFRGS_R_BI_WSD1_TEST
313−AH
ufrgs</p>
      <p>Precision averages (%) for individual queries
2
AUTOMATIC
Spanish; Castilian
title, description,
false
NLevels (Keyword
Ad−Hoc Robust Word Sense Disambiguation Bilingual English Test Task − Standard Recall Levels vs Mean Interpolated Precision
100% CROSSWSD11NUS2F
90%
80%
70%
60%
40%
30%
20%
10%
0%0%
10%
20%
30%
40%
60%
70%
80%
90%</p>
      <p>100%</p>
      <p>Ad−Hoc Robust Word Sense Disambiguation Bilingual English Test Task − Box plot of the Topics of the Experiment
0%
5%
10%
15%
20%
25%
30%
35%
40%
65%
70%
75%
80%
85%
90%
5%
10%
15%
20%
25%
30%
35%
40%
65%
70%
75%
80%
85%
90%
164−AH 165−AH
CROSSWSD1 NUS2F
264−AH 265−AH</p>
      <p>CROSSWSD1 NUS2F
18 −AH 189−AH 190−AH</p>
      <p>CROSSWSD1 NUS2F
28 −AH 289−AH 290−AH</p>
      <p>CROSSWSD1 NUS2F
314−AH 315−AH</p>
      <p>CROSSWSD1 NUS2F
3 8−AH 3 9−AH 340−AH</p>
      <p>CROSSWSD1 NUS2F
uniba</p>
      <p>Precision averages (%) for individual queries
uniba
retrieved,
R_PRECISION
Maximum
Minimum
First Quartile
Second Quartile
Third Quartile
Interquartile range
Mean
Standard Deviation
Lower Outlier Threshold
Upper Outlier Threshold
Mean With No Outliers
Std With No Outliers</p>
      <p>Ad−Hoc Robust Word Sense Disambiguation Bilingual English Test Task − Retrieved documents vs Mean Precision
100% CROSSWSD11NUS2F
90%
80%
70%
60%</p>
      <p>Ad−Hoc Robust Word Sense Disambiguation Bilingual English Test Task − Box plot of the Topics of the Experiment
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
60%
65%
70%
75%
80%
85%
90%
5%
10%
15%
20%
25%
30%
35%
40%
45%
60%
65%
70%
75%
80%
85%
90%</p>
      <p>Ad−Hoc Robust Word Sense Disambiguation Bil ngual English Test Task − Comparison to Median R−Precision by Topic (Topics 141−AH to 165−AH)
187−AH 18 −AH 189−AH 190−AH</p>
      <p>CROSSWSD1 NUS2F
287−AH 28 −AH 289−AH 290−AH</p>
      <p>CROSSWSD1 NUS2F
uniba</p>
      <p>Precision averages (%) for individual queries
3
AUTOMATIC
Spanish; Castilian
title, description,
false
NLevels (Synset
Ad−Hoc Robust Word Sense Disambiguation Bilingual English Test Task − Standard Recall Levels vs Mean Interpolated Precision
100% CROSSWSD12NUS2F
90%
t 150
n
e
m
ir
e
p
xE100
e
h
ft
cspoo
i
fT 50
o
r
e
b
m
u
N</p>
      <p>00%
80%
70%
60%
40%
30%
20%
10%
0%0%
10%
20%
30%
40%
60%
70%
80%
90%</p>
      <p>100%</p>
      <p>Ad−Hoc Robust Word Sense Disambiguation Bilingual English Test Task − Box plot of the Topics of the Experiment
0%
5%
10%
15%
20%
25%
30%
35%
40%
65%
70%
75%
80%
85%
90%
5%
10%
15%
20%
25%
30%
35%
40%
65%
70%
75%
80%
85%
90%</p>
      <p>Ad−Hoc Robust Word Sense Disambiguation Bil ngual English Test Task − Comparison to Median Average Precision by Topic (Topics 141−AH to 165−AH)
18 −AH 189−AH 190−AH</p>
      <p>CROSSWSD12NUS2F
28 −AH 289−AH 290−AH</p>
      <p>CROSSWSD12NUS2F
314−AH 315−AH</p>
      <p>CROSSWSD12NUS2F
3 8−AH 3 9−AH 340−AH</p>
      <p>CROSSWSD12NUS2F
uniba</p>
      <p>Precision averages (%) for individual queries
uniba</p>
      <p>DCL
retrieved,
R_PRECISION
Maximum
Minimum
First Quartile
Second Quartile
Third Quartile
Interquartile range
Mean
Standard Deviation
Lower Outlier Threshold
Upper Outlier Threshold
Mean With No Outliers
Std With No Outliers</p>
      <p>Ad−Hoc Robust Word Sense Disambiguation Bilingual English Test Task − Retrieved documents vs Mean Precision
100% CROSSWSD12NUS2F
90%
80%
70%
60%</p>
      <p>Ad−Hoc Robust Word Sense Disambiguation Bilingual English Test Task − Box plot of the Topics of the Experiment
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
60%
65%
70%
75%
80%
85%
90%
150
t
n
e
m
ir
e
p
xE100
e
h
tf
csoop
i
fT 50
o
r
e
b
m
u
N</p>
      <p>00%
5%
10%
15%
20%
25%
30%
35%
40%
45%
60%
65%
70%
75%
80%
85%
90%</p>
      <p>Ad−Hoc Robust Word Sense Disambiguation Bil ngual English Test Task − Comparison to Median R−Precision by Topic (Topics 141−AH to 165−AH)
187−AH 18 −AH 189−AH 190−AH</p>
      <p>CROSSWSD12NUS2F
287−AH 28 −AH 289−AH 290−AH</p>
      <p>CROSSWSD12NUS2F
uniba</p>
      <p>Precision averages (%) for individual queries
1
AUTOMATIC
Spanish; Castilian
title, description
false
synset expansion
Ad−Hoc Robust Word Sense Disambiguation Bilingual English Test Task − Standard Recall Levels vs Mean Interpolated Precision
100% CROSSWSD1NUS2F
90%
t 150
n
e
m
ir
e
p
xE100
e
h
ft
csoop
i
fT 50
o
r
e
b
m
u
N</p>
      <p>00%
80%
70%
60%
40%
30%
20%
10%
0%0%
10%
20%
30%
40%
60%
70%
80%
90%</p>
      <p>100%</p>
      <p>Ad−Hoc Robust Word Sense Disambiguation Bilingual English Test Task − Box plot of the Topics of the Experiment
0%
5%
10%
15%
20%
25%
30%
35%
40%
65%
70%
75%
80%
85%
90%
5%
10%
15%
20%
25%
30%
35%
40%
65%
70%
75%
80%
85%
90%</p>
      <p>Ad−Hoc Robust Word Sense Disambiguation Bil ngual English Test Task − Comparison to Median Average Precision by Topic (Topics 141−AH to 165−AH)
142−AH
143−AH
18 −AH 189−AH 190−AH</p>
      <p>CROSSWSD1NUS2F
28 −AH 289−AH 290−AH</p>
      <p>CROSSWSD1NUS2F
314−AH 315−AH</p>
      <p>CROSSWSD1NUS2F
3 8−AH 3 9−AH 340−AH</p>
      <p>CROSSWSD1NUS2F
uniba</p>
      <p>Precision averages (%) for individual queries
uniba</p>
      <p>DCL
retrieved,
R_PRECISION
Maximum
Minimum
First Quartile
Second Quartile
Third Quartile
Interquartile range
Mean
Standard Deviation
Lower Outlier Threshold
Upper Outlier Threshold
Mean With No Outliers
Std With No Outliers
1.0000
0.0000
0.0000
0.0000
0.0832
0.0832
0.0792
0.1665
0.0000
0.2037
0.0267
0.0537</p>
      <p>Ad−Hoc Robust Word Sense Disambiguation Bilingual English Test Task − Retrieved documents vs Mean Precision
100% CROSSWSD1NUS2F
90%
80%
70%
60%</p>
      <p>30 100 200
Retrieved Documents (logarithmic scale)</p>
      <p>Ad−Hoc Robust Word Sense Disambiguation Bilingual English Test Task − Box plot of the Topics of the Experiment
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
60%
65%
70%
75%
80%
85%
90%
150
t
n
e
m
ir
e
p
xE100
e
h
ft
csopo
i
fT 50
o
r
e
b
m
u
N</p>
      <p>00%
341−AH
5%
10%
15%
20%
25%
30%
35%
40%
45%
60%
65%
70%
75%
80%
85%
90%</p>
      <p>Ad−Hoc Robust Word Sense Disambiguation Bil ngual English Test Task − Comparison to Median R−Precision by Topic (Topics 141−AH to 165−AH)
187−AH 18 −AH 189−AH 190−AH</p>
      <p>CROSSWSD1NUS2F
287−AH 28 −AH 289−AH 290−AH</p>
      <p>CROSSWSD1NUS2F</p>
      <p>Precision averages (%) for individual queries</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <volume>30</volume>
          100 200
          <string-name>
            <given-names>Retrieved</given-names>
            <surname>Documents</surname>
          </string-name>
          (logarithmic scale)
          <volume>30</volume>
          100 200
          <string-name>
            <given-names>Retrieved</given-names>
            <surname>Documents</surname>
          </string-name>
          <article-title>(logarithmic scale)</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>