<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">SiS at CLEF 2017 eHealth TAR Task</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Vassil</forename><surname>Kalphov</surname></persName>
							<email>vassil.kalphov.2013@uni.strath.ac.uk</email>
							<affiliation key="aff0">
								<orgName type="institution">University of Strathclyde</orgName>
								<address>
									<settlement>Glasgow</settlement>
									<country key="GB">UK</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Georgios</forename><surname>Georgiadis</surname></persName>
							<email>georgios.georgiadis.2013@uni.strath.ac.uk</email>
							<affiliation key="aff0">
								<orgName type="institution">University of Strathclyde</orgName>
								<address>
									<settlement>Glasgow</settlement>
									<country key="GB">UK</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Leif</forename><surname>Azzopardi</surname></persName>
							<email>leif.azzopardi@strath.ac.uk</email>
							<affiliation key="aff0">
								<orgName type="institution">University of Strathclyde</orgName>
								<address>
									<settlement>Glasgow</settlement>
									<country key="GB">UK</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">SiS at CLEF 2017 eHealth TAR Task</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">7BFC2D39DA2BF87B68C5D8A815D3684D</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T20:27+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This paper presents Strathclyde iSchool's (SiS) participation in the Technological Assisted Reviews in Empirical Medicine Task. For the ranking task, we explored two ways in which assistance to reviewers could be provided during the assessment process: (i) topic models, where we use Latent Dirichlet Allocation to identify topics within the set of retrieved documents, ranking documents by the topic most likely to be relevant and (ii) relevance feedback, where we use Rocchio's algorithm to update the query model for subsequent rounds of interaction. A third approach combines the topic and relevance feedback to quickly identify the relevant abstracts. For the thresholding task, we apply a score threshold, and exclude documents which did not exceed the threshold given BM25.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>CLEF 2017 introduced a new eHealth retrieval problem -that of providing technological assistance to reviewers of systematic reviews -where the goals of the task were to explore how Information Retrieval techniques could be used to: (i) identify relevant material more quickly in the ranking challenge and (ii) identify when reviewers could stop processing documents in the thresholding challenge <ref type="bibr">[3,</ref><ref type="bibr">2]</ref>. During the review process, reviewers will routinely examine hundreds to thousands of abstracts to decided if the document (and evidence it contains) could be included in the systematic review that they are conducting <ref type="bibr">[5]</ref>. Once they have identified a subset of abstracts, which are potentially relevant, they examine the document's contents to decide whether the document should be included or excluded. The track focused on the first part, identifying potential relevant documents during the, so called, screening phase.</p><p>In this work we considered two different approaches -one which uses topic modelling and the other which uses relevance feedback. In selecting these approaches we thought that such techniques could be used in the following way. For topic modelling, we envisaged that the download abstracts could be semantically clustered -and the different clusters could be presented to the reviewerthe reviewer could then start the review process by selecting a cluster that they felt was most likely to contain the relevant documents. Since we did not have recourse to reviewers, we explored a number of different ways to automatically select the best cluster. For relevance feedback, we envisaged that as the reviewer starts to examine documents, the query could be updated to bring back the next most relevant documents, so that they would quickly find all the relevant material as soon as possible. Obviously, if the aim is to reduce the workload of the reviewers, then we need to be able to select a point where the reviewer can stop assessing documents -however, this runs the risk of losing relevant documents. To this end, we explore various heuristics to select the threshold such that we minimize effort and maximise recall (but ideally obtain total recall).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Experimental Set-up</head><p>Data: Given the list of topic descriptions the PubMed IDs were extracted, and a scripted fetched the Abstract and associated Metadata from the PubMed API. From the topics, we extracted the title for each topic and use that as the query. Indexing and Retrieval System: We used Lucene 6.2 to create a separate index for each of the topic (where stop words were removed, no stemming was applied). A Lucene Document was created where the following fields were index: Title, Abstract, Author, and Publication Name. The baseline retrieval algorithm we employed was fielded BM25 with standard parameters settings i.e. b = 0.75, and equal weights between fields (denoted as BM25). Relevance Feedback: We implemented Rocchio's Algorithm [4] in Lucenewhere feedback was used to provide relevance information. In each round of feedback, 30 documents were examined, and the query model updated, to provide a re-ranking of the subsequent documents. This was performed on the first 10%, 20%, etc of documents associated with the topic. Here, we only report the 30% runs (AL30) as these generally performed the best and little change in performance was observed on the training set using more feedback. Topic Modelling: We used MALLET toolkit, and thus Latent Dirichlet Allocation <ref type="bibr" target="#b0">[1]</ref> to semantically cluster the documents within each topic. We set the number of latent topics to 5, and α =. To rank the documents we selected one of the latent topics z i , and ordered the documents by the probability p(z i |d|). In an attempt to select the cluster that provides the best ranking, we took the BM25 ranking from above, and use the top 100 ranked documents as pseudo-relevance feedback. Then we ranked each latent topic (given the ordering by p(z i |d|)) and select the one with the highest overlap with BM25 (TMBM). Combined: Since our topic modelling approaches do not use feedback, we decided to see whether we could start the process of relevance feedback using the topic modelling run, and refine the query model accordingly. Thus, we selected the best performing method TMAL, given the training data, and used this as starting point for the active learning. Again we considered obtained feedback for the first 10%,20% and 30% of the documents per topic. Thresholded Runs To create thresholded runs, we took the BM25 run and applied a simple score based threshold. Using Lucene the scores for a query given BM25 are from 1.0 or greater, so we used thresholds 1.0, 1.5, 2.0 and 2.5. This led to a reasonable reduction in the number of documents without sacrificing much recall.</p><p>Tables <ref type="table" target="#tab_2">1 and 3</ref> report the performance on the Ranking Task, while Tables 2 and 4 report the performance on the Threshold Task. Our best performing run on the Ranking Task, in terms of normalized area under the gain curve, was AL30, which used 30% of the documents as feedback. From our results, it is clear that the topic modelling approach, as we have employed it, has not lead to significantly better improvements over the BM25 baseline. However, when inspecting the individual topic based ranking, the most probably topic, was not always the best performing topic. So we will direct more research into topic selection. This is because, when formulating queries, reviewers could use topic modelling to understand the space of documents retrieved, and then refine their query further -and thus make savings a priori rather than have to trawl through hundreds and thousands of documents. Another factor that could significantly improve our results is that for these initial runs we used the title as the query, as opposed to the boolean query provided within the topics. It is quite possible that the more complex and verbose boolean queries could lead to a better initial ranking -and so when used in conjunction with relevance feedback the relevant items could be found sooner. We leave these directions for further work. </p></div><figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 :</head><label>1</label><figDesc>Results on Training Data for Ranking Task</figDesc><table><row><cell>Run</cell><cell cols="5">Random BM25 AL30 TMBM TMAL30</cell></row><row><cell>NumRels</cell><cell>2494</cell><cell cols="2">2494 2494</cell><cell>2494</cell><cell>2494</cell></row><row><cell>NumFeed</cell><cell>0</cell><cell>0</cell><cell>44939</cell><cell>0</cell><cell>44940</cell></row><row><cell>RelsFound</cell><cell>2494</cell><cell cols="2">2494 2494</cell><cell>2494</cell><cell>2494</cell></row><row><cell>AP</cell><cell>0.044</cell><cell cols="2">0.171 0.273</cell><cell>0.15</cell><cell>0.24</cell></row><row><cell>MinRel</cell><cell>6947</cell><cell cols="2">5174 3210</cell><cell>4221</cell><cell>2835</cell></row><row><cell>WSS100</cell><cell>0.05</cell><cell cols="2">0.25 0.33</cell><cell>0.30</cell><cell>0.37</cell></row><row><cell>Area</cell><cell>0.51</cell><cell cols="2">0.83 0.91</cell><cell>0.80</cell><cell>0.90</cell></row><row><cell>NCG10</cell><cell>0.07</cell><cell cols="2">0.30 0.41</cell><cell>0.30</cell><cell>0.41</cell></row><row><cell>NCG20</cell><cell>0.18</cell><cell cols="2">0.51 0.73</cell><cell>0.57</cell><cell>0.73</cell></row><row><cell>NCG30</cell><cell>0.29</cell><cell cols="2">0.67 0.86</cell><cell>0.74</cell><cell>0.86</cell></row><row><cell>TotalCost</cell><cell>7453</cell><cell cols="3">7453 11947 7453</cell><cell>11947</cell></row><row><cell>TotalCostUniform</cell><cell>7453</cell><cell cols="3">7453 11947 7453</cell><cell>11947</cell></row><row><cell cols="2">TotalCostWeighted 7453</cell><cell cols="3">7453 11947 7453</cell><cell>11947</cell></row><row><cell>loss er</cell><cell>0.38</cell><cell cols="2">0.38 0.38</cell><cell>0.38</cell><cell>0.38</cell></row><row><cell>loss r</cell><cell>0.0</cell><cell>0.0</cell><cell>0.0</cell><cell>0.0</cell><cell>0.0</cell></row><row><cell>loss e</cell><cell>0.38</cell><cell cols="2">0.38 0.38</cell><cell>0.38</cell><cell>0.38</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2 :</head><label>2</label><figDesc>Results on Training Data for Threshold Task</figDesc><table><row><cell>Run</cell><cell cols="4">BM25 T1 T1.5 T2 T2.5</cell></row><row><cell>NumRels</cell><cell cols="4">2494 2949 2494 2494 2494</cell></row><row><cell>NumFeed</cell><cell>0</cell><cell>0</cell><cell>0</cell><cell>0</cell></row><row><cell>RelsFound</cell><cell cols="4">2494 2439 2419 2318 2187</cell></row><row><cell>AP</cell><cell cols="4">0.17 0.171 0.17 0.17 0.17</cell></row><row><cell>MinRel</cell><cell cols="4">5174 4531 3930 3453 3236</cell></row><row><cell>WSS100</cell><cell cols="4">0.25 0.23 0.22 0.19 0.19</cell></row><row><cell>Area</cell><cell cols="4">0.83 0.83 0.83 0.82 0.80</cell></row><row><cell>NCG10</cell><cell cols="4">0.30 0.30 0.30 0.30 0.30</cell></row><row><cell>NCG20</cell><cell cols="4">0.51 0.51 0.51 0.51 0.51</cell></row><row><cell>NCG30</cell><cell cols="4">0.67 0.67 0.67 0.67 0.67</cell></row><row><cell>TotalCost</cell><cell cols="4">7453 6010 5391 4700 4144</cell></row><row><cell cols="5">TotalCostWeighted 7453 6811 6978 6911 6985</cell></row><row><cell cols="5">TotalCostUniform 7453 6049 5482 5008 4714</cell></row><row><cell>loss er</cell><cell cols="4">0.38 0.30 0.24 0.20 0.16</cell></row><row><cell>loss r</cell><cell>0.0</cell><cell cols="3">0.0 0.001 0.006 0.016</cell></row><row><cell>loss e</cell><cell cols="4">0.38 0.29 0.24 0.19 0.15</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3 :</head><label>3</label><figDesc>Results on Test Data for Ranking Task</figDesc><table><row><cell>Run</cell><cell cols="5">Random BM25 AL30 TMBM TMAL30</cell></row><row><cell>NumRels</cell><cell>1857</cell><cell cols="2">1857 1857</cell><cell>1857</cell><cell>1857</cell></row><row><cell>NumFeed</cell><cell>0</cell><cell>0</cell><cell>35730</cell><cell>0</cell><cell>35432</cell></row><row><cell>RelsFound</cell><cell>1857</cell><cell cols="2">1857 1857</cell><cell>1857</cell><cell>1857</cell></row><row><cell>AP</cell><cell>0.05</cell><cell cols="2">0.17 0.22</cell><cell>0.12</cell><cell>0.16</cell></row><row><cell>MinRel</cell><cell>3722</cell><cell cols="2">2851 2290</cell><cell>3124</cell><cell>2305</cell></row><row><cell>WSS100</cell><cell>0.04</cell><cell cols="2">0.29 0.41</cell><cell>0.27</cell><cell>0.40</cell></row><row><cell>Area</cell><cell>0.48</cell><cell cols="2">0.81 0.86</cell><cell>0.73</cell><cell>0.84</cell></row><row><cell>NCG10</cell><cell>0.09</cell><cell cols="2">0.45 0.62</cell><cell>0.31</cell><cell>0.54</cell></row><row><cell>NCG20</cell><cell>0.19</cell><cell cols="2">0.65 0.79</cell><cell>0.55</cell><cell>0.77</cell></row><row><cell>NCG30</cell><cell>0.28</cell><cell cols="2">0.75 0.88</cell><cell>0.68</cell><cell>0.86</cell></row><row><cell>TotalCost</cell><cell>3918</cell><cell cols="2">3918 6300</cell><cell>3918</cell><cell>6280</cell></row><row><cell>TotalCostUniform</cell><cell>3918</cell><cell cols="2">3918 6300</cell><cell>3918</cell><cell>6280</cell></row><row><cell cols="2">TotalCostWeighted 3918</cell><cell cols="2">3918 6300</cell><cell>3918</cell><cell>6280</cell></row><row><cell>loss er</cell><cell>0.54</cell><cell cols="2">0.54 0.54</cell><cell>0.54</cell><cell>0.54</cell></row><row><cell>loss r</cell><cell>0.00</cell><cell cols="2">0.00 0.00</cell><cell>0.00</cell><cell>0.00</cell></row><row><cell>loss e</cell><cell>0.54</cell><cell cols="2">0.54 0.54</cell><cell>0.54</cell><cell>0.54</cell></row><row><cell cols="6">2. Goeuriot, L., Kelly, L., Suominen, H., Névéol, A., Robert, A., Kanoulas, E., Spijker,</cell></row><row><cell cols="6">R., Palotti, J., Zuccon, G.: CLEF 2017 eHealth evaluation lab overview. In: CLEF</cell></row><row><cell cols="6">2017 -8th Conference and Labs of the Evaluation Forum, Lecture Notes in Computer</cell></row><row><cell cols="3">Science (LNCS). Springer (September 2017)</cell><cell></cell><cell></cell><cell></cell></row><row><cell cols="6">3. Kanoulas, E., Li, D., Azzopardi, L., Spijker, R.: echnologically assisted reviews in</cell></row><row><cell cols="6">empirical medicine. In: CLEF 2017 Evaluation Labs and Workshop: Online Working</cell></row><row><cell>Notes, CEUR-WS (2017)</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell cols="6">4. Rocchio, J.J.: Relevance feedback in information retrieval (1971)</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 4 :</head><label>4</label><figDesc>Results on Test Data for Threshold Task Khan, Park, Thomas: Use of cost-effectiveness analysis to compare the efficiency of study identification methods in systematic reviews. Systematic reviews(2016)   </figDesc><table><row><cell>Run</cell><cell cols="5">BM25 T1 T1.5 T2 T2.5</cell></row><row><cell>NumRels</cell><cell cols="5">1857 1857 1857 1857 1857</cell></row><row><cell>NumFeed</cell><cell>0</cell><cell>0</cell><cell>0</cell><cell>0</cell><cell>0</cell></row><row><cell>RelsFound</cell><cell cols="5">1857 1828 1809 1784 1758</cell></row><row><cell>AP</cell><cell cols="5">0.17 0.17 0.17 0.17 0.17</cell></row><row><cell>MinRel</cell><cell cols="5">2851 2503 2333 2068 1877</cell></row><row><cell>WSS100</cell><cell>0.29</cell><cell cols="4">28 0.27 0.23 0.22</cell></row><row><cell>Area</cell><cell cols="5">0.81 0.81 0.80 0.80 0.79</cell></row><row><cell>NCG10</cell><cell cols="5">0.45 0.45 0.45 0.45 0.45</cell></row><row><cell>NCG20</cell><cell cols="5">0.65 0.65 0.65 0.65 0.65</cell></row><row><cell>NCG30</cell><cell cols="5">0.75 0.75 0.75 0.75 0.75</cell></row><row><cell>TotalCost</cell><cell cols="5">3918 3435 3165 2824 2536</cell></row><row><cell cols="6">TotalCostUniform 3918 3786 3865 3748 3902</cell></row><row><cell cols="6">TotalCostWeighted 3918 3454 3280 3117 2905</cell></row><row><cell>loss er</cell><cell cols="5">0.54 0.54 0.38 0.33 0.27</cell></row><row><cell>loss r</cell><cell cols="5">0.00 0.001 0.005 0.01 0.014</cell></row><row><cell>loss e</cell><cell cols="5">0.54 0.54 0.38 0.32 0.26</cell></row><row><cell>5. Shemilt,</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row></table></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Latent dirichlet allocation</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">M</forename><surname>Blei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">Y</forename><surname>Ng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">I</forename><surname>Jordan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Machine Learning Research</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="page" from="993" to="1022" />
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
