<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">UIC/OHSU CLEF 2018 Task 2 Diagnostic Test Accuracy Ranking using Publication Type Cluster Similarity Measures</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Aaron</forename><forename type="middle">M</forename><surname>Cohen</surname></persName>
							<email>cohenaa@ohsu.edu</email>
							<affiliation key="aff0">
								<orgName type="institution">Oregon Health &amp; Science University</orgName>
								<address>
									<settlement>Portland</settlement>
									<region>Oregon</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Neil</forename><forename type="middle">R</forename><surname>Smalheiser</surname></persName>
							<email>neils@uic.edu</email>
							<affiliation key="aff1">
								<orgName type="department">College of Medicine</orgName>
								<orgName type="institution">University of Illinois</orgName>
								<address>
									<settlement>Chicago</settlement>
									<region>Illinois</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">UIC/OHSU CLEF 2018 Task 2 Diagnostic Test Accuracy Ranking using Publication Type Cluster Similarity Measures</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">7769EDFC6086F9415791588D0C0983E8</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T02:33+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Machine Learning</term>
					<term>Support Vector Machine</term>
					<term>Publication Types</term>
					<term>Diagnostic Test Accuracy</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The CLEF 2018 Task 2 goal was to identify and rank retrieved articles relevant to conducting a systematic diagnostic test accuracy review on a given topic. The UIC/OHSU team did not attempt to rank retrieved articles by relevance directly, but rather explored the baseline value of ranking retrieved articles according to the probability that they are concerned with diagnostic test accuracy. First, a set of six publication type clusters, including a cluster of diagnostic test accuracy papers (DTAs), was built by searching PubMed from 1987-2015. We created several types of cluster similarity measures for each publication type. Similarity types included: implicit-term similarity, most important word similarity, journal similarity, and author count similarity. These similarity features were then used with weighted and un-weighted linear SVM machine learning algorithms, which were trained with a data set retrieved from PubMed searches consisting of 3481 PMIDS likely to be DTAs, and 71684 PMIDS most of which are not likely to be DTAs. The trained models produce scores predicting the probability that an individual article is a DTA. The CLEF 2018 Task 2 Test PMIDs for each topic were scored and ranked, and the cutoff probability for each of the two models determined by visual inspection of the score distribution on the test data. Cutoff probabilities chosen were 0.20 for the unweighted SVM model and 0.40 for the weighted SVM model.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>We participated in Task 2 of the CLEF 2018 e-Health challenge <ref type="bibr" target="#b0">[1]</ref> <ref type="bibr" target="#b1">[2]</ref>. The goal of this task was to identify and rank articles relevant to conducting a systematic diagnostic test accuracy review on a given topic, among those articles returned by topic-specific PubMed queries.</p><p>Figure <ref type="figure">1</ref>. PubMed query used to retrieve likely DTAs for training data.</p><p>We have been extending our prior work on probability based tagging for specific publication types <ref type="bibr" target="#b2">[3]</ref> by developing a general system to predict probabilities for multiple publication types simultaneously <ref type="bibr" target="#b3">[4]</ref>. We applied a preliminary version of that system on six clinical publication types, reporting here only on DTA publications.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Methods</head><p>The UIC/OHSU CLEF 2018 Task 2 submission applies a machine learning approach to ranking the PMIDs retrieved by CLEF for 20 topics. The approach assigns probabilities to individual PMIDs based the likelihood that they are DTAs. To generate positive training data, likely DTAs were retrieved using the PubMed query shown in Figure <ref type="figure">1</ref>. No specific information about the topic queries generating the PMID list for each query was used.</p><p>The system builds a predictive model in stages. Similarity types used as features included: implicit term similarity, most important word similarity, journal similarity, and author count similarity. Implicit term similarity measures how similar a paper is to a cluster based on terms (words, bigrams, etc.) that commonly occur with words contained in the papers within each cluster relative to the baseline frequency across MEDLINE. A cluster "centroid-like" vector is computed as the mean vector of the individual cluster article vectors, where each article vector consists of the 300 weighted terms most associated to the words in the article. The cluster centroid is limited to the 300 highest total scoring terms across the cluster. See <ref type="bibr" target="#b4">[5]</ref> for a complete and detailed description.</p><p>Most important word similarity measures the fraction of words in the paper that are in the list of most important words computed for each cluster, as measured by the frequency of the word occurring in that cluster versus MEDLINE as a whole. <ref type="bibr" target="#b5">[6]</ref> Journal similarity measures how representative an article's journal is for a cluster, again as measured by the frequency of the journal occurring in that cluster versus the rest of MEDLINE. A MeSH based journal distance measure was used for papers published in journals that did not occur in the cluster to estimate cluster similarity based on the most similar journal in the cluster <ref type="bibr" target="#b6">[7]</ref>. The author count similarity measures how selective the author count of a paper is for a particular cluster. Note that the criteria used for defining DTAs by PubMed search were NOT directly used by the features used in the classification model. Individual publication MeSH terms were not used directly as features in any of the similarity measures.</p><p>The four similarity measures produce one feature for each of the six publication type clusters, resulting in 24 similarity-based features. These similarity features were then used with weighted and un-weighted linear SVM machine learning algorithms, which were trained with a data set retrieved from the 1987-2015 PubMed searches. The DTA cluster was used as positive training data set, and the other clusters were combined into the negative training data set. This resulted in training data consisting of 3481 PMIDs likely to be diagnostic test accuracy papers (DTAs), and 71684 PMIDs most of which are not likely to be DTAs.</p><p>The trained weighted and un-weighted SVM models were then applied to the CLEF 2018 task 2 challenge data. The PMIDs supplied in the topic files were used to retrieve the full PubMed XML record for these articles, and the XML records used to compute the 24 similarity features for input to the trained models.</p><p>The trained models produce probability scores predicting whether or not an individual PMID is a DTA. The PMID predictions were then organized according to the CLEF 2018 Task 2 topics, and were ranked within a topic by probability. The cut-off probability for each of the two models was determined by visual inspection of the score distribution on the test data. Cutoff probabilities chosen were 0.20 for the unweighted SVM model and 0.40 for the weighted SVM model. This information was combined into the submission qrel files, rank ordering the topic publication PMIDs highest to lowest predicted probability, one file for each model. In this manner we produced two sets of predictions, submitted as two separate runs: OHSU_UIC_LIBLINW for the weighted model, and OHSU_UIC_LIBLINB for the unweighted model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Results</head><p>The official overall evaluation results for our systems are shown in Table <ref type="table" target="#tab_1">1</ref>. Across the board, the liblinear system with inverse class frequency weighting performed slightly better than the liblinear with bias version. These results are averages across all the topics. Based on the similar CLEF 2017 task, these results are about median as compared to other entries. The average precision achieved by the our liblinear weighted system was 0.180, which would have ranked 14th out of 33 CLEF 2017 entries <ref type="bibr" target="#b7">[8]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Discussion</head><p>Considering that we only ranked articles according to their probability of being a DTA, and did not evaluate query topic information at all, our approach did have some significant value in identifying articles that are relevant for inclusion in topic-specific systematic reviews.</p><p>We plan on continuing to work on our system, expanding the number of clusters and publication types, as well as add additional cluster similarity measures. While the current approach uses an SVM in a one-versus-rest approach for multi-classification, we are also experimenting with other classifiers which are more flexible with multiple category classification such as random forests and deep learning neural networks. </p></div><figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 1 .</head><label>1</label><figDesc>Official evaluation overall results for the UIC/OHSU Task 2 system entries.</figDesc><table><row><cell>Run Label</cell><cell cols="2">OHSU_UIC_LIBLINW OHSU_UIC_LIBLINB</cell></row><row><cell>Algorithm</cell><cell>Liblinear with inverse fre-</cell><cell>Liblinear with bias term</cell></row><row><cell></cell><cell>quency class weights</cell><cell></cell></row><row><cell>WSS@100%</cell><cell>0.164</cell><cell>0.154</cell></row><row><cell>WSS@95%</cell><cell>0.264</cell><cell>0.255</cell></row><row><cell>Recall@10%</cell><cell>0.296</cell><cell>0.289</cell></row><row><cell>Recall@20%</cell><cell>0.473</cell><cell>0.462</cell></row><row><cell>Recall@30%</cell><cell>0.579</cell><cell>0.562</cell></row><row><cell>Recall@40%</cell><cell>0.641</cell><cell>0.624</cell></row><row><cell>Recall@50%</cell><cell>0.695</cell><cell>0.683</cell></row><row><cell>Recall@60%</cell><cell>0.751</cell><cell>0.739</cell></row><row><cell>Recall@70%</cell><cell>0.805</cell><cell>0.793</cell></row></table></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Overview of the CLEF eHealth Evaluation Lab</title>
		<author>
			<persName><forename type="first">H</forename><surname>Suominen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Kelly</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Goeuriot</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Kanoulas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Azzopardi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Spijker</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CLEF 2018 -8th Conference and Labs of the Evaluation Forum. CEUR-WS</title>
		<title level="s">Lecture Notes in Computer Science (LNCS</title>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2018">2018. 2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">CLEF 2018 Technology Assisted Reviews in Empirical Medicine Overview</title>
		<author>
			<persName><forename type="first">E</forename><surname>Kanoulas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Spijker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Azzopardi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CLEF 2018 Evaluation Labs and Workshop</title>
				<imprint>
			<publisher>CEUR-WS</publisher>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Automated confidence ranked classification of randomized controlled trial articles: an aid to evidence-based medicine</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">M</forename><surname>Cohen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">R</forename><surname>Smalheiser</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">S</forename><surname>Mcdonagh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">E</forename><surname>Adams</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">M</forename><surname>Davis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">J Am Med Inform Assoc JAMIA</title>
		<imprint>
			<biblScope unit="volume">22</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="707" to="717" />
			<date type="published" when="2015-05">2015 May</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Design of a generic, open platform for machine learningassisted indexing and clustering of articles in PubMed, a biomedical bibliographic database</title>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">R</forename><surname>Smalheiser</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">M</forename><surname>Cohen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Data Inf Manag</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="1" to="10" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<title level="m" type="main">Unsupervised Low-Dimensional Vector Representations for Words, Phrases and Text that are Transparent, Scalable, and produce Similarity Metrics that are Complementary to Neural Embeddings</title>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">R</forename><surname>Smalheiser</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Bonifield</surname></persName>
		</author>
		<idno>ArXiv Prepr ArXiv180101884. 2018</idno>
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Distribution of &quot;Characteristic&quot; Terms in MEDLINE Literatures</title>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">R</forename><surname>Smalheiser</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">I</forename><surname>Torvik</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="266" to="276" />
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Three journal similarity metrics and their application to biomedical journals</title>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">D</forename><surname>Jennifer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">R</forename><surname>Smalheiser</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">PloS One</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="issue">12</biblScope>
			<biblScope unit="page">e115681</biblScope>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">CLEF 2017 technologically assisted reviews in empirical medicine overview</title>
		<author>
			<persName><forename type="first">E</forename><surname>Kanoulas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Azzopardi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Spijker</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CEUR Workshop Proceedings</title>
				<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="1" to="29" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
