<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Modelling Temporal Relationships in Pseudomonas Aeruginosa Antimicrobial Resistance Prediction in Intensive Care Unit</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Àlvar</forename><surname>Hernàndez-Carnerero</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Campus del Poblenou</orgName>
								<orgName type="institution">Universitat Pompeu Fabra</orgName>
								<address>
									<postCode>08018</postCode>
									<settlement>Barcelona</settlement>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Miquel</forename><surname>Sànchez-Marrè</surname></persName>
							<email>miquel@cs.upc.edu</email>
							<affiliation key="aff1">
								<orgName type="department">Dept. of Computer Science</orgName>
								<orgName type="laboratory" key="lab1">Knowledge Engineering and Machine Learning Group (KEMLG-UPC)</orgName>
								<orgName type="laboratory" key="lab2">In-telligent Data Science and Artificial Intelligence Research Centre (IDEAI-UPC)</orgName>
								<orgName type="institution">Universitat Politècnica de Catalunya</orgName>
								<address>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Inmaculada</forename><surname>Mora-Jiménez</surname></persName>
							<email>inmaculada.mora@urjc.es</email>
							<affiliation key="aff2">
								<orgName type="department" key="dep1">Department of Signal Theory and Communications</orgName>
								<orgName type="department" key="dep2">Telematics and Com-puting Systems</orgName>
								<orgName type="institution">Rey Juan Carlos University</orgName>
								<address>
									<postCode>28943</postCode>
									<settlement>Madrid</settlement>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Cristina</forename><surname>Soguero-Ruiz</surname></persName>
							<email>cristina.soguero@urjc.es</email>
							<affiliation key="aff3">
								<orgName type="department" key="dep1">Department of Signal Theory and Communications</orgName>
								<orgName type="department" key="dep2">Telematics and Com-puting Systems</orgName>
								<orgName type="institution">Rey Juan Carlos University</orgName>
								<address>
									<postCode>28943</postCode>
									<settlement>Madrid</settlement>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Sergio</forename><surname>Martínez-Ag Üero</surname></persName>
							<affiliation key="aff4">
								<orgName type="department" key="dep1">Department of Signal Theory and Communications</orgName>
								<orgName type="department" key="dep2">Telematics and Com-puting Systems</orgName>
								<orgName type="institution">Rey Juan Carlos University</orgName>
								<address>
									<postCode>28943</postCode>
									<settlement>Madrid</settlement>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Joaquín</forename><surname>Álvarez-Rodríguez</surname></persName>
							<affiliation key="aff5">
								<orgName type="department">Intensive Care Department</orgName>
								<orgName type="institution">University Hospital of Fuenlabrada</orgName>
								<address>
									<postCode>28942</postCode>
									<settlement>Madrid</settlement>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Modelling Temporal Relationships in Pseudomonas Aeruginosa Antimicrobial Resistance Prediction in Intensive Care Unit</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">83FC4FAA56509D6B5A01AD65B33E72C8</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T00:58+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In this paper, the prediction of antimicrobial resistance of Pseudomonas aeruginosa bacteria caused by nosocomial infections in the Intensive Care Unit (ICU) was considered. It was trained a Logistic Regression model using health records information from patients together with the history of past sensitivity tests (antibiograms). To predict the antimicrobial resistance for a certain patient, this study proposes to model the temporal relationships using bacterial information from the rest of the patients who are at the same time in the ICU. Furthermore, a training window with incremental size is used so that training set is always temporarily as near as possible to test instances to be predicted. Using these contributions, experiments show promising results to predict antimicrobial resistance even when few training data is available. From these results it is further inferred that resistant bacteria may be spreading among patients in the ICU and their populations rapidly mutate, changing the underlying data distribution, along time.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">INTRODUCTION</head><p>Antimicrobial resistance has been increasing for decades, and the rate at which new antibiotics are synthesized is not as fast as it would be required to prevent this trend <ref type="bibr" target="#b6">[7,</ref><ref type="bibr" target="#b16">17]</ref>. A large proportion of infections caused by resistant bacteria occur during hospital stays, specially in the Intensive Care Unit (ICU) <ref type="bibr" target="#b15">[16]</ref>. There, infection rates are much higher than in other hospital divisions <ref type="bibr" target="#b8">[9]</ref>. This is due to its severely vulnerable population and to the high risk of becoming infected through multiple procedures and to the use of invasive devices distorting the anatomical integrity-protective barriers of patients (intubation, mechanical ventilation, vascular access, etc.) <ref type="bibr" target="#b2">[3]</ref>.</p><p>It is frequent to find in the ICU some kinds of bacteria which can become multidrug-resistant. Among them, Acinetobacter spp., Enterococcous fecalis and Enterococcus faecium, Escherichia coli, Klebsiella pneumoniae, Pseudomonas aeruginosa and Staphylococcus aureus are usually found. In this paper, we focus on Pseudomonas aeruginosa due to their prevalence and virulence. It is naturally resistant to many antibiotics and has a remarkable capacity for acquiring new resistance mechanisms, creating therapeutic problems <ref type="bibr" target="#b5">[6]</ref>. Pseudomonas aeruginosa is considered to be multi-drug resistant (MDR) when it is observed a reduced in vitro susceptibility to three or more antimicrobial families <ref type="bibr" target="#b20">[21]</ref>.</p><p>It is known that infections due to MDR microorganisms are a major problem. This has a significant impact in the ICU, where they can cause additional morbidity, mortality, and health care costs <ref type="bibr" target="#b2">[3,</ref><ref type="bibr" target="#b10">11]</ref>. Inappropriate initial antimicrobial treatment of P. aeruginosa is statistically linked to a higher mortality compared to initial treatment with appropriate antimicrobial. The growing MDR rate of P. aeruginosa also increases the chance of inappropriate initial antimicrobial treatment <ref type="bibr" target="#b12">[13]</ref>.</p><p>To tackle with MDR in the hospital, a culture or microbiological analysis is usually performed to test whether the bacterium is resistant or susceptible to a set of antibiotics. For this purpose, first the germ (bacterium) is isolated and an antibiogram is carried out. The antibiogram represents the in vitro bacterium's resistance to a series of antibiotics. The set of antibiotics used in the antibiogram can be selected for the specific type of bacteria being tested. The result of the antibiogram is a vector of pairs antibiotic/sensibility <ref type="bibr" target="#b3">[4]</ref>. Antibiograms are often used by clinicians to assess bacteria susceptibility rates, as an aid in selecting empiric antibiotic therapy <ref type="bibr" target="#b9">[10]</ref>. Hence, the antibiogram result could vary between bacterium species, depending on the resistance of a particular bacterium to different antibiotics. However, quite often groups of antibiotics still have similar sensitivity when tested on a given bacterium species, despite its strains <ref type="bibr" target="#b19">[20]</ref>.</p><p>In addition, the result of the antibiogram can help to reduce the bacteria spread by taking special measurements such as isolation of the patient. One of the most relevant factors of the spread of bacterial resistance is the so called cross-transmission <ref type="bibr" target="#b17">[18]</ref>, which may facilitate the spread of resistant bacteria from one patient to another. Also, some measures such as hand hygiene, skin cleansing, and contact precautions can help to prevent cross-transmission <ref type="bibr" target="#b17">[18]</ref>. To know how and where to extreme caution, information about the kind of bacteria in the ICU and their resistance plays a key role. Since ICU patients have a critical health status and the antibiogram result can take from 24h to 48h <ref type="bibr" target="#b11">[12]</ref>, it is of major importance to develop tools which can help to anticipate this result. This would contribute not only to save patient's lives but also to prevent the spread of a resistant bacteria.</p><p>Because of the aforementioned reasons, in the current study we propose to use a Data Mining (DM) technique, and specific temporal data processing to get a quick estimation of the antibiogram result. In this sense, many of the state-of-the-art studies regarding the use of DM methods to predict antimicrobial susceptibility are using whole genome sequencing <ref type="bibr" target="#b7">[8,</ref><ref type="bibr" target="#b14">15,</ref><ref type="bibr" target="#b13">14,</ref><ref type="bibr" target="#b0">1]</ref>. Despite it is a very promising technique, it involves very significant costs. As an alternative, we propose to use information from the ICU health records and demographic data of patients, along with historic antibiogram results to train a DM model, aiming to predict resistant bacteria in new cultures. As opposed to the whole genome sequencing, our approach intends to use data which are already available in the vast majority of hospitals, in order to speed up the process of identifying positive cases. Similar strategies have been analyzed in the past <ref type="bibr" target="#b11">[12,</ref><ref type="bibr" target="#b19">20,</ref><ref type="bibr" target="#b18">19]</ref>.</p><p>The remainder of this paper is structured as follows. Section 2 describes the dataset and the procedure to create new features. The experimental setup is established in Section 3, and results in Section 4. Finally, conclusions, limitations of the study, and suggested future research are presented in Section 5. The dataset contains the results of antibiograms carried out to patients in the ICU, that is, the results of the sensitivity tests (susceptible (s); or resistant (r)) for certain pairs of bacteria and family of antibiotic used in the test. It also includes demographic data of the patients and information of their ICU admission: age, gender, date of ICU admission, clinical origin of the patient before ICU admission, reason for admission, patient category, comorbidities and pluripathology (it indicates whether a patient has more than two comorbidities).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">DATA DESCRIPTION AND PREPROCESSING</head><p>As already mentioned, in this study we focused on just one type of bacteria among the multiple available in the data set: P. aeruginosa. This bacterium is considered MDR if it is resistant to three or more of the following antimicrobial families within the same culture: Aminoglycosides, Carbapenems, 4th Generation Cephalosporins, Extended-spectrum penicillins, Polymyxins and Quinolones.</p><p>In this work, we analyze all instances containing the bacterium and antimicrobial families of interest. Then, each instance is represented by features described in Table <ref type="table" target="#tab_0">1</ref>. Features c&amp;car and c&amp;cf4 represent the target to be predicted, this is, the result of the sensibility test for P. aeruginosa to the antimicrobial families of Carbapenems and 4th Generation Cephalosporins, respectively. We consider only these antimicrobial families since this is the first approach to analyze the problem and allows us to reduce the scope of this study. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Generation of new features</head><p>In addition to the selected features, we propose to generate a new kind of features based on the temporal information of cultures recorded in the data set. The purpose of these features is to capture the presence of resistant bacteria in the ICU along time and the "intensity" of that presence. By "intensity", we consider the number of patients infected and the number of days since resistant bacterium was detected. For a specific instance of the data set, which represents a culture Cp of patient Pp, cultures containing P. aeruginosa have been collected for patients Pi (Ci = {Cij}) between 21 days and 48 hours before the date of the culture Cp. These cultures exclude those associated to the patient Pp under analysis. Note that, since the results of the test usually takes 48 hours to be provided, it is not possible to use cultures taken, for instance, one hour ago. Apart from that, from a clinical viewpoint, if the culture result is positive, it is kept as positive for the next 21 days. For this reason, we consider cultures collected 21 days before the date Dp of the current culture Cp of patient Pp.</p><p>In total, six features are created using the information of the past cultures, one for each type of the antimicrobial families mentioned above: r&amp;amg, r&amp;car, r&amp;cf4, r&amp;pap, r&amp;pol and r&amp;qui. For instance, feature r&amp;amg identifies only cultures tested for the Aminoglycosides family. The value of this feature is obtained taking into account the set of past cultures for all other patients (Pi), and excluding the patient under analysis (Pp). Each culture Cij on date dij for patient Pi has the sensibility test result rij, which is 0 or 1 depending on whether the bacterium is susceptible or resistant to a specific family of antibiotics. In order to give more emphasis to the most recent cultures, we use a negative exponential function <ref type="bibr" target="#b1">[2]</ref> to weight the culture results associated to each patient Pi as follows:</p><formula xml:id="formula_0">fC ij (Dp) = 0 if rij = 0 n −(Dp−d ij ) if rij = 1 (1)</formula><p>where, n is a real number experimentally set to 1.1. Then, to compute the value of each of the six features linked to patient Pp on date Dp, for each patient excepting Pp the maximum outcome in Eq. ( <ref type="formula">1</ref>) is determined and added up according to Eq. ( <ref type="formula" target="#formula_1">2</ref>).</p><formula xml:id="formula_1">F VC p (Dp) = ∀P i =Pp max C i fC ij (Dp)<label>(2)</label></formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Data preprocessing</head><p>To proceed with the model design for c&amp;car and c&amp;cf4, we created two data subsets, one associated to each target. These subsets will be used to train a Logistic Regression (LR) model for each target. Training two different classifiers instead of, for instance a multi-class classifier, allows each classifier to be specialized in predicting its particular target, therefore tuning classifier's weights individually. In order to limit the study to MDR adquired in the ICU, only instances of patients admitted in the ICU for more than 48h are considered. The final data set for the c&amp;car was composed by 450 cultures and 34 features including the target one, and the final data set for the r&amp;cf4 was composed by 556 cultures and the same 34 features including the target one. In both data sets there are only missing values in six features. The percentages of missing values in those features are depicted in the Table <ref type="table" target="#tab_1">2</ref>.</p><p>Before training the models, we deal with missing data associated to the proposed features r&amp;amg, r&amp;car, r&amp;cf4, r&amp;pap, r&amp;pol and r&amp;qui. Note that there may not be any culture in the ICU for P. aeruginosa and a particular antimicrobial family for some time intervals, and therefore, a missing value is considered for that feature.This fact can be addressed following different approaches, such as deleting instances with missing data or imputing missing values <ref type="bibr" target="#b21">[22]</ref>. In this study we propose an strategy based on the clinical meaning of the generated features. The smaller the value of these features, the fewer patients will have been infected with resistant P. aeruginosa bacteria and the greater the length of time since they were infected. As a result, if r&amp; * features do not have any value, it suggests that no P. aeruginosa has been recently detected in the ICU. Therefore, very likely no patients would have been recently infected with a resistant bacteria, and the value provided by Eq. ( <ref type="formula" target="#formula_1">2</ref>) should be very small. In this case, missing values are replaced by a 0. Afterwards, all categorical features are converted to binary following a one-hot encoding strategy, except for the features representing dates. Since dates have an intrinsic ordering, smaller numerical values are assigned to further dates in the past, and greater values correspond to more recent dates.</p><p>Finally, Pearson correlation between features (without considering the targets) is calculated in order to discard the most correlated features, since they provide very similar information. The methodology is as follows. When two features have a correlation coefficient higher than 0.9 or lower than -0.9, just one of them is randomly selected, discarding the rest. A visual representation of Pearson correlation between features for Carbapenems and 4th Generation Cephalosporins subsets is shown in Fig. <ref type="figure" target="#fig_1">1</ref>. We can conclude that a similar correlation in patterns is found for both subsets. In both, there is just one group of features that are correlated more than 0.9, which are the following: date culture, year culture, start date and year admission. From these four correlated features, date culture is selected because it is the most representative among them and the rest are discarded. Finally, both data sets had 31 features after this discarding. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">EXPERIMENTAL SETUP</head><p>The first experiment, once data is processed, is to evaluate the relevance of the set of features selected and proposed, regarding the two different target features to be predicted. This is going to be analyzed by using Mutual Information (MI) <ref type="bibr" target="#b4">[5]</ref>. It is a quantity that measures the mutual dependence of the two variables, that is, it quantifies the amount of information that a random variable provides about other random variable. In terms of the probabilities, the MI of two jointly discrete random variables X and Y is calculated as:</p><formula xml:id="formula_2">I(X; Y ) = y∈Y x∈X p (X,Y ) (x, y) log p (X,Y ) (x, y) pX (x) pY (y)<label>(3)</label></formula><p>where p (X,Y ) is the joint probability mass function of X and Y , and pX and pY are the marginal probability mass functions of X and Y respectively.</p><p>Looking now at the prediction of the target, the type of the problem proposed have associated a series of special characteristics that have to be considered in order to properly address it.</p><p>First property in health records have an inherent temporal ordering. This forces, to use as training only instances that belong to a time prior to the test antibiograms that are to be predicted. In addition a margin of time has to be respected between train and test windows, since results of antibiograms are not immediately available after they are carried out. As before, in this particular case, a time margin of 48h needs to be considered.</p><p>The second particularity encountered when trying to predict is the concept drift. It is the fact that the concept of interest may depend on some hidden context, not given explicitly in the form of predictive features. Changes in the hidden context over time can induce more or less radical changes in the target concept. Changes in hidden context may not only result in a change of the target concept, but may also cause a change of the underlying data distribution <ref type="bibr" target="#b19">[20]</ref>. In the particular domain of this study, antimicrobial resistance, the hidden context that changes over time are the mutations of bacteria, that allow them to be more resistant to antibiotics, as time passes by. In <ref type="bibr" target="#b19">[20]</ref> it is proposed to use instance selection to handle concept drift as it is the technique that is most commonly used and has been found to offer good results. More specifically it is proposed to use a technique based on instance selection that consists in generalizing from a window that moves over recently arrived instances and uses the learnt concepts for prediction only in the immediate future. This represents a very good approach to apply to the resolution of the problem analyzed in this study except for the third particularity of the data set.</p><p>The data scarcity makes it difficult to learn from temporal windows containing several months, even years. In the first data set considered, the one with target feature c&amp;car there are a total of 450 instances, and for the second data set with target feature c&amp;cf4 there are 556. Taking into account those data sets represent cultures from 10 years (from 2004 to 2013), the average of cultures per year is 45 and 56 respectively, which is a relatively small number of instances considering how fast ICU bacteria is able to mutate and change its sensitivity patterns. For this reason, we propose to use an incremental window for training, which increases its size as test window moves towards more recent instances. That is, the training window will be fixed from the first temporal instances, which are the oldest, and it will gradually increment in size, containing more instances, as more recent instances of test are predicted. The test window, on the other hand, will have a fixed size and it will progressively slide to select more recent instances.</p><p>The DM technique used in experiments is Logistic Regression (LR). It is chosen because of its simplicity to be used as a baseline. A baseline is a model that is both simple to set up and has a reasonable chance of providing acceptable results. LR is a technique which is used for the classification, and is used in this study to evaluate the feasibility of learning the data. The classifier has to decide whether the target is sensitive or resistant. Therefore it is a binary decision.</p><p>To assess the performance of the proposed incremental window framework, experiments are carried out with different configurations. The characteristics of defined training and test windows are the following:</p><p>• For each experiment a set of LR classifiers is trained, each for a different training-test window. With each training and test pair of windows, a simple validation is performed to maintain the temporal order. The class imbalance due to the nature of the problem, causes that in the test window there might be more instances of one class than the other. To get a realistic approximation of the algorithm performance, the true negatives (success in sensitive instances) and the true positives (success in resistant instances) are calculated, together with the average between these two values and the general accuracy. For a particular test window with ns sensitive instances and nr resistant instances, if the method succeeds in predicting ps sensitive instances and pr resistant instances, the just mentioned values are calculated as: </p><p>These metrics offer a better approximation of the performance, because allow to track the success rate in the minority class label, which in many real problems is the most important one. For instance, if the test set counts with 8 sensitive and 2 resistant instances and the DM technique predicts all instances as sensitive, the general accuracy metric would pretty high value of a 80%, while it will be performing poorly in indentifying resistant antibiograms which are the ones most needed to detect. Finally, to calculate the mean accuracy among several windows, an accumulation of the success rates is done, and the performance is evaluated using Equations 4, 5, 6 and 7. In other words, the performance of several windows is not calculated by averaging the individual performance values, but by accumulating the number of success instances in each window and using it to calculate the accuracy. This is done because test windows may have a different number of instances, due to the fact that not all 3-month time intervals have the same number of antibiograms. Therefore, making an average between their accuracy values would not be adequate since some instances would have more weight than others depending on the number of instances in their test window. This section presents the results obtained after carrying out the experiments described in Section 3. In Section 4.1, the mutual dependence among features is analyzed by using MI. Sections 4.2 and 4.3 evaluate the impact of features, date culture and r&amp; * respectively, on the prediction of bacteria resistance. The improvement in prediction obtained by using the incremental training window scheme is assessed in Section 4.4.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Features relevance using mutual information</head><p>The results of feature relevance using MI for Carbapenems and 4th Generation Cephalosporins targets are represented in left and right columns of Table <ref type="table" target="#tab_3">3</ref> respectively. The method MI computes the relevance or weight of one feature according to the co-occurrence of this feature and the target feature as described in equation 3. The MI method does not take into account the possible interaction of this feature with other ones regarding the target feature. In both cases, the feature date culture is by far the variable containing most information about the feature to be predicted. Regarding Carbapenems, date culture has a value of 0.53, while the second most important feature which is age just receives a relevance value of 0.18. In 4th Generation Cephalosporins date culture gets a value of 0.44 and r&amp;qui, the second one 0.16. The importance of date culture suggests that results of antibiograms are highly dependent on the time they were performed.</p><p>In addition, it is notable that five of the six proposed features r&amp; * , which consider antibiograms of other patients in the ICU, are between the eight most relevant features for both Carbapenems and 4th Generation Cephalosporins. Therefore, we can infer they contain a great amount of information to predict antimicrobial resistance. The proposed feature with a considerably lowest importance is r&amp;pol. This is probably caused because in the data set there are a smaller number of antibiograms for Polymyxins antimicrobial family (296) compared to the amount for Aminoglycosides (564), Carbapenems (450), 4th Generation Cephalosporins (556), Extendedspectrum penicillins (558) and Quinolones (558).</p><p>Hence, one can infer that the result of the antibiogram for a particular patient is dependent on the past results of antibiograms of other patients in the ICU. To explain this fact we propose as a hypothesis that bacteria has been spreading from one patient to another in the ICU by cross-transmission. Therefore, the fact that a patient is infected with a resistant bacteria may increase the odds of another patient becoming infected as well.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Prediction contribution of date culture feature</head><p>After observing that date culture is the most important feature according to MI, the behavior of this significant variable is evaluated when predicting the result of antibiograms. To do that the target feature is predicted in two modes, one considering all features including date culture, and a second one discarding date culture from the features set. These predictions are made by using an incremental window for training and a 3-months sliding window for test as described in the experimental setup section. The training window starts containing 2004 and 2005 instances and the test window the first three months of 2006. After that, the training window increases three months and test window slides three months until the end of the database is reached. Results of the accumulated accuracy for all windows are shown in the two left columns of Table <ref type="table" target="#tab_4">4</ref> for Carbapenems and in the two right columns of the same Table <ref type="table" target="#tab_4">4</ref> for 4th Generation Cephalosporins.</p><p>It is remarkable that in both cases the success percentage slightly increases when date culture is not used. We state that this may be due precisely to the high influence this feature has on prediction of antimicrobial resistance. For some time intervals where the majority of instances belong to a particular label, this feature might be forcing the DM method to learn that, in this particular interval of values of date culture, it is highly probable that the predicted instance belongs to the majority label. That is, it may be introducing some kind of bias towards the majority label of the time interval. This is reflected in the way resistant success and sensitive success varies when date culture is removed. For Carbapenems target it can be seen that when using this feature, success in resistant instances increases and success in sensitive instances decreases. In Carbapenems data set there is a higher number of resistant instances (238) than sensitive instances (212). This seems to indicate that the majority class enhances its accuracy when using date culture, and the minority class makes it worse. The same is observed for 4th generation Cephalosporins.In 4th generation Cephalosporins columns, the opposite situation happens, when date culture is considered, Resistant success decreases and Sensitive success increases, and now the majority class are sensitive instances (350) as opposed to resistant instances (206). This apparently shows the bias date culture feature introduces on classifing instances as the majority class, because of its high influence over the prediction.</p><p>In next experiments, date culture is discarded since it slightly reduces the accuracy of the algorithm.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4">Prediction contribution of incremental window scheme</head><p>At last, the usefulness of the incremental training widow and 3month test window scheme is evaluated to assess whether it improves success metrics with respect to using training windows furthest from the test set, and the same 3-month test windows. In this experiment years 2012 and 2013 are predicted. In Table <ref type="table" target="#tab_6">6</ref> success metrics for Carbapenems data sets are calculated, for 2012 and 2013. Figure <ref type="figure" target="#fig_3">2</ref> shows the number of test instances in each 3-month test window during the mentioned years. In the two left columns of Table <ref type="table" target="#tab_6">6</ref> accuracy improves from a 59%, using a fixed window of training from 2004 to 2011, to a 68% when using the incremental window to predict 2012 instances. In the three columns on the right of Table <ref type="table" target="#tab_6">6</ref> it is observed that, when predicting 2013, accuracy raises from a 64% when using training data until 2011 to a 71% when considering year 2012 as training too. When using the incremental window, the accuracy remains the same in a 71%. The reason why, in this case using the incremental window does not increase the accuracy with respect to training data until 2012 may be that the number of instances from first months of 2013 is small as it can be seen in Figure <ref type="figure" target="#fig_3">2</ref>, and including them in the incremental training window may not increase the knowledge of the problem. Table <ref type="table" target="#tab_7">7</ref> and Figure <ref type="figure" target="#fig_4">3</ref>, represent the results of the same experiment for 4th Generation Cephalosporins data. In the two left columns of Table <ref type="table" target="#tab_7">7</ref> accuracy for predicting 2012 is the same considering training instances until 2011 and using the incremental window. As before, that can be due to instances of first months of 2012 not providing enough knowledge to improve accuracy of prediction, although in this case there is a greater number of them as it is shown in Figure <ref type="figure" target="#fig_4">3</ref>. The three columns on the right of Table <ref type="table" target="#tab_7">7show</ref> that accuracy to pre- Considering the incremental window for training, the percentage increases to an 80%. These last experiments reveal that using training data temporarily as close as possible to the test set, always maintain the same success rates or improve them than using more distant data. Therefore, we conclude that the incremental window is a good scheme for the training set to make predictions of this particular problem. Also in these last experiments, in which test years used are 2012 and 2013, the accuracy values achieved are higher than the ones in previous experiments where the accuracy value was accumulated from 2006 to 2013. The fact that accuracy improves as the training set is bigger implies that the algorithm is able to learn the model and generalize from the training data. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">CONCLUSIONS AND FUTURE WORK</head><p>In this paper we suggest to use health records and past antibiogram data to predict antimicrobial resistance in the ICU. We propose to use information about recently detected resistant germs in ICU as features of the data set. To handle changes in data distribution over time caused by progressive mutation of bacteria, we suggest to use an incremental window for training set, and a test window with fixed size such that training and test instances are temporarily as close as possible.</p><p>Experiments show that information of recent resistant bacteria detected in patients of the ICU, contains a relatively high amount of in- formation to predict bacteria resistance in other patients, which could indicate that bacteria is spreading among ICU patients. It has also been observed that features providing the specific temporal ordering between all instances in the data set tend to decrease prediction accuracy. Lastly, experiments indicate that using an incremental window for training, maintain success rates or improve them. Therefore, we can conclude it is a scheme that improves the algorithm performance.</p><p>As future work we consider including further patient's details about their admission, such as the antibiotics they have been administered, whether they have required intubation, if they have needed mechanical ventilation, among others, which are indicators that can have an impact on the appearance of resistant bacteria. Also, including past antibiogram information, for each particular patient whose sensibility test has to be predicted, would be an interesting approach to evaluate. To extend this study we propose to use other kind of DM algorithms different to LR, to assess whether they can improve success rates seen in this study. In addition, it would be advantageous to predict the resistance to the six antimicrobial families relevant to P. aeruginosa mentioned in the current study, so it would be possible to detect when a bacteria may become multidrug resistant.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head></head><label></label><figDesc>Data considered in this work is a unified and anonymized dataset specifically collected for the study of antimicrobial resistance in the ICU of the University Hospital of Fuenlabrada (UHF) in Spain. The data set covers the years from 2004 to 2013. During this time interval, 1914 patients were admitted to the ICU, and 22142 cultures were carried out from 2186 admissions. It has a number of 257 different types of bacteria and 26 antimicrobial families.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 1 .</head><label>1</label><figDesc>Figure 1. Visual representation of Pearson correlation between features for: (a) Carbapenems and (b) 4th Generation Cephalosporins subset.</figDesc><graphic coords="3,314.94,527.00,225.45,101.93" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 2 .</head><label>2</label><figDesc>Figure 2. Number of test instances for Carbapenems in years 2012 to 2013 with 3 months granularity.</figDesc><graphic coords="7,38.65,71.12,252.00,126.00" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 3 .</head><label>3</label><figDesc>Figure 3. Number of test instances for 4th Gen. Cephalosporins in years 2012 to 2013 with 3 months granularity.</figDesc><graphic coords="7,301.66,71.12,252.00,126.00" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 .</head><label>1</label><figDesc>Feature names and their description. Marked in bold the target features.</figDesc><table><row><cell>Feature name</cell><cell>Description</cell></row><row><cell>c&amp;car</cell><cell>Result of the sensibility test to the Carbapenems family (r/s).</cell></row><row><cell></cell><cell>Result of the sensibility test to the</cell></row><row><cell>c&amp;cf4</cell><cell>4th Generation Cephalosporins</cell></row><row><cell></cell><cell>family (r/s).</cell></row><row><cell>days to culture</cell><cell>Number of days elapsed from admission to the date of the culture.</cell></row><row><cell>date culture</cell><cell>Date of the culture.</cell></row><row><cell>number antibiotics</cell><cell>Number of antibiotics tested in the antibiogram.</cell></row><row><cell>culture type</cell><cell>Type of culture performed (pharynx, urine, blood, etc.).</cell></row><row><cell>culture type grouped</cell><cell>Type of culture performed grouped (respiratory, urine, surface, etc.).</cell></row><row><cell>culture type grouped 2</cell><cell>Type of culture performed grouped (clinical sample/surface).</cell></row><row><cell>day month culture</cell><cell>Day of the month on which culture is carried out.</cell></row><row><cell>month culture</cell><cell>Month on which culture is carried out.</cell></row><row><cell>year culture</cell><cell>Year on which culture is carried out.</cell></row><row><cell>origin</cell><cell>Clinical origin before ICU admission.</cell></row><row><cell>reason admission</cell><cell>Reason of admission at ICU.</cell></row><row><cell>goi A</cell><cell>Group of illness A. Considers cardiovascular events.</cell></row><row><cell>goi B</cell><cell>Group of illness B. Considers kidney failure, arthritis.</cell></row><row><cell>goi C</cell><cell>Group of illness C. Considers respiratory problems.</cell></row><row><cell>goi D</cell><cell>Group of illness D. Considers pancreatitis, endocrine.</cell></row><row><cell>goi E</cell><cell>Group of illness E. Considers epilepsy, dementia.</cell></row><row><cell>goi F</cell><cell>Group of illness F. Considers diabetes, arteriosclerosis.</cell></row><row><cell>goi G</cell><cell>Group of illness G. Considers neoplasms.</cell></row><row><cell>pluripathology</cell><cell>Number of groups of illness to which patient belongs.</cell></row><row><cell>patient category</cell><cell>Patient's clinical category.</cell></row><row><cell>age</cell><cell>Patient age.</cell></row><row><cell>gender</cell><cell>Patient's gender.</cell></row><row><cell>start date</cell><cell>Date on which the patient's admission begins.</cell></row><row><cell>day week admission</cell><cell>Day of the week on which the patient's admission begins.</cell></row><row><cell>day month admission</cell><cell>Day of the month on which the patient's admission begins.</cell></row><row><cell>month admission</cell><cell>Month on which the patient's admission begins.</cell></row><row><cell>year admission</cell><cell>Year on which the patient's admission begins.</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2 .</head><label>2</label><figDesc>Features with missing values (expressed in %) in both data sets.</figDesc><table><row><cell>Dataset</cell><cell cols="6">r&amp;amg r&amp;car r&amp;cf4 r&amp;pap r&amp;pol r&amp;qui</cell></row><row><cell>c&amp;car</cell><cell>11.33</cell><cell>15.56</cell><cell>11.33</cell><cell>11.33</cell><cell>39.33</cell><cell>11.78</cell></row><row><cell>c&amp;cf4</cell><cell>10.79</cell><cell>17.99</cell><cell>10.79</cell><cell>10.79</cell><cell>35.97</cell><cell>11.15</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head></head><label></label><figDesc>• The size chosen for test windows is fixed in 3 months. Which is a relatively short time, near to the training instances, containing enough test instances. • Different test windows do not overlap between them. That is, compared to the others, each test window contains different instances belonging to different time intervals. • Instances in the training window do not contain antibiograms belonging to patients that also are present in the test window. For instance, if the result of an antibiogram of a particular patient is to be predicted in the test set, there are not past antibiograms of the same patient in the training set. That way, it is ensured that patients from training and test windows are different.</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 3 .</head><label>3</label><figDesc>MI feature weighting for c&amp;car and c&amp;cf4 target.Table showing the weight given by MI to each of the features in descending order regarding its value. Left column contains values for Carbapenems antimicrobial family and right column for 4th Gen. Cephalosporins family.</figDesc><table><row><cell>Feat. name</cell><cell>Wgts.</cell><cell>Feat. name</cell><cell>Wgts.</cell></row><row><cell></cell><cell>c&amp;car</cell><cell></cell><cell>c&amp;cf4</cell></row><row><cell>date culture</cell><cell>0.5312</cell><cell>date culture</cell><cell>0.4440</cell></row><row><cell>age</cell><cell>0.1866</cell><cell>r&amp;qui</cell><cell>0.1637</cell></row><row><cell>days to culture</cell><cell>0.1713</cell><cell>age</cell><cell>0.1578</cell></row><row><cell>r&amp;qui</cell><cell>0.1711</cell><cell>days to culture</cell><cell>0.1505</cell></row><row><cell>r&amp;pap</cell><cell>0.1668</cell><cell>r&amp;pap</cell><cell>0.1491</cell></row><row><cell>r&amp;amg</cell><cell>0.1459</cell><cell>r&amp;amg</cell><cell>0.1295</cell></row><row><cell>r&amp;car</cell><cell>0.1382</cell><cell>r&amp;car</cell><cell>0.1289</cell></row><row><cell>r&amp;cf4</cell><cell>0.1228</cell><cell>r&amp;cf4</cell><cell>0.1137</cell></row><row><cell>day month-admission</cell><cell>0.1167</cell><cell>day month-admission</cell><cell>0.1038</cell></row><row><cell>origin</cell><cell>0.0981</cell><cell>number antibiotics</cell><cell>0.0799</cell></row><row><cell>number antibiotics</cell><cell>0.0936</cell><cell>origin</cell><cell>0.0792</cell></row><row><cell>reason admission</cell><cell>0.0860</cell><cell>reason admission</cell><cell>0.0608</cell></row><row><cell>day month culture</cell><cell>0.0509</cell><cell>month admission</cell><cell>0.0556</cell></row><row><cell>month culture</cell><cell>0.0333</cell><cell>goi E</cell><cell>0.0485</cell></row><row><cell>culture type</cell><cell>0.0279</cell><cell>day month culture</cell><cell>0.0393</cell></row><row><cell>month admission</cell><cell>0.0219</cell><cell>day week admission</cell><cell>0.0366</cell></row><row><cell>day week admission</cell><cell>0.0214</cell><cell>culture type</cell><cell>0.0243</cell></row><row><cell>culture type-grouped</cell><cell>0.0190</cell><cell>r&amp;pol</cell><cell>0.0209</cell></row><row><cell>r&amp;pol</cell><cell>0.0144</cell><cell>pluripathology</cell><cell>0.0204</cell></row><row><cell>pluripathology</cell><cell>0.0074</cell><cell>month culture</cell><cell>0.0131</cell></row><row><cell>gender</cell><cell>0.0061</cell><cell>culture type-grouped</cell><cell>0.0116</cell></row><row><cell>goi B</cell><cell>0.0040</cell><cell>patient category</cell><cell>0.0099</cell></row><row><cell>goi F</cell><cell>0.0025</cell><cell>gender</cell><cell>0.0082</cell></row><row><cell>goi A</cell><cell>0.0020</cell><cell>goi B</cell><cell>0.0030</cell></row><row><cell>culture type-grouped 2</cell><cell>0.0017</cell><cell>goi F</cell><cell>0.0026</cell></row><row><cell>goi E</cell><cell>0.0008</cell><cell>goi C</cell><cell>0.0023</cell></row><row><cell>goi D</cell><cell>0.0003</cell><cell>goi G</cell><cell>0.0022</cell></row><row><cell>patient category</cell><cell>4.5e-05</cell><cell>culture type-grouped 2</cell><cell>0.0010</cell></row><row><cell>goi G</cell><cell>2.0e-05</cell><cell>goi A</cell><cell>6.1e-05</cell></row><row><cell>goi C</cell><cell>2.9e-06</cell><cell>goi D</cell><cell>4.1e-09</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 4 .</head><label>4</label><figDesc>Year 2006 to 2013 prediction for Carbapenems and 4th Gen. Cephalosporins with incremental window. similar experiment is carried out for proposed r&amp; * features. Accuracy metrics are calculated with and without them, for test years between 2006 and 2013. In Table5the results of this experiment are represented. The accuracy remains almost the same whether r&amp; * features are used to predict or not, for both antimicrobial families.</figDesc><table><row><cell></cell><cell cols="2">Carbapenems</cell><cell cols="2">4th Gen. Ceph.</cell></row><row><cell>Metric</cell><cell>With</cell><cell>Without</cell><cell>With</cell><cell>Without</cell></row><row><cell>Accuracy</cell><cell>56.96</cell><cell>57.22</cell><cell>57.49</cell><cell>60.88</cell></row><row><cell cols="2">Resistant success 56.88</cell><cell>50.92</cell><cell>25.26</cell><cell>40.72</cell></row><row><cell>Sensitive success</cell><cell>57.06</cell><cell>64.97</cell><cell>77.85</cell><cell>73.62</cell></row><row><cell>Average</cell><cell>56.97</cell><cell>57.94</cell><cell>51.55</cell><cell>57.17</cell></row><row><cell cols="5">Comparison with and without using date culture feature for</cell></row><row><cell cols="5">prediction of years 2006 to 2013 for both c&amp;car and c&amp;cf4</cell></row><row><cell>target features.</cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell cols="5">4.3 Prediction contribution of r&amp; * features</cell></row><row><cell cols="5">The difference is observed in resistant success and sensitive success.</cell></row><row><cell cols="5">When r&amp; * features are taken into account, these two metrics are</cell></row><row><cell cols="5">more balanced, which means that both the majority class and mi-</cell></row><row><cell cols="5">nority class get a similar success rate, which is a desirable effect.</cell></row><row><cell cols="5">Moreover, one can note that in both antimicrobial families, resistant</cell></row><row><cell cols="5">success increases, which means that cultures from other patients are</cell></row><row><cell cols="5">helping to better discriminate the resistant instances. Sensitive suc-</cell></row><row><cell cols="5">cess decreases which is probably caused by the LR decision bound-</cell></row><row><cell cols="5">ary, which after moving to better predict resistant instances is lower-</cell></row><row><cell cols="4">ing its performance in recognizing sensitive instances.</cell><cell></cell></row></table><note>A</note></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_5"><head>Table 5 .</head><label>5</label><figDesc>Year 2006 to 2013 prediction for Carbapenems and 4th Gen. Cephalosporins with incremental window.</figDesc><table><row><cell></cell><cell cols="2">Carbapenems</cell><cell cols="2">4th Gen. Ceph.</cell></row><row><cell>Metric</cell><cell>With</cell><cell>Without</cell><cell>With</cell><cell>Without</cell></row><row><cell>Accuracy</cell><cell>57.22</cell><cell>57.97</cell><cell>60.88</cell><cell>60.68</cell></row><row><cell cols="2">Resistant success 50.92</cell><cell>47.25</cell><cell>40.72</cell><cell>30.41</cell></row><row><cell>Sensitive success</cell><cell>64.97</cell><cell>71.19</cell><cell>73.62</cell><cell>79.80</cell></row><row><cell>Average</cell><cell>57.94</cell><cell>59.22</cell><cell>57.17</cell><cell>55.11</cell></row><row><cell cols="5">Comparison with and without using the set of r&amp; * features for</cell></row><row><cell cols="5">prediction of years 2006 to 2013 for both c&amp;car and c&amp;cf4</cell></row><row><cell>target features.</cell><cell></cell><cell></cell><cell></cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_6"><head>Table 6 .</head><label>6</label><figDesc>Year 2012 and year 2013 prediction for Carbapenems.</figDesc><table><row><cell></cell><cell cols="2">2012</cell><cell></cell><cell>2013</cell><cell></cell></row><row><cell>Metric</cell><cell>04-11</cell><cell>Inc.</cell><cell cols="2">04-11 04-12</cell><cell>Inc.</cell></row><row><cell>Accuracy</cell><cell>59.09</cell><cell>68.18</cell><cell>64.29</cell><cell>71.43</cell><cell>71.43</cell></row><row><cell>Resistant success</cell><cell>57.14</cell><cell>66.67</cell><cell>57.14</cell><cell>85.71</cell><cell>85.71</cell></row><row><cell>Sensitive success</cell><cell>100.0</cell><cell>100.0</cell><cell>71.43</cell><cell>57.14</cell><cell>57.14</cell></row><row><cell>Average</cell><cell>78.57</cell><cell>83.33</cell><cell>64.29</cell><cell>71.43</cell><cell>71.43</cell></row><row><cell cols="6">In the two left columns, comparison of the year 2012 prediction</cell></row><row><cell cols="6">using a fixed training window with instances from 2004 to 2011 and</cell></row><row><cell cols="6">an incremental training window. In the three columns on the right,</cell></row><row><cell cols="6">comparison of 2013 prediction using a fixed training window with</cell></row><row><cell cols="6">instances from 2004 to 2011, a fixed training window from 2004 to</cell></row><row><cell cols="6">2012 and an incremental training window. The target feature predicted</cell></row><row><cell>in all cases is c&amp;car.</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_7"><head>Table 7 .</head><label>7</label><figDesc>Year 2012 and year 2013 prediction for 4th Gen. Cephalosporins.In the two left columns, comparison of the year 2012 prediction using a fixed training window with instances from 2004 to 2011 and an incremental training window. In the three columns on the right, comparison of 2013 prediction using a fixed training window with instances from 2004 to 2011, a fixed training window from 2004 to 2012 and an incremental training window. The target feature predicted in all cases is c&amp;cf4.</figDesc><table><row><cell></cell><cell cols="2">2012</cell><cell></cell><cell>2013</cell><cell></cell></row><row><cell>Metric</cell><cell>04-11</cell><cell>Inc.</cell><cell cols="2">04-11 04-12</cell><cell>Inc.</cell></row><row><cell>Accuracy</cell><cell>59.57</cell><cell>59.57</cell><cell>75.0</cell><cell>75.0</cell><cell>80.0</cell></row><row><cell>Resistant success</cell><cell>15.38</cell><cell>15.38</cell><cell>0.0</cell><cell>0.0</cell><cell>0.0</cell></row><row><cell>Sensitive success</cell><cell>76.47</cell><cell>76.47</cell><cell>78.95</cell><cell>78.95</cell><cell>84.21</cell></row><row><cell>Average</cell><cell>45.93</cell><cell>45.93</cell><cell>39.47</cell><cell>39.47</cell><cell>42.11</cell></row></table></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>ACKNOWLEDGEMENTS</head><p>We are thankful to University Hospital of Fuenlabrada, Madrid, Spain for providing the database used in this research.</p><p>This work has been partly supported by the Spanish Thematic Network "Learning Machines for Singular Problems and Applications (MAPAS)" (TIN2017-90567-REDT, MINECO/FEDER EU), by the IDEAI-UPC Consolidated Research Group Grant from Catalan Agency of University and Research Grants (AGAUR, Generalitat de Catalunya) (2017 SGR 574), by the Spanish Ministry of Economy, Industry and Competitiveness under the Research Project Klinilycs (TEC2016-75361-R), by the Science and Innovation Ministry Grants AAVis-BMR (PID2019-107768RA-I00) and BigTheory (PID2019-106623RB-C41), by the Spanish Institute of Health Carlos III (grant DTS 17/00158), by Project Ref. F656 financed by Rey Juan Carlos University, by the Young Researchers R&amp;D Project Ref. 2020-661, financed by Rey Juan Carlos University and Community of Madrid (Spain), and by the Youth Employment Initiative (YEI) R&amp;D Project Ref. TIC-11649 financed by the Community of Madrid (Spain).</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Deeparg: a deep learning approach for predicting antibiotic resistance genes from metagenomic data</title>
		<author>
			<persName><forename type="first">Gustavo</forename><surname>Arango-Argoty</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Emily</forename><surname>Garner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Amy</forename><surname>Pruden</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Lenwood</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Peter</forename><surname>Heath</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Liqing</forename><surname>Vikesland</surname></persName>
		</author>
		<author>
			<persName><surname>Zhang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Microbiome</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="1" to="15" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<author>
			<persName><surname>Balakrishnan</surname></persName>
		</author>
		<title level="m">Exponential distribution: theory, methods and applications</title>
				<imprint>
			<publisher>Routledge</publisher>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">The rising problem of antimicrobial resistance in the intensive care unit</title>
		<author>
			<persName><forename type="first">Nele</forename><surname>Brusselaers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Dirk</forename><surname>Vogelaers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Stijn</forename><surname>Blot</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Annals of intensive care</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page">47</biblScope>
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Lectura interpretada del antibiograma: una necesidad clínica</title>
		<author>
			<persName><forename type="first">Rafael</forename><surname>Cantón</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Enfermedades Infecciosas y microbiología clínica</title>
		<imprint>
			<biblScope unit="volume">28</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page" from="375" to="385" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<title level="m" type="main">Elements of information theory</title>
		<author>
			<persName><forename type="first">M</forename><surname>Thomas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Joy</forename><forename type="middle">A</forename><surname>Cover</surname></persName>
		</author>
		<author>
			<persName><surname>Thomas</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Risk factors for multidrug-resistant pseudomonas aeruginosa nosocomial infection</title>
		<author>
			<persName><forename type="first">P</forename><surname>Defez</surname></persName>
		</author>
		<author>
			<persName><surname>Fabbro-Peray</surname></persName>
		</author>
		<author>
			<persName><surname>Bouziges</surname></persName>
		</author>
		<author>
			<persName><surname>Gouby</surname></persName>
		</author>
		<author>
			<persName><surname>Mahamat</surname></persName>
		</author>
		<author>
			<persName><surname>Daures</surname></persName>
		</author>
		<author>
			<persName><surname>Sotto</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Hospital Infection</title>
		<imprint>
			<biblScope unit="volume">57</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="209" to="216" />
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Origin of bacterial resistance to antibiotics</title>
		<author>
			<persName><forename type="first">Milislav</forename><surname>Demerec</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of bacteriology</title>
		<imprint>
			<biblScope unit="volume">56</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page">63</biblScope>
			<date type="published" when="1948">1948</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">The role of whole genome sequencing in antimicrobial susceptibility testing of bacteria: report from the eucast subcommittee</title>
		<author>
			<persName><surname>Mj Ellington</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Frank</forename><surname>Ekelund</surname></persName>
		</author>
		<author>
			<persName><surname>Møller Aarestrup</surname></persName>
		</author>
		<author>
			<persName><surname>Canton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Christian</forename><surname>Doumith</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hajo</forename><surname>Giske</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Henrik</forename><surname>Grundman</surname></persName>
		</author>
		<author>
			<persName><surname>Hasman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Katie</forename><forename type="middle">L</forename><surname>Holden</surname></persName>
		</author>
		<author>
			<persName><surname>Hopkins</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Clinical microbiology and infection</title>
		<imprint>
			<biblScope unit="volume">23</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="2" to="22" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Antibiotic susceptibility among aerobic gram-negative bacilli in intensive care units in 5 european countries</title>
		<author>
			<persName><forename type="first">Håkan</forename><surname>Hanberger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">José-Angel</forename><surname>Garcia-Rodriguez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Miguel</forename><surname>Gobernado</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Herman</forename><surname>Goossens</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Marc</forename><forename type="middle">J</forename><surname>Lennart E Nilsson</surname></persName>
		</author>
		<author>
			<persName><surname>Struelens</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Jama</title>
		<imprint>
			<biblScope unit="volume">281</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="67" to="71" />
			<date type="published" when="1999">1999</date>
		</imprint>
	</monogr>
	<note>&apos;</note>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Hospital antibiogram: a necessity</title>
		<author>
			<persName><surname>Joshi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Indian journal of medical microbiology</title>
		<imprint>
			<biblScope unit="volume">28</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page">277</biblScope>
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
	<note>&apos;</note>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Clinical and economic burden of antimicrobial resistance&apos;</title>
		<author>
			<persName><forename type="first">Eli</forename><forename type="middle">N</forename><surname>Lisa L Maragakis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sara</forename><forename type="middle">E</forename><surname>Perencevich</surname></persName>
		</author>
		<author>
			<persName><surname>Cosgrove</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Expert review of anti-infective therapy</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="issue">5</biblScope>
			<biblScope unit="page" from="751" to="763" />
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Machine learning techniques to identify antimicrobial resistance in the intensive care unit</title>
		<author>
			<persName><forename type="first">Sergio</forename><surname>Martínez-Agüero</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Inmaculada</forename><surname>Mora-Jiménez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jon</forename><surname>Lérida-García</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Joaquín</forename><surname>Álvarez-Rodríguez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Cristina</forename><surname>Soguero-Ruiz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Entropy</title>
		<imprint>
			<biblScope unit="volume">21</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page">603</biblScope>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Pseudomonas aeruginosa bloodstream infection: importance of appropriate initial antimicrobial treatment</title>
		<author>
			<persName><forename type="first">Ann</forename><forename type="middle">E</forename><surname>Scott T Micek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">David</forename><forename type="middle">J</forename><surname>Lloyd</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Richard</forename><forename type="middle">M</forename><surname>Ritchie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Victoria</forename><forename type="middle">J</forename><surname>Reichley</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Marin</forename><forename type="middle">H</forename><surname>Fraser</surname></persName>
		</author>
		<author>
			<persName><surname>Kollef</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Antimicrobial agents and chemotherapy</title>
		<imprint>
			<biblScope unit="volume">49</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="1306" to="1311" />
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Using machine learning to predict antimicrobial mics and associated genomic features for nontyphoidal salmonella</title>
		<author>
			<persName><forename type="first">Marcus</forename><surname>Nguyen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Wesley</forename><surname>Long</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Patrick</forename><forename type="middle">F</forename><surname>Mcdermott</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Randall</forename><forename type="middle">J</forename><surname>Olsen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Robert</forename><surname>Olson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Rick</forename><forename type="middle">L</forename><surname>Stevens</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Gregory</forename><forename type="middle">H</forename><surname>Tyson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Shaohua</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">James</forename><forename type="middle">J</forename><surname>Davis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of clinical microbiology</title>
		<imprint>
			<biblScope unit="volume">57</biblScope>
			<biblScope unit="issue">2</biblScope>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Evaluation of machine learning and rules-based approaches for predicting antimicrobial resistance profiles in gram-negative bacilli from whole genome sequence data</title>
		<author>
			<persName><forename type="first">Tahir</forename><surname>Mitchell W Pesesky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Meghan</forename><surname>Hussain</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sanket</forename><surname>Wallace</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Saadia</forename><surname>Patel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Carey-Ann D</forename><surname>Andleeb</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Gautam</forename><surname>Burnham</surname></persName>
		</author>
		<author>
			<persName><surname>Dantas</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Frontiers in microbiology</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="page">1887</biblScope>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Prediction of healthcare associated infections in an intensive care unit using machine learning and big data tools</title>
		<author>
			<persName><forename type="first">Paz</forename><surname>Revuelta-Zamorano</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alberto</forename><surname>Sánchez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">José</forename><surname>Luis Rojo-Álvarez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Joaquín</forename><surname>Álvarez-Rodríguez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Javier</forename><surname>Ramos-López</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Cristina</forename><surname>Soguero-Ruiz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">XIV Mediterranean Conference on Medical and Biological Engineering and Computing</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2016">2016. 2016</date>
			<biblScope unit="page" from="840" to="845" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Antibiotic and biocide resistance in bacteria: introduction</title>
		<author>
			<persName><surname>Russell</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of applied microbiology</title>
		<imprint>
			<biblScope unit="volume">92</biblScope>
			<biblScope unit="page" from="1S" to="3S" />
			<date type="published" when="2002">2002</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<monogr>
		<title level="m" type="main">De-escalation as a potential way of reducing antibiotic use and antimicrobial resistance in icu</title>
		<author>
			<persName><forename type="first">Jean-Francois</forename><surname>Timsit</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Stephan</forename><surname>Harbarth</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jean</forename><surname>Carlet</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<monogr>
		<title level="m" type="main">Predicting future antibiotic susceptibility using regression-based methods on longitudinal massachusetts antibiogram data</title>
		<author>
			<persName><forename type="first">Elke</forename><forename type="middle">A</forename><surname>Ml Tlachac</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kerri</forename><surname>Rundensteiner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Scott</forename><surname>Barton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kirthana</forename><surname>Troppy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Shira</forename><surname>Beaulac</surname></persName>
		</author>
		<author>
			<persName><surname>Doron</surname></persName>
		</author>
		<editor>HEALTHINF</editor>
		<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="103" to="114" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Handling local concept drift with dynamic integration of classifiers: Domain of antibiotic resistance in nosocomial infections</title>
		<author>
			<persName><forename type="first">Alexey</forename><surname>Tsymbal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Mykola</forename><surname>Pechenizkiy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Padraig</forename><surname>Cunningham</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Seppo</forename><surname>Puuronen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">19th IEEE Symposium on Computer-Based Medical Systems (CBMS&apos;06)</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2006">2006</date>
			<biblScope unit="page" from="679" to="684" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Antimicrobial-resistant pathogens associated with healthcare-associated infections: summary of data reported to the national healthcare safety network at the centers for disease control and prevention, 2011-2014</title>
		<author>
			<persName><forename type="first">Amy</forename><forename type="middle">K</forename><surname>Lindsey M Weiner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Brandi</forename><surname>Webb</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Margaret</forename><forename type="middle">A</forename><surname>Limbago</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jean</forename><surname>Dudeck</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alexander</forename><forename type="middle">J</forename><surname>Patel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jonathan</forename><forename type="middle">R</forename><surname>Kallen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Dawn</forename><forename type="middle">M</forename><surname>Edwards</surname></persName>
		</author>
		<author>
			<persName><surname>Sievert</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">infection control &amp; hospital epidemiology</title>
		<imprint>
			<biblScope unit="volume">37</biblScope>
			<biblScope unit="issue">11</biblScope>
			<biblScope unit="page" from="1288" to="1301" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Efficient missing data imputation for supervised learning</title>
		<author>
			<persName><forename type="first">Shichao</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Xindong</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Manlong</forename><surname>Zhu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">9th IEEE International Conference on Cognitive Informatics (ICCI&apos;10)</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="672" to="679" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
