<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Deep Neural Networks and Decision Tree classifier for Visual Question Answering in the medical domain</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Imane</forename><surname>Allaouzi</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Sciences and Techniques</orgName>
								<orgName type="institution">Abdelmalek Essaâdi University</orgName>
								<address>
									<settlement>Tangier</settlement>
									<country key="MA">Morocco</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Badr</forename><surname>Benamrou</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Sciences and Techniques</orgName>
								<orgName type="institution">Abdelmalek Essaâdi University</orgName>
								<address>
									<settlement>Tangier</settlement>
									<country key="MA">Morocco</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Mohamed</forename><surname>Benamrou</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Sciences and Techniques</orgName>
								<orgName type="institution">Abdelmalek Essaâdi University</orgName>
								<address>
									<settlement>Tangier</settlement>
									<country key="MA">Morocco</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Mohamed</forename><forename type="middle">Ben</forename><surname>Ahmed</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Sciences and Techniques</orgName>
								<orgName type="institution">Abdelmalek Essaâdi University</orgName>
								<address>
									<settlement>Tangier</settlement>
									<country key="MA">Morocco</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Deep Neural Networks and Decision Tree classifier for Visual Question Answering in the medical domain</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">6C59E078E898A529502832DCF56CAC00</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T02:33+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>CNN</term>
					<term>Bidirectional LSTM</term>
					<term>Decision Tree classifier</term>
					<term>Language modeling</term>
					<term>medical imaging</term>
					<term>Visual Question Answering</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This paper presents our contribution to the problem of visual question answering in the medical domain using a combination of deep neural networks and the Decision tree classifier. In our proposed approach we consider the task of visual question answering as multi-label classification problem, where each label corresponds to a unique word in the answer dictionary that was built from the training set.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Visual question answering (VQA) is a new and challenging task that has witnessed a surge interest from Artificial Intelligence (AI) community, since it combines the fields of Computer Vision (CV) and Natural Language Processing (NLP). NLP and CV are two branches of AI, where the former one enables computers to understand and analyze human language, while the second enables computers to understand and process images in the same way that a human does. The main idea of VQA systems is to predict the right answer giving both image and question about this image in a natural language. The VQA task can be treated as a classification problem if the answer is chosen from among different choices or as a generation problem if the answer is a comprehensive and well-formed textual description.</p><p>In the last few years, Deep Neural Networks have achieved the state-of-the-art in a wide range of NLP and CV applications including image recognition <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2]</ref>, machine translation <ref type="bibr" target="#b2">[3,</ref><ref type="bibr" target="#b3">4]</ref>,image caption <ref type="bibr" target="#b4">[5,</ref><ref type="bibr" target="#b5">6]</ref> and Visual Question Answering <ref type="bibr" target="#b6">[7,</ref><ref type="bibr" target="#b7">8,</ref><ref type="bibr" target="#b8">9]</ref>. Following this trend, this paper presents our contribution to the problem of visual question answering in the medical domain <ref type="bibr" target="#b9">[10,</ref><ref type="bibr" target="#b10">11]</ref> using a combination of deep neural networks (Convolutional Neural Network (CNN), Bidirectional Long Short-Term Memory) and the Decision tree classifier. In our proposed approach we consider the task of VQA as multi-label classification problem, where each label corresponds to a unique word in the answer dictionary that was built from the training set. The paper's arrangement is as follows: the dataset is described in Section 2, the proposed model is described in Section 3, results are presented and discussed in Section 4, and finally Section 5 draws some conclusions and future work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>2</head><p>Dataset:</p><p>VQA-Med <ref type="bibr" target="#b9">[10]</ref> is a dataset generated using images from PubMed Central articles (essentially a subset of the ImageCLEF 2017 caption prediction task <ref type="bibr" target="#b11">[12]</ref>). As shown in the table </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Images Questions Answers</head><p>What does the CT scan show? A large filling defect in the left atrium.</p><p>gr Where does CT coronal section of the skull show well-defined unilocular lesion?</p><p>In the right maxillary sinus.</p><p>Who does CT abdomen show?</p><p>Right adrenal pheochromacytoma.</p><p>Is there any intra-cardiac mass identified?</p><p>No.</p><p>What shows the limits between the stomach and mass? MRI.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>3</head><p>The Proposed Model:</p><p>The VQA in the medical domain involves providing a medical question-image pairs to produce answers. In this work we assume that the answers are a concatenation of one or more words, therefore we have treated the task as multi-label classification problem.</p><p>Our proposed model uses the pre-trained VGG-16 <ref type="bibr" target="#b12">[13]</ref> model to extract image features and the word embedding <ref type="bibr" target="#b13">[14]</ref> along with a Bidirectional Long Short-Term Memory (LSTM) <ref type="bibr" target="#b14">[15]</ref> to embed the question and extract textual features. The image and textual features are concatenated using two fully connected layers of 512 neurons to get a fixed length feature vector. This vector is used as a new input for Decision Tree Classifier in order to predict an answer.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>The model consists of 3 sub-models:  Image Representation:</head><p>To extract prominent features from medical images, we have used the pre-trained VGG-16 network that won the ImageNet 2014 challenge <ref type="bibr">[16]</ref>, by achieving a 7.4% error rate on object classification. We have removed the last layer of this network to obtain an output vector of 4096 elements, which in turn passed through a fully connected layer to get image representation of size 512. The VGG-16 architecture is shown in the figure <ref type="figure" target="#fig_0">1</ref>: </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head> Question Representation:</head><p>Recently recurrent neural networks (RNNs) have shown great success in diverse NLP tasks <ref type="bibr" target="#b15">[18,</ref><ref type="bibr" target="#b16">19]</ref>, motivated by this success we have used a bidirectional RNN with LSTM for dealing with the medical questions. Bidirectional Long Short-Term Memory (BDLSTM) is an extension of the traditional LSTM; its main idea consists of processing sequence data in both forward and backward directions to avoid the problem of limited context that applies to any feed-forward model.</p><p>For that, first the question is converted to a matrix of one-hot vectors and passed through an embedding layer (with a vocabulary of 3312 and a dense embedding of 521), in order to get their dense representation and their relative meanings. The embedded question is then fed to a BDLSTM with 512 units followed by a fully connected layer to get question representation of size 512.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head> Answer prediction:</head><p>To predict an answer, we have modeled the VQA-Med task as multi-label classification problem, since we have assumed that an answer is a concatenation of one or more words. Therefore, we have used the multi-label Decision Tree classifier that takes as input the output from both sub-models of image representation and question representation and predicts one or more predefined labels. The total number of labels equals to 3109.Where, each label corresponds to a unique word in the answer dictionary that was created from the training set.</p><p>In the training phase, we have kept the CNN parameters frozen, and we have trained the rest of our deep neural network using a fully connected layer with sigmoid as activation function, Binary Cross-entropy as loss function and Adam as optimizer. As well as, the dropout technique was used before the last fully connected layer and after the BDLSTM layer with a probability of 0.5.</p><p>The best parameters were selected based on the validation loss, with a mini-batch of 20 and a number of epochs up to 10.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>4</head><p>Results:</p><p>Three metrics are used to evaluate our proposed VQA-Med model, which are: BLEU score <ref type="bibr" target="#b17">[20]</ref>, WBSS (Word-based Semantic Similarity), and CBSS (Concept-based Semantic Similarity). The first one is one of the most commonly used metrics that have been used to measure the similarity between two sentences, the second one aims to calculate the semantic similarity in the biomedical domain <ref type="bibr" target="#b18">[21]</ref>, it was created based on Wu-Palmer Similarity (WUPS) <ref type="bibr" target="#b19">[22]</ref> with WordNet ontology in the backend, while the third one is similar to the WBSS metric, except that instead of tokenizing the predicted and ground truth answers into words, it uses MetaMap via the pymetamap wrapper to extract biomedical concepts from the answers. Before applying the evaluation metrics, each answer undergoes the following preprocessing techniques:</p><p> Lower-case: Converts each answer to lower-case.  Tokenization: Divides the answer into individual words.  Stop-words: Removes punctuations and commonly encountered English words.</p><p>The following table shows the results obtained on the test set: </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Conclusion:</head><p>In this paper, we present our contribution to the task of visual question answering in the medical domain. We have treated the task as a multi-label classification using the decision tree classifier. However, the results on test set are totally unsatisfactory, especially in term of BLEU metric with a score of 0.054. Therefore, we think to develop an LSTM model to generate answers since the adopted classification approach ignores words order in the answer which leads to a loss of information. We also think to improve our visual model by using the attention technique .This technique allows to pay more attention to specific regions that better represent the question instead of the whole image.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. The VGG-16 model architecture [17].</figDesc><graphic coords="4,136.10,195.35,336.10,194.05" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 .</head><label>1</label><figDesc>1 the VQA-Med dataset consists of 2278 training images and 324 validation images, accompanied respectively with 5413 and 500 of question-answer pairs, and a test set of 264 medical images with 500 questions. The answer can be either "a single word", "a phrase containing around 2-28 words", or "a yes/no". The table2illustrates some examples of the training data with different types of questions and answers. The VQA-Med dataset distribution.</figDesc><table><row><cell></cell><cell>Images</cell><cell>Questions</cell><cell>Answers</cell></row><row><cell>Train</cell><cell>2278</cell><cell>5413</cell><cell>5413</cell></row><row><cell>Validation</cell><cell>324</cell><cell>500</cell><cell>500</cell></row><row><cell>Test</cell><cell>264</cell><cell>500</cell><cell>-</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 1 .</head><label>1</label><figDesc>Some examples of the training data.</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3 .</head><label>3</label><figDesc>Results of our proposed model on Test set.As shown in the table above, our proposed model gives good results in term of CBSS metric (0.27) comparing with BLEU score (0.054) and WBSS metric (0.10). This is justified by the high number of labels that are not presented equally in the training set. This is what is known as the label imbalance problem.</figDesc><table><row><cell></cell><cell>Evaluation metrics</cell><cell></cell></row><row><cell>BLEU</cell><cell>WBSS</cell><cell>CBSS</cell></row><row><cell>0.053867018</cell><cell>0.100854295</cell><cell>0.269119831</cell></row></table></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">Very deep convolutional networks for large-scale image recognition</title>
		<author>
			<persName><forename type="first">K</forename><surname>Simonyan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Zisserman</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1409.1556</idno>
		<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="page">17</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">ImageNet classification with deep convolutional neural networks</title>
		<author>
			<persName><forename type="first">A</forename><surname>Krizhevsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Sutskever</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">E</forename><surname>Hinton</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">NIPS</title>
				<imprint>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation</title>
		<author>
			<persName><forename type="first">Cho</forename><surname>Kyunghyun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Van Merrienboer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Gulcehre</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)</title>
				<meeting>the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)<address><addrLine>Doha</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="1724" to="1734" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Sequence to Sequence Learning with Neural Networks</title>
		<author>
			<persName><forename type="first">I</forename><surname>Sutskever</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Vinyals</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Le</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">the 27th International Conference on Neural Information Processing Systems</title>
				<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="page" from="3104" to="3112" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Show and tell: A neural image caption generator</title>
		<author>
			<persName><forename type="first">O</forename><surname>Vinyals</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Toshev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Bengio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Erhan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CVPR</title>
				<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<title level="m" type="main">Show, attend and tell: Neural image caption generation with visual attention</title>
		<author>
			<persName><forename type="first">K</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Ba</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Kiros</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Cho</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Courville</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Salakhutdinov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Zemel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Bengio</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2015">2015</date>
			<publisher>ICML</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Stacked attention networks for image question answering</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Deng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Smola</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CVPR</title>
				<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">Simple baseline for visual question answering</title>
		<author>
			<persName><forename type="first">B</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Tian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Sukhbaatar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Szlam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Fergus</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1512.02167</idno>
		<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title level="m" type="main">Ask your neurons: A deep learning approach to visual question answering</title>
		<author>
			<persName><forename type="first">M</forename><surname>Malinowski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Rohrbach</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Fritz</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1605.02697</idno>
		<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Overview of Im-ageCLEF 2018 Medical Domain Visual Question Answering Task</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">A</forename><surname>Hasan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Ling</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Farri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lungren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Müller</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="s">CLEF working notes</title>
		<imprint>
			<date type="published" when="2018">2018</date>
			<publisher>CEUR</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Overview of ImageCLEF 2018: Challenges, Datasets and Evaluation</title>
		<author>
			<persName><forename type="first">B</forename><surname>Ionescu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Müller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Villegas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>García Seco De Herrera</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Eickhoff</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Andrearczyk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Dicente Cid</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Liauchuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Kovalev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">A</forename><surname>Hasan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Ling</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Farri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lungren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Dang-Nguyen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Piras</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Riegler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lux</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gurrin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Ninth International Conference of the CLEF Association</title>
				<meeting>the Ninth International Conference of the CLEF Association<address><addrLine>CLEF</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2018">2018. 2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<title level="m" type="main">Overview of Im-ageCLEFcaption 2017 -the image caption prediction and concept extraction tasks to under-stand biomedical images</title>
		<author>
			<persName><forename type="first">C</forename><surname>Eickhoff</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Schwall</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Garc´ıa Seco De Herrera</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Muller</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
	<note>CLEF working notes</note>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<title level="m" type="main">Very Deep Convolutional Networks for Large-Scale Image Recognition</title>
		<author>
			<persName><forename type="first">K</forename><surname>Simonyan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Zisserman</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1409.1556</idno>
		<imprint>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Distributed Representations of Words and Phrases and their Compositionality</title>
		<author>
			<persName><forename type="first">T</forename><surname>Mikolov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Sutskever</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">S</forename><surname>Corrado</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Dean</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">NIPS</title>
		<imprint>
			<biblScope unit="volume">4</biblScope>
			<biblScope unit="issue">8</biblScope>
			<biblScope unit="page">17</biblScope>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Bidirectional recurrent neural networks</title>
		<author>
			<persName><forename type="first">M</forename><surname>Schuster</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">K</forename><surname>Paliwal</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Signal Processing</title>
		<imprint>
			<biblScope unit="volume">4</biblScope>
			<biblScope unit="page" from="2673" to="2681" />
			<date type="published" when="1997">1997</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Speech recognition with deep recurrent neural networks</title>
		<author>
			<persName><forename type="first">M</forename><surname>Graves</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mohamed</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Hinton</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing</title>
				<meeting>the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing</meeting>
		<imprint>
			<date type="published" when="2013">2013</date>
			<biblScope unit="page" from="6645" to="6649" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">RNNLM-recurrent neural network language modeling toolkit</title>
		<author>
			<persName><forename type="first">T</forename><surname>Mikolov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kombrink</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Deoras</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Burget</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Cernocky</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2011 ASRU Workshop</title>
				<meeting>the 2011 ASRU Workshop</meeting>
		<imprint>
			<date type="published" when="2011">2011</date>
			<biblScope unit="page" from="196" to="201" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">BLEU: a method for automatic evaluation of machine translation (PDF)</title>
		<author>
			<persName><forename type="first">K</forename><surname>Papineni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Roukos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Ward</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">J</forename><surname>Zhu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">40th Annual meeting of the Association for Computational Linguistics</title>
				<imprint>
			<publisher>Pensylvania</publisher>
			<date type="published" when="2002">2002</date>
			<biblScope unit="page" from="311" to="318" />
		</imprint>
	</monogr>
	<note>ACL-2002</note>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">BIOSSES: a semantic sentence similarity estimation system for the biomedical domain</title>
		<author>
			<persName><forename type="first">G</forename><surname>Soğancıoğlu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Öztürk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Özgür</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Bioinformatics</title>
		<imprint>
			<biblScope unit="volume">33</biblScope>
			<biblScope unit="issue">14</biblScope>
			<biblScope unit="page" from="49" to="58" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Verbs semantics and lexical selection</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Palmer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 32nd annual meeting on Association for Computational Linguistics Association for Computational Linguistics</title>
				<meeting>the 32nd annual meeting on Association for Computational Linguistics Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="1994">1994</date>
			<biblScope unit="page" from="133" to="138" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
