<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Deep Learning Approach to Recognize Genome Functional Elements Using Diverse Genomic Data</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Nazar</forename><surname>Beknazarov</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Computer Science</orgName>
								<orgName type="laboratory">Laboratory of Bioinformatics</orgName>
								<orgName type="institution">National Research University Higher School of Economics</orgName>
								<address>
									<addrLine>11 Pokrovsky boulvar</addrLine>
									<postCode>101000</postCode>
									<settlement>Moscow</settlement>
									<country key="RU">Russia</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Seungmin</forename><surname>Jin</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Computer Science</orgName>
								<orgName type="laboratory">Laboratory of Bioinformatics</orgName>
								<orgName type="institution">National Research University Higher School of Economics</orgName>
								<address>
									<addrLine>11 Pokrovsky boulvar</addrLine>
									<postCode>101000</postCode>
									<settlement>Moscow</settlement>
									<country key="RU">Russia</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Maria</forename><surname>Poptsova</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Computer Science</orgName>
								<orgName type="laboratory">Laboratory of Bioinformatics</orgName>
								<orgName type="institution">National Research University Higher School of Economics</orgName>
								<address>
									<addrLine>11 Pokrovsky boulvar</addrLine>
									<postCode>101000</postCode>
									<settlement>Moscow</settlement>
									<country key="RU">Russia</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Deep Learning Approach to Recognize Genome Functional Elements Using Diverse Genomic Data</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">4E8AF824CB6FEFD0A1B81E1ECBA130B3</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T00:44+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>DNA secondary structures</term>
					<term>histone code</term>
					<term>histone marks</term>
					<term>epigenetics</term>
					<term>machine learning</term>
					<term>deep learning</term>
					<term>convolutional neural networks</term>
					<term>recurrent neural networks</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>As a result of the revolution in genome sequencing a lot of -omics data were generated. After obtaining a primary genomic sequence the next major task is to study genomic regulatory code. Epigenetic data sets provide a hint of how regulatory patterns are distributed in different tissues. Other layer of genome regulatory code comprises DNA secondary structures, which can work as regulators of various genomic processes. Having Big Data from next-generation sequencing experiments, machine learning approaches were chosen to solve the task of recognizing genomic functional elements. The earlier attempts to solve the problems of genome annotation with different classes of functional ele-ments, i.e. nucleosomic DNA, exon-intron boundaries, enhancers used machine learning algorithms that required manual collection of different features needed to characterize genomic regions. Lately deep learning approaches including convolution neural networks and recurrent neural networks become successful in recognizing genomic functional elements based on sequence information on-ly and/or with additional information on epigenetics and known regulatory ele-ments. Here we discuss a deep learning approach and provide an example of building a deep learning model for the task of recognition of DNA secondary structures.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Deep learning is becoming popular and easy to apply in solving various tasks. Among them, CNN (Convolutional Neural Network) and RNN (Recurrent Neural Network) are the most popular deep learning architectures, which may show the state-of-the-art performance in the majority of applications <ref type="bibr" target="#b0">[1]</ref>. This is achieved by the combination of the top performance in spatial and temporal dimen-sions. CNN may capture the hierarchical information in space. The mechanism of CNN is essentially in exploring a region of the input, one at a time, and mapping it to a specific feature space. By generating a series of convolutions at each region the network may learn the space features hierarchically <ref type="bibr" target="#b1">[2]</ref>. For instance, for the task of face recognition, CNN starts to gather convolutions from lines or cir-cles in face images, and then it filters these features for building up the feature maps of nose, eyes, and ears, and finally it recognizes the face <ref type="bibr" target="#b2">[3]</ref>.</p><p>RNN can learn temporal order using its context, and additionally, being turing-complete, it may learn, theoretically, any kind of function <ref type="bibr" target="#b3">[4]</ref>. Essentially RNN model keeps passing the context vector, which compresses the in-formation at a certain time step to predict outcome in the future time steps. It means RNN may handle arbitrary length of input <ref type="bibr" target="#b4">[5]</ref>. This feature makes RNN useful in many sequential tasks, such as machine learning translation, time series prediction, speech recognition, and signal processing. However, in practice RNN does n ot work well alone, especially for the feature _______________________ extraction and long term prediction tasks <ref type="bibr" target="#b3">[4,</ref><ref type="bibr" target="#b4">5]</ref>. This is why modulating CNN and RNN is a common practice and shows the best results in deep learning tasks <ref type="bibr" target="#b5">[6]</ref><ref type="bibr" target="#b6">[7]</ref><ref type="bibr" target="#b7">[8]</ref>.</p><p>In Bioinformatics, research in deep learning has been rapidly increasing since early 2000s and CNN and RNN are widely applied to various tasks <ref type="bibr" target="#b5">[6]</ref>. For example, CNN applied to predict gene expression from epigenomic data, anomaly classification in biomedical imaging, brain decoding in biomedical signal processing <ref type="bibr" target="#b5">[6]</ref>. RNN also was applied to protein structure classification, and anomaly classification in biomedical signal processing. Although combining two models in practice shows good performance, there is a tendency to use them separately in bioinformatics tasks <ref type="bibr" target="#b5">[6]</ref>. One of the pioneering example of hybrid CNN and RNN model to predict function of the DNA sequence was implemented and tested in DanQ <ref type="bibr" target="#b6">[7]</ref>. Another hybrid CNN-RNN model was applied for a task of predicting enhancers based on histone modification marks <ref type="bibr" target="#b7">[8]</ref>. In this research, we continue testing deep learning approach combining two models to recognize genome functional elements using diverse genomic data.</p><p>As a genomic functional element we chose Z-DNA belonging to DNA secondary structures. The role of DNA secondary structures in the regulation of genomic processes was confirmed experimentally for quadruplexes, cruciform structures, triplexes, and Z-DNA. Experiments on wholegenome detection of Z-DNA regions are under development, and currently several experimental datasets are available <ref type="bibr" target="#b8">[9,</ref><ref type="bibr" target="#b9">10]</ref>. Building and testing machine learning models that would aggregate information from experimental data is an urgent task, since there is a need for computer methods of genome annotation with functional elements. Here we tested several machine learning approaches including deep learning to detect Z-DNA regions. We showed that deep learning, and specifically hybrid CNN plus RNN models achieved the best performance in the task of Z-DNA recognition.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Material and Methods</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">Data on Z-DNA, epigenetics, RNA polymerase, and transcription factor binding sites</head><p>The positions of Z-DNA are taken from the dataset of the Chip-Seq experiment on identification of binding sites of the Zaa protein, which binds to the left-twisted form of DNA <ref type="bibr" target="#b9">[10]</ref>. To improve the prediction quality of the sequence we added information on epigenetic and regulatory code. Histone marker positions and DNase hypersensitivity sites, which mark regions of an open chromatin, are taken from the international consortium project Roadmap Epigenomics <ref type="bibr" target="#b10">[11]</ref>. Information on the binding sites of RNA polymerase and transcription factors are taken from the Encyclopedia of DNA elements (ENCODE) project <ref type="bibr" target="#b11">[12]</ref>. Totally, 1065 features are selected.</p><p>DNA subsequence with Z-DNA regions is considered as an output vector. A binary value is assigned to every nucleotide depending on its location inside the Z-DNA region. We considered subsequences of 5000 bp, thus, every output vector has a length of 5000.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Construction of train and test datasets</head><p>We encoded human DNA sequence using one hot encoding method where a sequence is transformed to a binary matrix of 4xL where L is the length of the sequence and 4 rows correspond to the 4 nucleotides, TCAG. This matrix is filled with zeros and has only one value at the corresponding nucleotide cell in each position. Epigenomic data and RNA polymerase and transcription factors binding sites were added to the encoded DNA sequence. Finally, we create a set of matrices for every chromosome, which has the same length of DNA sequence. The shape of input matrix is 1069xL, where 1064 comes from additional features and 4 from one-hot encoded DNA, and L is the length of the sequence. In order to avoid any dependencies between Z-DNA sites and borders of DNA subsequences, DNA is uniformly divided into subsequences of length 5000. Then we split subsequence into train and test sets in a ratio of 4 to 1 respectively preserving the proportion of subsequences with Z-DNA in each set.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.">Machine learning models</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.1.">Baseline model</head><p>In order to show the level of performance of deep learning models, we prepared a boosting classifier as a baseline. The term 'boosting' here means that it converts weak learners to strong learners. Basically, boosting is an ensemble method for improving the model predictions of any given learning algorithm. This method consists of sequential training of simple models, where each subsequent model corrects the errors of the previous one. Boosting is a well-known method in the bioinformatics domain and generally shows good results in many classification tasks [13-15].</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.2.">Deep learning models</head><p>DNA has patterns in the form of one-dimensional sequence motifs, which CNN may capture very well, and, from the other hand, DNA is a text, so RNN may learn the context from it. Therefore, we expect the best result when we combine two models, CNN and RNN. For the proper comparison, we also trained independent CNN along with CNN + RNN.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.3.">CNN</head><p>We experimented with several hyperparameters for CNN models. We considered different sizes of the kernels and strides because it may influence the result. The number of output kernels was set to 1 and we use a softmax layer at the end. Thus, these models have a vector of outcome with length of input, each nucleotide corresponds to a probability value from 0 to 1. For each nucleotide, there are C boolean values, where C is kernel size. Every boolean value depicts the presence of Z-DNA in this very point. Averaging on these C values was used as a target for the outcome cell. Since the padding is absent, the number of outcomes of the models equals the number of averaged values. That means each model will predict the average number of nucleotides that occurred in a given segment, and assign this number to the middle of the segment. Increasing layer number or kernel size make worse its complexity but may have better results. Next set of models has more convolutional layers with ReLU activation. In this case, the target variable is calculated in a slightly different way. Averaging is performed by the size of the last layer. The size and number of kernels on the first and second layers were selected from a predefined set of values.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.4.">CNN+RNN</head><p>This type of hybrid model was successfully implemented in the DanQ <ref type="bibr" target="#b6">[7]</ref>. CNN extracts important motifs and simultaneously RNN can learn complex regulatory grammar between the motifs. It is assumed that the motifs that were detected by the CNN layer also have recurrent dependencies. In theory, such a network is able to recognize a succession of motifs on which Z-DNA configuration depends. The model architecture used for Z-DNA detection is shown in Fig. <ref type="figure" target="#fig_1">1</ref>.  There are several ways to use RNN: one-to-one, one-to-many, many-to-one, and many-to-many (Fig. <ref type="figure" target="#fig_2">2</ref>). In this paper, we considered two approaches, many-to-many and many-to-one.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.5.">Approach many-to-one</head><p>In this case, the structure of a model is as follows. The first part of the model is one or several CNN layers, and each column of the received out-put is separately transferred to the RNN network. In our case, a multi-layer bidirectional LSTM is selected for RNN. Next, the number of layers in the CNN and LSTM parts will be selected. The sizes of kernels and hidden layers will be selected. At the end and beginning of the sequence, the RNN layer will output 2 vectors that are associated with longterm LSTM memory cells. Two LSTM context vectors were included since this RNN model is bidirectional. Then the vectors are passed to the fully connected layer, which makes the prediction. The target variable is a boolean value of Z-DNA presence in the region in this sequence.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.6.">Approach many-to-many</head><p>This architecture completely copies the previous one, except for one element. After the RNN layer, the output of the long-term memory element is ignored and the short-term memory outputs of each direction are aggregated. Next, each unit of the sequence corresponds to two vectors, which are passed to the fully connected layer and then predictions are made for each part of the sequence. The target variable in this case will be calculated exactly as in the case of CNN. That is, each unit of the sequence will be mapped to the average of a certain region of the chain.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Results</head><p>Quantiles were calculated for the distribution of random AUC using bootstrap sampling (Table <ref type="table" target="#tab_0">1</ref>). You can see that the first model has a rather low quality, indistinguishable from that of a random choice. The best CNN model among all showed 69 AUC on test set. The architecture can be listed as follows. For the best CNN model, the first layer is a convolutional layer with 36 kernels, kernels size 13, stride 2 and padding 6. Second layer is a ReLU. Third layer is a convolutional layer with 2 kernels, kernels size 13, stride 2 and padding 6. Last layer is a Sigmoid. The performance of the hybrid CNN+RNN showed quality higher than CNN model. Best model with a many-to-one approach showed 86.5 AUC. The architecture of the best CNN+RNN model can be listed as follows. The first layer is a convolutional layer with 64 kernels, kernels size 13, stride 4 and padding 6.Second layer is a ReLU. Output of ReLU was sent to bidirectional LSTM layer with hidden size 64 and 2 layers. Hidden state of LSTM goes to the dropout layer with probability 0.7. Last fully connected layer has 2 neurons.</p><p>The best model with a many-to-many approach showed 80.5 AUC. First layer is a convolutional layer with 36 kernels, kernels size 25, stride 2 and padding 12. Second layer is a ReLU. Third layer is a convolutional layer with 64 kernels, kernels size 25, stride 2 and padding 12.Fourth layer is a ReLU. Output of ReLU was sent to bidirectional LSTM layer with hidden size 64 and 2 layers. Hidden state of LSTM goes to the dropout layer with probability 0.7. Last fully connected layer has 2 neurons.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Conclusions and Discussion</head><p>The following conclusions can be drawn from the obtained results. Although CNN model shows higher performance than the baseline, it does not handle the sequential nature of DNA sequence. Baseline and CNN models perform much worse than a model that contains an RNN layer. The maximum quality that can be achieved on this dataset with the power of this set of architectures does not exceed 86 % of the AUC, which indicates that the task can be solved using available data.</p><p>Here we presented results of a deep learning approach for the Z-DNA prediction, in particular a hybrid model of two famous deep learning network architectures -CNN and RNN. This architecture outperforms both models based only on CNN and classical machine learning models such as gradient boosting. As we expected CNN + RNN shows better results than CNN because RNN may capture the sequential pattern using its context. We assume our approach may be applied to many other bioinformatics tasks, which are required for mapping spatial data to sequential output.</p><p>One of the advantages of our approach is scalability, where we can upgrade the system when more epigenetics and regulatory data become available. Thus, the same type of models can be applied to recognition of quadruplexes or triplexes as well as patterns of association of DNA secondary structures and epigenetic code. We expect that inclusion of omics data will improve prediction quality of the model. However there is a drawback in having a large feature space that will increase the time of mod-el training. It would be beneficial first to find a minimal set that would achieve the desired model quality and then train the model with the reduced size of feature space. It will also help to find scientifically important associations between studied functional and epigenetic and/or regulatory elements.</p><p>Deep neural networks are capable of processing effectively aggregated information from different levels of genome organization. At the present time, when next-generation sequencing experiments are still too expensive, machine learning models for annotating genomes with functional genomic elements are very important. For some species next-generation sequencing experiments on epigenomic and regulatory code are not available at all. Finding de novo or imputed novel functional elements with computational artificial intelligence systems would help researchers in understanding principles and mechanisms of genome functioning.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head></head><label></label><figDesc>Modeling and Analysis of Complex Systems and Processes -MACSPro'2020, October 22-24, 2020, Venice, Italy &amp; Moscow, Russia EMAIL: nazar.s.beknazarov@gmail.com (A. 1); mpoptsova@hse.ru (A. 3) ORCID: 0000-0002-7198-8234 (A. 3);</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Architecture of a hybrid model, CNN + RNN for Z-DNA prediction. DNA sequence data transformed with one-hot encoding was concatenated with sparse vectors of epigenomic data.</figDesc><graphic coords="4,72.50,72.50,473.50,266.60" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Schematic representation of approaches for the classification using RNN architecture.</figDesc><graphic coords="4,90.95,429.20,428.30,276.04" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc></figDesc><table><row><cell>Experiment result</cell><cell></cell><cell></cell></row><row><cell>Model</cell><cell>AUC</cell><cell>Accuracy</cell></row><row><cell>Boosting</cell><cell>0.532</cell><cell>0.691</cell></row><row><cell>CNN</cell><cell>0.69</cell><cell>0.55</cell></row><row><cell>CNN+RNN</cell><cell>0.865</cell><cell>0.75</cell></row></table></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">A comprehensive review for industrial applicability of artificial neural networks</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">R G</forename><surname>Meireles</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">E M</forename><surname>Almeida</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">G</forename><surname>Simoes</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Industrial Electronics</title>
		<imprint>
			<biblScope unit="volume">50</biblScope>
			<biblScope unit="page" from="585" to="601" />
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Object recognition with gradient-based learning In</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Lecun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Haffner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Bottou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Bengio</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Shape, contour and grouping in computer vision</title>
				<editor>
			<persName><forename type="first">D</forename><forename type="middle">A</forename><surname>Forsyth</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><forename type="middle">L</forename><surname>Mundy</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">V</forename><forename type="middle">D</forename><surname>Gesú</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">R</forename><surname>Cipolla</surname></persName>
		</editor>
		<meeting><address><addrLine>Berlin, Heidelberg</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="1999">1999</date>
			<biblScope unit="page" from="319" to="345" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">When face recognition meets with deep learning: an evaluation of convolutional neural networks for face recognition</title>
		<author>
			<persName><forename type="first">G</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Yi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Kittler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Christmas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">Z</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Hospedales</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the IEEE international conference on computer vision workshops</title>
				<meeting>the IEEE international conference on computer vision workshops</meeting>
		<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="142" to="150" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<title level="m" type="main">Deep learning</title>
		<author>
			<persName><forename type="first">I</forename><surname>Goodfellow</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Bengio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Courville</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2016">2016</date>
			<publisher>MIT press</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Long short-term memory</title>
		<author>
			<persName><forename type="first">S</forename><surname>Hochreiter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Schmidhuber</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Neural computation</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="issue">8</biblScope>
			<biblScope unit="page" from="1735" to="1780" />
			<date type="published" when="1997">1997</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Video-based emotion recognition using CNN-RNN and C3D hybrid networks</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Fan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Lu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 18th ACM International Conference on Multimodal Interaction</title>
				<meeting>the 18th ACM International Conference on Multimodal Interaction</meeting>
		<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="445" to="450" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Cnn-rnn: A unified framework for multi-label image classification</title>
		<author>
			<persName><forename type="first">J</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Mao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Xu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the IEEE conference on computer vision and pattern recognition</title>
				<meeting>the IEEE conference on computer vision and pattern recognition</meeting>
		<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="2285" to="2294" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Stock price prediction using LSTM, RNN and CNN-sliding window model</title>
		<author>
			<persName><forename type="first">S</forename><surname>Selvin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Vinayakumar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">A</forename><surname>Gopalakrishnan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">K</forename><surname>Menon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">P</forename><surname>Soman</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">international conference on advances in computing, communications and informatics (icacci)</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="1643" to="1647" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Deep learning in bioinformatics</title>
		<author>
			<persName><forename type="first">S</forename><surname>Min</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Yoon</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Brief Bioinform</title>
		<imprint>
			<biblScope unit="volume">18</biblScope>
			<biblScope unit="page" from="851" to="869" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences</title>
		<author>
			<persName><forename type="first">D</forename><surname>Quang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Xie</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Nucleic Acids Res</title>
		<imprint>
			<biblScope unit="volume">44</biblScope>
			<biblScope unit="page">e107</biblScope>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Enhancer prediction with histone modification marks using a hybrid neural network model</title>
		<author>
			<persName><forename type="first">A</forename><surname>Lim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Lim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kim</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Methods</title>
		<imprint>
			<biblScope unit="volume">166</biblScope>
			<biblScope unit="page" from="48" to="56" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Permanganate/S1 Nuclease Footprinting Reveals Non-B DNA Structures with Regulatory Potential across a Mammalian Genome</title>
		<author>
			<persName><forename type="first">F</forename><surname>Kouzine</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Wojtowicz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Baranello</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Yamane</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Nelson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Resch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">R</forename><surname>Kieffer-Kwon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">J</forename><surname>Benham</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Casellas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">M</forename><surname>Przytycka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Levens</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Cell Syst</title>
		<imprint>
			<biblScope unit="volume">4</biblScope>
			<biblScope unit="page" from="344" to="356" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Z-DNAforming sites identified by ChIP-Seq are associated with actively transcribed regions in the human genome</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">I</forename><surname>Shin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ham</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Park</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">H</forename><surname>Seo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">H</forename><surname>Lim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Jeon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Huh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">Y</forename><surname>Roh</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">DNA Res</title>
		<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Integrative analysis of 111 reference human epigenomes</title>
		<author>
			<persName><forename type="first">C</forename><surname>Roadmap Epigenomics</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kundaje</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Meuleman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Ernst</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Bilenky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Yen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Heravi-Moussavi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Kheradpour</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">J</forename><surname>Ziller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Amin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">W</forename><surname>Whitaker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">D</forename><surname>Schultz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">D</forename><surname>Ward</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Sarkar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Quon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">S</forename><surname>Sandstrom</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">L</forename><surname>Eaton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><forename type="middle">C</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">R</forename><surname>Pfenning</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Claussnitzer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Coarfa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">A</forename><surname>Harris</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Shoresh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">B</forename><surname>Epstein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Gjoneska</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Leung</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Xie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">D</forename><surname>Hawkins</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Lister</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Hong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Gascard</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">J</forename><surname>Mungall</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Moore</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Chuah</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Tam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">K</forename><surname>Canfield</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">S</forename><surname>Hansen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Kaul</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">J</forename><surname>Sabo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">S</forename><surname>Bansal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Carles</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">R</forename><surname>Dixon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">H</forename><surname>Farh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Feizi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Karlic</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">R</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kulkarni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Lowdon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Elliott</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">R</forename><surname>Mercer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">J</forename><surname>Neph</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Onuchic</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Polak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Rajagopal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Ray</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">C</forename><surname>Sallari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">T</forename><surname>Siebenthall</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">A</forename><surname>Sinnott-Armstrong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Stevens</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">E</forename><surname>Thurman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">E</forename><surname>Beaudet</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">A</forename><surname>Boyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">L</forename><surname>De Jager</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">J</forename><surname>Farnham</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">J</forename><surname>Fisher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Haussler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">J</forename><surname>Jones</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Marra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">T</forename><surname>Mcmanus</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Sunyaev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A</forename><surname>Thomson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">D</forename><surname>Tlsty</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">H</forename><surname>Tsai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">A</forename><surname>Waterland</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">Q</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">H</forename><surname>Chadwick</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">E</forename><surname>Bernstein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">F</forename><surname>Costello</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">R</forename><surname>Ecker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hirst</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Meissner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Milosavljevic</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Ren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A</forename><surname>Stamatoyannopoulos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kellis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Nature</title>
		<imprint>
			<biblScope unit="volume">518</biblScope>
			<biblScope unit="page" from="317" to="330" />
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">The Encyclopedia of DNA elements (ENCODE): data portal update</title>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">A</forename><surname>Davis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">C</forename><surname>Hitz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">A</forename><surname>Sloan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">T</forename><surname>Chan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">M</forename><surname>Davidson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Gabdank</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A</forename><surname>Hilton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Jain</surname></persName>
		</author>
		<author>
			<persName><forename type="first">U</forename><forename type="middle">K</forename><surname>Baymuradov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">K</forename><surname>Narayanan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">C</forename><surname>Onate</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Graham</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">R</forename><surname>Miyasato</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">R</forename><surname>Dreszer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">S</forename><surname>Strattan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Jolanki</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">Y</forename><surname>Tanaka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">M</forename><surname>Cherry</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Nucleic Acids Res</title>
		<imprint>
			<biblScope unit="volume">46</biblScope>
			<biblScope unit="page" from="D794" to="D801" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Model-based boosting in high dimensions</title>
		<author>
			<persName><forename type="first">T</forename><surname>Hothorn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Buhlmann</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Bioinformatics</title>
		<imprint>
			<biblScope unit="volume">22</biblScope>
			<biblScope unit="page" from="2828" to="2829" />
			<date type="published" when="2006">2006</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Boosting for tumor classification with gene expression data</title>
		<author>
			<persName><forename type="first">M</forename><surname>Dettling</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Buhlmann</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Bioinformatics</title>
		<imprint>
			<biblScope unit="volume">19</biblScope>
			<biblScope unit="page" from="1061" to="1069" />
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Predicting protein residue-residue contacts using deep networks and boosting</title>
		<author>
			<persName><forename type="first">J</forename><surname>Eickholt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Cheng</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Bioinformatics</title>
		<imprint>
			<biblScope unit="volume">28</biblScope>
			<biblScope unit="page" from="3066" to="3072" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
