<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Stacked Sparse autoencoder for unsupervised features learning in PanCancer miRNA cancer classification</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>1st Imene Zenbout</string-name>
          <email>imene.zenbout@univ-constantine2.dz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>2nd Abdelkrim Bouramoul</string-name>
          <email>abdelkrim.bouramoul@univ-constantine2.dz</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>3rd Souham Meshoul</string-name>
          <email>sbmeshoul@pnu.edu.sa</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>IFA department, NTIC faculty, Constantine 2 University</institution>
          ,
          <addr-line>CRBT, CERIST, Constantine</addr-line>
          ,
          <country country="DZ">Algeria</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>IFA department, NTIC faculty, Constantine 2 University, Misc laboratory</institution>
          ,
          <addr-line>Constantine</addr-line>
          ,
          <country country="DZ">Algeria</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Princess Nourah bint Abderahmen University</institution>
          ,
          <addr-line>Riyadh</addr-line>
          ,
          <country country="SA">Saudi Arabia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>-The recent progress in cancer diagnosis is genomic data analysis oriented. miRNA is playing an important role as cancer biomarkers to move with cancer diagnosis and therapy towards personalized medicine with the ultimate goal to augment survival rate and disease prevention. The recent explosion in genomic data generation has motivated the use of miRNA to enhance diagnosis, prognosis and treatment. In this work we have explored the integrated Atlas PanCancer miRNA profiles, using deep features learning based on unsupervised Stacked Sparse AutoEncoder (SSAE). The proposed SSAE model learns features representation from the used data. The consistency of the learned features has been tested using classification of samples according to 31 cancer types. The model performance has been compared to state-of-the-art unsupervised features learning models. The obtained results exhibit the competitiveness and promising performance of our model, where an accuracy rate of about 95% has been achieved. Index Terms-Deep learning, Bioinformatics, features learning, Sparse autoencoders, miRNA, PanCancer.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>I. INTRODUCTION</title>
      <p>
        The recent and tremendous advance in high sequencing
technologies [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] have forstred the role of genomic data across
all the transcriptomic as a key answer to different biological
related questions and precisely in disease genetics. With these
new genomic and genetic data availability and transparency,
miRNA role moved from noisy particles to a highly engaged
genomic instances in gene regulation and post protein function.
This has led to a direct involving of miRNA in the occurrence
or the suppression of cancer [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
microRNA (miRNA) are classified as non-coding regulatory
genes [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], that can be found in small fragments of non-coding
RNA regions (about 21-23 nucleotide) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Since the
discovery of miRNA in 1993 by R.C.Lee [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], the generation of
miRNA data using high throughput technologies [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] to
explore the direct role of miRNA and cancer diagnosis and gene
impact become intensive. The particularity of miRNA profiles
is their ability to be a direct tool in cancer analysis, therapy
and post treatment [8], which represents the main motivation
of this work. The miRNA data share the same issue with gene
expression data which is the very small sample size with regard
to the high profiles dimensionality .i.e there is some profiles
that are irrelevant in cancer diagnosis and related decisions
compared to the low number of patient samples. Obviously,
this lends itself to a dimensionality reduction problem where
it is required to extract the miRNA signature representation
that can be a relevant predictors in cancer diagnosis.
In this work we propose a deep unsupervised features learning
model, based on stacking three sparse autoencoders to learn
new features from the initial noisy miRNA profiles inputs.
The learned features through the different abstraction levels,
have been used to train classifiers to predict the cancer type
of a specific sample according to 31 different cancer type.
The proposed unsupervised and supervised models have been
trained on the Atlas PanCancer [9] data set. The particularity
of this data set is that it combines different cancer type. This
may help us to draw information from the well explored
cancer type that have a big number of samples and/or a high
correlation between the different miRNA profiles and apply
these information to classify, or understand the cancer type
with poor exploration rate. The features learning model has
been compared to some of the most known unsupervised
features learning and dimensionality reduction models, here
we used pricipal component analysis PCA and kernel principle
component analysis KPCA. The rest of the paper is organized
as follows: A literature review in section II. Section III is
devoted to a brief introduction to sparse autoencoders. Section
IV describes the data set and the preprocessing steps. Our
proposal is presented in section V along with the set of
experimental results and discussion.
      </p>
    </sec>
    <sec id="sec-2">
      <title>II. MIRNA CANCER CLASSIFICATION</title>
      <p>
        Recently, the exploration of noncoding regions rule in
cancer diagnosis and therapy is attracting a large community
of scientists. The miRNA data set analysis using statistical
and machine learning become one of the trending problems in
bioinformatics [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. In cancer diagnosis and classification, we
cite the work of J.Lu et all [10] where the authors analysed
mammalian miRNA using k-nearest neighbors and
probabilistic neural network algorithm. Kotlarchyk et al [11], used
ensemble methodology to classify different cancer type based
on miRNA profiles. A statiscical support vector machine-
knearest neighbors is proposed by D. Ting-ting et al [12], where
they used t-statics to select relevent miRNA feautures and a
combination of kNN and SVM as classifiers to distinguish
between positive and negative samples in different cancer type
data set. For multiclass cancer classification, P.Yongjun [13]
used subset-based ensenmble method features selection, by
generating multiple miRNA subset based on the correlation
among miRNAs and then using classifiers to learn valuable
knoweledge from each subset to finally combine the results
of each classifier by averaging probabilities.A fuzzy
normalization based approach is proposed by M Anidha et al [14],
where the authors used relevant information gain and F-score
to select the most important features in cancer diagnosis, yet
in this work the experiments were for binary classification
tasks only. A web advisor consisting of semi-supervised
classifiers, with pearson correlation, Kappa statistics and recursive
feature elimination for selecting the best miRNA profiles,
was conducted by N.Cheerla et al [8] ,to perdict cancer type
and treatment recommendation based on the Atlas PanCancer
data set. In paper [15], the authors used Beep belief nets
and active learning to apply multi-level gene/miRNA feature
selection, and to visualize the impact between genes and
miRNAs, and select the most discriminating miRNAs profiles,
the paper tested the performance of the proposed approach
in classifying 3 cancer types. Whereas L.Fu et al [16] used
stacked auto-encoders to enhance cancer diagnosis and
treatment, by building both miRNA-miRNAs and human
diseasedisease similarities network and then use stacked autoencoder
to extract the best features set from the similarity results in
order to employit in predicting cancer type. Convolution net
works CNN were also used by A. L.Rincon et al [17], to
classify the PanCancer data types, where the authors applied
Evolutionary algorithm to optimize the architecture of the
CNN model.
      </p>
    </sec>
    <sec id="sec-3">
      <title>III. SPARSE AUTO ENCODERS</title>
      <p>An autoencoder is a symmetric neural network, which
copies the input of the network to its output passing
through a bottleneck layer that represents the latent features
space(figure1). A sparse autoencoder is an autoencoder with
applying a sparsity value (h) on the training of the encoder
part, in addition to the reconstruction loss [18]. This sparsity
value will deactivate the low value nodes, which led to the
extraction of more relevant features representation.
g(h) is the decoder output and h = f (x) is the encoder
output. A detailed description of the autoencoder architecture
is in sectionV. Sparse autoencoders have been intensively used
for feature learning problems in different domains, emotion
detection and robotics [21], medical imaging [20] also and
not only medical diagnosis [22].</p>
    </sec>
    <sec id="sec-4">
      <title>IV. DATA COLLECTION AND PREPROCESSING</title>
      <p>We collected the Atlas PanCancer [9] miRNA Data set
used for predicting cancer type from the TCGA data base
repository(10/12/2018 18:14 ). The miRNA data set have been
generated using next generation sequencing on around 33
types of cancer in the US hospitals. The initial miRNA data set
consist of more than 10 thousand patients and around 800 short
non-coding RNAs profiles. We have applied a preprocessing
to the data matrix by eliminating the miRNA instances with
more than 20% zero values, also we used a log transformation
to eliminate the skewed data and finally data imputation to
replace the missing values, After we have divided our final
data matrix to 70% samples used to train the supervised model,
and 30% samples to evaluate the performance of the trained
classifier. Table I exhibit the data set description before and
after preprocessing and table II illustrate the distribution of
samples on the different cancer types.</p>
    </sec>
    <sec id="sec-5">
      <title>V. SSAE FEATURES LEARNING</title>
      <p>We can denote the tackled problem as a matrix X of
a dimension N M where N represents the number of
samples and M represents the set of non-coding regions,
where each xij corresponds to a miRNA value i of a sample
j. The proposed architecture(Figure2) consist of two phases,
a dimensionality reduction phase and a predictive phase.
In phase one we have used unsupervised features learning
to train a stacked sparse autoencoders (SSAE), where we
have piled three sparse autoencoders[SAE1; SAE2; SAE3], in
which the input of SAEi is the output of SAEi 1, where
the particularity of the output of autoencoders is that the data
is a reconstruction of the input with less noise. The features
vectors generated from the three AEs has been concatenated
to train a predictive models. These models are trained using
supervised learning to predict the cancer type. The two steps
in our analytical architecture have been implemented using
python 3.5 and Keras [23] with tensorflow backened. The
experimental results have been processed on HP-bs0xx with
Intel Core i7-7500U CPU @ 2.70GHz 4 and 8 GB memory.</p>
      <sec id="sec-5-1">
        <title>A. First Phase</title>
        <p>In this step, we have used SSAE to extract a new
features representation, that is more accurate in multi-class
cancer diagnosis. The first sparse autoencoder SAE1 takes the
features vector S of the matrix X of range M , and fed it to the
encoder, in the bottleneck layer a new latent space F1 of range
K, where K &lt; M is generated and based on this latent space
the decoder try to reconstruct the input S as close as possible
at the output of the decoder where S S0. The output S0
of SAE1 become the input of SAE2 and the same steps are
followed to generate a latent space F1 and the decoder try to
reconstruct S0 at the output of the decoder S00 where S0 S00.
Equally S00 is the new input of SAE3 and the bottleneck of
the third sparse autoencoder generate the last latent features
space vector F3. The consistency of each autoencoder and their
final architecture settings has been evaluated by calculating the
reconstruction error loss between the input of the encoder and
the output of the decoder for each SAEi. In our proposal we
have used the mean abselout error loss function(eq2). The
three generated features representation[F1; F2; F3] from each
sparse auto encoder have been concatenated in one features
vector F 4 to be used to train the classifiers.</p>
        <p>n n
mae = 1=n X jxi x0ij=n = X jeij=n (2)
i=1 i=1
While zooming on the architecture of each autoencoder in
phase one (tableIII), we describe it as follows:
* SAE1: We have used a deep architecture, where the
encoder consist of two fully connected layers (494,250
node) with a L2 regularization as sparse penalty, a latent
space layer with 50 node that will further generate the
new space features F1, and a symmetric decoder to
reconstruct the encoder input with 250, 494 node for each
layer respectively.
* SAE2: Equally, we have used a deep autoencoder with
two fully connected layers of 494, 150 nodes for each
to represent the encoder, and we applied on it a sparse
L2 regularization penalty, a 50 nodes bottleneck layer
to generate the new F2 features representation, and a
symmetrical decoder.
* SAE3: In the last step we used the simple representation
of a sparse autoencoder, since our data have been purified
from the biggest amount of noise in the two previous
sparse auto encoders, we need to avoid falling into the
curse of overfitting and underfitting problem where our
autoencoder will only copy the input to the output without
learning a new features representation. So our SAE3
is composed of only one fully connected sparse layer
to represent the encoder(494 nodes), a bottleneck layer
composed of 50 nodes that represent the last features
vector F3, and a 494 nodes fully connected layer for the
mirror decoder.</p>
        <p>In order to tune each layer weights of the autoencoders(table
III), we have used a Relu nonlinear function. While the
bottleneck layer has been tuned using a Softplus activation
function. We trained the stacked autoencoder using mini
batch gardient descent training and Adam optimizer as
follows:
1- We trained SAE1 through 200 epochs on a batch size
equals to 180 samples from the initial input data set
that represents the value of non-coding regions of all the
available patients, to obtain a experimental reconstruction
loss value of 0.56.
2- SAE2 have been trained on 150 epochs with a batch
size of 150 using the reconstructed input from SAE1, the
experimental reconstruction loss after training was 0.32.
3- The output of SAE2 have been used to train SAE3 on
100 epochs with a batch size of 130, the reconstruction
loss after training was 0.21.</p>
        <p>The figure3, demonstrate the training process of each encoder,
where we can see that SAE1 converged toward the best
performance around 150 epochs while SAE2 was able to
stabilize around the epoch 125, whereas SAE3 converged
rapidly to its best performance around the epoch 80. After
training the three autoencoders we have extracted the latent
space of each autoencoder and concatenate the three vectors
as the new miRNA features space to be used in the second
phase.</p>
      </sec>
      <sec id="sec-5-2">
        <title>B. Second Phase</title>
        <p>The second phase is for classification, where we have
used three classifiers to predict the class of a cancer sample
according to 31 cancer types. Support vectors machine (SVM),
Decision trees(DT), Random Forest(RF), and K-nearest
neighbors were the chosen models to be trained to fulfill the
diagnosis task. The performance of the model have been
assessed through hold-out cross validation where we split
our data into 70% training and 30% testing. Besides, to
evaluate the performance of our SSAE in learning new features
representations, we have compared the performances of the
trained classifiers with other classifiers trained on features
generated by some of state of the art unsupervised
dimensionality reduction methods, namely Principal component analysis
(PCA), and Kernel principal component analysis(KPCA). The
overall accuracy score of each classifier (figure4), shows that
the predictive models trained on the features representation
extracted from SSAE are more powerful to predict the class of
each sample. Hence SVM/SSAE scored the highest accuracy
in discriminating between the different cancer types with a
performance that reaches approximately 95%. While in DT
we can see that KPCA was able to overcome our approach
with a difference of 0.02. In KNN and RF the performance of
the classifiers on each dimensionality reduction approach was
so close with a superiority to our approach as an accuracy of
92% and 89% respectively.</p>
        <p>Since Accuracy is not enough to evaluate a classifier, also
since our problem is a multi-class classification problem we
have choose other metrics to evaluate the performance of
our models all along the trained classifiers. We have used
micro/macro and weighted average values to evaluate the
consistency of each classifiers on the prediction of each class,
tables[IV,V,VI,VII]. TableIV, represent the case with the best
performance in each classifier. We conclude from the results
that the SVM/SSAE scored the best performed model, the
micro average score reflect the ability of the model in
predicting positive samples with a high rate (95%) for both micro
average precision and micro average recall. Equally the macro
average and the weighted average results are very promising
despite the fact that our data are size variant. Tables[V,VII]
exhibit the overall performance of the classifiers, where our
features representation learning model was able to slightly
overcome those trained on PCA and KPCA. In tableVI, were
the case our DT/SSAE model was not able to perform better
than the DT/KPCA classifier. The collection of results tables
exhibit the high consistency of the SSAE features. Where all
along most of the classifiers our model was able to score the
highest values possible, and in all the experiments we have
tested, PCA features were not able to perform better, than
ours, yet KNN/PCA was so close to KNN/SSAE with equal
micro average and weighted average values, here, only a small
difference was captured by the macro average values.</p>
        <p>Compared to the results published in [8] and [17], we
can say that our model was very powerful in discriminating
between the 31 cancer types, despite the fact that some of
the cancer types samples are very low in count. Cheerla et al
[8] addressed this problem by eliminating the types that have
smaller number of patients, so they worked on only 21 cancer
type using semi-supervised learning to augment the accuracy
score to 97%. For A.L.Rincon et al [17], the authors also dealt
with 29 cancer types to reach a training accuracy 96%. Also
we assume that by integrating more characteristics like stage
and gender to our analytical strategy we may improve the
results of the 31 predicted cancer type.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>VI. CONCLUSION</title>
      <p>In this paper we have implemented a stacked sparse
unsupervised auto encoder to learn new features
representation that may help in promoting cancer genetic diagnosis
based on the short non-coding RNA regions, which plays
a significant role in silencing, regulating and managing the
transcription biological process in human body. The learned
features have been evaluated through a supervised models,
where our proposed unsupervised features learning model was
able to generate a new discriminant data representation leading
to a competitive method with regard to the state-of the art
methods. We believe that the collection of new samples or
moving toward semi-supervised classification or integrating
some clinical information may enhance the results obtained
in this work, also the use of the PanCancer data set may give
to our model the flexibility and the easy use on other cancer
types generated from different genomic data banks for further
research aspects.
[8] N.Cheerla, O.Gevaert, ”MicroRNA based Pan-Cancer diagnosis and
treatment recommendation”. BMC bioinformatics, 2017.
[9] J.Liu, et al. ”An integrated TCGA pan-cancer clinical data resource to
drive high-quality survival outcome analytics”. Cell, 2018, pp 400–416.
[10] J.Lu, et al. ”MicroRNA expression profiles classify human cancers”.</p>
      <p>nature, 2005.
[11] A. Kotlarchyk,Khoshgoftaar, T., Pavlovic, M., Zhuang, H., A.S Pandya.</p>
      <p>Identification of microRNA biomarkers for cancer by combining
multiple feature selection techniques. Journal of Computational Methods in
Sciences and Engineering, 2011. pp 283–298.
[12] D,Ting-ting, S.Chang-ji, D.Yan-shou,B. Yi-duo. ”Analysis of miRNA
expression profile based on SVM algorithm”.in IOP Conference Series:
Earth and Environmental Science. IOP Publishing, 2018.
[13] P.Yongjun, P.Minghao, R. Keun Ho. ”Multiclass cancer classification
using a feature subset-based ensemble from microRNA expression
profiles”. Computers in biology and medicine, 2017, pp 39–44.
[14] M.Anidha; K.Premalatha, ”An application of fuzzy normalization in
miRNA data for novel feature selection in cancer classification”.</p>
      <p>Biomed. Res, 2017, 28.9: 4187-4195.
[15] R.Ibrahim, N.A.Yousri, M.A.Ismail, N. M.El-Makky,”Multi-level
gene/MiRNA feature selection using deep belief nets and active
learning”. in 2014 36th Annual International Conference of the
IEEE Engineering in Medicine and Biology Society. IEEE, 2014. pp
3957–3960.
[16] L.Fu, Q.Peng.”A deep ensemble model to predict miRNA-disease
association”. Scientific reports, 2017.
[17] A. L.RINCON, et al. ”Evolutionary optimization of convolutional neural
networks for cancer miRNA biomarkers classification”. Applied Soft
Computing, 2018, pp 91–100.
[18] I.Goodfellow, Y.Bengio, A.Courville. ”Deep learning”. MIT press, 2016.
[19] M.Tschannen, O.Bachem, M.Lucic, ”Recent advances in
autoencoderbased representation learning”, arXiv preprint arXiv:1812.05069, 2018.
[20] Y-D.Zhang, et al. ”Seven-layer deep neural network based on sparse
autoencoder for voxelwise detection of cerebral microbleed”. Multimedia
Tools and Applications, 2018, pp 10521–10538.
[21] L Chen, et al. ”Softmax regression based deep sparse autoencoder
network for facial emotion recognition in human-robot interaction”.</p>
      <p>Information Sciences, 2018, pp 49–61.
[22] C.Zhang, et al. ”Deep Sparse Autoencoder for Feature Extraction and
Diagnosis of Locomotive Adhesion Status”. Journal of Control Science
and Engineering,2018.
[23] C.Franc¸ois et al.”Keras”.https://keras.io.2015</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>F.</given-names>
            <surname>Cristiano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Veltri</surname>
          </string-name>
          . ”
          <article-title>Methods and techniques for miRNA data analysis”. in Microarray Data Analysis</article-title>
          . Humana Press, New York, NY,
          <year>2015</year>
          . pp
          <fpage>11</fpage>
          -
          <lpage>23</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Tam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.S.</given-names>
            <surname>Tsao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.D.</given-names>
            <surname>Mcpherson</surname>
          </string-name>
          .”
          <article-title>Optimization sof miRNA-seq data preprocessing”</article-title>
          . Briefings in bioinformatics,
          <year>2015</year>
          , pp
          <fpage>950</fpage>
          -
          <lpage>963</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Sing</surname>
          </string-name>
          , et al. ”
          <article-title>Machine learning techniques in exploring microRNA gene discovery, targets, and functions</article-title>
          ” in Bioinformatics in MicroRNA Research. Humana Press, New York, NY,
          <year>2017</year>
          . pp
          <fpage>211</fpage>
          -
          <lpage>224</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P.H.</given-names>
            <surname>Gunaratne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Coarfa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Soibam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Tandon</surname>
          </string-name>
          .
          <article-title>”miRNA Data Analysis: Next-Gen Sequencing”</article-title>
          .
          <source>in Fan JB</source>
          . (eds)
          <string-name>
            <surname>Next-Generation MicroRNA Expression Profiling Technology</surname>
          </string-name>
          .
          <source>Methods in Molecular Biology (Methods and Protocols)</source>
          ,
          <year>2012</year>
          Humana Press
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>R.C.</given-names>
            <surname>LEE</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.L.</given-names>
            <surname>Feinbaum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ambros</surname>
          </string-name>
          . ”The C.
          <article-title>elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14</article-title>
          . cell”,
          <year>1993</year>
          , pp
          <fpage>843</fpage>
          -
          <lpage>854</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Xuan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Qing</surname>
          </string-name>
          <string-name>
            <surname>a</surname>
          </string-name>
          , L.Guo ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Shi</surname>
          </string-name>
          .
          <article-title>Next-generation sequencing in the clinic: promises and challenges</article-title>
          .
          <source>Cancer letters</source>
          ,
          <year>2013</year>
          ,pp
          <fpage>284</fpage>
          -
          <lpage>95</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>K.R.</given-names>
            <surname>Kukurba</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.B.</given-names>
            <surname>Montgomery</surname>
          </string-name>
          .”
          <article-title>RNA sequencing and analysis”</article-title>
          .
          <source>Cold Spring Harbor Protocols</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>