<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>M-MNet for Cancer Classification by Constructing Somatic Mutation Map</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Chenxu Quan</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Xin Chen</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fenghui Liu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lin Qi</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yun Tie</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>The First Afiliated Hospital of Zhengzhou University</institution>
          ,
          <addr-line>Zhengzhou, Henan, 450000</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Zhengzhou University</institution>
          ,
          <addr-line>Zhengzhou, Henan, 450000</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
      </contrib-group>
      <fpage>181</fpage>
      <lpage>191</lpage>
      <abstract>
        <p>Due to the high fatality rate of cancer, timely detection and treatment in the early stage of cancer is very important. Cancer classification studies based on somatic mutation data are helpful for physicians to identify cancer types at the genetic level and reduce the possibility of misdiagnosis and missed diagnosis. However, the one-dimensional (1-D) high redundancy of somatic mutation data resulted in the low robustness and overfitting of the model. In addition, current models based on convolution neural networks (CNNs) fail to take global features of input data into account and are inferior in classification performance. In this paper, we proposed a gene mutation map construction method to realize the dimension transformation of somatic mutation data and make it suitable for existing image classification models, which are based on the RGB three-channel principle of the image. Then, based on the prediction results of driver genes, the feature selection optimization (FSO) algorithm is performed on the original mutation map to solve the problems of high noise and sparsity of the original mutation map. Furthermore, a classification network named M-MNet is introduced based on inverted residual module and multi-head self-attention module. The experimental results show that the proposed method have improved the overall classification performance, and the overall method achieves 94.62% accuracy and 94.34% f1 score in cancer classification tasks of 19 tumor cohorts, which has good cancer classification ability.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Somatic mutation</kwd>
        <kwd>Driver genes</kwd>
        <kwd>Cancer classification</kwd>
        <kwd>Feature selection optimization</kwd>
        <kwd>M-MNet</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        There are many types of mutations in the somatic cell genome, such as Single Nucleotide Variants
(SNVs), Insertions and Deletions (InDels), and chromosome Structural Variations (SVs) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. When
specific genomic mutations occur in somatic cells, they can promote the development and malignant
proliferation of cancer cells. These genes are known as driver genes. For example, it has been confirmed
that mutations in the VHL and MET genes lead to kidney cancer [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Depending on the mechanisms
by which they induce cancer, driver genes can be classified as oncogenes or tumor suppressor genes,
both of which interact to maintain stable positive and negative regulatory signals. Oncogenes are often
expressed at low levels or not expressed in the genome. When they undergo mutation and abnormal
activation, they become carcinogenic factors that induce cancer. Tumor suppressor genes, on the other
hand, are genes in the genome that have inhibitory efects on cell growth and potential anti-cancer
efects. Mutations or inactivation of tumor suppressor genes can lead to cell carcinogenesis. However,
not all mutated genes are driver genes. There are neutral mutations in the human body that do not
promote cell carcinogenesis, and genes that undergo such mutations are called passenger genes.
      </p>
      <p>
        Cancer driver genes play a significant role in various clinical aspects of cancer prevention, early
detection and diagnosis, staging and classification, as well as rehabilitation treatment. Both driver
genes and passenger genes are mutated genes, but they have distinct roles in regulating cellular
physiological mechanisms and their pathological analysis and treatment. Therefore, it is of great
significance to identify driver genes among all mutated genes in tumor cells. Due to the high complexity
of cancer gene mutation mechanisms [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], it is challenging to identify the distinguishing features that
can efectively diferentiate driver genes from passenger genes. Accurately identifying driver genes
remains a huge challenge. In recent years, with the rapid development of computer science and the
emergence of second-generation sequencing technologies, such as high-throughput sequencing, it
has become possible to analyze cancer driver genes with the support of big data. This data-driven
research approach significantly improves the eficiency of cancer research. Taking advantage of these
advancements, many complex computational methods have been proposed for detecting cancer driver
gene mutations and conducting an in-depth analysis of the regulatory mechanisms behind driver genes
in cancer.
      </p>
      <p>This article focuses on the prediction of driver genes and cancer classification tasks based on somatic
mutation data, aiming to address the challenges encountered in driver gene prediction and cancer
classification. Among them, the driver gene prediction methods based on mutation position face the
following problems: (1) The statistical determination of mutation probabilities for nucleotide contexts
is independent, which can lead to overfitting of background mutation signals and overlook important
signals conveyed by low-frequency mutation sites. (2) The computational complexity increases
exponentially when expanding nucleotide contexts. (3) Traditional clustering algorithms such as K-means
and DBSCAN are no longer suitable for handling complex mutation data, resulting in poor clustering
performance.</p>
      <p>The main contributions of this study to address the aforementioned problems are as follows:
1) Firstly, this study applies the RGB three-channel principle to perform dimensionality transformation
on somatic mutation data, resulting in a two-dimensional gene mutation image data. This enables
the use of existing image classification models for somatic mutation data.
2) Secondly, this study employs a feature selection optimization algorithm to address the issue of high
noise and sparsity in the two-dimensional mutation image. This optimization work in feature gene
selection efectively enhances the model’s generalization ability.
3) Lastly, to better capture both local and global features of the images, this study proposes the
MMNet network model by combining the inverted residual module with the multi-head self-attention
module. Specifically, the inverted residual module is utilized to extract local features, followed by the
multi-head self-attention module to capture global features. This enables accurate feature extraction
and classification of mutation images.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>The computational methods used for cancer driver gene discovery mainly include three main classes:
methods for identifying individual cancer driver genes, methods for identifying cancer driver gene
modules, and methods for discovering personalized cancer driver genes (i.e., driver genes specific to
individual patients).</p>
      <p>
        Single cancer driver gene identification can be divided into two subcategories based on the key
techniques used in the methods: mutation-based methods and subnetwork-based methods.
Mutationbased methods utilize various features of mutations, such as the significance, functional impact, location,
and other information, to discover cancer driver genes. Methods based on the significance of gene
mutations include MuSiC [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and MutSigCV [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Methods based on the functional impact of gene
mutations include OncodriveFM [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], OncodriveFML [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], DriverML [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], and others. These methods
often select driver gene candidates based on mutations in genes that have a significant functional
impact rather than evaluating the number of mutations. Therefore, these methods can often detect
low-frequency mutations that play important roles in cancer development. Methods based on the
location information of gene mutations usually rely on clustering methods and are often referred to
as hotspot-based methods [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Hotspots typically refer to high-density mutation regions, often driven
by positive selection, and are commonly found in functionally important domains or residues in the
three-dimensional protein structure [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. The OncodriveCLUST method [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] is a typical representative
of hotspot methods. The MMC algorithm [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] adjusts the influence weights of mutation sites on
surrounding sites through kernel density estimation for clustered identification of mutation genes.
The clustering algorithm HotMAPS [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] considers more hotspot information in the 3D protein space
by combining tumor mutation data with PDB data. Additionally, MutPanning [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] identifies driver
factors based on the ratio of non-synonymous substitution rate to synonymous substitution rate (dn/ds).
Network-based methods often predict cancer driver genes by evaluating the role of genes in biological
networks and combining them with gene mutation information. DriverNet [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] reveals cancer driver
genes by assessing the impact of mutations on cancer transcriptional networks.
      </p>
      <p>
        Identification of cancer driver modules. The majority of methods for identifying cancer driver modules
are based on the mutual exclusivity of mutations. CoMEt [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] uses mutual exclusivity techniques to
detect cancer driver modules. Similarly, WeSME [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] evaluates the mutual exclusivity of gene mutations
to detect cancer driver genes. However, WeSME does not evaluate genes within the same pathway
but only considers mutually exclusive pairs of mutated genes as candidate cancer driver genes for
modularization.
      </p>
      <p>
        Personalized cancer driver gene identification. Personalized cancer driver gene identification methods
are based on gene regulatory networks to recognize driver genes. For example, PNC [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] identifies the
minimum gene set that covers all edges in a bipartite graph as cancer driver genes.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Methods</title>
      <sec id="sec-3-1">
        <title>3.1. Data Dimension Transformation</title>
        <p>Diferent from traditional 1-D data, gene mutation data has randomness and low frequency. The
mutation information randomly distributed in the long redundant gene text sequence is dificult to
be efectively captured by the existing deep learning model, and the long data length requires high
complexity and performance of the model, and it is dificult to efectively extract features to realize
classification tasks by conventional means. Thus, we transform the dimension of somatic mutation data
based on the RGB three-channel image principle to be suitable for training of deep learning model.</p>
        <p>Figure 1 illustrates the construction process of the gene mutation map. First, count the mutated genes
in each tumor cohort of the dataset and sort them based on their chromosomal positions (chromosome
1, 2, · · · , 22, X and Y ). The mutated genes in the -th chromosome of -th tumor cohort is arranged to
a matrix  , where 0 ≤  ≤ 30 and 0 ≤  ≤ 18.</p>
        <p>Next, the mutated genes from diferent tumor cohorts but the same chromosome are resorted. At this
stage, the length of the set · = 0 ∩ 1 ∩ · · · ∩ 18, which consists of mutated genes from diferent
tumor cohorts in -th chromosome, is . Assuming the constructed gene mutation graph has a shape
of  ×  , at this point, the -th chromosome occupies  rows in the mutation graph, where:
 =
︂{ / + 1,  mod  ̸= 0</p>
        <p>/,  mod  = 0
Then all the genes on the chromosomes occupy  rows, where  is defined as:
 = ∑︁  = ∑23︁ {︂
23
=0
0</p>
        <p>//,+ 1,  ̸== 00 ,  ≤</p>
        <p>According to the above definition, we can build a gene mutation map of size  ×  , which includes
all mutated genes from the 19 tumor cohorts. In this article, the replacement, insertion, and deletion
genes from single nucleotide variants are statistical and correspond to the RGB channels of the image,
respectively. To map mutation information onto an image, this method first selects the maximum
mutation value,  , from all samples in the corresponding mutation type.   is
then used as one of the indicators to map the mutation quantity to grayscale values. For convenience,</p>
        <sec id="sec-3-1-1">
          <title>Replace Ċ T C CĊ Ċ T T C Ċ</title>
        </sec>
        <sec id="sec-3-1-2">
          <title>Insert Ċ T T T CĊ Ċ T T C Ċ</title>
          <p>Delete
assume that  single nucleotide of mutation gene  makes a replacements in the sample gene fragment.
And its value in the red channel is . The relationship between  and  is:</p>
          <p>255 =   ,  ≤ 255,  ≤   (3)</p>
          <p>According to the above definition, the conversion from mutation quantity to grayscale values in a
single-channel image can be accomplished. Finally, the single-channel images corresponding to each
mutation type in the samples are merged to obtain a two-dimensional mutation image.</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Feature Selection Optimization</title>
        <p>
          The high noise and sparsity issues present in mutation profile data hinder the efective extraction of
feature information. Overlearning the noise in the data can increase the complexity of the model and lead
to overfitting. To avoid interference caused by high noise and sparsity in the data, this method narrows
down the scope of feature selection based on previous work [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]. It employs clustering algorithms
to filter candidate driver genes and potential driver genes for feature selection optimization (FSO).
Passenger genes that have no significant relevance to tumor formation or development are eliminated.
This approach helps improve the overall training eficiency and classification performance of the model.
        </p>
        <p>Assuming that the dimension of the gene mutation map after FSO is  ′ ×  ′, ′ represents the number
of selected feature genes for each tumor cohort. According to the principles of constructing the mutation
map, there should be a certain relationship between ′ and  ′ denoted by 24 ′ + 19′ − 24 ≤  ′2.</p>
        <p>This is because, in the dimension transformation process proposed in this paper, each row of the
mutation map corresponds to the mutation information on one chromosome. Since the distribution of
mutated genes across the 24 chromosomes in the human body is not uniform, if there  ′ + 1 mutated
genes in one chromosome, that chromosome will occupy two rows in the mutation map. To better
adapt to the training process of deep learning models, this paper sets  ′ to be 56. The maximum integer
value that ′ can take is theoretically 95. However, in practice, the probability of the occurrence of
this situation (where  ′ + 1 mutated genes occupy two rows) is almost zero. Therefore, in this paper,
′ is set to 100. This means that for each tumor cohort, the selected mutated genes are the top 100
ranked genes according to the clustering prediction results. If the number of candidate driver genes
is less than 100, potential driver genes will be further selected. After FSO, the dimension of the gene
Gene Mutation Maps
3×3 CNN</p>
        <p>Inverted Residual Module
Inverted Residual Structure B</p>
        <p>Inverted Residual</p>
        <p>Structure A</p>
        <p>MHSA Module
MHSA Layer
h θ 5/H8
h ': θ V θ 5/H8
h θ Linear
h θ /58H
h ': θ V θ 5/H8
h θ /QLHUD</p>
        <p>Self-Attention Layer H *W ´H *W
z</p>
        <p>H ´W ´d</p>
        <p>H *W ´d
softmax</p>
        <p>q</p>
        <p>H *W ´H *W
content-position qrT</p>
        <p>H ´W ´d</p>
        <p>r
H ´1´d</p>
        <p>Rh</p>
        <p>H ´W ´d
1´W ´d WQ :1´1</p>
        <p>Rw</p>
        <p>H *W ´H *W
qkT content-content
k</p>
        <p>H ´W ´d
WK :1´1
v</p>
        <p>H ´W ´d</p>
        <p>WV :1´1
x H ´W ´d</p>
        <p>Linear
Linear
Linear
Q</p>
        <p>LORQ3J</p>
        <p>OY*UH$ERDJ</p>
        <p>ROQLWIFDV&amp;</p>
        <p>UH3QDF
Multi-head Self-Attention</p>
        <p>Scaled Dot-Product Attention
Scaled Dot-Product Attention
Self-Attention Layer</p>
        <p>Concat
Linear
Linear
Linear</p>
        <p>K</p>
        <p>Linear
Linear
Linear
V
mutation matrix is reduced to 56 × 56, efectively addressing the high noise and sparsity issues in the
original dimension transformation method.This is because, in the dimension transformation process
proposed in this paper, each row of the mutation map corresponds to the mutation information on
one chromosome. Since the distribution of mutated genes across the 24 chromosomes in the human
body is not uniform, if there  ′ + 1 mutated genes in one chromosome, that chromosome will occupy
two rows in the mutation map. To better adapt to the training process of deep learning models, this
paper sets  ′ to be 56. The maximum integer value that ′ can take is theoretically 95. However, in
practice, the probability of the occurrence of this situation (where  ′ + 1 mutated genes occupy two
rows) is almost zero. Therefore, in this paper, ′ is set to 100. This means that for each tumor cohort,
the selected mutated genes are the top 100 ranked genes according to the clustering prediction results.
If the number of candidate driver genes is less than 100, potential driver genes will be further selected.
After FSO, the dimension of the gene mutation matrix is reduced to 56 × 56, efectively addressing the
high noise and sparsity issues in the original dimension transformation method.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. M-MNet Classification Network</title>
        <p>The proposed network model, M-MNet, is based on the inverted residual module and the multi-head
self-attention module. The overall architecture of M-MNet is illustrated in Figure 2.</p>
        <p>The network takes the gene mutation map as input. It starts with a 3 × 3 convolutional layer with
a stride of 2 to extract shallow features. Then, multiple inverted residual modules are employed to
extract high-dimensional features. The inverted residual module consists of two types of inverted
residual structures, A and B. In the final stage of the model, several multi-head self-attention modules
are introduced to capture global features. Each multi-head self-attention module consists of two 1 × 1
convolutional layers and a multi-head self-attention layer. Finally, the output is obtained through a
7 × 7 global average pooling layer.
1) Inverted Residual Module: In the conventional residual structure, dimension reduction is
performed through a 1 × 1 convolutional layer before being increased again through another 1 × 1
convolutional layer. However, in our proposed module, dimension augmentation is performed through
a 1 × 1 convolutional layer, followed by feature extraction through a 3 × 3 convolutional layer, and
ifnally dimension reduction through another 1 × 1 convolutional layer. We named this structure the
”inverted residual block” because of its reversed order of dimension change. This approach efectively
utilizes the feature information from diferent channels at the same spatial location. It is important to
note that while the conventional residual structure uses the  activation function, the inverted
residual structure uses the  6 activation function as follows:
 =  6() = min( (), 6) = min(max(0, ), 6)
(4)</p>
        <p>Due to the potential loss of information caused by the ReLU function when performing non-linear
transformations on low-dimensional features, the inverted residual structure has a lower-dimensional
output. Therefore, in the dimension reduction process, the linear activation function is used instead of
the ReLU activation function. This choice helps preserve the information and prevent unnecessary loss.</p>
        <p>
          2) Multi-head Self-attention Module: The core module of the Transformer network is the
MultiHead Self-Attention (MHSA), which is a specialized form of the multi-head attention mechanism. In
MHSA, the inputs K, V, and Q of the multi- head attention are all hidden state matrices H = R× 
of the same input sequence, where d represents the dimension of the hidden state and  is the length
of the sequence. The Transformer network uses absolute positional encoding to enable the attention
mechanism to perceive positional information. However, recent studies have found that relative
distanceaware positional encoding is more suitable for visual tasks. This is because attention not only considers
the content of information but also takes into account the relative distances between features at diferent
positions, efectively linking information across objects with positional awareness. In this paper, the
two-dimensional relative positional encoding self-attention from reference [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ] is used to implement
the multi-head self-attention mechanism.
        </p>
        <p>In the two-dimensional relative positional encoding attention module, all attention mechanisms are
performed on a two- dimensional feature map. The relative distance positional encodings Rℎ and R
are used to represent the height and width of the feature map, respectively. The attention logarithm
is denoted as qk + qr , where q, k, and r represent the query, key, and positional encoding (relative
distance encoding), respectively. The ⊕ and ⊗ symbols represent element-wise addition and matrix
multiplication, respectively, and 1 × 1 denotes point- wise convolution. It should be noted that in the
Transformer network, the normalization layer used is Layer Normalization (LN), while in our model, the
normalization layer used for multi-head self-attention is Batch Normalization (BN). The MHSA block in
Transformer includes an output projection, but the MHSA used in this paper does not. Additionally,
while Transformer uses a single non-linear activation function in the Feed-forward Network (FFN), this
paper uses three non-linear activation functions.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Label Smoothing Regularization</title>
        <p>In classification algorithms, data is typically labeled using hard labels, which are represented in the
form of one-hot encoding.</p>
        <p>However, when the training data is insuficient to reflect the true distribution of the data, the network
model may sufer from overfitting, resulting in decreased generalization ability. To address this issue,
this article employs Label Smoothing Regularization (LSR) technique to enhance the robustness of the
model.</p>
        <p>In label smoothing regularization, a uniform distribution is combined with the technique to constrain
the model’s predicted results by adding noise to the output. In the label smoothing regularization
strategy for multi-class tasks, the one- hot encoded label vector  is replaced with a label vector ^ as
follows:
^ = ︂{ 1− − 1,,   ̸==  (5)</p>
        <p>Where  s a small hyperparameter typically set to 0.1, and  represents the number of classes. Label
smoothing regularization is employed to prevent the model from relying solely on the training data
distribution during the training process. By adding noise to the output, the label smoothing technique
constrains the model, to some extent, and mitigates overfitting. Additionally, label smoothing can make
the clusters of diferent classes more compact, increasing the inter-class distance and reducing the
intra-class distance. This, in turn, enhances the model’s generalization ability.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experiments and Discussion</title>
      <p>All experiments in this paper were conducted on the Ubuntu 16.04 system with Python 3.9 version under
the Anaconda environment. The deep learning library used was TensorFlow. The GPU model used
was GeForce RTX 2080 Ti with 12GB memory. During the training process, the relevant parameters
remained consistent. The optimizer employed was Momentum-SGD (Stochastic Gradient Descent with
Momentum). The number of epochs was set to 150, the initial learning rate was 0.01, the decay factor
was 0.5, and the momentum parameter was set to 0.9. Early stopping was implemented to prevent a
decrease in validation accuracy as the number of iterations increased. The evaluation metrics for model
performance in this study were accuracy, precision, recall, and F1 score.</p>
      <sec id="sec-4-1">
        <title>4.1. Dataset</title>
        <p>The experimental dataset was downloaded from the TCGA (The Cancer Genome Atlas), which consists
of somatic mutation data from 19 tumor cohorts. The data format is in MAF (Mutation Annotation
Format) files, which include various mutation types such as silent, missense, nonsense, splice site,
and frameshift insertions/deletions. After obtaining the somatic mutation dataset, this study initially
converted the MAF files into TSV (Tab-Separated Values) format. Subsequently, filtering was applied to
each cohort dataset to exclude synonymous mutations and minimize false positives. This study focused
on single nucleotide variations and excluded structural variations and insertions/deletions from the
somatic mutation data.</p>
        <p>The final dataset contains a total of 6,906 samples, 236,245 gene elements, and 1,678,190 single
nucleotide mutation positions. The gene list from the Cancer Gene Census (CGC) was retrieved from
the COSMIC (Catalogue of Somatic Mutations in Cancer) website in April 2022. The genomic coordinates
for coding genes were obtained from the ENCODE website, using Gencode version 19.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Ablation Study</title>
        <p>To validate and interpret the efectiveness and necessity of the proposed feature selection optimization
algorithm, we conducted classification tasks using three diferent network models: VGG-16, Inception
ResNet-V2, and MobileNet-V2. The input data for these models were gene mutation maps obtained by
dimensionality transformation of somatic mutation data. The FSO algorithm was employed to perform
feature selection on the two-dimensional gene mutation maps.</p>
        <p>Table 1 presents the comparison of classification performance using diferent network models with
gene mutation map construction methods alone and in combination with gene mutation map
construction and feature selection optimization algorithm. In Table 1, it can be observed that compared to
VGG-16, FSO+VGG-16 achieved an improvement of 6.11 percentage points in accuracy, 5.27 percentage
points in precision, and 5.69 percentage points in F1 score. FSO+Inception ResNet-V2 showed a general
improvement of 1 percentage point in all metrics compared to the original model. FSO+MobileNet-V2
exhibited a 2 percentage point improvement in all metrics compared to MobileNet-V2 as a whole.
δ aε VGG-16
δ bε FSO+VGG-16</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Feature Visualization</title>
        <p>To provide a clearer explanation of the efectiveness of the feature selection optimization module in
improving the overall model performance, this subsection utilizes t-SNE (t- distributed Stochastic
Neighbor Embedding) to visualize the features extracted by the model. t-SNE compresses the input features
before the final classification feature layer into a two-dimensional plane for output, representing a total
of 690 gene mutation map samples, to some extent reflecting the model’s classification performance.</p>
        <p>From Figure 3, it can be observed that after undergoing feature selection optimization, several tumor
cohorts such as SARC, OV, and GBM are no longer clustered together with cohorts like LGG, PRAD,
and THCA. The classification performance has improved, but there are still some individual samples
from cohorts like OV, LIHC, and BRCA that cannot be correctly classified.</p>
        <p>In addition, this study also compares the proposed M- MNet model with the previously mentioned
best-performing MobileNet-V2 network through visualization. Figure 4 illustrates the classification
results of the MobileNet-V2 network and the M-MNet network proposed in this paper for the 690 test
samples from the 19 tumor cohorts.</p>
        <p>From the figure, it can be observed that the MobileNet- V2 network misclassifies certain samples from
tumor cohorts such as PRAD, LIHC, SARC, and COAD into the OV cohort, and it also misclassifies some
samples from the CESC and UCEC cohorts into the SKCM cohort. However, in the M-MNet network,
this situation is improved, indicating the efectiveness and robustness of the proposed network model.
δ aε MobileNet-V2
δ bε M-MNet</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. Classification performance</title>
        <p>We trained the M-MNet network model using gene mutation maps before and after feature selection
optimization. We also compared the results with the previous models, and the experimental results are
shown in Table 2.</p>
        <p>Comparing the experimental results in the fifth and seventh rows of Table 5.3, we can see that the
MobileNet-V2 network achieved an accuracy of 91.18%, precision of 91.42%, recall of 91.18%, and F1
score of 91.30%. Compared to the MobileNet- V2 network model with only inverted residual blocks, the
M-MNet network showed improvements in all metrics, with an average increase of 2 percentage points.
Furthermore, comparing the results in the sixth and eighth rows, it can be observed that FSO+M-MNet
outperformed FSO+MobileNet- V2 with an average increase of 1 percentage point, including a 1.28%
improvement in accuracy. This demonstrates the efectiveness and necessity of combining the
multihead self- attention module with the inverted residual module. FSO+M- MNet achieved improvements
of 1.43, 0.96, 1.43, and 1.19 percentage points in accuracy, precision, recall, and F1 score, respectively,
compared to M-MNet. In comparison to the VGG- 16 model, the FSO module had a smaller impact on the
overall performance of the M-MNet model, indicating that the feature selection optimization algorithm
remains efective for the proposed M-MNet network in this chapter. However, the improvement from
the feature selection optimization algorithm becomes smaller when applied to models with better
original performance. The gene mutation map construction, feature selection optimization algorithm,
and M-MNet network proposed in this paper all contributed to the improvement in model performance,
demonstrating the efectiveness and robustness of this approach in cancer classification tasks.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>This paper proposes a feature extraction method that combines the construction of gene mutation maps
based on the RGB three-channel principle with feature selection optimization. Furthermore, we present
an M-MNet classification model based on the inverted residual module and multi-head self-attention
module. To overcome the issues of high noise and sparsity in gene mutation maps constructed based
on the RGB three-channel principle, an FSO algorithm is applied for further feature extraction. At
the same time, to design a classification model with better performance and stronger robustness, this
paper proposes an M-MNet classification model based on the inverted residual module and multi-head
self-attention module, trained using label smoothing regularization. The feature extraction method that
combines the construction of gene mutation maps based on the RGB channel principle with feature
selection optimization efectively solves the problem of high-dimensional mutation data being unsuitable
for existing convolutional neural networks. Experimental results show that the overall performance of
M-MNet is better than that of other existing classification models.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Acknowledgments</title>
      <p>Not Applicable.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>F.</given-names>
            <surname>Tomao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Papa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Rossi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Strudel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Vici</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. L.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tomao</surname>
          </string-name>
          ,
          <article-title>Emerging role of cancer stem cells in the biology and treatment of ovarian cancer: basic knowledge and therapeutic possibilities for an innovative approach</article-title>
          ,
          <source>Journal of Experimental &amp; Clinical Cancer Research</source>
          <volume>32</volume>
          (
          <year>2013</year>
          )
          <fpage>48</fpage>
          -
          <lpage>48</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>W. M.</given-names>
            <surname>Linehan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. J.</given-names>
            <surname>Ricketts</surname>
          </string-name>
          ,
          <article-title>The metabolic basis of kidney cancer</article-title>
          ,
          <source>Seminars in Cancer Biology</source>
          <volume>23</volume>
          (
          <year>2012</year>
          )
          <fpage>46</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Meyerson</surname>
          </string-name>
          , S. Gabriel, G. Getz,
          <article-title>Advances in understanding cancer genomes through secondgeneration sequencing</article-title>
          ,
          <source>Nature Reviews Genetics</source>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>K.</given-names>
            <surname>Sirvan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Peronne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Deepak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Salendra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. F.</given-names>
            <surname>Thomas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Kishore</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vinay</surname>
          </string-name>
          ,
          <article-title>Sysmut: decoding the functional significance of rare somatic mutations in cancer, Briefings in Bioinformatics (</article-title>
          <year>2022</year>
          )
          <article-title>4</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Candia</surname>
          </string-name>
          , E. Bayarsaikhan,
          <string-name>
            <given-names>M.</given-names>
            <surname>Tandon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Budhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X. W.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>The genomic landscape of mongolian hepatocellular carcinoma</article-title>
          ,
          <source>Nature Communications</source>
          <volume>11</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Abel</surname>
          </string-name>
          ,
          <string-name>
            <surname>Gonzalez-Perez</surname>
          </string-name>
          , Nuria, Lopez-Bigas,
          <article-title>Functional impact bias reveals cancer drivers</article-title>
          .,
          <source>Nucleic acids research</source>
          (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>L.</given-names>
            <surname>Mularoni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sabarinathan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Deu-Pons</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gonzalez-Perez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>López-Bigas</surname>
          </string-name>
          ,
          <article-title>Oncodrivefml: a general framework to identify coding and non-coding regions with cancer driver mutations</article-title>
          ,
          <source>Genome Biology</source>
          <volume>17</volume>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>H.</given-names>
            <surname>Yi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Juze</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Xinyi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Wei-Chung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Shu-Hsuan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Xing</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liyuan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yaning</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Qingbiao</surname>
          </string-name>
          , L. a. Pengyuan,
          <string-name>
            <surname>Driverml:</surname>
          </string-name>
          <article-title>a machine learning algorithm for identifying driver genes in cancer sequencing studies</article-title>
          ,
          <source>Nuclc Acids Research</source>
          (
          <year>2019</year>
          )
          <fpage>e45</fpage>
          -
          <lpage>e45</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>B.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Guo</surname>
          </string-name>
          , J. Ma,
          <article-title>Quantification of egfr mutations in primary and metastatic tumors in non-small cell lung cancer</article-title>
          ,
          <source>Journal of Experimental &amp; Clinical Cancer Research</source>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>T.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Ren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , H. Liu,
          <string-name>
            <surname>Y. Zhang,</surname>
          </string-name>
          <article-title>Landscape of homologous recombination-related (hrr) genes mutations in colon cancer</article-title>
          ,
          <source>Journal of Clinical Oncology</source>
          <volume>39</volume>
          (
          <year>2021</year>
          )
          <fpage>e15525</fpage>
          -
          <lpage>e15525</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>S. W. K.</given-names>
            <surname>Ng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. J.</given-names>
            <surname>Rouhani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. F.</given-names>
            <surname>Brunner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Brzozowska</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. J.</given-names>
            <surname>Aitken</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Abascal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Moore</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Nikitopoulou</surname>
          </string-name>
          ,
          <string-name>
            <surname>L.</surname>
          </string-name>
          <article-title>a. Chappell, Convergent somatic mutations in metabolism genes in chronic liver disease</article-title>
          ,
          <source>Nature</source>
          <volume>598</volume>
          (
          <year>2021</year>
          )
          <fpage>473</fpage>
          -
          <lpage>478</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>W.</given-names>
            <surname>Poole</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Leinonen</surname>
          </string-name>
          , I. Shmulevich,
          <string-name>
            <given-names>T. A.</given-names>
            <surname>Knijnenburg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Bernard</surname>
          </string-name>
          ,
          <article-title>Multiscale mutation clustering algorithm identifies pan-cancer mutational clusters associated with pathway-level changes in gene expression</article-title>
          ,
          <source>PLOS Computational Biology</source>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>C.</given-names>
            <surname>Tokheim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Bhattacharya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Niknafs</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. M.</given-names>
            <surname>Gygax</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. C.</given-names>
            <surname>Ryan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Masica</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Karchin</surname>
          </string-name>
          ,
          <article-title>Exome-scale discovery of hotspot mutation regions in human cancer using 3d protein structure</article-title>
          ,
          <source>Cancer Research</source>
          (
          <year>2016</year>
          )
          <fpage>3719</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>T.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Pacheco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Reikowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Stettner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Qiu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bouvier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bertram</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Faisal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Brummel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Libuda</surname>
          </string-name>
          ,
          <article-title>Identification of the reversible skin layer on co</article-title>
          ,
          <source>ACS catalysis 12</source>
          (
          <year>2022</year>
          )
          <fpage>3256</fpage>
          -
          <lpage>3268</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>M. F. B. Asad</surname>
            ,
            <given-names>M. N. A.</given-names>
          </string-name>
          <string-name>
            <surname>Hallak</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Sukari</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Baca</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Nagasaka</surname>
          </string-name>
          ,
          <article-title>Prognostic impact of xpo1 mutations in metastatic non-small cell lung cancer (nsclc)</article-title>
          ,
          <source>Journal of Clinical Oncology</source>
          <volume>39</volume>
          (
          <year>2021</year>
          )
          <fpage>e20533</fpage>
          -
          <lpage>e20533</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Leiserson</surname>
            ,
            <given-names>D. M.</given-names>
          </string-name>
          <string-name>
            <surname>Mark</surname>
            , Hsin-Ta, Vandin, Fabio, Raphael,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Benjamin</surname>
          </string-name>
          ,
          <article-title>Comet: a statistical approach to identify combinations of mutually exclusive alterations in cancer</article-title>
          ,
          <source>Genome Biology</source>
          <volume>16</volume>
          (
          <year>2015</year>
          )
          <fpage>1</fpage>
          -
          <lpage>20</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>Y. A.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Madan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. M.</given-names>
            <surname>Przytycka</surname>
          </string-name>
          , Wesme:
          <article-title>Uncovering mutual exclusivity of cancer drivers and beyond</article-title>
          , arXiv e-prints (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>R. L.</given-names>
            <surname>Stevens</surname>
          </string-name>
          ,
          <article-title>Deep learning in cancer and infectious disease: Novel driver problems for future hpc architecture (</article-title>
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>C.</given-names>
            <surname>Quan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Qi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tie</surname>
          </string-name>
          ,
          <article-title>Lrt-cluster: A new clustering algorithm based on likelihood ratio test to identify driving genes</article-title>
          ,
          <source>Interdisciplinary Sciences: Computational Life Sciences</source>
          <volume>15</volume>
          (
          <year>2023</year>
          )
          <fpage>217</fpage>
          -
          <lpage>230</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>A.</given-names>
            <surname>Srinivas</surname>
          </string-name>
          , T.-
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Parmar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Shlens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Abbeel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vaswani</surname>
          </string-name>
          ,
          <article-title>Bottleneck transformers for visual recognition</article-title>
          ,
          <source>2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)</source>
          (
          <year>2021</year>
          )
          <fpage>16514</fpage>
          -
          <lpage>16524</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <article-title>A novel image-to-knowledge inference approach for automatically diagnosing tumors</article-title>
          ,
          <source>Expert Systems with Applications</source>
          <volume>229</volume>
          (
          <year>2023</year>
          )
          <article-title>120450</article-title>
          . URL: https://www.sciencedirect.com/science/article/pii/S0957417423009521. doi:https: //doi.org/10.1016/j.eswa.
          <year>2023</year>
          .
          <volume>120450</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>C.</given-names>
            <surname>Szegedy</surname>
          </string-name>
          , W. Liu,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Sermanet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Reed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Anguelov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Erhan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vanhoucke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rabinovich</surname>
          </string-name>
          ,
          <article-title>Going deeper with convolutions</article-title>
          ,
          <source>in: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</source>
          ,
          <year>2015</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>9</lpage>
          . doi:
          <volume>10</volume>
          .1109/CVPR.
          <year>2015</year>
          .
          <volume>7298594</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>D.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          , L. Liu,
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <article-title>Fusion of human cognitive knowledge and machine inference for breast cancer detection</article-title>
          ,
          <source>in: 2023 International Conference on Advanced Robotics and Mechatronics (ICARM)</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>179</fpage>
          -
          <lpage>184</lpage>
          . doi:
          <volume>10</volume>
          .1109/ICARM58088.
          <year>2023</year>
          .
          <volume>10218759</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>M.</given-names>
            <surname>Schlichtkrull</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Kipf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bloem</surname>
          </string-name>
          , R. van den Berg, I. Titov,
          <string-name>
            <given-names>M.</given-names>
            <surname>Welling</surname>
          </string-name>
          ,
          <article-title>Modeling relational data with graph convolutional networks</article-title>
          ,
          <source>in: Extended Semantic Web Conference</source>
          ,
          <year>2017</year>
          . URL: https://api.semanticscholar.org/CorpusID:5458500.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>A.</given-names>
            <surname>Dosovitskiy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Beyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kolesnikov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Weissenborn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Unterthiner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dehghani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Minderer</surname>
          </string-name>
          , G. Heigold,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gelly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Uszkoreit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Houlsby</surname>
          </string-name>
          ,
          <article-title>An image is worth 16x16 words: Transformers for image recognition at scale</article-title>
          , ArXiv abs/
          <year>2010</year>
          .11929 (
          <year>2020</year>
          ). URL: https://api. semanticscholar.org/CorpusID:225039882.
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>D.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          , L. Liu,
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <article-title>Fully automated interpretable breast ultrasound assisted diagnosis system</article-title>
          ,
          <source>in: 2023 International Conference on Advanced Robotics and Mechatronics (ICARM)</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>173</fpage>
          -
          <lpage>178</lpage>
          . doi:
          <volume>10</volume>
          .1109/ICARM58088.
          <year>2023</year>
          .
          <volume>10218807</volume>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>