<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Abnormal Traffic Detection Based on Character-Level Convolutional Neural Networks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Runjie Liu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bingjie Guo</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Chaoqiang Ji</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lei Shi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>National Supercomputing Center in Zhengzhou, Zhengzhou University</institution>
          ,
          <addr-line>Zhengzhou 450001, Henan</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
      </contrib-group>
      <fpage>177</fpage>
      <lpage>186</lpage>
      <abstract>
        <p>The common preprocessing method for abnormal traffic detection is feature-level preprocessing, which is more complex and requires some a priori knowledge (number, meaning, and characteristics of features, etc.) to be considered. A character-level preprocessing-based abnormal traffic detection method is proposed. Starting from the fine-grained character level, the network traffic data is treated as a sequence of characters, and the character sequence is encoded as a vector using four character encodings of the character-level convolutional neural network, and the vectors of each data after character encoding are aggregated into a matrix and fed into the improved convolutional neural network. Experimental results show that the ASCII encoding model in this paper performs best. Compared with the Sparse encoding model, the verification time was reduced by 1.07s and the accuracy rate is increased by 3.21%; Compared with the feature-level pretreatment model, the pre-treatment was simplified without considering the prior knowledge of the data, and the accuracy rate is increased by 4.49%.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;CharCNN</kwd>
        <kwd>Character encoding</kwd>
        <kwd>Abnormal traffic detection</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>2.Related Work</title>
      <p>
        Deep learning is becoming increasingly popular in academic research due to its strong autonomous
learning capability and is used in anomalous traffic detection. However, anomalous traffic detection
data includes both continuous and discrete data, and the order of magnitude difference between different
feature attributes in the data is large. To construct a reasonable dataset, it is usually necessary to
preprocess the dataset. The data preprocessing stage usually contains two parts: character feature
numericalization and numerical feature normalization. Numerical characterization refers to the mapping
of data using one-hot encoding and other encoding methods; numerical feature normalization refers to
the reduction of data values to the range of [
        <xref ref-type="bibr" rid="ref1">0,1</xref>
        ] using Min-Max, Z-score, and other normalization
methods to eliminate the influence of size on feature attributes[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>
        After research, according to the overall preprocessing process of abnormal traffic detection based
on deep learning, it can be divided into feature level, hybrid level, and character level. Most of the
preprocessing methods are used for feature-level preprocessing. Yin et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] proposed a deep learning
approach for intrusion detection based on recurrent neural networks (RNN-IDS), where the data were
put into RNNs after feature-level preprocessing of the dataset, and the performances of the model in
binary and multi-classification were investigated, as well as the impact of the number of neurons and
different learning rates on the performances of the proposed model. Anju Krishnan et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] proposed a
one-dimensional convolutional neural network-based intrusion detection (1DCNN-IDS), where data
after numerical and normalized feature-level preprocessing of character features were fed into a
1DCNN, and experimental results showed superiority over traditional machine learning models. Based on
the advantages of CNN in the image domain, Xiao et al. [10] transformed the data after feature-level
preprocessing and dimensionality reduction using PCA or AE into grayscale images input to CNN for
feature extraction and classification, and the experimental results were better than traditional machine
learning algorithms. Feature-level preprocessing is performed only with prior knowledge of the a priori
knowledge of the features in the dataset (number, meaning and characteristics of the features, etc.), and
in a real network environment, where there are more possible values of character types, the character
feature numericalization will be incomplete and the generalization is not good.
      </p>
      <p>In terms of hybrid-level preprocessing, Min et al. [11] mixed feature-level and word-level using two
modern NLP techniques: word embedding and Text-CNN, extracted salient features from the payload,
and then performed a random forest algorithm on the combination of statistical features and payload
features for classification. This method still requires prior knowledge of the statistical features in the
dataset, which also complicates pre-processing.</p>
      <p>In terms of character-level preprocessing, Lin et al. [12] proposed an intrusion detection method based
on character-level convolutional networks (CharCNN). The traffic data was treated as a special text
sequence, and each character of the data is first transformed into a character vector using one-hot coding,
and the vectors are converged into a matrix and fed directly into the convolutional neural network model,
and the final experimental results show the highest accuracy rate compared with the traditional machine
learning model. This process simplifies preprocessing by eliminating the need to know the prior
knowledge of the dataset, but the character vector formed using one-hot encoding increases the
dimensionality, which leads to a very sparse phenomenon.</p>
      <p>CharCNN was investigated due to the advantages of character-level preprocessing. Andrei Karpau
et al. [13] introduced four encoding methods for CharCNN cited in NLP: Sparse, Sparse Group, ASCII,
ASCII Group, and ASCII had a faster convergence speed. Joseph et al. [14] used the character encoding
embedding method of UTF-8 to encode each character as a vector using a UTF-8 binary value.
Experiments showed that training was faster and had higher performance, and it was easier to learn
from character-level text.</p>
      <p>Therefore, this paper proposes a character-level convolutional neural network-based anomalous
traffic detection model (ATD-CharCNN). Applying the character-level convolutional neural network
in the field of NLP to anomalous traffic detection, the network traffic data is treated as a special kind
of character text from a more fine-grained character level. The main work of this paper is as follows：
(1) Four character encodings for the character-level convolutional neural network are introduced for
character-level preprocessing of network traffic data; this method simplifies the preprocessing process
without knowing the prior knowledge of the dataset features.</p>
      <p>(2) Experiments are conducted on the public dataset NSL-KDD [15], and then the effectiveness of the
four character encodings in abnormal traffic detection is verified and compared with the detection
performance of the model with feature-level preprocessing.</p>
    </sec>
    <sec id="sec-2">
      <title>3.Abnormal Traffic Detection Based on CharCNN</title>
      <p>This paper does not use the mainstream preprocessing method but introduces the character-level
convolutional neural network used in the natural language field for abnormal traffic detection, which
simplifies the preprocessing steps and makes the model more generalizable. The overall process of the
method is shown in Figure 1, which mainly includes three parts: (i) data preprocessing, the character
encoding of traffic data as characters, convergence into vector matrix; (ii) construction of CNN model;
(iii) Binary and multi-class training and testing using the NSL-KDD classic dataset.</p>
    </sec>
    <sec id="sec-3">
      <title>3.1.The binary classification and multi-classification of NSL-KDD</title>
      <p>The distribution of the NSL-KDD dataset is shown in Table 1, and the network traffic in the dataset
can be divided into two categories: Normal traffic and Abnormal traffic. NSL-KDD can also be divided
into multi-classification in more detail: Normal traffic, Dos attack traffic, Probe attack traffic, R2L
attack traffic, and U2R attack traffic.</p>
      <p>From NSL-KDDTrain+, we can see that the amount of normal and abnormal data in the binary
classification is more balanced, and the detection effect will be higher when the abnormal traffic
detection model performs binary classification, but only abnormal traffic can be detected, and it is not
possible to know more deeply which kind of abnormal traffic it is. In multi-classification, the abnormal
traffic is classified more specifically, and from the overall perspective, the distribution of the five traffic
data in the training set is unbalanced, and although specific categories of abnormal traffic can be
detected, the detection rate is not high.</p>
    </sec>
    <sec id="sec-4">
      <title>3.2.Character level preprocessing</title>
      <p>However, for the neural network to better identify these features, the character features need to be
numericalized before model training, and normalization is required to eliminate singular values. For
example: in the NSL-KDD dataset, there are three character-type discrete features: 'protocol_type',
'service', and 'flag', which are first quantized and encoded, and converted into digital representations. In
this way of preprocessing, the quantification and normalization of features can only be carried out
according to the prior knowledge of the features of the dataset, including the number, meaning, and
characteristics of features.</p>
      <p>The character-level preprocessing method in this paper does not require such prior knowledge and
only needs to process each character in the data. This model pre-processes traffic feature sequences
with character length  0. If the characters in a sequence are less than  0, then spaces will be added to the
header of the feature sequence to make it  0 length; if there are more characters in a sequence than  0,
since the sequence header is often the basic feature of the traffic, it is not compatible with attacks. If the
correlation is small, the characters beyond the head of the feature sequence are deleted [12].</p>
      <p>The character table composed of all the characters that may appear in the statistical dataset is as
follows. Table 2 contains the upper and lower case of 26 letters, ten numbers, and 3 punctuation marks
that appear in the dataset.
The characters that appear are encoded and mapped in four ways, as follows:</p>
      <p>Sparse: The essence of Sparse encoding is one-hot encoding. All characters appearing in the dataset
are first counted and numbered, and each character is encoded as a vector of zeros except for the number
of character that appears, which is 1. All characters in this data set are 65 characters in Table 2, in which
the uppercase letters are converted to lowercase letters to reduce sparsity, 39 characters are quantified
as 1×39 vectors, e.g., the Sparse encoding of a is a vector containing 38 zeros: [1,0,0,0,0,0...,0,0]. For
all characters except the character table, it is encoded as an all-zero vector with 39 zeros:
[0,0,0,0,0,0...,0,0].</p>
      <p>
        Sparse Group: This code adds 4 bits to identify the character type based on Sparse. Character types
are lowercase letters, uppercase letters, numbers, and punctuation. For example, the Sparse Group
encoding of the capital letter A is [1,0...,0,0,1,1,0,0], which is the Sparse encoding of a [1,0,0,0, 0...,0,0]
adds [
        <xref ref-type="bibr" rid="ref1 ref1">1,1,0,0</xref>
        ] to identify as uppercase letters. This encoding provides more information about the
characters and enhances the awareness of the model [13].
      </p>
      <p>
        ASCII: The first two encodings contain a lot of zeros and high dimensionality, resulting in a very
sparse phenomenon. ASCII encoding is a non-sparse encoding of an eight-bit binary based on the ASCII
character set. Each character of the character table is encoded as a 1×8 vector according to the binary
code value of ASCII, such as the ASCII encoding of A is [
        <xref ref-type="bibr" rid="ref1 ref1">0,1,0,0,0,0,0,0,0,1</xref>
        ].
      </p>
      <p>
        ASCII Group: Like the Sparse Group encoding, it is a 1×12 vector with 4 bits added to the ASCII
encoding to identify the character type. For example, the ASCII Group code of A is
[
        <xref ref-type="bibr" rid="ref1 ref1 ref1 ref1 ref1 ref1">0,1,1,0,0,0,0,0,1,1,1,1,0,0</xref>
        ].
      </p>
      <p>
        Encoding Examples
‘a’: [1,0,0,0,0...,0,0]
‘A’: no upper case
‘,’: [0,0,0,0...,1...,0]
‘1’: [0,0,0,0...,1...,0]
‘a’: [1,0...,0,0,1,0,0,0]
‘A’: [1, 0 ..., 0, 0, 1, 1, 0, 0]
‘,’: [0...,1...,0,0,0,1,0]
‘1’: [0...,1...,0,0,0,0,1]
ASCII Group
66
‘a’: [
        <xref ref-type="bibr" rid="ref1 ref1 ref1">0,1,1,0,0,0,0,1</xref>
        ]
‘A’: [
        <xref ref-type="bibr" rid="ref1 ref1">0, 1, 0, 0, 0, 0, 0, 1</xref>
        ]
‘,’: [
        <xref ref-type="bibr" rid="ref1 ref1 ref1">0,0,1,0,1,1,0,0</xref>
        ]
‘1’: [
        <xref ref-type="bibr" rid="ref1 ref1 ref1">0,0,1,1,0,0,0,1</xref>
        ]
‘a’: [
        <xref ref-type="bibr" rid="ref1 ref1 ref1 ref1">0,1,1,0,0,0,0,1,1,0,0,0</xref>
        ]
‘A’: [
        <xref ref-type="bibr" rid="ref1 ref1 ref1 ref1 ref1">0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0</xref>
        ]
‘,’: [
        <xref ref-type="bibr" rid="ref1 ref1 ref1 ref1">0,0,1,0,1,1,0,0,0,0,0,1</xref>
        ]
‘1’: [
        <xref ref-type="bibr" rid="ref1 ref1 ref1 ref1">0,0,1,1,0,0,0,1,0,0,1,0</xref>
        ]
      </p>
      <p>Taking ASCII encoding as an example, the traffic feature sequence of a traffic record will be finally
converted into a matrix  of size  0×8. This matrix will be used as input to the convolutional neural
network to train or test the model. The character length  0 is set to 177, which is not only greater than
the entire character length of any traffic feature sequence in the experimental dataset NSL-KDD used
but also greater than the character length of most actual network traffic feature sequences [12].</p>
      <p>Figure 2 shows the process of quantizing and assembling a matrix of ASCII encoding of a piece of
traffic data and features. Among them, each block combination represents an ASCII-encoded vector
corresponding to a character. For example, the eight-bit ASCII code values corresponding to "t", "c",
and "p" are {01110100}, {01100011}, {01110000}, respectively. The white squares represent 0, and
the black squares represent 1. All the ASCII-encoded vectors of the feature sequence of a flow are
aggregated to form a feature matrix that can be used as CNN input.</p>
      <p>By performing character-level preprocessing on traffic data, the preprocessing no longer needs to
consider the meaning and content of features, and the process is simplified. In addition, in a real network,
if a new feature value of a feature is encountered, especially a character feature, the processing process
does not need to readjust the preprocessing module in the model.</p>
    </sec>
    <sec id="sec-5">
      <title>3.3. Structure of ATD-CharCNN model</title>
      <p>Based on the CharCNN-IDS model [12], this paper improves the preprocessing method and neural
network structure and improves the feature extraction ability and discrimination ability of the model. In
the CharCNN model structure proposed by Zhang et al. [16], the convolutional neural network has 9
layers, including 6 convolutional layers and 3 fully connected layers. Compared with text classification
in the field of natural language processing, the amount of data to convert the sequence of traffic data
into text input is smaller (for example,  0 in the natural language field is mostly 1014, and  0 in the traffic
data set in this paper is 177, much smaller than 1014). Therefore, in the ATD-CharCNN model, 3
convolution blocks and 3 fully connected blocks are enough to identify complex text patterns without
consuming too many resources, and too many network layers will also cause overfitting. According to
the experimental results, the convolutional neural network model in ATD-CharCNN is set as follows:
the network contains a total of 6 layers, including 3 convolutional layers and 3 fully connected layers.
The layout of the entire network model is shown in Table 4.
1. Conv: Convolutional layer
2. FC: Fully-connected layer</p>
      <p>The traffic data is processed as special text, so the convolutional layers in the above table are all
one-dimensional convolutions. To increase the nonlinear mapping learning ability of the model, ReLU
is selected as the activation function for each layer in the network structure, and the maximum pooling
layer is added between the convolutional layers to reduce feature redundancy and output local optimal
features. Using the regularization method Dropout technology in the fully connected layer can avoid
overfitting. The output of the last fully-connected layer uses Logsoftmax for binary and
multiclassification. Logsoftmax is the logarithm of softmax, it can avoid solving the overflow and underflow
problem, and can speed up the operation speed and improve the data stability. The mathematical
expression is shown in formula (1):</p>
      <p>Li =log</p>
      <p>ezi
∑jk ezj
where z is the output value of category i in the fully connected layer, and k is the number of output
nodes, that is, the total number of categories classified.</p>
    </sec>
    <sec id="sec-6">
      <title>4.Experiment</title>
    </sec>
    <sec id="sec-7">
      <title>4.1. Performance metrics</title>
      <p>The experiment of this model is trained and tested on the Jiutian·Bisheng platform. The graphics
card is NVIDIA TESLA V100, and the video memory is 32GB. Based on the deep learning framework
of Pytorch 1.8, the ATD-CharCNN model in this paper is implemented.</p>
      <p>The following metrics are used to evaluate the performance of ATD-CharCNN: Accuracy, Precision,
Recall, F1-score, False Positive Rate (FPR), and AUC (Area Under Curve) values, as well as training
and testing time. The larger the AUC value is, the better the classification performance is. The specific
expressions are shown in formulas (2-6):
1
2
3
4
5
6
where TP is positive cases detected as positive, TN is negative cases detected as negative, FP is
negative cases detected as positive, and FN is positive cases detected as negative. Accuracy, which is
used to reflect the classifier's ability to determine various types of samples; Precision, the proportion of
correct attack predictions to the attack predictions; Recall, which can be the proportion of correct attack
samples among all attack samples, reflects the classifier's ability to detect network attacks; F1-score, a
weighted average of precision and recall; and False Positive Rate (FPR), which refers to the proportion
of misclassified normal traffic as abnormal traffic to all normal traffic.</p>
    </sec>
    <sec id="sec-8">
      <title>4.2.Experimental Results and Analysis</title>
    </sec>
    <sec id="sec-9">
      <title>4.2.1.Experimental data and parameter settings.</title>
      <p>The experiments in this paper are to use four character encodings of CharCNN in abnormal traffic
detection and to compare the performance. The experimental dataset is NSL-KDD, where the training
set is NSL-KDDTrain+, and this dataset contains 125,973 data. The test set NSL-KDDTest+ has 22,544
data. The NSL-KDD dataset is introduced in 3.1. In the experiments of binary classification and
multiclassification, the batch size of the training process (batch size) is 256, and the training process is
optimized using the Adam optimizer with a loss function of cross-entropy function.</p>
    </sec>
    <sec id="sec-10">
      <title>4.2.2.Performance in binary classification.</title>
      <p>The two classifications in network traffic are divided into normal traffic and abnormal traffic. The
NSL-KDDTrain+ dataset is used to train the ATD-CharCNN models with four character encodings.
The loss curves and accuracy curves of the training are shown in Figure 3 and Figure 4 below. Table 5
shows the training time, test time, and experimental results at NSL-KDDTest+ for each character
encoding.</p>
      <p>From above Figure 3 and Figure 4, we can see that "Sparse" has less error and higher accuracy on
the training set compared to "ASCII" and fits the training set better. From Table 5, it can be seen that
"ASCII" outperforms "Sparse" in all four character encodings: shorter time, higher accuracy, lower
false positive rate, and larger AUC value. The "Group" code with identifier type is better in terms of
accuracy and AUC value, but it performs worse in terms of false positive rate, training time, and
verification time. Combining the four graphs and the training set and test set accuracies in the table, we
can see that "ASCII" has better generalization.</p>
      <p>From above Figure 3 and Figure 4, we can see that "Sparse" has less error and higher accuracy on
the training set compared to "ASCII" and fits the training set better. From Table 5, it can be seen that
"ASCII" outperforms "Sparse" in all four character encodings: shorter time, higher accuracy, lower
false positive rate, and larger AUC value. The "Group" code with identifier type is better in terms of
accuracy and AUC value, but it performs worse in terms of false positive rate, training time, and
verification time. Combining the four graphs and the training set and test set accuracies in the table, we
can see that "ASCII" has better generalization.</p>
      <p>ASCII is an 8-bit binary non-sparse encoding, it is less memory intensive and takes less time to train
than the Sparse encoding which contains 38 zeros in 39 bits. And ASCII is an international set of
character encoding containing upper and lower case English characters, numbers, and punctuation
marks. For data sets where all characters are in the ASCII character table, ASCII encoding is a more
time- and memory-efficient encoding with a higher accuracy rate. In the field of NLP, it is better to
distinguish between upper and lower case English characters than a model with only lower case English
characters. From the accuracy of the Sparse Group encoding and ASCII encoding in Table 5, the
encoding that artificially sets the identity character type does not have good detection performance for
encodings that contain these character types on its own.</p>
    </sec>
    <sec id="sec-11">
      <title>4.2.3.Performance in multi-classification.</title>
      <p>It can be seen from Section 3.1 that the multi-classification in the network traffic data in the
NSLKDD dataset is divided into five classifications. They are normal traffic, Dos attack traffic, Probe attack
traffic, R2L attack traffic, and U2R attack traffic. There are few samples of the latter two abnormal
flows. The Recall values and Accuracy values of various character encoding multi-classifications of
ATD-CharCNN are shown in Table 6.</p>
      <p>From the table, the recall values are larger for the attack types with large sample sizes in the dataset,
and the detection rate of ATD-CharCNN for Dos attack traffic and Probe attack traffic is very high, and
the low detection rates for the latter two are due to the small sample size, and the model learns less valid
information when performing learning. From the overall Accuracy values, the ATD-CharCNN model
with ASCII encoding has the best detection performance in multi-classification, and the ASCII Group
encoding with four more identifying characters is not very useful in multi-classification, and the ASCII
character set itself already identifies the character type.</p>
    </sec>
    <sec id="sec-12">
      <title>4.2.4.Comparison: ATD-CharCNN and CNN-IDS.</title>
      <p>To verify the effectiveness of the character-level preprocessing proposed in this paper in abnormal
traffic detection, the detection accuracy of the convolutional neural network model with feature-level
preprocessing for intrusion detection is listed, and the detection accuracy of different models and the
detection accuracy of the model proposed in this paper are compared, and the results are shown in Table
7.</p>
      <p>As can be seen from Table 7, the metrics value of the proposed character-level preprocessing is
better than the feature-level preprocessing, indicating that the proposed method has better detection
performance. Character-level preprocessing does not require numerical characterization by knowing
the a priori knowledge of the dataset, nor does it require normalization of the singular values, which
simplifies the pre-processing process and improves the detection performance.</p>
    </sec>
    <sec id="sec-13">
      <title>5.Conclusion and future work</title>
      <p>An ATD-CharCNN model with character-level preprocessing is proposed and experimentally
validated on the dataset NSL-KDD. The effectiveness of character-level preprocessing in anomalous
traffic detection is shown, where ASCII encoding has the best detection performance in binary and
multi-classification with the shortest time and high detection rate. Compared with the feature-level
preprocessing model, the model in this paper greatly simplifies the preprocessing process and the
accuracy rate is increased by 4.49%, improving the detection performance. The model has a low
detection rate for minority attacks in five classifications, and in future work, the imbalance of data will
be further reduced to improve the detection rate for minority attacks.</p>
    </sec>
    <sec id="sec-14">
      <title>6.References</title>
      <p>[10] Xiao Y, Xing C, Zhang T, et al. An intrusion detection model based on feature reduction and
convolutional neural networks[J]. IEEE Access, 2019,7:42210-42219.
[11] Min E X, Long J, Liu Q, et al. TR-IDS: Anomaly-based intrusion detection through
textconvolutional neural network and random forest[J]. Security and Communication Networks,
2018:1-9.
[12] Lin S Z, Shi Y, Xue Z. Character-level intrusion detection based on convolutional neural
networks[C]//International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro,
2018:18.
[13] Andrei Karpau, Markus Hofmann. Input encodings for Character Level Convolutional Neural</p>
      <p>Networks[J]. Digitale Welt, 2020,4(1):26-31.
[14] Joseph D. Prusa, Taghi M. Khoshgoftaar. Designing a Better Data Representation for Deep Neural
Networks and Text Classification[C]//2016 IEEE 17th International Conference on Information
Reuse and Integration (IRI) (2016): 411-416.
[15] University of New Brunswick. NSL-KDD dataset [DS/OL].</p>
      <p>https://www.unb.ca/cic/datasets/nsl.html.
[16] Zhang X, Zhao J, Le C Y. Character-level convolutional networks for text
classification[C]//Advances in neural information processing systems. 2015:649-657.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Rezaei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Eslami</surname>
          </string-name>
          , M.
          <article-title>-</article-title>
          <string-name>
            <surname>A. Amini</surname>
          </string-name>
          , et al.
          <article-title>Hierarchical Three-module Method of Text Classification in Web Big Data</article-title>
          [C]//2020 6th International Conference on Web Research (ICWR),
          <year>2020</year>
          :
          <fpage>58</fpage>
          -
          <lpage>65</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Li</surname>
            <given-names>H</given-names>
          </string-name>
          and
          <string-name>
            <surname>Li G M.</surname>
          </string-name>
          <article-title>Research on Facial Expression Recognition Based on LBP</article-title>
          and
          <string-name>
            <surname>Deep Learning</surname>
          </string-name>
          [C]//2019 International Conference on Robots &amp;
          <source>Intelligent System (ICRIS)</source>
          ,
          <year>2019</year>
          :
          <fpage>94</fpage>
          -
          <lpage>97</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Qu</surname>
            <given-names>F K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            <given-names>J L</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            <given-names>W C</given-names>
          </string-name>
          , et al.
          <source>Design of Intelligent Classification Trash Bin Based on Speech Recognition[C]//7th International Conference on Electronic Technology and Information Science</source>
          ,
          <year>2022</year>
          :
          <fpage>1</fpage>
          -
          <lpage>5</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Fu</surname>
            <given-names>Y X</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lu</surname>
            <given-names>T L</given-names>
          </string-name>
          , Ma Z L.
          <article-title>One-Hot based CNN malicious code detection technology[J]</article-title>
          .
          <source>Computer Applications and Software</source>
          ,
          <year>2020</year>
          ,
          <volume>37</volume>
          (
          <issue>1</issue>
          ):
          <fpage>304</fpage>
          -
          <lpage>308</lpage>
          +
          <fpage>333</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Ito</given-names>
            <surname>Michiaki</surname>
          </string-name>
          , Iyatomi, Hitoshi.
          <source>Web Application Firewall using Character-level Convolutional Neural Network[C]//IEEE 14th International Colloquium on Signal Processing and its Applications (CSPA)</source>
          .
          <year>2018</year>
          :
          <fpage>103</fpage>
          -
          <lpage>106</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Rafael</given-names>
            <surname>Brinhosa</surname>
          </string-name>
          ,
          <string-name>
            <surname>Marcos</surname>
            <given-names>A</given-names>
          </string-name>
          , et al.
          <article-title>Comparison between LSTM and CLCNN in detecting malicious requests in web attacks[C]//in</article-title>
          <source>Proceedings of the 21st Brazilian Symposium on Information and Computational Systems Security, Belém</source>
          ,
          <year>2021</year>
          :
          <fpage>113</fpage>
          -
          <lpage>126</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Jian</surname>
            <given-names>S J</given-names>
          </string-name>
          , Lu
          <string-name>
            <given-names>Z G</given-names>
            ,
            <surname>Du</surname>
          </string-name>
          <string-name>
            <surname>D</surname>
          </string-name>
          , et al.
          <article-title>A Survey of Network Intrusion Detection Technology[J]</article-title>
          .
          <source>Journal of Cyber Security</source>
          ,
          <year>2020</year>
          ,
          <volume>5</volume>
          (
          <issue>4</issue>
          ):
          <fpage>96</fpage>
          -
          <lpage>122</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Yin</surname>
            <given-names>C L</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhu</surname>
            <given-names>Y F</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fei J L</surname>
          </string-name>
          , et al．
          <article-title>A deep learning approach for intrusion detection using recurrent neural networks［J］</article-title>
          .
          <source>IEEE Access</source>
          ,
          <year>2017</year>
          (5):
          <fpage>21954</fpage>
          -
          <lpage>21961</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Anju</given-names>
            <surname>Krishnan</surname>
          </string-name>
          and
          <string-name>
            <given-names>S. T.</given-names>
            <surname>Mithra</surname>
          </string-name>
          .
          <article-title>A Modified 1D-CNN Based Network Intrusion Detection System[J]</article-title>
          .
          <source>IJRESM</source>
          ,
          <year>2021</year>
          ,
          <volume>4</volume>
          (
          <issue>6</issue>
          ):
          <fpage>291</fpage>
          -
          <lpage>294</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>