<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Study of Data Scarcity Problem for Automatic Detection of Deceptive Speech Utterances</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Alena Velichko</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Alexey Karpov</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>In recent years there have been a lot of interest in the task of contactless automatic detection of deception in speech utterances. This task belongs to the paralinguistic area that studies the models for researching such aspects of speech as emotions, voice characteristics, psychophysiological traits etc. Despite the high relevance of the topic, there still remains the problem of the lack of data containing deceptive information. This paper presents an analysis of methods aimed to deal with the problem. Such over-sampling algorithms as SMOTE and ADASYN were explored.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Intoduction</title>
      <p>was used along with lexical features and regularization. Results with oversampling were more optimistic than
others. The best performance was achieved with the use of random sampling and bagging classifier in terms
of accuracy of 58% and F1-score of 66%. The potential of oversampling technique was also presented in one of
other paralinguistic tasks, depression detection [Cummins et al., 2014]. All experiments were conducted using
Audio/Visual Emotion Challenge and Workshop (AVEC) 2013 dataset. This paper presented an oversampled
extraction technique for the i-vector system in small datasets. Although oversampled classification accuracies
of KL-means were lower than the standard ones, oversampled accuracies of i-vectors outperformed the
standard ones by almost 16% in terms of mean accuracy. A study of imbalanced learning in sentiment analysis
was presented in [Ah-Pine and Soriano-Morales, 2016]. Authors used three synthetic oversampling techniques
(SMOTE, BorderLineSMOTE and ADASYN) for tweet-polarity classification. Oversampling improved the
results of decision trees and l1 penalized logistic regression. Authors found out that all three methods improved
recognition of the minority class as well as obtained large increase of the overall geometric mean criterion.</p>
      <p>In [Kaya and Karpov, 2017] authors proposed another approach to handle the imbalanced data – weighted
kernel classifiers. The pipeline combines suprasegmental acoustic features, FV encoding and multi-level
normalization. The system effectively handled the class imbalance and did not need to use oversampling. The
results were promising since they outperformed the baseline system results in Snoring Sub-Challenge and were
on the same level as base results in the Addressee Sub-Challenge. Authors of [Akhtiamov et al., 2019] applied
augmentation technique to addressee detection task in their cross-corpora experiments. They used an approach
called mixup and it performed pretty good for neural networks with predefined acoustic features but did not
gave a significant improvement in performance for e2e models, and did not benefit for linear classifiers and
simple architectures without regularization at all. A novel imbalance learning-based framework for movie fear
recognition was presented in [Zhang et al., 2018]. Authors conducted experiments using 4 different sampling
techniques: SMOTE, Random Sampling, Hardsampling and Softsampling. The latter two methods were
proposed to combine the advantages of oversampling and undersampling. The results reached a state-of-the-art
performance on Recall and F1-measure in the MediaEval 2017 Emotion Impact of Movie Task. In [Ashihara
et al., 2019] authors investigated whispered speech detection task and imbalanced learning impact on it. They
used a class-aware sampling method for training phase and it helped to diminish the effect of imbalance in
classes. The proposed system could achieve the best ROC-AUC score of almost 1.0 in close/neutral conditions
and almost 0.9 in far-field condition.</p>
      <p>The best result right now in deception detection in speech was achieved by [Mendels et al., 2017] and
[Montaci´e et al., 2016] within the framework of ComParE-2017 and ComParE-2016 accordingly. Authors of
the first paper presented a system that used acoustic and lexical features and Random Forest Classifier. The
best performances in terms of F1-measure and precision were 63.9% and 76.1% accordingly. In the second
paper authors elaborated a system with the use of prosodic ques, base feature set. The system reached the
UAR of 74.9%. The base system [Schuller, 2016] on the competitions in 2016 performed with result in terms
of UAR of 68.3%.
3</p>
    </sec>
    <sec id="sec-2">
      <title>Technique Description</title>
      <p>In case of imbalanced data, it is worthwhile to draw attention to the metrics for classifiers’ performance
evaluation. Such metrics as Precision, Unweighted Average Recall (UAR), F-score, Mean Squared Error
(MSE) and other can help us to monitor the situation. Also, confusion matrix can be useful as well.</p>
      <p>In case of binary classification task, confusion matrix in general looks like it is presented in Table 1.
Formulas 1-5 present metrics we mentioned above. Precision focuses on False Positive errors while recall
focuses on False Negative errors. UAR is mean of Recall of class 1 and Recall of class 2. F1-measure is a
harmonic mean between Precision and Recall. MSE measures the average of the squares of the errors, in other
words, the averaged squared difference between true values and predicted values.</p>
      <p>There are two main ways of countering imbalanced data problem. In the first case, we need to remove a
part of majority class objects, it is called undersampling. In the second case, we synthetically increase objects
into minority class, it is called oversampling. It should be pointed out that it is better to use undersampling
when overall number of data is more than tens and hundreds of thousands, and oversampling should be used
when overall number of data is less.</p>
      <p>P recision =</p>
      <p>T rueP ositive
T rueP ositive + T rueN egative</p>
      <sec id="sec-2-1">
        <title>There also exist other techniques of work with imbalanced data:</title>
      </sec>
      <sec id="sec-2-2">
        <title>1. Collect more data;</title>
        <p>2. Use other classifiers. For example, decision trees are good for imbalanced data classification (C4.5, C5.0,</p>
        <p>CART, Random Forest);
3. Use penalized models. Such models add weights to the model that makes wrong predictions to the
minority class (penalized Linear Discriminant Analysis, penalized Support Vector Machines). This approach
is appropriate when it is necessary to use certain classifier and when it is impossible to resample the
dataset;</p>
      </sec>
      <sec id="sec-2-3">
        <title>4. Use another concept such as anomaly detection and change detection.</title>
        <sec id="sec-2-3-1">
          <title>Reducing number of objects in the majority class:</title>
          <p>Random undersampling. This technique implies counting number of objects in the majority class that
should be removed to get optimal balance between classes. After that it randomly removes objects with
or without replacement.</p>
          <p>Tomek links. Technique removes the majority class objects that overlap the minority class objects until
all nearest neighbor pairs are of the same class. This method is widely used to remove noise from data
[Tomek, 2010].</p>
          <p>Condensed Nearest Neighbor Rule. The main aim of using this method is to teach classifier to find
differences between similar objects belonging to different classes [Hart, 1968].</p>
          <p>One-side sampling. This method combines two of the abovementioned methods. At first, it uses
condensed nearest neighbor rule, then Tomek links method is applied [Kubat and Matwin, 1997].
Neighborhood cleaning rule. The aim of this method is to remove all objects that affect adversely on
the minority class objects classification. On the first step it classifies the data using 3-nearest neighbors
method. On the second step it removes correctly classified majority class objects and neighbors of wrong
classified majority class objects [Laurikkala, 2001].</p>
        </sec>
        <sec id="sec-2-3-2">
          <title>Increasing number of objects in the minority class:</title>
          <p>Oversampling. Depending on the required balance between classes, this technique randomly chose
minority class objects for copying or duplication.</p>
          <p>SMOTE (Synthetic Minority Oversampling Technique). As opposite to the abovementioned technique,
SMOTE does not copy or duplicate minority class objects, it creates new similar objects. Using k-nearest
neighbors method, SMOTE takes a vector between some minority class objects, then it multiplies the
vector by random number between 0 and 1. New objects are created by summarization of obtained
value and the initial value. Likelihood level of objects can be regularized by changing the parameter
of k-nearest neighbors, also it is possible to set a number of objects to generate. Disadvantage of the
method is increasing of minority class objects density that could affect on noise in data in case of equally
distributed majority class objects [Chawla et al., 2002].</p>
          <p>ADASYN (Adaptive Synthetic Minority Oversampling). This technique was created similar to SMOTE
but uses density function to automatically identify the number of objects that should be created for
every minority class object. In so doing minority class objects weights change adaptively depending on
the level of difficulty to learn classifier. Hence, new objects are creating mainly in front of minority class
objects that are difficult to learn [Haibo et al., 2008].</p>
          <p>In our case it was the most efficient way to change metrics for classifiers’ performance evaluation, to use
trees and oversampling. We chose the following metrics: Precision, Unweighted Average Recall, F-measure
and Mean Squared Error. We decided to use oversampling of minority class, namely: ADASYN, SMOTE and
two variants of SMOTE – BorderLineSMOTE (finds borderline minority class objects based on which new
objects are creating) [Han et al., 2005] and SVMSMOTE (uses SVM method to identify minority class objects
for creating new objects) [Nguyen et al., 2011].
4</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Experiments</title>
      <p>We used two databases to train and test models: Deceptive Speech Database [Schuller, 2016] and multimodal
Real-Life Trial Deception Detection Dataset [P´erez-Rosas et al., 2015]. Following opensource toolkits were
used: Ffmpeg to extract audio recordings from the multimodal dataset (https://www.ffmpeg.org/), Praat to
preprocess audio data (http://www.fon.hum.uva.nl/praat/), openSMILE to extract 6373 low-level acoustic
features (including pitch, energy, spectral and cepstral features, http://audeering.com/technology/opensmile/),
Scikit-learn (to use implementations of the methods) [Pedregosa et al., 2011].</p>
      <p>Then we oversampled each training set for every 10-cross validation, so testing sets did not contain
synthetic data. Hence, training sets had balanced data. Total number of objects in training and testing sets
were up to 1528 depending on the balance of classes in every training set.</p>
      <p>Four classifiers were chosen according to the previous works [Velichko, Budkov et al., 2018; Velichko,
Budkov et al., 2019]: Bagging with k-Nearest Neighbors as base classifier, k-Nearest Neighbors (k-NN), Support
Vector Classifier (SVC) and Random Forest. Parameters of used methods were found with the use of grid
search and presented in the Table 2. Figure 1 presents the proposed architecture of the system.</p>
      <sec id="sec-3-1">
        <title>Without oversampling</title>
      </sec>
      <sec id="sec-3-2">
        <title>SMOTE</title>
        <p>(nn=3)</p>
      </sec>
      <sec id="sec-3-3">
        <title>SMOTE</title>
        <p>(nn=5)</p>
      </sec>
      <sec id="sec-3-4">
        <title>ADASYN</title>
        <p>(nn=3)</p>
      </sec>
      <sec id="sec-3-5">
        <title>ADASYN</title>
        <p>(nn=5)</p>
      </sec>
      <sec id="sec-3-6">
        <title>SVMSMOTE (nn=3)</title>
      </sec>
      <sec id="sec-3-7">
        <title>SVMSMOTE (nn=5)</title>
      </sec>
      <sec id="sec-3-8">
        <title>BorderLine</title>
      </sec>
      <sec id="sec-3-9">
        <title>SMOTE</title>
        <p>(nn=3)</p>
      </sec>
      <sec id="sec-3-10">
        <title>BorderLine</title>
      </sec>
      <sec id="sec-3-11">
        <title>SMOTE (nn=5)</title>
        <p>4*</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Discussion</title>
      <p>As we can see in tables above, the best results were achieved with the use of Support Vector Classifier and
SMOTE oversampling technique with 3 neighbors. The results of the model were: UAR of 73.5%, mean
F1-score of 75.0% and Precision of 77.0%.</p>
      <p>The proposed deception detection system outperformed the results of base system presented in 2016 and
our previous system [Velichko, Budkov and Karpov, 2018] in terms of UAR by 5.2% and 2.5% accordingly but
underperformed the winners by 1.4% of UAR. Also, we outperformed result of the system presented in 2017
in terms of Precision and F1-score by 0.9% and 11.1% accordingly, see Table 4.</p>
      <p>Results of experiments show that some of the methods proposed such as Random Forest and Support
Vector Classifier can achieve good results with oversampled data and it is promising. Otherwise, such methods
as Bagging and k-Nearest Neighbors did not achieve significant increase in performance with oversampled data,
but worked quite good without sampling techniques.
6</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>In this paper we examined an important paralinguistic problem and deception detection task in particular,
namely, data scarcity problem in classification task. Most popular techniques aimed to counter with imbalanced
data were reviewed, the most suitable approaches for our task were chosen. A set of experiments was conducted
with the use of 10-fold cross-validation method. Four classifiers with parameters found by grid search were
used to find the best oversampling technique for our data. The best results were achieved using combination of
the following methods: Support Vector Classifier with SMOTE oversampling technique, it resulted with 73.5%
in terms of Unweighted Average Recall, mean Precision of 77.0% and mean F1-score of 75.0%. The proposed
system can be used in such fields as banking area, prevention of telephone and online terrorism and fraud, in
polygraph researches etc.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgements References</title>
      <p>This research is supported by the Russian Science Foundation (project No. 18-11-00145).
[Schuller and Weninger, 2012] Schuller B., Weninger F.(2012) Ten Recent Trends in Computational
Paralinguistics. In: Esposito A., Esposito A.M., Vinciarelli A., Hoffmann R., Mu¨ller V.C. (eds) Cognitive
Behavioural Systems. Lecture Notes in Computer Science, vol 7403. Springer, Berlin, Heidelberg.
[Chow and Louie, 2017] Chow A., and Louie J. N.(2017) Detecting Lies via Speech Patterns.
[Cummins et al., 2014] Cummins, N., Epps, J., Sethu, V., Krajewski, J. (2014) Variability compensation in
small data: Oversampled extraction of i-vectors for the classification of depressed speech. ICASSP,
IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. 970-974.
10.1109/ICASSP.2014.6853741.
[Ah-Pine and Soriano-Morales, 2016] Ah-Pine, J., Soriano-Morales, E-P. (2016) A Study of Synthetic
Oversampling for Twitter Imbalanced Sentiment Analysis. Workshop on Interactions between Data Mining
and Natural Language Processing (DMNLP), Sep 2016, Riva del Garda, Italy. hal-01504684.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <source>[Kaya and Karpov</source>
          , 2017] Kaya,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Karpov</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.A.</surname>
          </string-name>
          (
          <year>2017</year>
          )
          <article-title>Introducing Weighted Kernel Classifiers for Handling Imbalanced Paralinguistic Corpora: Snoring, Addressee and Cold</article-title>
          .
          <source>In: Proceedings of INTERSPEECH2017</source>
          , Stockholm, Sweden, pp.
          <fpage>3527</fpage>
          -
          <lpage>3531</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [Akhtiamov et al.,
          <year>2019</year>
          ] Akhtiamov,
          <string-name>
            <given-names>O.</given-names>
            ,
            <surname>Siegert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            ,
            <surname>Karpov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Minker</surname>
          </string-name>
          ,
          <string-name>
            <surname>W.</surname>
          </string-name>
          (
          <year>2019</year>
          )
          <article-title>Cross-Corpus Data Augmentation for Acoustic Addressee Detection</article-title>
          .
          <source>In: Proceedings of the SIGDial 2019 Conference</source>
          , Stockholm, Sweden, pp.
          <fpage>274</fpage>
          -
          <lpage>283</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [Zhang et al.,
          <year>2018</year>
          ] Zhang, X., Cheng, X.,
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Fang</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <surname>T.</surname>
          </string-name>
          (
          <year>2018</year>
          )
          <article-title>Imbalance Learning-based Framework for Fear Recognition in the MediaEval Emotional Impact of Movies Task</article-title>
          .
          <source>In: Proceedings of INTERSPEECH-2018</source>
          , Hydebarad, India, pp.
          <fpage>3678</fpage>
          -
          <lpage>3682</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [Ashihara et al.,
          <year>2019</year>
          ] Ashihara,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Shinohara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Sato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Moriya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Matsui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Fukutomi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Yamaguchi</surname>
          </string-name>
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Aono</surname>
          </string-name>
          ,
          <string-name>
            <surname>Y.</surname>
          </string-name>
          (
          <year>2019</year>
          )
          <article-title>Neural Whispered Speech Detection with Imbalanced Learning</article-title>
          .
          <source>In: Proceedings of INTERSPEECH-2019</source>
          , Graz, Austria, pp.
          <fpage>3352</fpage>
          -
          <lpage>3356</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [Mendels et al.,
          <year>2017</year>
          ] Mendels,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Levitan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.I.</given-names>
            ,
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Hirschberg</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          (
          <year>2017</year>
          )
          <article-title>Hybrid acoustic-lexical deep learning approach for deception detection</article-title>
          .
          <source>In: Proceedings of INTERSPEECH-2017</source>
          , Stockholm, Sweden, pp.
          <fpage>1472</fpage>
          -
          <lpage>1476</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [Montaci´e et al.,
          <year>2016</year>
          ] Montaci´e,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Caraty</surname>
          </string-name>
          , M.
          <article-title>-</article-title>
          <string-name>
            <surname>J.</surname>
          </string-name>
          (
          <year>2016</year>
          )
          <article-title>Prosodic Cues and Answer Type Detection for the Deception Sub-Challenge</article-title>
          .
          <source>In: Proceedings of INTERSPEECH-2016</source>
          , San Francisco, USA, pp.
          <fpage>2016</fpage>
          -
          <lpage>2020</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <source>[Schuller</source>
          , 2016] Schuller,
          <string-name>
            <surname>B.</surname>
          </string-name>
          (
          <year>2016</year>
          )
          <article-title>The INTERSPEECH 2016 Computational Paralinguistics Challenge: Deception, Sincerity and Native Language In:</article-title>
          <source>Proceedings of INTERSPEECH-2016</source>
          , San Francis-co, USA, pp.
          <fpage>2001</fpage>
          -
          <lpage>2005</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <source>[Tomek</source>
          , 2010] Tomek,
          <string-name>
            <surname>I.</surname>
          </string-name>
          (
          <year>2010</year>
          )
          <article-title>Two modifications of CNN In Systems</article-title>
          , Man, and
          <string-name>
            <surname>Cybernetics</surname>
          </string-name>
          , IEEE Transactions on, vol.
          <volume>6</volume>
          , pp
          <fpage>769</fpage>
          -
          <lpage>772</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <source>[Hart</source>
          , 1968] Hart,
          <string-name>
            <surname>P.</surname>
          </string-name>
          (
          <year>1968</year>
          )
          <article-title>The condensed nearest neighbor rule In Information Theory</article-title>
          , IEEE Transactions on, vol.
          <volume>14</volume>
          (
          <issue>3</issue>
          ), pp.
          <fpage>515</fpage>
          -
          <lpage>516</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <source>[Kubat and Matwin</source>
          , 1997] Kubat,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Matwin</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          (
          <year>1997</year>
          )
          <article-title>Addressing the curse of imbalanced training sets: one-sided selection In ICML</article-title>
          , vol.
          <volume>97</volume>
          , pp.
          <fpage>179</fpage>
          -
          <lpage>186</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <source>[Laurikkala</source>
          , 2001] Laurikkala,
          <string-name>
            <surname>J.</surname>
          </string-name>
          (
          <year>2001</year>
          )
          <article-title>Improving identification of difficult small classes by balancing class distribution</article-title>
          . Springer Berlin Heidelberg.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [Chawla et al.,
          <year>2002</year>
          ] Chawla,
          <string-name>
            <given-names>N. V.</given-names>
            ,
            <surname>Bowyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. W.</given-names>
            ,
            <surname>Hall</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. O.</given-names>
            ,
            <surname>Kegelmeyer</surname>
          </string-name>
          ,
          <string-name>
            <surname>W. P.</surname>
          </string-name>
          (
          <year>2002</year>
          )
          <article-title>SMOTE: synthetic minority over-sampling technique</article-title>
          <source>Journal of artificial intelligence research</source>
          ,
          <fpage>321</fpage>
          -
          <lpage>357</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [Haibo et al.,
          <year>2008</year>
          ] He,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Edwardo</surname>
          </string-name>
          <string-name>
            <given-names>A.</given-names>
            <surname>Garcia</surname>
          </string-name>
          ,
          <string-name>
            <surname>and Shutao L.</surname>
          </string-name>
          (
          <year>2008</year>
          )
          <article-title>ADASYN: Adaptive synthetic sampling approach for imbalanced learning</article-title>
          <source>In IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence)</source>
          , pp.
          <fpage>1322</fpage>
          -
          <lpage>1328</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [Han et al.,
          <year>2005</year>
          ] Han,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Wen-Yuan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            ,
            <surname>Bing-Huan</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          (
          <year>2005</year>
          )
          <article-title>Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning Advances in intelligent computing</article-title>
          ,
          <volume>878</volume>
          -
          <fpage>887</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [Nguyen et al.,
          <year>2011</year>
          ] Nguyen,
          <string-name>
            <given-names>H. M.</given-names>
            ,
            <surname>Cooper</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. W.</given-names>
            ,
            <surname>Kamei</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.</surname>
          </string-name>
          (
          <year>2011</year>
          )
          <article-title>Borderline over-sampling for imbalanced data classification</article-title>
          <source>International Journal of Knowledge Engineering and Soft Data Paradigms</source>
          ,
          <volume>3</volume>
          (
          <issue>1</issue>
          ), pp.
          <fpage>4</fpage>
          -
          <lpage>21</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [P´
          <string-name>
            <surname>erez-Rosas</surname>
          </string-name>
          et al.,
          <year>2015</year>
          ] P´
          <string-name>
            <surname>erez-Rosas</surname>
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Abouelenien</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mihalcea</surname>
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Burzo</surname>
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2015</year>
          )
          <article-title>Deception detection using real-life trial data</article-title>
          <source>Proceedings of ACM</source>
          , pp.
          <fpage>59</fpage>
          -
          <lpage>66</lpage>
          . doi:
          <volume>10</volume>
          .1145/2818346.2820758.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [Pedregosa et al.,
          <year>2011</year>
          ] Pedregosa
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Varoquaux</surname>
          </string-name>
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Gramfort</surname>
          </string-name>
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Michel</surname>
          </string-name>
          <string-name>
            <given-names>V.</given-names>
            ,
            <surname>Thirion</surname>
          </string-name>
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Grisel</surname>
          </string-name>
          <string-name>
            <given-names>O.</given-names>
            ,
            <surname>Blondel</surname>
          </string-name>
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Prettenhofer</surname>
          </string-name>
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Weiss</surname>
          </string-name>
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Dubourg</surname>
          </string-name>
          <string-name>
            <given-names>V.</given-names>
            ,
            <surname>Vanderplas</surname>
          </string-name>
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Passos</surname>
          </string-name>
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Cournapeau</surname>
          </string-name>
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Brucher</surname>
          </string-name>
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Perrot</surname>
          </string-name>
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Duchesnay</surname>
          </string-name>
          <string-name>
            <surname>E.</surname>
          </string-name>
          (
          <year>2011</year>
          )
          <article-title>Scikit-learn: Machine Learning in</article-title>
          <source>Python Journal of Machine Learning Research</source>
          , vol.
          <volume>12</volume>
          , pp.
          <fpage>2825</fpage>
          -
          <lpage>2830</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [Velichko, Budkov et al.,
          <year>2018</year>
          ]
          <string-name>
            <surname>Velichko</surname>
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Budkov</surname>
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kagirov</surname>
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karpov</surname>
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2018</year>
          )
          <article-title>Comparative Analysis of Classification Methods for Automatic Deception Detection in Speech</article-title>
          <source>In Proc. 20th International Conference on Speech and Computer SPECOM-2018</source>
          , Springer, LNAI vol.
          <volume>11096</volume>
          , pp.
          <fpage>737</fpage>
          -
          <lpage>746</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [Velichko, Budkov et al.,
          <year>2019</year>
          ]
          <string-name>
            <surname>Velichko</surname>
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Budkov</surname>
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kagirov</surname>
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karpov</surname>
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2020</year>
          )
          <article-title>Applying Ensemble Learning Techniques and Neural Networks to Deceptive and Truthful Information Detection Task in the Flow of Speech In: Kotenko I</article-title>
          .,
          <string-name>
            <surname>Badica</surname>
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Desnitsky</surname>
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>El Baz</surname>
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ivanovic</surname>
            <given-names>M</given-names>
          </string-name>
          .
          <article-title>(eds) Intelligent Distributed Computing XIII</article-title>
          .
          <article-title>IDC 2019</article-title>
          .
          <article-title>Studies in Computational Intelligence</article-title>
          , vol
          <volume>868</volume>
          . Springer, Cham.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [Velichko et al.,
          <year>2018</year>
          ]
          <string-name>
            <surname>Velichko</surname>
            <given-names>A.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Budkov</surname>
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Yu</surname>
          </string-name>
          .,
          <string-name>
            <surname>Karpov</surname>
            <given-names>A.A.</given-names>
          </string-name>
          (
          <year>2018</year>
          )
          <article-title>Issledovanie metodov klassifikatsii dlya avtomaticheskogo opredeleniya istinnoi ili lozhnoi informatsii v rechevykh soobshcheniyakh [Study of classification methods for automatic truth and deception detection in speech] Nauchnyi vestnik Novosibirskogo gosudarstvennogo tekhnicheskogo universiteta - Science bulletin of the Novosibirsk state technical university</article-title>
          , no.
          <volume>3</volume>
          (
          <issue>72</issue>
          ), pp.
          <fpage>21</fpage>
          -
          <lpage>32</lpage>
          . doi:
          <volume>10</volume>
          .17212/1814-1196-2018-3-
          <fpage>21</fpage>
          -32 (In Rus.).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>