=Paper= {{Paper |id=Vol-3315/paper14 |storemode=property |title=Uzbek Sentiment Analysis Based on Local Restaurant Reviews |pdfUrl=https://ceur-ws.org/Vol-3315/paper14.pdf |volume=Vol-3315 |authors=Sanatbek Matlatipov,Hulkar Rahimboeva,Jaloliddin Rajabov,Elmurod Kuriyozov }} ==Uzbek Sentiment Analysis Based on Local Restaurant Reviews== https://ceur-ws.org/Vol-3315/paper14.pdf
Uzbek Sentiment Analysis Based on Local Restaurant
Reviews
Sanatbek Matlatipov1 , Hulkar Rahimboeva1 , Jaloliddin Rajabov1 and
Elmurod Kuriyozov2
1
 National University of Uzbekistan named after Mirzo Ulugbek, 4 Universitet St, Tashkent, 100174, Uzbekistan
2
  Universidade da Coruña, CITIC, Grupo LYS, Depto. de Computación y Tecnologías de la Información, Facultade de
Informática, Campus de Elviña, A Coruña 15071, Spain


                                         Abstract
                                         Extracting useful information for sentiment analysis and classification problems from a big amount of
                                         user-generated feedback, such as restaurant reviews, is a crucial task of natural language processing,
                                         which is not only for customer satisfaction where it can give personalized services, but can also influence
                                         the further development of a company. In this paper, we present a work done on collecting restaurant
                                         reviews data as a sentiment analysis dataset for the Uzbek language, a member of the Turkic family
                                         which is heavily affected by the low-resource constraint, and provide some further analysis of the novel
                                         dataset by evaluation using different techniques, from logistic regression based models, to support vector
                                         machines, and even deep learning models, such as recurrent neural networks, as well as convolutional
                                         neural networks. The paper includes detailed information on how the data was collected, how it was
                                         pre-processed for better quality optimization, as well as experimental setups for the evaluation process.
                                         The overall evaluation results indicate that by performing pre-processing steps, such as stemming for
                                         agglutinative languages, the system yields better results, eventually achieving 91% accuracy result in the
                                         best performing model.

                                         Keywords
                                         Sentiment Analysis, Uzbek Language, Dataset, Support Vector Machine, RNN, CNN,




1. Introduction
The power of Natural Language Processing (NLP) techniques relies on amounts of labelled data
in many applications. Sentiment analysis is the process of analyzing and labelling the opinion
which is posted by consumers. Consumers usually post their feedback about places/foods to
famous applications such as Google Maps 1 , Yelp2 , etc). They often encourage consumers to
actively participate in reviews, and massive user-generated restaurant reviews allow consumers
to fully express their needs while helping merchants provide real-time and personalized service
The International Conference and Workshop on Agglutinative Language Technologies as a challenge of Natural
Language Processing (ALTNLP), June 7-8, Koper, Slovenia
$ s.matlatipov@nuu.uz (S. Matlatipov); h.rahimboyeva@nuu.uz (H. Rahimboeva); j.rajabov@nuu.uz (J. Rajabov);
e.kuriyozov@udc.es (E. Kuriyozov)
€ https://sanatbek.uz/ (S. Matlatipov)
 0000-0002-6895-3436 (S. Matlatipov); 0000-0002-3259-7708 (H. Rahimboeva); 0000-0002-0369-6707 (J. Rajabov);
0000-0003-1702-1222 (E. Kuriyozov)
                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings         CEUR Workshop Proceedings (CEUR-WS.org)
                  http://ceur-ws.org
                  ISSN 1613-0073




                  1
                    Google Maps: https://www.google.com/maps
                  2
                    Yelp: https://www.yelp.com
[1]. Moreover, the restaurant reviews express the composition of clients’ emotional necessities
and are an important source of information about consumers’ choices [2]. Currently, opinion
mining has achieved very high accuracy performances, especially after applying deep learning
methods, for high resource languages [3]. However, applying deep learning and machine
learning techniques for different types of domains [4] and gathering corpora with high quantity
play an important[5] role in the development of low-resource languages. For example, the
language we focus on is the Uzbek language which is being used by around 34 million native
speakers in Uzbekistan and elsewhere in Central Asia and China3 . Uzbek is a null-subject and
highly agglutinative language where one word can be a meaningful sentence[6, 7]. To our
knowledge, there is no previous work for sentiment classification problems based on restaurant
domain feedback. So, the following contributions are considered for this paper.

    • Restaurant domain annotated corpora is created for sentiment analysis which
      is collected from Google Maps based on Uzbek cuisine’s locations where local national
      food reviews are the primary target. The corpora contain 4500 positive and 3710 negative
      reviews after manually removing major errors and cleaning. The annotation process is
      based on the feedback’s 5 stars method provided by Google Maps where from 1 to 3 we
      consider the dataset as negative and from 4 to 5 as positive. We found some reviews are
      based on other languages such as English, Kyrgyz and Russian. We didn’t want to ignore
      them, so we decided to translate them into Uzbek using the official Google Translate API.
    • Pre-processing the corpora is applied in two steps. The first steps are removing
      URLs, punctuation, and lower-casing. The second step is ignoring stopwords[8] from the
      dataset where it is based on accuracy evaluation after generating the list of stop words
      using the TF-IDF algorithm; Then, we applied the stemming algorithm [7, 9] which is
      based on Uzbek words’ endings’ electronic dictionary that uses combinatorial approach
      inferring apply for part of speech of the Uzbek language: nouns, adjectives, numerals,
      verbs, participles, moods, voices. Advantages of using the algorithm are lexicon-free and
      its complexity that allows one operation (referring to the dictionary of endings of the
      language) to perform: segmentation of the word into suffixes; performs morphological
      analysis of the word.
    • Machine learning and deep learning algorithms have been applied. Furthermore,
      deep learning(Recurrent neural network) algorithm fed with fastText4 pre-trained word
      embedding is applied to improve the accuracy;

All resources including the corpora, source code used for crawling techniques and classification
algorithms are uploaded to the public repository 5 . The paper is structured as follows: Intro-
duction(this section), Section 2 describes related work that has been done so far. It is followed
by a description of the methodology in Section 3 and continues with Section 4, which focuses
on experiments and results. The final part (Section 6) concludes the paper and highlights the
future work.


   3
     https://en.wikipedia.org/wiki/Uzbek_language
   4
     https://fasttext.cc/docs/en/crawl-vectors.html
   5
     https://github.com/SanatbekMatlatipov/restauranat-sentiment/tree/main
Figure 1: Research framework.


2. Related Work
In recent years, several works were done in the NLP field for Uzbek, including sentiment analysis
datasets [10, 11], created by collecting and analyzing Google Play app reviews, with two types
of data: A medium-size manually annotated dataset and a larger-size dataset automatically
translated from English. [12] obtained bilingual dictionaries for six Turkic languages and applied
them to cross-lingually align word embeddings, backed by a bilingual dictionary induction
evaluation task. They showed that obtained aligned word embeddings from a low-resource
language can benefit from resource-rich closely-related languages. Another similar paper
[13] investigated the effect of emoji-based features in opinion classification of Uzbek texts. A
semantic evaluation dataset was presented with semantic similarity and relatedness scores in
word pairs as well as its analysis for Uzbek in a recent work [14]. There is a very recent growing
trend in NLP that makes use of AI-based techniques, which can be seen in the work on Uzbek
with neural transformers-architecture based language model trained off raw Uzbek corpus [15].
   In a global outlook to the field of sentiment analysis, there is a work [16] that used various
methods of sentiment analysis techniques, such as machine learning and deep learning, in their
work with an idea to take into account the differences in opinions and thoughts that exist on
popular social platforms such as Twitter, Reddit, Tumblr and Facebook.


3. Methodology
In this paper, we proposed a machine learning and deep learning-based sentiment analysis
framework for the restaurant domain dataset (Figure 1). The framework includes data collection
using web-crawler, pre-processing(cleaning, stopwords, lexicon-free stemming), constructing
TF-IDF weight matrix, performing ML and DL for sentiment analysis;
Figure 2: Feedback sample


3.1. Data collection
We start by looking at a high number of the dataset available for crawling in the Uzbek language.
However, the usual approaches such as Twitter or movie reviews are not the case for Uzbek.
Therefore, we decided to collect restaurant reviews as local people mostly loved giving feedback
which is restaurants. we think it makes sense as Uzbek cuisines are one of the most popular
throughout the Commonwealth of Independent States (CIS, CA countries). In most CA cities,
for instance, it’s easy to find busy restaurants specializing in Uzbek cuisine6 . We crawled all
local restaurants in Tashkent from Google Maps. Firstly, we selected a list of more than 140
URLs which has at least 3 reviews and we retrieved all info shown in Figure 2. While crawling,
we considered Google’s anti-spam and anti-DDOS policies as there are certain limitations on
harvesting data. The source code is available on the repository.

3.2. Data pre-processing
The collection of texts with star ratings in the crawled dataset was noisy and required manual
correction. The comments containing only emojis, names or any other irrelevant content, such
as username mentions, URLs or specific app names were removed. Those written in languages
different from Uzbek (mostly in Russian and some in English) were translated using the official
Google translate API. Although people in Uzbekistan use the official Latin alphabet, the use of
the old Cyrillic alphabet is equally popular, especially among adults. The comments that were
written in Cyrillic were converted to Latin using the Uzbek machine transliteration tool [17].
Then, we applied stop words to remove low-level information words from our comments to focus
on important information. The technique is based on [8] paper where it is a proposed algorithm
of automatic detection of single word stop words collection using TFIDF(Term frequency -
inverse document frequency). After that, each word is processed to lexicon-free stemming tool
[7] algorithm for decreasing the word capacity because of prefixes and suffixes. The basic idea
is using the combinatorial approach of eligible endings candidates. Following table 1 shows
processed data which is ready for TFIDF-vectorizer.
   We selected a set of words to visualize the word count. Figure 3 shows that people tend to
give more positive feedback than negative on the domain of restaurants.

   6
       BBc Travel: https://www.bbc.com/travel/article/20191117-is-uzbek-cuisine-actually-to-die-for
Table 1
The example of a chosen review before and after processing it.
                            Review                                                  After processing
 Birinchi Milliy taomlardan biri - keng assortimentli taomlar!         Bir/ milliy/ taom/ keng/ assortiment/
 Gastro-turistlar uchun juda jozibali joy - bu yerda barcha            taom/ gastro/ turist/ juda/ joziba/ joy/
 turdagi milliy taomlar mavjud.Yagona salbiy tomoni shundaki,          tur/ milliy/ taom/ mavjud/ salbiy/ tomon/
 bunday yirik muassasa uchun to’xtash joyi kichik. Narxlar             yirik/ muassa/ to’xta/ joy/ kichik/ narx/
 nisbatan arzon! Turistlar uchun juda arzon!                           arzon/ turist/ arzon




Figure 3: The visualisation of some selected examples of Uzbek words taken from positive and negative
reviews with their log counts.


4. Evaluation
The collected novel dataset has been split into training and testing subsets for evaluation with 8
x 2 ratio respectively. After the data cleaning process, we have the original dataset as follows,
where 𝑥⃗𝑖 represents feature vectors and 𝑦⃗𝑖 represents annotated labels:

                               (𝑥⃗𝑖 , 𝑦𝑖 ),               𝑖 = 1, 2, 3, ..., 𝑁                           (1)
                          𝑥⃗𝑖 = (𝑥𝑖1 , 𝑥𝑖2 , ..., 𝑥𝑖𝑚 )       𝑖 = 1, 2, 3, ..., 𝑁                       (2)
𝑁 and 𝑚 is equal to the number of reviews and length of the feature vector, respectively.
   Then we calculate TFIDF scores for each feature vector 𝑥⃗𝑖 which vectorises words by taking
into account the frequency of a word in a given review and the frequency between reviews.
The final result of all 𝑧⃗𝑖 s is defined as a sparse matrix.

                          𝑧⃗𝑖 = 𝑇 𝐹 (𝑥𝑖 )𝑥𝐼𝐷𝐹 (𝑥𝑖 )            𝑖 = 1, 2, 3, ..., 𝑁           (3)

4.1. Machine learning algorithms
The \Logistic regression model is

                                     ℎ(𝑧⃗ ) = 1/(1 + exp(−𝑧))                                (4)
                                     {︃
                                       ℎ(𝑧 ⃗ ),      if 𝑦 = +1(𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒)
                           𝑃 (𝑦|𝑧
                                ⃗) =
                                       1 − ℎ(𝑧  ⃗ ), if 𝑦 = −1(𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒)
Logistic regression[18] model is a classification algorithm, known for its exponential and log-
linear functions. It works with discrete values and maps the function of any real value into 0
and 1. For sentiment analysis, the hypothesis shows, reviews are either positive or negative by
using the (4). The Support Vector Machine(SVM) model has the following response function:

                                           ℎ(𝑧
                                             ⃗ ) = 𝑠𝑖𝑔𝑛(𝑧
                                                        ⃗)                                   (5)

SVM algorithm is known for its fast and dependable classification which resolves two-group
classification problems. The classification is conducted for finding a hyperplane between
two classes’ positive and negative reviews in the model: After all, we implemented LR and
SVM models utilizing the Scikit-Learn [19] machine learning library in Python with default
configuration parameters. For the LR models, we implemented a variant based on word n-grams
(unigrams and bigrams), and one with character n-grams (with 𝑛 ranging from 1 to 4). We also
tested a model combining said word and character n-gram features.

4.2. Deep Learning algorithms
Keras [20] is used on top of TensorFlow [21].The FastText pre-trained word embeddings of size
300 [22] for the Uzbek language are applied. For the CNN model, we used a multi-channel CNN
with 256 filters and three parallel channels with kernel sizes of 2,3 and 5, and drop out of 0.3.
The output of the hidden layer is the concatenation of the max-pooling of the three channels.
For RNN, we use a bidirectional network of 100 GRUs. The output of the hidden layer is the
concatenation of the average and max-pooling of the hidden states. For the combination of deep
learning models, we stacked the CNN on top of the GRU. In the three cases, the final output is
obtained through a sigmoid activation function [23] applied to the previous layer. In all cases,
Adam optimization algorithm [24], an extension of stochastic gradient descent, was chosen
for training, with standard parameters: learning rate 𝛼 = 0.0001 and exponential decay rates
𝛽1 = 0.9 and 𝛽2 = 0.999. Binary cross-entropy was used as a loss function. The same steps,
but slightly different parameters were used in a work that presents guidance to use CNN for
sentiment classification [25]. Inspired by their example that perfectly illustrates the steps of
performing deep learning based sentiment classification using CNN, the visualisation of our
steps can be seen in Figure 4.
Figure 4: The illustration of steps taken in deep learning based sentiment classification using CNN,
inspired by [25].


4.3. Evaluation metrics
Confusion[26] matrices are used in the task to determine the gap between predicted and true
values which is shown in Table 2. Precision, Recall and F1-score are used as evaluation metrics
for model performance.

Table 2
Confusion matrix
                         Classes        Positive               Negative
                        Positive    True Positive(TP)     False Negative(FN)
                        Negative    False Positive(TP)    True Negative(FN)

  The calculation of Precision and Recall is shown below:
                                      𝑇𝑃                              𝑇𝑃
                     𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =                        𝑅𝑒𝑐𝑎𝑙𝑙 =                               (6)
                                    𝑇𝑃 + 𝐹𝑃                         𝑇𝑃 + 𝐹𝑁
The F1-Score is used, which takes into account both accuracy and recall, and the F1-Score is
calculated as follows:
                                     2 * 𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 * 𝑅𝑒𝑐𝑎𝑙𝑙
                               𝐹1 =                                                      (7)
                                       𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙
5. Results and Discussion
This section presents a detailed description of the results obtained by the evaluation process
using both machine learning and deep learning techniques applied to the collected novel
sentiment analysis dataset of restaurant reviews.

5.1. Experiment Results
The overall experiment results of the above-mentioned evaluation were performed, and the
results can be seen in Table 3.

Table 3
Experiment results of sentiment analysis for all evaluation techniques, including models, their distinctive
parameters, as well as evaluation metrics, such as Precision (Prec.), Recall (Rec.), F1-score (F1), as well
as Accuracy (Acc.).
       Model name + Parameters                         Sentiment      Prec.    Rec.    F1     Acc.
                                                       Positive        88%     98%     93%
       Logistic Regression based on word n-grams                                               89%
                                                       Negative        88%     67%     74%
                                                       Positive        87%     51%     92%
       Logistic Regression based on char. n-grams                                              87%
                                                       Negative        83%     97%     64%
                                                       Positive        95%     95%     92%
       Logistic Regression (word + char. n-grams)                                              91%
                                                       Negative        90%     89%     90%
                                                       Positive        88%     97%     92%
       SVM based on linear kernel                                                              88%
                                                       Negative        84%     71%     80%
                                                       Positive        90%     95%     92%
       RNN without word embeddings                                                             88%
                                                       Negative        78%     64%     70%
                                                       Positive        90%     95%     93%
       RNN with word embeddings                                                                88%
                                                       Negative        80%     65%     72%
                                                       Positive        90%     96%     93%
       CNN (multichannel)                                                                      89%
                                                       Negative        83%     64%     72%

   The Logistic Regression(LR) based on word n-grams obtained a binary classification accuracy
of 90% on the dataset, while the one based on character n-grams, with its better handling
of misspelt words, improved it to 91%(which is the winner of this comparison). Support
Vector machines based on Linear kernel mode have shown 88% accuracy overall. Recurrent
Neural network models without and with fastText embedding show the same accuracy (88%).
Convectional Neural Network showed slightly less performance(89.23%) than LR. However,
this is the reason for lacking data for neural-network models, as it requires big data for better
performance.

5.2. Discussion and limitations
Nowadays, unstructured data are becoming more and more in the restaurant domain which
requires performing high accuracy sentiment analysis. Especially, this is the case for low-
resource languages. Based on the review data of Google Maps(Tashkent location) which is
obtained by web-crawling, the paper has shown several ML& DL methods. It was observed that
the LR algorithm outperforms the others which makes sense as our dataset is relatively small.
The research also mentioned some theoretical and practical implications. We believe, in terms
of gaining massive user reviews on the domain can provide consumers make their decision in
the best manner such as lower cost and faster speed. However, we also wanted to point out
some limitations in this research paper. The dataset we gathered has an unbalanced number
of positive and negative reviews, which can cause deviations in the result. Moreover, we used
the review rating in the annotation process which sometimes, in reality, consumers may give a
high rating score, but polarity context may be related to negative, and vice versa.


6. Conclusion
In this paper, we have shown a novel dataset in the restaurant domain for the Uzbek language,
with 8210 reviews, annotated with positive or negative labels, which is crawled from Google
Maps using URLs of all locations in the capital city Tashkent, and was labelled as their corre-
sponding star score. Then, we applied full pre-processing steps to the dataset which contributed
to increasing the accuracy of our baseline models. Further analysis of the collected dataset was
shown with evaluations using both machine learning and deep learning techniques. The best
accuracy result (91%) on the dataset was obtained using a logistic regression model with word
and character n-grams.
   In the foreseen future, we are planning to extend the work by collecting more data, which
can effectively analyze the restaurant reviews in a practical level. Also, the work is underway
to remove the evaluation bias of the training experiments by using cross-validation methods in
data splitting.


Acknowledgments
This work partially has received funding from ERDF/MICINN-AEI (SCANNER-UDC, PID2020-
113230RB-C21), and from Centro de Investigación de Galicia ”CITIC”, funded by Xunta de
Galicia and the European Union (ERDF - Galicia 2014-2020 Program), by grant ED431G 2019/01.
Elmurod Kuriyozov was funded for his PhD by El-Yurt-Umidi Foundation under the Cabinet of
Ministers of the Republic of Uzbekistan.


References
 [1] R. A.-I. Rafael Anaya-Sánchez, Sebastian Molinillo, F. Liébana-Cabanillas, Improving
     travellers’ trust in restaurant review sites, Tourism Review 74 (2019) 830–840. doi:10.
     1108/TR-02-2019-0065.
 [2] E. Marine-Roig, S. A. Clave, A method for analysing large-scale UGC data for tourism:
     Application to the case of catalonia, in: Information and Communication Technologies in
     Tourism 2015, Springer International Publishing, Cham, 2015, pp. 3–17.
 [3] J. Barnes, R. Klinger, S. S. i. Walde, Assessing state-of-the-art sentiment models on state-
     of-the-art sentiment datasets, arXiv preprint arXiv:1709.04219 (2017).
 [4] L. Zhang, S. Wang, B. Liu, Deep learning for sentiment analysis: A survey, Wiley
     Interdiscip. Rev. Data Min. Knowl. Discov. 8 (2018). URL: https://doi.org/10.1002/widm.1253.
     doi:10.1002/widm.1253.
 [5] M. Artetxe, I. Aldabe, R. Agerri, O. Perez-de Viñaspre, A. Soroa, Does corpus quality
     really matter for low-resource languages?, 2022. URL: https://arxiv.org/abs/2203.08111.
     doi:10.48550/ARXIV.2203.08111.
 [6] G. Matlatipov, Z. Vetulani, Representation of Uzbek morphology in prolog, in: Aspects of
     Natural Language Processing. Lecture Notes in Computer Science, volume 5070, Springer,
     2009.
 [7] S. Matlatipov, U. Tukeyev, M. Aripov, Towards the uzbek language endings as a language
     resource, in: M. Hernes, K. Wojtkiewicz, E. Szczerbicki (Eds.), Advances in Computational
     Collective Intelligence, Springer International Publishing, Cham, 2020, pp. 729–740.
 [8] K. Madatov, S. Bekchanov, J. Vičič, Automatic detection of stop words for texts in the
     uzbek language, 2022.
 [9] U. Tukeyev, A. Turganbayeva, B. Abduali, D. Rakhimova, D. Amirova, A. Karibayeva,
     Lexicon-free stemming for kazakh language information retrieval, in: 2018 IEEE 12th
     International Conference on Application of Information and Communication Technologies
     (AICT), 2018, pp. 1–4. doi:10.1109/ICAICT.2018.8747021.
[10] I. Rabbimov, S. Kobilov, I. Mporas, Opinion classification via word and emoji embedding
     models with lstm, in: International Conference on Speech and Computer, Springer, 2021,
     pp. 589–601.
[11] E. Kuriyozov, S. Matlatipov, Building a new sentiment analysis dataset for uzbek language
     and creating baseline models, in: Multidisciplinary Digital Publishing Institute Proceedings,
     volume 21, 2019, p. 37.
[12] E. Kuriyozov, Y. Doval, C. Gómez-Rodríguez, Cross-lingual word embeddings for Turkic
     languages, in: Proceedings of the 12th Language Resources and Evaluation Conference,
     European Language Resources Association, Marseille, France, 2020, pp. 4054–4062. URL:
     https://aclanthology.org/2020.lrec-1.499.
[13] I. Rabbimov, I. Mporas, V. Simaki, S. Kobilov, Investigating the effect of emoji in opinion
     classification of uzbek movie review comments, in: A. Karpov, R. Potapova (Eds.), Speech
     and Computer, Springer International Publishing, Cham, 2020, pp. 435–445.
[14] U. Salaev, E. Kuriyozov, C. Gómez-Rodríguez, Simreluz: Similarity and relatedness scores
     as a semantic evaluation dataset for uzbek language, arXiv preprint arXiv:2205.06072
     (2022).
[15] B. Mansurov, A. Mansurov, Uzbert: pretraining a bert model for uzbek, arXiv preprint
     arXiv:2108.09814 (2021).
[16] Y. Chandra, A. Jana, Sentiment analysis using machine learning and deep learning, in:
     2020 7th International Conference on Computing for Sustainable Global Development
     (INDIACom), 2020, pp. 1–4. doi:10.23919/INDIACom49435.2020.9083703.
[17] U. Salaev, E. Kuriyozov, C. Gómez-Rodríguez, A machine transliteration tool between
     uzbek alphabets, arXiv preprint arXiv:2205.09578 (2022).
[18] E. Christodoulou, J. Ma, G. S. Collins, E. W. Steyerberg, J. Y. Verbakel, B. Van Calster,
     A systematic review shows no performance benefit of machine learning over logistic
     regression for clinical prediction models, Journal of Clinical Epidemiology 110 (2019) 12–22.
     URL: https://www.sciencedirect.com/science/article/pii/S0895435618310813. doi:https:
     //doi.org/10.1016/j.jclinepi.2019.02.004.
[19] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel,
     P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher,
     M. Perrot, E. Duchesnay, Scikit-learn: Machine learning in Python, Journal of Machine
     Learning Research 12 (2011) 2825–2830.
[20] F. Chollet, et al., Keras, https://github.com/fchollet/keras, 2015.
[21] M. Abadi, et al., TensorFlow: Large-scale machine learning on heterogeneous systems,
     2015. URL: https://www.tensorflow.org/, software available from tensorflow.org.
[22] E. Grave, P. Bojanowski, P. Gupta, A. Joulin, T. Mikolov, Learning word vectors for 157
     languages, in: Proceedings of the International Conference on Language Resources and
     Evaluation (LREC 2018), 2018.
[23] A. C. Marreiros, J. Daunizeau, S. J. Kiebel, K. J. Friston, Population dynamics: variance and
     the sigmoid activation function, Neuroimage 42 (2008) 147–157.
[24] D. P. Kingma, J. Ba, Adam: A method for stochastic optimization, arXiv preprint
     arXiv:1412.6980 (2014).
[25] Y. Zhang, B. Wallace, A sensitivity analysis of (and practitioners’ guide to) convolutional
     neural networks for sentence classification, arXiv preprint arXiv:1510.03820 (2015).
[26] M. Sokolova, G. Lapalme, A systematic analysis of performance measures for classi-
     fication tasks, Information Processing & Management 45 (2009) 427–437. URL: https:
     //www.sciencedirect.com/science/article/pii/S0306457309000259. doi:https://doi.org/
     10.1016/j.ipm.2009.03.002.