<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Exploring the Efects of Diferent Embedding Algorithms and Neural Architectures on Early Detection of Alzheimer's Disease</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Minni Jain</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rishabh Doshi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vibhu Sehra</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Divyashikha Sethia</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Engineering, Delhi Technological University</institution>
          ,
          <addr-line>Delhi</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <fpage>376</fpage>
      <lpage>383</lpage>
      <abstract>
        <p>Alzheimer's Disease (AD) is an irrecoverable, progressive neurodegenerative disorder that deteriorates the cognitive and linguistic abilities of a person over time. Ample research has been done on the early detection of AD; it remains a challenging task. Doctors use the patient's history, laboratory tests, and change in behaviour to diagnose the disease. Natural Language Processing(NLP) techniques can help automate the detection of AD, as Language impairments accompany this disease. This work aims to analyze the efect of diferent Embedding models on the DementiaBank dataset in order to detect the disease. The work uses both Generic and domain-specific Word Embeddings on the three deep learning models - CNN, Bidirectional LSTM(BLSTM), and CNN+BLSTM. Results indicate that for a specific picture description task like cookie theft description, domain-specific Word Embeddings tend to work better. Lastly, it is discussed how results are afected by the use of diferent Embedding models (Fasttext, Word2Vec, GloVe).</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Alzheimer's Disease</kwd>
        <kwd>Natural Language Processing</kwd>
        <kwd>Word Embeddings</kwd>
        <kwd>Deep Learning</kwd>
        <kwd>Cookie theft Description task</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        medical treatment is not very useful after the diagnosis
of the disease. Hence the early detection of Alzheimer’s
Alzheimer’s Disease(AD) is a brain disorder that slowly is still a challenge in medical science. There have been
damages the nerve connections in the Brain. It is the many attempts to diagnose the disease with the help
most common type of dementia and symptoms of AD of neuroimaging techniques, but non-imaging
techinclude communication dificulties, memory loss, poor niques are essential to personalize the treatment for
judgment, and changing mood and personality1. More a patient and monitor disease progression. Machine
than 50 million people are diagnosed with Alzheimer’s learning can detect the language deficits that often
acDisease every year 2. This challenge has grown sub- company dementia and therefore can be used for ealry
stantially over the years with the ageing of the pop- detection of Alzheimer’s Disease. Previously, many
ulation and the agerelated nature of many dementia- Natural Language Processing (NLP) techniques were
producing neurodegenerative diseases [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. This num- proposed to help in early detection of Alzheimer’s
Disber of cases for Alzheimer’s Disease will continue to ease. These techniques treat the problem as a
supergrow in the coming years. There is no proven health vised learning problem. Previous research works like
care method to cure AD. Hence, it is necessary to de- [
        <xref ref-type="bibr" rid="ref2">2, 3, 4</xref>
        ] made use of transcripts obtained from
intervelop a new method to detect AD in a patient. Around views with patients to detect Alzheimer’s disease by
50 to 90% of dementia cases are left undiagnosed by using various machine learning and deep learning
alstandard clinical examinations [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Early detection of gorithms. Further, other studies like [
        <xref ref-type="bibr" rid="ref7">5, 6, 7</xref>
        ] used
acousAlzheimer’s Disease is still a massive issue in the cur- tic features obtained from the audio recordings of the
rent scenario. Alzheimer’s Disease progresses over the interviews for the classification task. Our study aims
years, and sometimes patients can have the disease to explore the efect of various Word Embeddings and
for 20 years before showing symptoms. At this point, neural architectures on transcripts obtained from the
cookie theft description task of DementiaBank.
      </p>
      <p>ISIC’2021: International Semantic Intelligence Conference, Feb 25–27, This paper makes use of both generic and
domain2021, New Delhi, Delhi , India specific Word Embeddings that are trained on the
tranD"osmhiin);nvijiabihnu@sedhtrua.a@c.ginm(aMil..cJoamin)(;Vd.oSsehhirrais);hdaibvhy2a6s@higkmhaa@il.dcotum.a(cR.i.n scripts. Out of all the presented models, the CNN +
(D. Sethia) Bidirectional LSTM models that make use of Fasttext
domain-specific Word Embeddings provides the best
CPWrEooUrckReshdoinpgs 1IhStpNh:/c1e6u1r3-w-0st.o7r3gtpC©Cso:2E/m0/2Umw0oRCnwospWLwyircie.oganhrsltekzfAo.sorthttrrhiogbisup/ptaiaoPplnezrr4ho.b0yecIniietmtseeardnueatirhtnisoorg-nsda.slUe((sCmeCCpEeBerYnUm4tiR.ti0tae)-.d/W1un0Sd_e.rosCirgrgena)tisve irnespuulttst.o Sthenetmenocdeeslso,batanidnetdhefrooumtptuhteistrtahnescprriepdtsictaerde
2https://www.alz.org/alzheimers-dementia/facts-figures label (Healthy or Alzheimer’s), no feature engineering
was involved in the process. Hence, this paper inves- had achieved an accuracy of 87.5% using the sparse
tigates how the task of detecting Alzheimer’s Disease vector representations of 4, 5 n-grams. The dataset
is afected by the use of various domain-specific and was equally divided by making use of 99 dementia
trangeneric Embeddings on diferent neural architectures. scripts and 99 control transcripts from the dataset.
Re</p>
      <p>
        The rest of the paper comprises section 2, which cently, [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] proposed the use of 3 diferent deep
learnconsists of the Related works followed by our proposed ing algorithms- 2D-CNN, LSTM, and 2D CNN - RNN
work and experimental setup in sections 3 and 4, re- models by making use of the complete Dementia bank
spectively. Then we present our results and discussion dataset which consists of 1017 Alzheimer’s transcripts
in sections 5 and 6, respectively, which is followed by and 243 control transcripts. They used each utterance
the conclusion and future work in section 7. as a separate data sample, therefore obtaining 14362
utterance samples. They achieve the best accuracy of
91.1% using the CNN-RNN model by using Word
Em2. Related Work beddings along with POS tagged data to the classifier.
[3] used a Hierarchical attention network (HAN) on
This section discusses the previous research, done in the transcripts obtained from DementiaBank Dataset.
the field of Alzheimer’s detection using the various They made use of Word Embeddings along with
demomachine learning and deep learning techniques. graphic features for the prediction task obtaining an
accuracy of 86.9%. [11] proposed a model that
com2.1. Machine Learning Techniques bined bidirectional hierarchical recurrent neural
network with an attention mechanism for dementia
detection. [
        <xref ref-type="bibr" rid="ref14">12</xref>
        ] showed that fine tuned BERT model
outperformed the models that used hand crafted feature
engineering. Table.4 summarizes the approach used
by previous research works.
      </p>
      <sec id="sec-1-1">
        <title>Existing research found on early detection of Alzheim</title>
        <p>
          er’s Disease using Natural language processing made
use of various machine learning techniques. [
          <xref ref-type="bibr" rid="ref4">8</xref>
          ] used
three diferent machine learning algorithms - namely
Decision trees, Support Vector Machine, and K-Nearest
neighbours on a sample of 80 conversations to achieve
the best accuracy of 79.5% using their Decision tree 3. Proposed Work
model. [9]proposed a model using Support Vector
machine making use of 14 lexical features, nine syntac- 3.1. Preprocessing
tic features, and n-grams extracted from the Pitt
Corpus in Dementia Bank Dataset by using 99 dementia This work uses the transcripts in the Dementia Bank
transcripts and 99 control transcripts from the dataset. dataset [
          <xref ref-type="bibr" rid="ref15">13</xref>
          ], which are available in the form of CHAT
They used Area Under Curve (AUC) metric to test the transcription [
          <xref ref-type="bibr" rid="ref16">14</xref>
          ]. The transcripts are passed through
performance of the algorithm achieving a maximum a series of steps as given below and illustrated in Fig. 1.
AUC score of 0.93 by using the top 1000 features ob- PyLangAcq library [
          <xref ref-type="bibr" rid="ref17">15</xref>
          ], which is a powerful library
tained using a Leave Pair Out Cross-Validation (LPOCV) that can handle CHAT data, reads the transcripts. We
crossvalidation technique. then convert all obtained utterances to lower text and
        </p>
        <p>
          Further, [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] used the DementiaBank dataset to ex- remove all punctuations. We use 99 transcripts from
tract the acoustic measures and semantic measures to each set (Dementia and Control) from the Cookie Theft
predict the clinical scores of the patients by making task as suggested by [9, 10] where they made use of an
use of the bivariate dynamic Bayes network. [5] ex- equal number of dementia and control patients.
tracted acoustic features from the DementiaBank
dataset and created a regression model to predict clinical 3.2. Word Embeddings used for early
scores (MMSE) used for dementia prediction. [6] made detection of Alzheimer’s Disease
use of acoustic features on various Machine Learning
models like Logistic Regression, KNN, Naive Bayes, This work uses three types of Word Embeddings-
WorDummy classifier, Random Forests, and achieved the d2Vec [
          <xref ref-type="bibr" rid="ref18">16</xref>
          ], Glove [
          <xref ref-type="bibr" rid="ref19">17</xref>
          ] and, Fasttext [18]. These
embest accuracy of 78% with Logistic regression classi- beddings are chosen because they are widely used and
ifer. have diferent architechtures which may tell us the best
way to proceed with the problem in hand. All the
2.2. Deep Learning Techniques Word Embeddings have a 300-dimensional vector
representation for each Word. For each of the types
men[10] had made use of Deep-Deep neural networks and tioned above, two-Word Embeddings are used,
Domainspecific and generic Word Embeddings. All the tran- the 1D Convolution layer, ReLU [22] as the activation
scripts from DementiaBank are used to create the do- function for the Dense layers, and Softmax for
classimain specific Word Embeddings stated above. The max- ifcation.
imum size of a transcript was 498 words. Hence, we
keep the size of the Word Embedding as (500,300). 3.3.2. Bi-Directional LSTM Model
3.2.1. Domain-Specific Word Embeddings
        </p>
      </sec>
      <sec id="sec-1-2">
        <title>Domain-Specific Word Embeddings are Embeddings</title>
        <p>that are trained on a specific corpus that contains data
from the interested domain. They are highly efective
for a specific domain but require extra training time.</p>
        <p>
          Gensim library [19] is used to create Word2vec [
          <xref ref-type="bibr" rid="ref18">16</xref>
          ]
and Fasttext [18] Word Embeddings from the corpus.
        </p>
        <p>
          Glove3 library is used to create the GloVe Embeddings 3.3.3. Hybrid CNN + Bi-Directional LSTM Model
[
          <xref ref-type="bibr" rid="ref19">17</xref>
          ].
        </p>
        <p>The model has a series of the Bidirectional LSTM layer
and Dropout [23] layer; further layers consist of a Dense
network for classification. The Dropout layers are
added to prevent overfitting in the model and dropout rate
is kept at 30%. All the layers use default Tanh
activation except the last one, which uses Softmax for
classification.</p>
      </sec>
      <sec id="sec-1-3">
        <title>This model is a combination of the above two mod</title>
        <p>
          3.2.2. Generic Word Embeddings els. We pass the Embeddings through a series of
1Dconvolutional layers followed by a MaxPooling layer,
Generic Word Embeddings are Embeddings that are with two bidirectional LSTM layers stacked over the
trained on vast generic corpora. Hence these Embed- Maxpool layer. A dense network follows this. Fig. 2.
dings reduce training time and often give outstanding illustrates the proposed model. The Activations used
results. The work trains the pretrained Glove [
          <xref ref-type="bibr" rid="ref19">17</xref>
          ] Em- for CNN and bidirectional LSTM is Tanh, while we use
beddings on 6 billion words. It trains Word2vec Em- ReLU [22] activation for dense layers followed by a
bedding, which includes word vectors for a vocabulary SoftMax function for classification.
of 3 million words and phrases on roughly 100 billion
words from a Google News dataset. It also trains Fast- 3.4. Training Details
text [18] Embedding, which contains vectors for 1
million words, on Wikipedia 2017, UMBC web base
corpus, and statmt.org news dataset having a total of 16
billion tokens.
        </p>
      </sec>
      <sec id="sec-1-4">
        <title>The above-stated models are trained using the Adam</title>
        <p>Optimizer [24] for 30 epochs, each using Binary
crossentropy as the loss function. L2 regularization [21] is
applied in each layer has  = 10−5</p>
        <sec id="sec-1-4-1">
          <title>3.3. Deep Learning Models Used</title>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>4. Experimental Details</title>
      <p>This section explains the deep learning models that
are used for the classification of control and dementia
patients. Keras functional API [20] is used to create
all the deep learning models explained below. To
address the concern of overfitting, we use L2 regularizer
[21] as the kernel initializer. Due to the small size of
the dataset, the research makes use of 10-fold
crossvalidation on each model. The model atempts to
capture the language impairments that are often seen in
the ealry phases of dementia. The Annexure provides
the details of the model architecture.</p>
      <sec id="sec-2-1">
        <title>This work uses Pitt Corpus, which is the largest En</title>
        <p>
          glish dataset available in DementiaBank [
          <xref ref-type="bibr" rid="ref15">13</xref>
          ].
DementiaBank is a part of the TalkBank project initiated by
Carnegie Mellon University. The National Institute of
Aging funds it. The project encourages research for
human communication. It uses the Codes for the
Human Analysis of Transcripts (CHAT) system [
          <xref ref-type="bibr" rid="ref16">14</xref>
          ],
which provides automatic analysis and testing. The CHAT
system is commonly used in many datasets to
provide uniformity and easy usage. Various participants
from each group (Control and dementia) visited
annu3.3.1. CNN Model ally for the interview. Pitt Corpus [
          <xref ref-type="bibr" rid="ref15">13</xref>
          ] is a collection
In this work, the CNN model consists of a combina- of transcripts and audio files that were collected as a
tion of 1DConvolution layers with an increasing num- part of a longitudinal study conducted by Alzheimer’s
ber of kernels followed by MaxPool layers. A Dense and Related dementia at the University of Pittsburgh
network follows this. We use the Tanh activation for School of Medicine. This dataset contains interviews
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>3https://github.com/JonathanRaiman/glove</title>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>5. Results</title>
      <sec id="sec-3-1">
        <title>All the three neural models - 1D CNN, Bidirectional</title>
        <p>LSTM(BLSTM), and 1D CNN + Bidirectional LSTM
(CThe work uses the Cookie theft part of the corpus as
it contains the maximum number of participants, and
previous researchers have used it.</p>
        <p>• Cookie Theft: Patients see an image provided The paper aims to explore how the diferent Word
Emby the Boston Diagnostic Aphasia Examination, bedding models and types of Embeddings perform on
and then the patients (Control and Dementia) diferent neural models. It uses both the domain
sperecall the events taking place in the image (Fig. 3). cific and the generic Word Embeddings to classify the
transcripts. However, since the domain-specific Word
• Fluency: This task is done only for dementia Embeddings have been trained on the same corpus
bepatients where they respond to a word Fluency ing used, it generally provides better results. As the
task. cookie theft data comprises of explaining a particular
• Recall: The Dementia Patients undergo a story image, the vocabulary found in the transcripts is
limrecall test. ited, and as a result, it is easier to understand the
relationship between words. Using Domain-specific,
Fast• Sentence: The Dementia Patients perform a Sen- text, and Word2vec provides better results than their
tence construction task. Generic counterparts. Results indicate that Glove
Embeddings provide similar results on both types of Word
Embeddings.</p>
        <p>If we had a combination of diferent tasks (not only
cookie theft) having a larger corpus and vocabulary,
Generic Embedding might perform better.</p>
        <p>Results indicate that Word2vec has the lowest
accuracy amongst the three Embedding models. This is
possible because domain-specific Word2vec requires
a larger corpus to develop the semantic relation as it
only captures local word relations. The domain
specific Fasttext Embedding gives the best result since it
does not require a large corpus as it breaks each word
into character n-grams, thereby increasing the
vocabulary size.</p>
        <p>Results also indicate that the hybrid CNN + BLSTM
model achieves the highest accuracy of 90.6%. The
CNN + BLSTM model works better than any single use
of either of the model, because:</p>
      </sec>
      <sec id="sec-3-2">
        <title>Compared to similar previous works like [2] and</title>
        <p>
          [3] use a Word Embeddings layer that is trained along
with the neural architecture, this study uses three Word
Embedding models and from each Embedding model,
a domainspecific and pre-trained Embedding is
created to identify how diferent Embedding models and
• CNN model captures the short-term dependen- the type of data on which the Embeddings are trained
cies in text. afects the performance of detecting Alzheimer’s
Disease. [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] breaks down each transcript into utterances
• LSTM model captures long term dependencies and considers them as separate data samples thereby
in the text. Bidirectional LSTM is better than
the LSTM as it trains on two LSTM cells instead
of one cell in a single input sequence.
        </p>
        <p>Accuracy</p>
        <p>Precision</p>
        <p>Recall</p>
        <p>F1-score
creating 14362 samples as compared to our 198
samples which are complete transcripts of a patient.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>7. Conclusion and Future Work</title>
      <sec id="sec-4-1">
        <title>This study employs three Word Embedding algorithms</title>
        <p>on three diferent Neural Models that make use of CNN
and Bidirectional LSTM for Alzheimer’s Disease
Classification. For each word embedding algorithm 2
different types of word embeddings were used - Domain
Specific and Generic Embeddings, where it was found
that Domain Specific word embeddings performed
better than Generic Word Embeddings. This work was
limited by the small amount of dataset available. In
future, we may gather a larger dataset that may help
in creation of a more generalized embedding. Further,
we can also extend the dataset for people speaking
different languages.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>A. Appendix</title>
      <sec id="sec-5-1">
        <title>A.1. Neural Model Details</title>
        <sec id="sec-5-1-1">
          <title>We used the following neural models. The batch size was kept at 10. In the last dense layer of each model softmax activation function was used. Other dense layers use a rectified linear activation function.</title>
        </sec>
        <sec id="sec-5-1-2">
          <title>Each CNN-1D layer in brackets represents(no-of-filters , kernel-size)</title>
          <p>CNN-1D(8,3) → CNN-1D(10,3) → MaxPool-1D(3) →
CNN-1D(12,3) → CNN-1D(14,3) → MaxPool-1D(3)
→ Flatten() → Dense(20,Relu) → Dense(10,Relu) →
Dense(2,Softmax)</p>
        </sec>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M. W.</given-names>
            <surname>Bondi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. C.</given-names>
            <surname>Edmonds</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. P.</given-names>
            <surname>Salmon</surname>
          </string-name>
          ,
          <article-title>Alzheimer's disease: past, present, and future</article-title>
          ,
          <source>Journal of the International Neuropsychological Society</source>
          <volume>23</volume>
          (
          <year>2017</year>
          )
          <fpage>818</fpage>
          -
          <lpage>831</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Karlekar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Niu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bansal</surname>
          </string-name>
          ,
          <article-title>Detecting linguistic characteristics of Alzheimer's dementia by interpreting neural models</article-title>
          ,
          <source>arXiv preprint A.1</source>
          .2. BLSTM arXiv:
          <year>1804</year>
          .
          <volume>06440</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Each</surname>
            <given-names>LSTM</given-names>
          </string-name>
          <article-title>layer in brackets represents(no-of-lstm-</article-title>
          <string-name>
            <surname>cells-</surname>
            [3]
            <given-names>W.</given-names>
          </string-name>
          <string-name>
            <surname>Kong</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Jang</surname>
            , G. Carenini,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Field</surname>
          </string-name>
          ,
          <article-title>A neuin-that-layer) ral model for predicting dementia from language, Bidir(LSTM(16)) → Dropout(0.3) → Bidir(LSTM(8)) in: Machine Learning for Healthcare Conference, → Bidir(LSTM(4)) → Bidir(LSTM(2</article-title>
          )) →
          <source>Dropout(0.2)</source>
          <year>2019</year>
          , pp.
          <fpage>270</fpage>
          -
          <lpage>286</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <article-title>→ Dense(8) → Dense(2,Softmax) [4] SP</article-title>
          .reOd.iOctrivime aLyien,gJ.uSis.-tiMc .
          <source>FWea.tKu.rJe.sGfoolrdeAn,lzLheeaimrneinr'gs A.1</source>
          .3. CNN+
          <article-title>BLSTM Disease and related Dementias using Verbal Utterances</article-title>
          ,
          <source>in: Proceedings Workshop on ComCNN-1D</source>
          (
          <issue>8</issue>
          ,3) → CNN-1D(
          <issue>10</issue>
          ,3) →
          <article-title>MaxPool-1D(3) → putational Linguistics and Clinical Psychology: CNN-1D(16,3) → CNN-1D(20,3) → MaxPool-1D(3) From Linguistic Signal to Clinical Reality,</article-title>
          <year>2014</year>
          , →
          <article-title>Bidir(LSTM(8)) → BatchNorm() → Bidir(LSTM(</article-title>
          <year>16</year>
          )) pp.
          <fpage>78</fpage>
          -
          <lpage>87</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <source>→ Dense(64</source>
          ,
          <string-name>
            <surname>Relu</surname>
            <given-names>)</given-names>
          </string-name>
          →
          <source>Dense(32</source>
          ,
          <string-name>
            <surname>Relu</surname>
            <given-names>)</given-names>
          </string-name>
          →
          <source>Dense (2</source>
          ,Soft- [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Al-Hameed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Benaissa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Christensen</surname>
          </string-name>
          ,
          <article-title>Simmax) ple and robust audio-based detection of biomark-</article-title>
          [18]
          <string-name>
            <given-names>P.</given-names>
            <surname>Bojanowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Grave</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Joulin</surname>
          </string-name>
          , T. Mikolov,
          <article-title>ers for alzheimer's disease</article-title>
          ,
          <source>in: 7th Workshop Enriching Word Vectors with Subword Informaon Speech and Language Processing for Assistive tion, arXiv:1607.04606</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Technologies</surname>
          </string-name>
          (SLPAT),
          <year>2016</year>
          , pp.
          <fpage>32</fpage>
          -
          <lpage>36</lpage>
          . [19]
          <string-name>
            <given-names>R.</given-names>
            <surname>Rehurek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Sojka</surname>
          </string-name>
          , Software framework for [6]
          <string-name>
            <given-names>V.</given-names>
            <surname>Masrani</surname>
          </string-name>
          ,
          <article-title>Detecting dementia from written topic modelling with large corpora, in: Proceedand spoken language</article-title>
          ,
          <source>Ph.D. thesis</source>
          , University of ings International Workshop on New Challenges British Columbia,
          <year>2018</year>
          . for NLP Frameworks,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Yancheva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rudzicz</surname>
          </string-name>
          , Vector-space topic mod- [20]
          <string-name>
            <surname>Keras</surname>
          </string-name>
          ,
          <article-title>Deep learning for humans, els for detecting Alzheimer's disease</article-title>
          , in: Pro- https://github.com/fchollet/keras,
          <year>2015</year>
          .
          <article-title>Last ceedings Annual Meeting of the Association for accessed on Nov 2019</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>Computational</given-names>
            <surname>Linguistics</surname>
          </string-name>
          ,
          <year>2016</year>
          , pp.
          <fpage>2337</fpage>
          -
          <lpage>2346</lpage>
          . [21]
          <string-name>
            <given-names>J.</given-names>
            <surname>Schmidhuber</surname>
          </string-name>
          ,
          <source>Deep learning in neural net</source>
          [8]
          <string-name>
            <given-names>C.</given-names>
            <surname>Guinn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Habash</surname>
          </string-name>
          ,
          <article-title>Language analysis of works: An overview</article-title>
          ,
          <source>Neural Networks</source>
          <volume>61</volume>
          (
          <year>2015</year>
          )
          <article-title>speakers with dementia of the Alzheimer's type</article-title>
          ,
          <fpage>85</fpage>
          -
          <lpage>117</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <source>in: AAAI Fall Symposium Series</source>
          ,
          <year>2012</year>
          , pp.
          <fpage>8</fpage>
          -
          <lpage>13</lpage>
          . [22]
          <string-name>
            <given-names>A. F.</given-names>
            <surname>Agarap</surname>
          </string-name>
          ,
          <source>Deep Learning using Rectified Lin</source>
          [9]
          <string-name>
            <given-names>S. O.</given-names>
            <surname>Orimaye</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. S.-M. Wong</surname>
            ,
            <given-names>K. J.</given-names>
          </string-name>
          <string-name>
            <surname>Golden</surname>
          </string-name>
          , ear Units (ReLU), arXiv:
          <year>1803</year>
          .
          <volume>08375</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>C. P.</given-names>
            <surname>Wong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. N.</given-names>
            <surname>Soyiri</surname>
          </string-name>
          , Predicting probable [23]
          <string-name>
            <given-names>N.</given-names>
            <surname>Srivastava</surname>
          </string-name>
          , et al.,
          <string-name>
            <surname>Dropout</surname>
            :
            <given-names>A Simple</given-names>
          </string-name>
          <string-name>
            <surname>Way</surname>
          </string-name>
          <article-title>Alzheimer's disease using linguistic deficits and to Prevent Neural Networks from Overfitting, biomarkers</article-title>
          ,
          <source>BMC Bioinformatics 18</source>
          (
          <year>2017</year>
          ).
          <source>Journal of Machine Learning Research</source>
          <volume>15</volume>
          (
          <year>2014</year>
          ) [10]
          <string-name>
            <given-names>S. O.</given-names>
            <surname>Orimaye</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. S.-M. Wong</surname>
            ,
            <given-names>C. P.</given-names>
          </string-name>
          <string-name>
            <surname>Wong</surname>
          </string-name>
          ,
          <year>Deep 1929</year>
          -
          <year>1958</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <article-title>language space neural network for classifying</article-title>
          [24]
          <string-name>
            <given-names>D. P.</given-names>
            <surname>Kingma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ba</surname>
          </string-name>
          ,
          <article-title>A Method for Stochastic Opmild cognitive impairment and alzheimer-type timization</article-title>
          , ?arXiv:
          <fpage>1412</fpage>
          .6980 (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          dementia,
          <source>PloS one 13</source>
          (
          <year>2018</year>
          ). [25]
          <string-name>
            <given-names>L.</given-names>
            <surname>Hebert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. A.</given-names>
            <surname>Scherr</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. L.</given-names>
            <surname>Bienias</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. A.</given-names>
            <surname>Ben</surname>
          </string-name>
          [11]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mirheidari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Reuber</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Venneri, nett,
          <string-name>
            <given-names>D. A.</given-names>
            <surname>Evans</surname>
          </string-name>
          ,
          <article-title>Alzheimer disease in the US D</article-title>
          .
          <string-name>
            <surname>Blackburn</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Christensen</surname>
          </string-name>
          ,
          <article-title>Automatic hi- population: prevalence estimates using the 2000 erarchical attention neural network for detect- census</article-title>
          ,
          <source>Arch Neurol</source>
          <volume>60</volume>
          (
          <year>2003</year>
          )
          <fpage>1119</fpage>
          -
          <lpage>1122</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <source>ing ad</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>4105</fpage>
          -
          <lpage>4109</lpage>
          . doi:
          <volume>10</volume>
          .21437/ Interspeech.2019-
          <volume>1799</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Balagopalan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Eyre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rudzicz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Novikova</surname>
          </string-name>
          ,
          <article-title>To bert or not to bert: Comparing speech and language-based approaches for alzheimer's disease detection</article-title>
          ,
          <year>2020</year>
          . arXiv:
          <year>2008</year>
          .01551.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>J. T.</given-names>
            <surname>Becker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Boller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O. L.</given-names>
            <surname>Lopez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Saxton</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.</surname>
          </string-name>
          <article-title>McGonigle, The natural history of Alzheimer's disease. Description of study cohort and accuracy of diagnosis</article-title>
          .,
          <source>Archives of Neurology</source>
          <volume>51</volume>
          (
          <year>1994</year>
          )
          <fpage>585</fpage>
          -
          <lpage>594</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>B.</given-names>
            <surname>Macwhinney</surname>
          </string-name>
          ,
          <article-title>The CHILDES project: tools for analyzing talk</article-title>
          ,
          <source>Child Language Teaching and Therapy</source>
          <volume>8</volume>
          (
          <year>2000</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J. L.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Burkholder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. B.</given-names>
            <surname>Flinn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. R.</given-names>
            <surname>Coppess</surname>
          </string-name>
          ,
          <article-title>Working with CHAT transcripts in Python</article-title>
          ,
          <source>Technical report TR-2016-02</source>
          ,
          <string-name>
            <surname>Technical</surname>
            <given-names>Report</given-names>
          </string-name>
          , Department of Computer Science, University of Chicago,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mikolov</surname>
          </string-name>
          , et al.,
          <source>Eficient Estimation of Word Representations in Vector Space, in: Proceedings International Conference on Learning Representations</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Pennington</surname>
            , Jefrey,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Socher</surname>
            ,
            <given-names>C. D.</given-names>
          </string-name>
          <string-name>
            <surname>Manning</surname>
          </string-name>
          , Glove:
          <article-title>Global vectors for word representation</article-title>
          ,
          <source>in: Proceedings International Conference Empiricial Methods in Natural Language Processing</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>