=Paper=
{{Paper
|id=Vol-3026/paper14
|storemode=property
|title=Ensemble Learning in Detecting ADHD Children by Utilizing the Non-Linear Features of EEG Signal
|pdfUrl=https://ceur-ws.org/Vol-3026/paper14.pdf
|volume=Vol-3026
|authors=Pham Thi Viet Huong,Nguyen Anh Tu,Tran Anh Vu
}}
==Ensemble Learning in Detecting ADHD Children by Utilizing the Non-Linear Features of EEG Signal==
Ensemble learning in detecting ADHD children by
utilizing the non-linear features of EEG signal*
Pham Thi Viet Huong1, Nguyen Anh Tu2, Tran Anh Vu2**
1 International School, Vietnam National University, Hanoi, Vietnam
2 Hanoi University of Science and Technology, Hanoi, Vietnam
**vu.trananh@hust.edu.vn
Abstract. Electroencephalogram (EEG) has play a critical role in the assessment
of Attention-Deficit Hyperactivity Disorder (ADHD) in patients. In this paper,
we proposed a novel method, which utilizes the non-linear features of EEG signal
in discriminating EEG children with healthy group. Since most of the previous
research focused on linear feature of EEG, this paper opens a new aspect on an-
alyzing EEG in the task of detecting ADHD in humans. Our dataset is recently
published in 2020 in ieee-dataport.org. We use the Fractal Dimensions (FD) as
non-linear features with different method of feature selection. Finally, we use
ensemble learning as a classifier to discriminate ADHD children with healthy
group. Our result confirmed our methodology as it has higher accuracy when
compared with state-of-the-art studies..
Keywords: Attention-Deficit Hyperactivity Disorder (ADHD), Electroencepha-
logram (EEG), Fractal Dimension (FD), Ensemble learning.
1 Introduction
Attention Deficit Hyperactivity Disorder (ADHD) is a mental disorder that is charac-
terized by an ongoing pattern of inattention and/or hyperactivity impulsivity that inter-
feres with functioning or development [1]. According to recent studies, around 5% of
children are affected by the ADHD, with boys having a higher risk than girls [1] [2].
Normally, ADHD symptoms appear in preschool age and become critical in primary
shool age. The main problem of ADHD in children is the lack of concentration and
weak regulation of their behaviors, so they do not show appropriate react to the sur-
rounding environment [3] [4] [5]. Therefore, early diagnosis of ADHD is extremely
important in preventing later complications such as negative impacts on children’s so-
cial interactions.
* Copyright © by the paper’s authors. Use permitted under Creative Commons License Attribu-
tion 4.0 International (CC BY 4.0). In: N. D. Vo, O.-J. Lee, K.-H. N. Bui, H. G. Lim, H.-J.
Jeon, P.-M. Nguyen, B. Q. Tuyen, J.-T. Kim, J. J. Jung, T. A. Vo (eds.): Proceedings of the
2nd International Conference on Human-centered Artificial Intelligence (Computing4Human
2021), Da Nang, Viet Nam, 28-October-2021, published at http://ceur-ws.org
** Corresponding Author.
130 Huong et al.
Usually, the diagnosis of ADHD is mainly based on the Diagnostic and Statistical
Manual of Mental Disorders (DSM) or the International Classification of Diseases
(ICD) [1] [6]. This diagnosis is highly dependent on a parent or teacher's perception of
the psychologist's questions and the truthfulness of their answers. To minimize this
subjective factor, objective ways have been developed to identify children with symp-
toms of ADHD. One way is to use electroencephalogram (EEG) in the diagnosis [7] [8]
[9] [10], which is a recording of brain activity. In order to get EEG, small sensors are
attached to the scalp to catch the electrical signal produced when brain cells send mes-
sage to each other.
EEG processing has become one of the most widely used techniques for ADHD di-
agnosis due to its accessibility and non-expensive characteristics. Researchers have
been developed several methods to deal with EEG in differentiating ADHD group and
healthy group. The very first research in developing a rationale for the diagnosis of
ADHD was taken in [11] for 15 years. He found that in ADHD people, the theta activity
increased, and beta power dramatically reduced. In [12], 30 ADHD children and 30
healthy children were studied and results showed that ADHD group had greater abso-
lute power in delta and theta oscillations in all regions of their brain. ADHD adults and
healthy groups were classified using support vector machine based on power spectra in
[13].
The most commonly used machine learning algorithms for classification of ADHD
patterns using EEG are Logistic Regression [14], Linear Discriminant Analysis (LDA)
[15], K-Nearest Neighbor [16], Support Vector Machine (SVM) [17], Principal Com-
ponent Analysis (ICA) [18], Fast Fourier and Wavelet Transform [19] and Neural Net-
works [20] [21]. Deep learning methods are also utilized to perform the task, for exam-
ple, convolution neural networks (CNN) [22] [23].
The non-linear features of EEG signal such as entropy and Lyapunov exponent were
taken advantage in differentiating the ADHD group in [24]. In order to improve the
classification results, the double input symmetrical relevance (DISR) and minimum
Redundancy Maximum Relevance (mRMR) methods were used to choose the best fea-
tures to put into the neural network. Results showed that the extracted non-linear fea-
tures revealed that non-linear indices were greater in different regions of the brain of
the ADHD children compared to healthy children. As expected, ADHD children have
more delays and less accurate in cognitive tasks.
Our proposed method also utilized from the non-linear features of the EEG signal.
We use fractal dimension (FD) based metrics such as Higuchi, Katz and Petrosian frac-
tal dimensions to define the chaotic pattern in EEG signal. Instead of using some given
tools in Matlab to select the features, such as DISR and mRMR [24], we perform dif-
ferent methods: filter method, Correlation-based Feature Selection (CFS), Lasso
method, logistic method, wrapper method, recursive feature elimination (RFE), which
dig more into the physics of the EEG signal. After feature selection, we use ensemble
learning to perform the task. Our achieved results are better than current research for
the same purpose.
Our paper is organized as follow. Section I is the introduction. Section II presents
the dataset and methodology we use to perform the task. Section III shows the experi-
ment and results. Section IV concludes the paper.
Ensemble learning in detecting ADHD children by utilizing the non-linear features
of EEG signal 131
2 Data and Methodology
2.1 Dataset
Our dataset is taken from ieee-dataport.org, which is IEEE’s dataset storage and dataset
search platform. The dataset is the EEG signal from 61 children with ADHD and 60
healthy controls (boys and girls, age 7-12). The ADHD group was diagnosed using
DSM-IV criteria by a qualified psychiatrist and this group was given Ritalin for up to
6 months. DSM-IV criteria is the official guide of the American Psychiatric Associa-
tion, which is intended to offer a framework for categorizing disorders and defining
diagnostic criteria for the disorders listed. None of the children in the control group had
a history of psychiatric disorders, epilepsy, or any report of high-risk behavior. EEG
recording was performed based on 10-20 standard by 19 channels (Fz, Cz, Pz, C3, T3,
C4, T4, Fp1, Fp2, F3, F4, F7, F8, P3, P4, T5, T6, O1, O2) at 128 Hz sampling fre-
quency. The A1 and A2 electrodes were the references located on earlobes.
The EEG recording methodology was based on visual attention tasks, since visual
attention is one of the impairments in in ADHD children. A series of cartoon character
photos were given to the children, and they were instructed to count the figures. The
number of characters in each image was chosen at random between 5 and 16, and the
images were large enough for children to be easily see and count. To have a continuous
stimulation during the signal recording, each image was presented immediately and
without interruption after the child’s reaction. As a result, the length of EEG recording
during this cognitive visual task was determined by the child’s performance (i.e. re-
sponse speed).
2.2 Methodology
Data preprocessing
EEG recording was performed based on 19 channels at 128Hz sampling frequency. Our
obtained signal was in the range 0-64Hz as in 오류! 참조 원본을 찾을 수 없습니다..
We process the signal using Fast Fourier Transform (FFT) filter and remove the noise
at 50Hz, we obtain the clean signal as in 오류! 참조 원본을 찾을 수 없습니다..
Fig. 1. Original EEG signal at Fp1 Fig. 2. Processed EEG signal at Fp1
132 Huong et al.
Feature extraction
We utilized the fractal dimension (FD), which is non-linear and represents the chaotic
pattern of the EEG signal. FD is a ratio giving a statistical index of complexity in terms
of details in the pattern variations with the scale [25] [26]. In our paper, we calculate
three FD: Higuchi, Katz and Petrosian. All these features are computed for 19 channels.
Katz Fractal Dimension is calculated as follows [25]
ln(𝑁−1)
𝐹𝐷 = 𝑑 (1)
ln(𝑁−1)−ln( )
𝐿
where L is the sum of distances between consecutive points, N is the length of data
sequence and d is the diameter of data sequence.
Higuchi Fractal Dimension is calculated based on a time series 𝑥(1), 𝑥(2), … , 𝑥(𝑁)
as an input then a new time series is obtained [26]
𝑘 𝑁−𝑚
𝐹𝑥𝑚 = {𝑥(𝑚), 𝑥(𝑚 + 𝑘), 𝑥 (𝑚 + 2𝑘), … , 𝑥(𝑚 + ⌊ ⌋ 𝑘} (2)
𝑘
for 𝑚 = 1, 2, 3, … , 𝑘
where 𝑚 is the first sample and ⌊. ⌋ indicates the integer part of series. Length 𝐿𝑚 (𝑘)
𝑘
for 𝑥𝑚 is given by
∑𝑖=1|𝑥(𝑚+𝑖𝑘)−𝑥(𝑚+(𝑖−1)𝑘|(𝑁−1)
𝐿 𝑚 (𝑘 ) = 𝑁−𝑚 (3)
⌊ 𝑘 ⌋𝑘
𝑑 [𝑥𝑚 (𝑖 ), 𝑥𝑚 (𝑗)] = 𝑚𝑎𝑥𝑘=1,2,…,𝑚 (|𝑠(𝑖 + 𝑘 − 1) − (𝑗 + 𝑘 − 1)|) (4)
𝑥𝑚 (𝑖 ) = {𝑠(𝑖 ), 𝑠(𝑖 + 1), … , 𝑠(𝑖 + 𝑚 − 1)}; 1 ≤ 𝑖 ≤ 𝑁 − 𝑚 + 1 (5)
where 𝑚 and 𝑟𝑓 are positive real integers and indicate data length and filtering level,
respectively. 𝑁 is the number of samples and 𝑑 is the distance between 𝑥𝑚 (𝑖) and
𝑥𝑚 (𝑗)
Petrosian Fractal Dimension was introduced in [27]. In this calculation, samples of
a time series are subtracted consecutively, and a new time series is produced. Then,
positive and negative samples are allocated to 1 and -1. Hence, the number of sign
changes in the produced time series is equal to the number of local extrema in the pri-
mary time series. The Petrosian FD is calculated as
log10 𝑛
𝐷= 𝑛 (6)
log10 𝑛+log10 ( )
𝑛+0.4𝑁∆
where 𝑛 and 𝑁∆ are the number of samples and number of sign changes in the binary
time series, respectively. In this algorithm, the 𝑁∆ is important, while in the Katz FD
calculation, the amplitude differences are important. Hence, the Petrosian method is
faster and more sensitive to noise.
Feature selection
At first, using all of the extracted feature appears to be logical, however this will result
in the inclusion of irrelevant or duplicate data, reducing classification accuracy. In our
Ensemble learning in detecting ADHD children by utilizing the non-linear features
of EEG signal 133
proposed method, we use several methods to select the appropriate features and figure
out which method works best for our dataset. Following are those method that we apply
to select feature in our dataset.
+) The Filter approach rates each feature based on a uni-variate metric and then se-
lects the features with the highest ranking. The following are some examples of uni-
variate metrics [28]:
• Variance: eliminating features that are constant or quasi-constant
• Chi-square: a categorization tool. It is a statistical test of independence used to
detect if two variables are dependent on each other.
• Correlation coefficients: duplicate features are removed
• Information gain or mutual information: Examine the independent variable's role
in predicting the target variable.
+) The Correlation Feature Selection (CFS) method, which is a simple approach that
uses a correlation-based heuristic evaluation function to rank feature subsets. The fea-
ture subset evaluation function in CFS is defined as follows [29] [16]:
𝑘𝑟
̅̅̅̅̅
𝑐𝑓
𝑀𝑠 = (7)
√𝑘+𝑘(𝑘−1)𝑟
̅̅̅̅̅
𝑓𝑓
where 𝑀𝑠 is the evaluation of a subset of S consisting of k features, ̅̅̅̅
𝑟𝑐𝑓 is the average
correlation value between features and class labels, and ̅̅̅̅
𝑟𝑓𝑓 is the average correlation
value between two features.
+) The Lasso method imposes a limit on the total of the absolute values of the model
parameters: it must be smaller than a predetermined value (upper bound). To do so, the
method uses a shrinkage (regularization) procedure in which the coefficients of the re-
gression variables are penalized, with some of them being reduced to zero. The varia-
bles with a non-zero coefficient following the shrinking procedure are chosen to be part
of the model during the feature selection procedure. The purpose of this procedure is to
reduce the prediction error as much as possible [30].
+) The Logistic method includes a set of diagnostic tools that allow us to quantify
the proposed model's goodness-of-fit and choose features accordingly. The maximum
value of the log likelihood (LL) reached for each feature is used to evaluate the model's
performance. D is a type of deviation that is defined as [31] [32]:
D=-2(LL of the current model – LL of the saturated model) (8)
The saturated model has the same number of parameters as the sample size and has a
probability of one. Low deviance values suggest a strong match or, in other words, a
strong predictive value for the features. When comparing the two models, the deviation
is useful.
+) The Recursive Feature Elimination (RFE) method is a feature selection algorithm
with a wrapper. The method works by looking for a subset of features in the training
dataset, starting with all of them and successfully deleting them until just the target
number remains.
+) Wrapper method: To forecast the target variable, the wrapper approach looks for
the optimal subset of input information. It chooses the features that give the model the
134 Huong et al.
best accuracy. Wrapper approaches employ past model inferences to determine if a new
feature should be included or eliminated [28].
Ensemble learning for classification
Ensemble learning is a method of solving a computational intelligence problem by in-
tentionally generating and combining many models, such as classifiers or experts. En-
semble learning is primarily used to improve a model's performance (classification,
prediction, function approximation, etc).
The ensemble learning includes:
- Boosted Trees: The method is with the training parameters based on the Weighted
Majority voting rule and the AdaBoost ensemble approach in this study. The
learner type is Decision tree, with a maximum of 20 splits, 30 learners, and a 0.1
learning rate.
- Bagged Trees: The weight average rule employs the bag ensemble method with
30 learners and a Decision tree learner type.
- Subspace KNN: The training parameters in this work are based on the simple Ma-
jority Vote rule, and the proposed method uses the Subspace ensemble approach.
- Subspace Discriminant: The majority voting rule was utilized to create the sub-
space discriminant ensemble, which used the random subspace ensemble ap-
proach with 30 linear discriminant learners and two subspace dimensions.
- RUS Boosted Trees: It is employing Combined RUS and normal boosting tech-
nique of AdaBoost with RUSBoost ensemble approach as training parameters in
this study. The decision tree is the learner type, with a maximum of 20 splits, 30
learners, and a learning rate of 0.1.
3 Experiment setup and results
After pre-processing signal, we apply the method of feature extraction for each of the
19 channels [24]. Then, feature selection algorithms are applied. As a result, we get 58
feature sets from 3 methods of calculating FD. Then we implement feature selection
methods to reduce the number of features as in Table 1.
Table 1. Results of feature selection
Model Selection Feature Set (58)
Filter Method 12 features
CFS Method 7 features
Lasso Method 38 features
Logistic Method 15 features
RFE Method 20 features
Wrapper Method 25 features
For a more detail result of feature selection, see Table 2
Ensemble learning in detecting ADHD children by utilizing the non-linear features
of EEG signal 135
Table 2. Detail results of feature selection
Logistic 1. Fp1_Kat 5. P3_Pet 9. P3_Hig 13. T7_Hig
Method 2. F3_Hig 6. O1_Hig 10. P4_Pet 14. P7_Pet
3. C3_Kat 7. F7_Pet 11. F8_Hig 15. P8_Pet
4. C3_Hig 8. F8_Pet 12. T7_Pet 16. P8_Hig
Lasso 1. F4_Hig 11. Cz_Kat 21. C3_Kat 31. C4_Hig
Method 2. P7_Hig 12. P3_Kat 22. T8_Hig 32. F3_Hig
3. P4_Hig 13. P8_Kat 23. P7_Kat 33. O2_Hig
4. F7_Hig 14. Fz_Kat 24. P4_Kat 34. Fp1_Hig
5. Pz_Hig 15. Fp2_Kat 25. Pz_Kat 35. F8_Hig
6. C3_Hig 16. C4_Kat 26. F8_Pet 36. Fz_Hig
7. Fp1_Kat 17. F4_Kat 27. T8_Kat 37. P3_Hig
8. P8_Hig 18. T7_Kat 28. F3_Pet 38. C4_Pet
9. O1_Kat 19. O2_Kat 29. Cz_Hig
10. Fp2_Hig 20. F7_Kat 30. O1_Hig
Wapper 1. Fp1_Kat 8. P3_Kat 15. T7_Hig 22. P8_Hig
Method 2. Fp2_Pet 9. P3_Hig 16. O1_Hig 23. Fz_Hig
3. F3_Hig 10. P4_Pet 17. O2_Pet 24. Cz_Kat
4. F4_Hig 11. O2_Kat 18. T8_Pet 25. Pz_Pet
5. C3_Kat 12. F8_Pet 19. T8_Hig
6. C3_Hig 13. F8_Hig 20. P7_Pet
7. P3_Pet 14. T7_Pet 21. P8_Pet
Filter 1. Fp1_Kat 4. C3_Kat 7. Fz_Kat 10. P7_Hig
Method 2. F3_Kat 5. P3_Kat 8. Cz_Kat 11. P8_Kat
3. F4_Kat 6. O1_Kat 9. Pz_Kat 12. P8_Hig
RFE 1. Fp1_Pet 6. F4_Hig 11. P4_Pet 16. P7_Pet
Method 2. Fp2_Pet 7. C3_Pet 12. O1_Pet 17. P8_Pet
3. F3_Pet 8. C3_Hig 13. O2_Pet 18. Fz_Pet
4. F3_Hig 9. C4_Pet 14. F8_Pet 19. Cz_Pet
5. F4_Pet 10. P3_Pet 15. T7_Pet 20. Pz_Pet
CFS 1. Fp1_kat 3. P7_Hig 5. Cz_Kat 7. Pz_Hig
Method 2. P3_Kat 4. P8_Hig 6. Pz_Kat
The extracted features are input to the ensemble learning. Training and testing set are
divided with ratio 80:20. We set the labels of ADHD children and Control Children by
1 and -1, respectively. The accuracy of the classification is given in Table 3. We see
that with the subspace KNN and RUS boosted trees, the best results are obtained. We
also present the confusion matrix and the RoC for those cases.
136 Huong et al.
Table 3. The accuracy of training data
Feature Ensemble learning
Selection Boosted Bagged Subspace Subspace RUS Boosted
Tress Tress KNN Discriminant Trees
Filter Method 77.7 86.8 90.9 71.9 90.1
CFS Method 87.2 90.9 91.7 74.8 90.9
Lasso Method 79.8 91.3 91.3 79.8 94.6
Logistic Method 90.5 89.7 89.7 76.4 92.6
RFE Method 75.2 88 88.8 80.2 91.3
Wrapper Method 75.2 91.3 91.3 81.0 94.6
Fig. 3. The Confusion matrix and ROC of Subspace KNN (Filter Method)
Fig. 4. The Confusion matrix and ROC of Subspace KNN (CFS Method)
Fig. 5. The Confusion matrix and ROC of RUS Boosted Trees (Lasso Method)
Ensemble learning in detecting ADHD children by utilizing the non-linear features
of EEG signal 137
Fig. 6. The Confusion matrix and ROC of RUS Boosted Trees (Logistic Method)
Fig. 7. The Confusion matrix and ROC of RUS Boosted Trees (RFE Method)
Fig. 8. The Confusion matrix and ROC of RUS Boosted Trees (Wrapper Method)
The confusion matrix results showing the true positive rates/false negative rates and the
positive predictive values/false discovery rates are illustrated in Fig. 3, Fig. 4, Fig. 5,
Fig. 6, Fig. 7, Fig. 8. In addition, the ROC curves are all normal.
The accuracy on testing data is given in Table 4. The highest accuracy 98.33% is ob-
tained with logistic method feature selection and RUS boosted trees.
138 Huong et al.
Table 4. The accuracy of training and testing data
Train (80%) Test (20%)
Filter Method Subspace KNN 90.9 80.0
CFS Method Subspace KNN 91.7 81.6
Lasso Method RUS Boosted Trees 94.6 95
Logistic Method RUS Boosted Trees 92.6 98.33
RFE Method RUS Boosted Trees 91.3 88.33
Wrapper Method RUS Boosted Trees 94.6 83.33
Table 5. Comparison of the model accuracy with some state-of-the art studies in this field
Study Year Dataset Feature selection Classifier Accuracy
This 2021 61 ADHD Katz FD, Higuchi FD, Pe- Ensemble 98.33%
study children, trosian FD learning
60 healthy
children
[21] 2016 31 ADHD Lyapunov Exponent, Katz MLP NN 93.65%
children, FD, Higuchi FD, Petrosian
30 healthy FD
children
[33] 2019 50 ADHD Mutual Deep 94.67%
children, information CNN
51 healthy Connectivity
children matrix
[34] 2019 50 ADHD Filter Bank Deep 90.29%
children, Common Spatial Patterns CNN
57 healthy Gradient-weighted Class
children Activation
Mapping
[35] 2019 47 ADHD Phase space SVM, NN 93.3%
children, reconstruction of EEG, k-NN, and
50 healthy CFS and PSO feature naive-
children selection Bayes
classifier
Table 5 show how our study outperforms the state-of-the art studies in accuracy for the
same purpose.
4 Conclusion
In general, ADHD is a disorder that is common in children and it affects to children’s
reaction to the environment. Hence, early diagnosis of these symtoms is very important
in the child’s development. In our paper, we use the non-linear features of EEG signals
to differentiate between ADHD children and healthy children. Our dataset is published
Ensemble learning in detecting ADHD children by utilizing the non-linear features
of EEG signal 139
in 2020 in ieee-dataport.org. So far, most studies have used linear features (spectral,
time, spatial or time-frequency features) to categorized ADHD patients. Although some
of these studies have provided promising results, new advanced methods are still in
need to analyze EEG signals. Non-linear features of EEG signal in children’s brain has
only reported in [21] with the dataset of 31 ADHD children and 30 healthy children.
They used the same set of non-linear features but different feature selection methods
by using the given tools in Matlab. In our study, instead of using tools in Matlab, we
used some modified feature selection method, which focuses more on the physics and
the structure of the EEG signals. For classifier, we use ensemble learning, which is
more simple method than neural network [21]. We get better results of 98.33% accuracy
with a larger and more updated dataset of 61 ADHD children and 60 healthy control.
Our results show that the non-linear features are appropriate features to analyze and
characterize the EEG signals. The application of non-linear analysis to EEG has opened
a new door in analyzing EEG signals in order to discriminate ADHD patients from the
healthy group.
References
1. A. P. Association: American Psychiatric Association. DSM-5 Task Force. Diagnostic and
statistical manual of mental disorders: DSM-5. In 5th American Psychiatric Association,
Washington DC (2013).
2. A. Meysamie, M. D. Fard and M.-R. Mohammadi: Prevalence of attention-deficit/hyperac-
tivity disorder symtoms in preschoolaged Iranian children. Iran J Pediatr. 21(4), (2011).
3. J. A. King, M. Colla, M. Brass, I. Heuser and D. v. Cramon: Inefficient cognitive control in
adult ADHD: evidence from trial-by-trial Stroop test and cued task switching performance.
Behav Brain Funct, 3(42), (2007).
4. F. Aboitiz, T. Ossandón, F. Zamorano, B. Palma and X. Carrasco: Irrelevant stimulus pro-
cessing in ADHD: catecholamine dynamics and attentional networks. Front Psychol. (2014).
5. M. R. Mohammadi, N. Malmir, A. Khaleghi and M. Aminiorani: Comparison of Sensorimo-
tor Rhythm (SMR) and Beta Training on Selective Attention and Symptoms in Children
with Attention Deficit/Hyperactivity Disorder (ADHD): A Trend Report. Iran J Psychiatry,
10(3), (2015).
6. O. WH: The ICD-10 classification of mental and behavioural disorders: clinical descriptions
and diagnostic guidelines. WHO. Geneva (1992).
7. J. F. Lubar: Discourse on the development of EEG diagnostics and biofeedback for atten-
tion-deficit/hyperactivity disorders. Biofeedback Self Regul. 16(3), (1991).
8. A. Tenev, S. Markovska-Simoska, L. Kocarev, J. Pop-Jordanov, A. Müller and G. Candrian:
Machine learning approach for classification of ADHD adults. Int J Psychophysiol. 93(1),
162-6, (2014).
9. S.-S. Poil, S. Bollmann, C. Ghisleni and R. L. O'Gorman: Age dependent electroencephalo-
graphic changes in Attention Deficit/Hyperactivity Disorder (ADHD). Clinical Neurophys-
iology, 125(8), 1626-1638, (2014).
10. A. Mazaheri, C. Fassbender, S. Coffey-Corina, T. A. Hartanto, J. B. Schweitzer and G. R.
Mangun: Differential oscillatory electroencephalogram between attention-deficit/hyperac-
tivity disorder subtypes and typically developing adolescents. Biol Psychiatry. 76(5), 422-
429, (2014).
140 Huong et al.
11. J. F. Lubar: Discourse on the development of EEG diagnostics and biofeedback for atten-
tion-deficit/hyperactivity disorders. Applied Psychophysiology and Biofeedback 16(3),
201-225, (1991).
12. L. C. Fonseca, G. M. A. S. Tedrus, C. d. Moraes, A. d. V. Machado, M. P. d. Almeida and
D. O. F. d. Oliveira :Epileptiform abnormalities and quantitative EEG in children with at-
tention-deficit/hyperactivity disorder. Arq Neuropsiquiatr 66(3A), 462-7, (2008).
13. A. Tenev, S. Markovska-Simoska, L. Kocarev, J. Pop-Jordanov, A. Müller and G. Candrian:
Machine learning approach for classification of ADHD adults. Int J Psychophysiol. 93(1)
162-166, (2014).
14. InezBuyck and J. R.Wiersema: Resting electroencephalogram in attention deficit hyperac-
tivity disorder: Developmental course and diagnostic value Author links open overlay panel.
Psychiatry Research 216(3), 391-397, (2014).
15. R. M. N. H. &. D. P. W. M Duda: Use of machine learning for behavioral distinction of
autism and ADHD. Translational Psychiatry, vol. 6, (2016).
16. S. Kaur et. Al: Phase Space Reconstruction of EEG Signals for Classification of ADHD and
Control Adults. Clinical EEG and Neuroscience, (2020).
17. A. E. Alchalabi, S. Shirmohammadi, A. N. Eddin and M. Elsharnouby: Detecting ADHD
patients by an EEG-based serious game. IEEE Transactions on Instrumentation and Meas-
urement, (2018).
18. J. R. Wessel: Testing Multiple Psychological Processes for Common Neural Mechanisms
Using EEG and Independent Component Analysis. Brain Topography, vol. 31, 90-100,
(2016).
19. K. Katoab, K. Takahashia, N. Mizuguchiac and J. Ushiba: Online detection of amplitude
modulation of motor-related EEG desynchronization using a lock-in amplifier: Comparison
with a fast Fourier transform, a continuous wavelet transform, and an autoregressive algo-
rithm. Journal of Neuroscience Methods , vol. 293, 289-298, (2018).
20. K. M. A. Allahverdy Armin, M. R. Mohammadi and M. N. Ali: Detecting ADHD Children
using the Attention Continuity as Nonlinear Feature of EEG. Frontiers Biomed Technol,
3(1-2), 28-33, (2016).
21. A. K. Mohammad Reza Mohammadi, A. M. Nasrabadi, S. Rafieivand, M. Begol and H.
Zarafshan: EEG classification of ADHD and normal children using non-linear features and
neural network. Biomedical Engineering Letters, vol. 6, 66-73, (2106).
22. A. Vahid, A. Bluschke, V. Roessner and S. S. a. C. Beste: Deep Learning Based on Event-
Related EEG Differentiates Children with ADHD from Healthy Controls. Journal of Clinical
Medicine, 8(7), (2019).
23. Z. Li and C. Weike: Application of Deep Convolutional Neural Networks in Attention-Def-
icit/Hyperactivity Disorder Classification: Data Augmentation and Convolutional Neural
Network Transfer Learning. Journal of Medical Imaging and Health Informatics, 9(8), 1717-
1724, (2019).
24. M. R. Mohammadi, A. M. N. Ali Khaleghi, S. Rafieivand, M. Begol and H. Zarafshan: EEG
classification of ADHD and normal children using non-linear features and neural network.
Biomedical Engineering Letters , 66-73, (2016).
25. T. Higuchi: Approach to an irregular time series on the basis of the fractal theory. Physica
D, Nonlinear Phenomena, (1988).
26. A. Petrosian: Kolmogorov complexity of finite sequences and recognition of different preic-
tal EEG patterns. In Proceedings Eighth IEEE Symposium on Computer-Based Medical
Systems, Lubbock, TX, USA, (1995).
27. R. L. M. Petre Stoica: Introduction to Spectral Analysis, the University of Michigan: Pren-
tice Hall (1997).
Ensemble learning in detecting ADHD children by utilizing the non-linear features
of EEG signal 141
28. [Online]. Available: https://towardsdatascience.com/feature-selection-identifying-the-best-
input-features-2ba9c95b5cab.
29. A.-B. A. T.-S. M. Sánchez-Maroño N: Filter Methods for Feature Selection – A Compara-
tive Study," Intelligent Data Engineering and Automated Learning - IDEAL 2007. IDEAL
2007. Lecture Notes in Computer Science, vol. 4881, Springer, Berlin, Heidelberg. (2007).
30. [Online]. Available: https://beta.vu.nl/nl/Images/werkstuk-fonti_tcm235-836234.pdf.
31. O. S. Qasima and Z. Y. Algamalb: Feature selection using particle swarm optimization-
based logistic regression model. Chemometrics and Intelligent Laboratory Systems,
182(15), 41-46, 2018.
32. Q. Cheng, P. Varshney and M. Arora: Logistic Regression for Feature Selection and Soft
Classification of Remote Sensing Data. IEEE Geoscience and Remote Sensing Letters, 3(4),
491-494, (2006).
33. C. He, S. Yan and L. Xiaoli: A deep learning framework for identifying children with ADHD
using an EEG-based brain network. Neurocomputing, 356(3), 83-96, (2019).
34. H. Chen, Y. Song and X. Li: Use of deep learning to detect personalized spatial-frequency
abnormalities in EEGs of children with ADHD. Journal of Neural Engineering, 16(6),
(2019).
35. S. S. Simranjit Kaur1, P. Arun, D. Kaur and M. Bajaj: Phase Space Reconstruction of EEG
Signals for Classification of ADHD and Control Adults. Clinical EEG and Neuroscience,
(2019).