=Paper=
{{Paper
|id=Vol-3276/SSS-22_FinalPaper_124
|storemode=property
|title=REM Estimation Based on Combination of
Multi-Timescale Estimations and Automatic Adjustment of Personal
Bio-vibration Data of Mattress Sensor
|pdfUrl=https://ceur-ws.org/Vol-3276/SSS-22_FinalPaper_124.pdf
|volume=Vol-3276
|authors=Iko Nakari,Naoya Matsuda,Keiki
Takadama
|dblpUrl=https://dblp.org/rec/conf/aaaiss/NakariMT22
}}
==REM Estimation Based on Combination of
Multi-Timescale Estimations and Automatic Adjustment of Personal
Bio-vibration Data of Mattress Sensor==
REM Estimation Based on Combination of Multi-Timescale Estimations and
Automatic Adjustment of Personal Bio-vibration Data of Mattress Sensor
Iko Nakari1 , Naoya Matsuda1 , Keiki Takadama2
The University of Electro-Communications
1-5-1 Chofugaoka, Chofu, Tokyo, Japan 182-8585
{iko0528, matsuda.naoya}@cas.lab.uec.ac.jp1 , keiki@inf.uec.ac.jp2
Abstract Polysomnography (PSG) test based on the Rechtschaffen
This paper proposes the novel REM estimation method
& Kales (R&K) method (Rechtschaffen and Kales 1968).
based on the combination of REM estimations with multi- However, the PSG test is a highly restrictive method and re-
timescale logarithmic spectrums calculated from overnight quires a person to attach multiple electrodes to his/her head
bio-vibration data acquired from mattress sensor. Concretely, and body, which burdens physical and mental on his/her and
this paper learns each Random Forests for multiple scale prevents obtaining data of sleep as usual. To address the
spectrums, and counts the number of REM estimation in the problems, the demand for sleep stage estimation methods
length of the window, and estimates REM if the counted by simple sensors (such as mattress sensors) has increased
number exceeds the threshold. The threshold is automatically as an alternative to the PSG test. For example, Watanabe
determined based on the REM estimation ratio to the total developed a mattress sensor and focused on the relation be-
sleep length for each person to consider individual differ- tween heart rate variability and sleep stage (Watanabe and
ences. Through the human subject experiments, the following
implications have been revealed: (1) the combination of RFs
Watanabe 2004). The accuracies of the method are reported
learned with each scale spectrum improves the Precision and as follows: 42.8% in three stages (NREM/REM/WAKE) es-
Recall of REM estimation, and Accuracy, Precision, Recall timation; 82.6% in NREM estimation; 70.5% in WAKE esti-
and Specificity are 80.2%, 51.4%, 47.0% and 48.5%, respec- mation; and 38.3% in REM estimation. As the results show,
tively; and (2) the automatic adjustment of the threshold can the accuracy of the method is not high, especially the ac-
be flexibly adapted to data with large individual differences curacy of REM sleep estimation. This is because that REM
without the need to retrain the model. sleep estimation is mainly based on rapid eye movements in
the R&K method, and mattress sensors cannot measure eye
Introduction movements. Even though REM sleep has other characteris-
According to the survey conducted by the Ministry of tics (i.e., unstable heart and respiration rate) acquired from
Health, Labour and Welfare, it is estimated that about one in the mattress sensor, it is hard to estimate REM sleep because
three Japanese adults feel sleepy during the day at least three of the following points. (1) The characteristics that appear
times a week. In addition to that, Japan has the shortest sleep in REM sleep appear intermittently rather than all the time
time among the OECD member countries (Organization for during REM sleep. (2) The heart rate gets unstable by body
Economic Cooperation and Development 2019), which sug- movements. (3) The heart rate is easily affected by individ-
gests that many people in Japan are sleep-deprived. The ual differences and daily physical condition.
accumulation of sleep deprivation (especially4-6 hours of To tackle the problems, it is necessary to estimate REM
sleep) leads to a state of sleep debt. In the state of sleep sleep from a new perspective, including physiological char-
debt, the ability to think and make decisions is equivalent acteristics. However, since we do not know what to fo-
to staying up all night (Van Dongen et al. 2003), and it is a cus on, machine learning (ML) is a good way to estimate
factor in the increased risk of industrial and traffic accidents. REM sleep from a new perspective. In this study, Random
It also decreases immune function and increases the risk of Forests (Breiman 2001) is employed for the ML model be-
developing lifestyle-related diseases such as depression and cause it is easier to analyze what the model learned from the
dementia (Mullington et al. 2009; Holingue et al. 2018). For data than deep learning (Goodfellow et al. 2016), which is
individuals to stay healthy and for the government to reduce widely employed because of its high prediction accuracy. It
health care costs, these sleep problems should be solved as is essential for analyzing models easier because it leads to
soon as possible. the interpretability of the model in the future. However, it
To solve these sleep problems, it is important to increase cannot deal with the problem (1) mentioned before by ap-
the amount of sleep time, but many people suffer from plying the ML to sleep stage estimation because each epoch
the problem of poor sleep quality even if they sleep for (30 seconds) is estimated without considering before/after
a long time. For the facts, it is necessary to understand the corresponding epoch. Due to this, it is difficult to esti-
sleep quality. The standard method for measuring sleep qual- mate REM sleep when the characteristics do not appear at
ity (sleep stage) is to evaluate biological data acquired by the epoch. In addition to that, the ML is not good at learning
___________________________________
In T. Kido, K. Takadama (Eds.), Proceedings of the AAAI 2022 Spring Symposium
“How Fair is Fair? Achieving Wellbeing AI”, Stanford University, Palo Alto, California,
USA, March 21–23, 2022. Copyright © 2022 for this paper by its authors. Use permitted
under Creative Commons License Attribution 4.0 International (CC BY 4.0).
74
data with individual differences. WAKE
REM
To overcome these problems, this paper aims to improve NREM1
the accuracy of REM sleep estimation and proposes the NREM2
NREM3
novel REM sleep estimation method with a mattress sensor 0 90 180 270 360 Time (min)
that can consider before/after the corresponding epoch and
be automatically adjusted the REM sleep estimation thresh-
Figure 1: Example of the overnight sleep stage.
old for each person. Concretely, this paper employs TANITA
sleep scan SL511 (Japan) as the mattress sensor for acquir-
ing bio-vibration data, and prepares several RFs for learning
the PSG test and are easy to evaluate. Note that these char-
multi-timescale logarithmic spectrums. It is combined that
acteristics occur intermittently, rather than continuously.
the REM sleep estimations by each RF for the final output
of estimation, and the estimation sensitivity is automatically
adjusted based on the REM sleep estimation ratio out of all Related Works
epochs in an overnight sleep. Sleep Stage Estimation by mattress sensor
This paper is organized as follows. The next section de- Watanabe et al. tried to extract the relation between the
scribes the sleep mechanism especially REM sleep. Section change in the heart rate and sleep stages through the fre-
3 describes the related works of non-contact sleep stage es- quency band containing the multiple biological rhythms of a
timation and RF which is the main ML method in our pro- human to build a foundation of sleep stage estimation from
posed method and Section 4 proposes our multiple scales heart rate variability (HRV) (Watanabe and Watanabe 2001).
REM estimation method. The experiment is conducted in They focused on two biological rhythms that the ultradian
Section 5 and the result are analyzed in Sections 6. Finally, rhythm and the circadian rhythm, which is an approximate
our conclusion is given in Section 7. 25 hours cycle. From their study, the relations between the
frequency of HRV and sleep stage have been revealed, and
Sleep Mechanism they built a sleep stage estimation method based on the heart
Sleep Stage rate data acquired from the air mattress sensor (Watanabe
The sleep stage is an indicator of the depth of sleep defined and Watanabe 2004).
by the R&K method. The depth of sleep is classified into six
stages in each epoch (30 seconds), i.e., WAKE, REM, Non- Random Forests
REM1 (N1), N2, N3, and N4 (N4 is often included in N3). This study employs Random Forests (RF) (Breiman 2001).
The proportion of each sleep stage in healthy young adults The RF model repeats random sampling from training data,
per night is as follows: WAKE is 1-5%; REM is 15-25%; randomly construct decision trees with different conditional
N1 is 5-20%; N2 is 45-75%; and N3 is 10-22% (depending branches, and classify them by majority rule of those results.
on age and physical condition on that day). In order to de- In this research, Gini impurity is the splitting condition, it
termine the sleep stage, the R&K method needs biological becomes low when all the samples contained in each node of
data such as electroencephalography (EEG), electrooculo- the decision tree are the same. RF processing is as follows:
gram (EOG), and electromyogram (EMG) acquired by the 1. Generate bootstrapped sample (Sj ) from training data set
PSG test. Figure 1 shows the example of the overnight sleep (S).
stages, where the vertical axis indicates the sleep stage and
2. One-third of the original data is called Out-Of-Bug
the horizontal axis indicates the time. As shown in Figure 1,
(OOB), and it is used for constructing decision tree. Each
the structure of the sleep stage in a healthy person repeats
node processing is as follows:
deep sleep (N3 sleep) and shallow sleep (above N3 sleep)
alternately, and the regular sleep repeats this cycle (about (a) Extract mtry features randomly with not allowing du-
90 to 120 minutes) three to five times a night. Each cycle is plicate value.
connected by about 20 to 30 minutes of REM sleep, and this (b) Choose the feature that minimizes Gini impurity, and
cycle is called the ultradian rhythm. divide nodes.
3. Repeat 1. to 2. Ntree times.
Characteristics of REM Sleep
The physiological characteristics of REM sleep are as fol- Where Ntree is the number of decision trees to be con-
lows: structed. In the classification problem, it is recommended
to use the square root of the total number of features for the
• rapid eye movement; variable mtry , which used to divide the nodes of decision
• decreased skeletal muscle activity; trees.
• increased or unstable heart rate and respiratory rate;
• changes in autonomic function. Proposed Method: Multi-Timescale REM
In particular, REM sleep is determined by focusing on “rapid Estimation
eye movement” and “decreased skeletal muscle activity” in The proposed method, Multi-Timescale REM estimation
the R&K method. This is because that the two characteris- starts from learning several RFs with each scale bio-
tics are clearly expressed in the biological data acquired by vibration data, then combines REM predictions by each RF,
75
(1) Learning each scale spectrum by each RF
RF RF RF RF
32 sec. 64 sec. 128 sec. 256 sec.
-
(2) Combining the REM predictions
-
Time
(a) power spectrum (b) logarithmic spectrum
1 1 0 1 0 0 1 4 2 2 PC: prediction
count
Figure 2: Examples of (a) power spectrum and (b) logarith-
2 2 1 1 1 5 7 8
WPC: windowed
mic spectrum (L = 64).
prediction
(3) Exploring optimal threshold and estimating REM count
Learning phase: labeling to the data TH REM estimation ratio
N2 N3 R R R R R N2 N2 N2 1 60%
Time 2 38%
…
30 sec. N2 128 sec. of data
if WPC ≥ 5 then REM
R 5 21%
if WPC < 5 then not REM
Estimating phase: predict for first epoch
…
R R R R R N2 N2 N2 N2 N2
Time Figure 4: Overview of Multi-Timescale REM Estimation.
Figure 3: Example of strides and how to label sleep stage to
the spectrum. the density (logarithmic value) of the spectrum and the
horizontal axis indicates the frequency. Furthermore, the
density of each frequency in the logarithmic spectrum is
and estimates REM sleep based on the REM estimation normalized to 0, 1 based on the value of the density of
threshold adjusted for each person. the overall frequency.
3. This logarithmic spectrum is calculated per 30-second
Input data (stride size is 30-second) and labeled with the correct
To extract characteristics of bio-vibration data by the ML, sleep stage (REM/Not-REM) determine by R&K method
the frequency analysis is applied for decomposing each vi- for RF to learn. Figure 3 shows the example of strides
brations (i.e., heartbeats, respiration and body movement) to (window size is 128 seconds) and how to label sleep stage
frequencies. This process is conducted as follows. to the spectrum. When labeling sleep stage to the spec-
trum, bio-vibration data often have multiple sleep stages,
1. Applying the Fast Fourier Transform (FFT) (Cooley and
so that, in this study, the sleep stage which is labeled to
Tukey 1965) to the bio-vibration data in a L-second win-
the spectrum is determined by a majority vote of the pro-
dow to convert the data to a power spectrum (note that
portions occupied by those sleep stages. Note that, when
the sampling frequency of the mattress sensor is 16Hz,
using RF for REM prediction (not learning phase), the
and data size is L × 16). In this study, window size (L) is
logarithmic spectrum is not labeled with the correct sleep
set as next for capturing several scales of REM. L = {32,
stage, and the output of the prediction is for first epoch.
64, 128, 256}. According to the sampling theorem (Shan-
The number of the input data that can be calculated from
non 1949), the frequency that can be analyzed by FFT
one subject (in case of seven hours of sleep) is about 840.
is up to 8Hz, so that the data size of power spectrum is
L×8, and the frequency resolution is 1/L Hz. Figure 2(a) REM estimation based on multiple scales spectrum
shows the example of power spectrum (L = 64) calcu-
lated from bio-vibration data, where the vertical axis in- Figure 4 shows the overview of the proposed Multi-
dicates the density of power spectrum and the horizontal Timescale REM Estimation. The flow of the method is as
axis indicates the frequency. In particular, the frequency follows: (1) preparing RFs for a number of scales (each
band between 0.1Hz and 0.3Hz is related to the respira- window size spectrum), and learning each scale spectrum
tion, and the frequency band between 0.6Hz and 1.5Hz is by each RF; (2) combining the number of REM predictions
related to the heartbeats. Regarding the BM, the larger/s- by each RF in each window (note that, this window is dif-
maller BM, the higher/lower density of the power spec- ferent from window size of spectrum); (3) exploring opti-
trum. However, as shown in Figure 2(a), it is difficult to mal threshold for REM estimation from overnight data, and
understand the shape of the power spectrum above 1Hz REM sleep is detected when the number of REM predictions
because of the high density of frequencies below 1Hz. in a window counted in (2) exceeds the threshold.
2. In order to make it easier to understand above 1Hz and Combining REM predictions by each RF: Our method
for RF to learn, power spectrum is converted into a loga- outputs four REM predictions for each epoch from RF of 32
rithmic spectrum (log10). Figure 2(b) shows the example sec., 64 sec., 128 sec. and 256 sec., and the REM Prediction
of the logarithmic spectrum converted from the power Count (PC) is counted for each epoch as shown in top of Fig-
spectrum of Figure 2(a), where the vertical axis indicates ure 4(2). Since REM sleep do not occur singly (one epoch)
76
Algorithm 1: Exploring optimal threshold Table 1: Information of healthy subjects.
1: r list[ ]: REM estimation ratio for each threshold ID (Age) WAKE REM N1 N2 N34 Total
2: W P C list[ ]: WPC for each epoch in overnight sleep
A (20’s) 46 176 41 421 164 848
3: N EPOCH: Number of epoch in overnight sleep
B (20’s) 53 113 48 368 2 584
4: current T H ⇐ 1: Start to explore from 1
C (30’s) 103 184 35 382 0 704
5: IS CONTINUE⇐ true
D (40’s) 75 65 46 419 2 607
6: while IS CONTINUE do
E (40’s) 44 85 53 390 23 595
7: REM COUNT ⇐ count epochs detected as REM
F (40’s) 34 121 40 225 0 420
from W P C list
G (40’s) 53 110 80 407 1 651
8: r list.append(REM COUNT / N EPOCH)
H (50’s) 35 249 56 520 0 860
9: if REM COUNT is equal to 0 then
I (60’s) 98 159 27 436 0 720
10: IS CONTINUE = f alse
11: end if
12: current T H++
13: end while The column “ID (Age)” indicates the ID of the subject and
14: optimal TH: Optimal threshold found by the exploring age of that. The columns from “WAKE” to “N34” indicate
15: for i = 0 to r list.length−1 do the number of epochs in each sleep stage (WAKE, REM,
16: if r list[ i] > 0.25 then NREM1, NREM2 and NREM34), and the column “Total”
17: break indicates the total number of epochs in one night. The av-
18: end if erage number of epochs (30 seconds) of sleep is 664±130.
19: optimal TH = i As evaluation criteria, this study employs five evaluation in-
20: end for dicators, Accuracy, Precision, Recall, F-measure and Speci-
ficity of REM estimation. In addition, this study evaluates
the REM estimation ratio to see if REM estimations are be-
but in clusters (successive epochs), in order to consider the ing made at an appropriate frequency.
state before/after the epoch wanted to estimate, the method
prepare a window to count PC for Ne epochs before/after the setup
epoch, which called Windowed Prediction Count (WPC) as The electrodes were attached to the body and head of each
shown in bottom of Figure 4(2). subject to acquire EEG, EOG and EMG, and mattress sen-
Automatic adjustment of REM estimation threshold: sor was placed under the mattress in the bed to acquire bio-
The finally output of the proposed REM estimation for an vibration data in one night. After sleep, the correct sleep
epoch is determined by the value of WPC. If the WPC ex- stages for each subject were determined according to the
ceeded a certain threshold, then the epoch is detected as R&K method based on the data measured by PSG (helped
REM sleep, as shown in the Figure 4(3). It has a propor- by medical specialist), and the bio-vibration data measured
tional relationship between the size of the threshold and the by mattress sensor is converted to logarithmic spectrums of
REM estimation ratio, which means the ratio of estimation several scales (i.e., window sizes are L = {32, 64, 128, 256,
out of all epochs in an overnight sleep (without considering 512}) of which are labeled with the correct sleep stage in
correct and incorrect answers). According to the physiologi- each epoch.
cal characteristics of sleep, the proportion of REM sleep per The logarithmic spectrum of each scale is learned with
overnight sleep is about 20%, so that the proposed method different RFs from each other. The training data is gener-
explores the threshold which the REM estimation ratio is ated from the eight subjects, and the validation data is the
about 20% for each person to avoid excessive or negative other subjects. The ratio of REM sleep and not-REM sleep
estimation. of training data is 1:3 because REM sleep accounts for 20%
The algorithm of exploring optimal threshold is described of one night sleep and to prevent excessive REM estimation.
in Algorithm 1 The algorithm counts the number of epoch The data which have large BM are excluded because it af-
detected as REM sleep by a threshold in overnight sleep, and fect the shape of the spectrum and difficult to learn with RF.
calculate the REM estimation ratio, while the REM sleep The parameters of RF are set as follows: (i) the maximum
count is equal to 0 (see line 6 to 13). Then, the optimal depth of decision tree is 10; (ii) the number of decision tree
threshold is extracted as the previous threshold where the is 50; (iii) the number of the features employed to construct
REM estimation ratio exceeds 25% (see line 15 to 20). the decision tree is 16, 23, 32, 46 and 64 for window size 32,
64, 128, 256 and 512 respectively. The window size Ne for
Experiments counting WPC is set as 3.
To investigate the effectiveness of the proposed Multi-
Timescale REM Estimation, this paper conducted the hu- Results
man subject experiment of the nine of healthy subjects. Table 2 shows the results of REM estimations. The column
The performance of the REM estimation is compared with “Type of RF” indicates the RF learned with each window
RFs learned with each window size of logarithmic spec- size of spectrum and rows “proposed 1” and “proposed 2”
trum. The information of subjects is summarized in Table 1. are the combination of multiple RF results, and the other
77
Table 2: Reaults of REM estimation by RFs learned with each window size and the proposed method.
Values are expressed as mean ± standard deviation.
Type of RF Accuracy Precision Recall F-measure Specificity REM estimation ratio
32 sec. window 74.1 ± 8.5 33.0 ± 21.3 9.8 ± 13.6 11.3 ± 10.3 91.5 ± 12.2 8.8 ± 12.3
64 sec. window 76.0 ± 7.1 38.5 ± 21.3 12.9 ± 18.9 14.2 ± 12.3 92.9 ± 11.6 8.4 ± 13.3
128 sec. window 76.6 ± 8.3 47.3 ± 14.8 20.6 ± 16.5 24.8 ± 9.8 91.5 ± 12.3 11.0 ± 13.3
256 sec. window 78.0 ± 7.8 52.8 ± 20.8 24.5 ± 17.4 29.9 ± 12.6 92.0 ± 11.2 11.6 ± 12.4
512 sec. window 76.3 ± 5.9 25.8 ± 16.8 4.6 ± 3.8 7.2 ± 5.3 95.6 ± 4.4 4.5 ± 4.1
Proposed 1
79.0 ± 8.1 52.8 ± 20.7 55.4 ± 21.3 51.1 ± 16.4 84.9 ± 9.8 23.3 ± 11.1
(same TH (= 2))
Proposed 2
80.2 ± 5.5 51.4 ± 15.0 47.0 ± 15.5 48.5 ± 13.7 89.0 ± 3.0 18.5 ± 3.4
(consider individual)
W
3
2R
1
N12
N34
0
0:00:00 0:57:36 1:55:12 2:52:48 3:50:24 4:48:00
correct sleep stage 32sec. detection 64sec. Detection 128sec. Detection 256sec. detection
Figure 5: Estimation result of subject “E”.
columns are the evaluation indicators. Each value of indica- NREM34), the horizontal axis indicates the time, the blue
tors are expressed as mean value of nine subjects ± standard line indicates the correct sleep stage, and orange line, gray
deviation value of those, and the value is a percentage. As line, yellow line and green line are REM estimation result of
shown in Table 2, the Accuracy, Precision, Recall and F- 32, 64, 128 and 256 seconds respectively. As shown in Fig-
measure are getting high rate as window size of spectrum ure 5, the areas of the actual REM sleep marked by red cy-
getting wide, but when window size of spectrum is too wide cles tend to have a concentration of REM estimation by four
(i.e., 512 sec.), these evaluation indicators get worsen than types of RF. On the other hand, the actual not REM sleep
any other results. The reason why the value of Specificity in areas tend to have few REM estimations. Based on the pro-
512 seconds is most largest than any other results is that the posed method, the number of REM estimations by each RF
REM estimation ratio of the RF leaned with 512 seconds is in a window interval of 3 epochs (1.5 minutes) before/after
small (i.e., the REM estimation is passive). is shown in Figure 6. In Figure 6, the left and right verti-
Focusing on the results of two proposed methods, the val- cal axes indicate the number of windowed REM prediction
ues of Accuracy, Recall and F-measure outperform any other count (WPC) and sleep stage, respectively, the horizontal
results (i.e., single RF leaned with each window size of spec- axis indicates the time, and the orange and blue lines indicate
trum). The values of Precision and Specificity are not the the WPC and correct sleep stage, respectively. As shown in
best among the result but these values are close to the best. Figure 6, the WPC tends to be high in the areas of the actual
In addition, the REM estimation ratio is larger than any other REM sleep, and it tends to be low in that of actual not REM
results, and the ratio is close to the average ratio of REM sleep. The proposed method (the combination of multiple
sleep per one night (about 20%). The standard deviation RF results) exploits this tendency and estimates REM sleep
of each evaluation indicators are smaller for the proposed when the WPC exceeds a certain threshold. The threshold
method 2, which considers individual differences, than for for REM estimation in the proposed method was determined
the proposed method 1, which does not considers individual by the sensitivity analysis of REM estimation threshold as
differences, and this fact suggests the results are stable for shown in the Figure 8, where the vertical axis indicates the
each subject. percentage, horizontal axis indicates the threshold, and blue,
orange, gray, yellow, purple and green lines indicate Accu-
Discussion racy, Precision, Recall, F-measure, Specificity and REM es-
How the combination of multiple RF results timation ratio, respectively. From the figure, the smaller the
contributes? threshold value, the higher the REM estimation ratio, and
the better the estimation of the actual REM sleep (as shown
Figure 5 shows the estimation result of subject E, where the in the Figure 8, Recall), while the larger the threshold value,
vertical axis indicates sleep stage (WAKE, REM, NREM12,
78
10 3W
8
2R
6
WPC
4
1N12
2
0 0N34
0:00:00 0:30:14 1:00:29 1:30:43 2:00:58 2:31:12 3:01:26 3:31:41 4:01:55 4:32:10 5:02:24
windowed REM prediction count (WPC) correct sleep stage
Figure 6: The number of windowed REM prediction count (WPC): subject “E”.
30 3W
25
20 2R
WPC
15
10 1N12
5
0 0N34
0:00:00 1:12:00 2:24:00 3:36:00 4:48:00 6:00:00
windowed REM prediction count (WPC) correct sleep stage
Figure 7: The number of windowed REM prediction count (WPC): subject “I”.
100.0%
90.0%
“I” who have to set a significantly different threshold com-
80.0% pared to other subjects. Figure 7 shows the number of WPC
70.0% of subject I, where the left and right vertical axes indicate the
60.0% number of REM estimations and sleep stage, respectively,
50.0%
40.0%
the horizontal axis indicates the time, and the orange and
30.0% blue lines indicate the WPC and correct sleep stage, respec-
20.0% tively. Compared to the results of subject “E” in Figure 6, the
10.0% WPC of subject “I” is excessive, and it is not desirable to set
0.0%
1 2 3 4 5 6 7 8 9 10
a threshold of the same value. To deal with the individual
Accuracy Precision Recall
differences the proposed method focused on the physiolog-
F-mesure Specificity REM detection ratio ical characteristics about sleep that REM sleep accounts for
about 20% of the total sleep in one night.
Figure 8: Sensitivity analysis of REM estimation threshold
in the proposed method. Table 3 shows the thresholds and results that were auto-
matically adjusted for each subject so that the REM esti-
mation ratio to the overall sleep time is about 20%. In the
the lower the REM estimation ratio and the better the esti- Table 3, the column “ID (Age)” indicates the subject ID and
mation of actual not-REM sleep (as shown in the Figure 8, age, the columns “TH” and “REM estimation ratio” indicate
Specificity). In this study, the threshold value (= 2) was cho- the threshold (the value is an integer) for REM estimation
sen based on the Precision and Recall are almost equal and and REM estimation ratio (the value is percentage) based
F-measure is the largest. However, this threshold is suscep- on the threshold. The other columns indicate each evalua-
tible to individual differences (e.g., age and physical condi- tion indicators (the value is a percentage). Each threshold
tion on the day) and must be carefully determined for each is set to a value close to 20% without the REM estimation
subject. ratio exceeding 25%. As shown in Table 3, the thresholds
for all subjects, except subject “I”, are set between 1 and 4.
If the threshold value of 4 is given to the subject “I” like
Consideration of individual differences in the
the other subjects, the REM estimation ratio will be 86.1%
proposed method and the Accuracy, Precision, Recall, F-measure and Speci-
As mentioned above, this section discusses the importance ficity will be 35.9%, 25.7%, 99.4%,40.9% and 17.7%, re-
of the threshold setting in the proposed method with subject spectively. This situation should be avoided in real applica-
79
Table 3: The results of proposed method considering the individual differences (the row “proposed 2” in Table 2) in detail.
ID (Age) TH REM estimation ratio Accuracy Precision Recall F-measure Specificity
A (20’s) 3 17.2 87.0 73.1 60.2 66.0 94.1
B (20’s) 4 19.4 72.4 29.5 29.2 29.3 83.0
C (30’s) 3 19.4 77.8 60.7 44.6 51.4 89.7
D (40’s) 2 9.3 84.8 23.2 21.3 22.2 92.0
E (40’s) 1 19.6 87.4 54.8 74.1 63.0 89.7
F (40’s) 4 20.1 77.7 63.9 46.1 53.5 89.9
G (40’s) 4 20.0 84.0 52.7 61.8 56.9 88.6
H (50’s) 1 21.1 72.7 54.4 39.4 45.7 86.4
I (60’s) 20 20.2 78.0 50.7 45.9 48.2 87.2
tions, and the proposed method makes the REM estimation The future task is following: (1) to investigate the validity
with fewer wrong estimation by the condition that the keep of the combination of multiple RFs; (2) to validate whether
the REM estimation ratio around 20%. it is effective for other sleep stages.
In general, to improve the accuracy of the estimation for
such data, it is necessary to retrain the model by collect- References
ing similar data or devising new input features, which are Breiman, L. 2001. Random forests. Machine learning,
difficult tasks and take a long time to do. By contrast, the 45(1): 5–32.
proposed method does not need to do these things to im-
Cooley, J. W.; and Tukey, J. W. 1965. An algorithm for the
prove the accuracy, and the only thing needed to do is set
machine calculation of complex Fourier series. Mathematics
the REM estimation threshold. In addition, the threshold can
of computation, 19(90): 297–301.
be automatically determined based on the REM estimation
ratio so that the proposed method makes it easy to adapt to Goodfellow, I.; Bengio, Y.; Courville, A.; and Bengio, Y.
individual differences. Therefore, the differences in the au- 2016. Deep learning, volume 1. MIT press Cambridge.
tomatically determined thresholds, as shown in column TH Holingue, C.; Wennberg, A.; Berger, S.; Polotsky, V. Y.; and
of Table 3, represent individual differences. Since the heart Spira, A. P. 2018. Disturbed sleep and diabetes: A potential
rate is increased or unstable during REM sleep, it can be in- nexus of dementia risk. Metabolism, 84: 85–93.
ferred, for example, that if the value of the threshold is high, Mullington, J. M.; Haack, M.; Toth, M.; Serrador, J. M.; and
heart rate of overnight may be higher or more unstable than Meier-Ewert, H. K. 2009. Cardiovascular, inflammatory,
an average person. and metabolic consequences of sleep deprivation. Progress
in cardiovascular diseases, 51(4): 294–302.
Conclusion Organization for Economic Cooperation and Development.
This paper proposed the novel REM estimation method that 2019. GENDER EQUALITY, Gender Data Portal. https:
combination of multiple RF learned with different timescale //www.oecd.org/gender/data/.
of spectrums and investigates its effectiveness through com- Rechtschaffen, A.; and Kales, A. 1968. A Manual of Stan-
parison of the REM estimation by single RF learned with dardized Terminology, Techniques and Scoring System for
each scale spectrums. Concretely, the proposed method Sleep Stages of Human Subjects. Washington DC.
learns several RFs with each scale spectrums, then counts Shannon, C. E. 1949. Communication in the presence of
the number of REM estimation in the length of the window noise. Proceedings of the IRE, 37(1): 10–21.
and estimates REM sleep if the counted number exceeds the Van Dongen, H.; Maislin, G.; Mullington, J. M.; and Dinges,
threshold. Furthermore, the threshold is automatically de- D. F. 2003. The cumulative cost of additional wakeful-
termined for each person based on the REM estimation ra- ness: dose-response effects on neurobehavioral functions
tio to the overall sleep time for considering individual dif- and sleep physiology from chronic sleep restriction and total
ferences. The results of the human subject experiments, the sleep deprivation. Sleep, 26(2): 117–126.
Accuracy, Precision, Recall and Specificity of the REM esti-
mation are 80.2(±5.5)%, 51.4(±15.0)%, 47.0(±15.5)% and Watanabe, T.; and Watanabe, K. 2001. Estimation of the
48.5(±13.7)%, respectively. Through experiments, the fol- sleep stages by the non-restrictive air mattress sensor rela-
lowing implications have been revealed: (1) the combination tion between the change in the heart rate and sleep stages.
of RFs learned with multiple window sizes spectrum sepa- Transactions of the Society of Instrument and Control Engi-
rately improves the Precision of REM estimation and Recall neers, 37(9): 821–828.
of that, rather than RF learned with only a particular window Watanabe, T.; and Watanabe, K. 2004. Noncontact method
size spectrum; (2) the automatic adjustment of the threshold for sleep stage estimation. IEEE Transactions on biomedical
based on the REM estimation ratio to the total sleep length engineering, 51(10): 1735–1748.
can be flexibly adapted to data with large individual differ-
ences without the need to retrain the model.
80