=Paper= {{Paper |id=Vol-3276/SSS-22_FinalPaper_124 |storemode=property |title=REM Estimation Based on Combination of Multi-Timescale Estimations and Automatic Adjustment of Personal Bio-vibration Data of Mattress Sensor |pdfUrl=https://ceur-ws.org/Vol-3276/SSS-22_FinalPaper_124.pdf |volume=Vol-3276 |authors=Iko Nakari,Naoya Matsuda,Keiki Takadama |dblpUrl=https://dblp.org/rec/conf/aaaiss/NakariMT22 }} ==REM Estimation Based on Combination of Multi-Timescale Estimations and Automatic Adjustment of Personal Bio-vibration Data of Mattress Sensor== https://ceur-ws.org/Vol-3276/SSS-22_FinalPaper_124.pdf
      REM Estimation Based on Combination of Multi-Timescale Estimations and
       Automatic Adjustment of Personal Bio-vibration Data of Mattress Sensor
                                                Iko Nakari1 , Naoya Matsuda1 , Keiki Takadama2
                                                      The University of Electro-Communications
                                                   1-5-1 Chofugaoka, Chofu, Tokyo, Japan 182-8585
                                           {iko0528, matsuda.naoya}@cas.lab.uec.ac.jp1 , keiki@inf.uec.ac.jp2


                                     Abstract                                             Polysomnography (PSG) test based on the Rechtschaffen
   This paper proposes the novel REM estimation method
                                                                                          & Kales (R&K) method (Rechtschaffen and Kales 1968).
   based on the combination of REM estimations with multi-                                However, the PSG test is a highly restrictive method and re-
   timescale logarithmic spectrums calculated from overnight                              quires a person to attach multiple electrodes to his/her head
   bio-vibration data acquired from mattress sensor. Concretely,                          and body, which burdens physical and mental on his/her and
   this paper learns each Random Forests for multiple scale                               prevents obtaining data of sleep as usual. To address the
   spectrums, and counts the number of REM estimation in the                              problems, the demand for sleep stage estimation methods
   length of the window, and estimates REM if the counted                                 by simple sensors (such as mattress sensors) has increased
   number exceeds the threshold. The threshold is automatically                           as an alternative to the PSG test. For example, Watanabe
   determined based on the REM estimation ratio to the total                              developed a mattress sensor and focused on the relation be-
   sleep length for each person to consider individual differ-                            tween heart rate variability and sleep stage (Watanabe and
   ences. Through the human subject experiments, the following
   implications have been revealed: (1) the combination of RFs
                                                                                          Watanabe 2004). The accuracies of the method are reported
   learned with each scale spectrum improves the Precision and                            as follows: 42.8% in three stages (NREM/REM/WAKE) es-
   Recall of REM estimation, and Accuracy, Precision, Recall                              timation; 82.6% in NREM estimation; 70.5% in WAKE esti-
   and Specificity are 80.2%, 51.4%, 47.0% and 48.5%, respec-                             mation; and 38.3% in REM estimation. As the results show,
   tively; and (2) the automatic adjustment of the threshold can                          the accuracy of the method is not high, especially the ac-
   be flexibly adapted to data with large individual differences                          curacy of REM sleep estimation. This is because that REM
   without the need to retrain the model.                                                 sleep estimation is mainly based on rapid eye movements in
                                                                                          the R&K method, and mattress sensors cannot measure eye
                                Introduction                                              movements. Even though REM sleep has other characteris-
According to the survey conducted by the Ministry of                                      tics (i.e., unstable heart and respiration rate) acquired from
Health, Labour and Welfare, it is estimated that about one in                             the mattress sensor, it is hard to estimate REM sleep because
three Japanese adults feel sleepy during the day at least three                           of the following points. (1) The characteristics that appear
times a week. In addition to that, Japan has the shortest sleep                           in REM sleep appear intermittently rather than all the time
time among the OECD member countries (Organization for                                    during REM sleep. (2) The heart rate gets unstable by body
Economic Cooperation and Development 2019), which sug-                                    movements. (3) The heart rate is easily affected by individ-
gests that many people in Japan are sleep-deprived. The                                   ual differences and daily physical condition.
accumulation of sleep deprivation (especially4-6 hours of                                    To tackle the problems, it is necessary to estimate REM
sleep) leads to a state of sleep debt. In the state of sleep                              sleep from a new perspective, including physiological char-
debt, the ability to think and make decisions is equivalent                               acteristics. However, since we do not know what to fo-
to staying up all night (Van Dongen et al. 2003), and it is a                             cus on, machine learning (ML) is a good way to estimate
factor in the increased risk of industrial and traffic accidents.                         REM sleep from a new perspective. In this study, Random
It also decreases immune function and increases the risk of                               Forests (Breiman 2001) is employed for the ML model be-
developing lifestyle-related diseases such as depression and                              cause it is easier to analyze what the model learned from the
dementia (Mullington et al. 2009; Holingue et al. 2018). For                              data than deep learning (Goodfellow et al. 2016), which is
individuals to stay healthy and for the government to reduce                              widely employed because of its high prediction accuracy. It
health care costs, these sleep problems should be solved as                               is essential for analyzing models easier because it leads to
soon as possible.                                                                         the interpretability of the model in the future. However, it
   To solve these sleep problems, it is important to increase                             cannot deal with the problem (1) mentioned before by ap-
the amount of sleep time, but many people suffer from                                     plying the ML to sleep stage estimation because each epoch
the problem of poor sleep quality even if they sleep for                                  (30 seconds) is estimated without considering before/after
a long time. For the facts, it is necessary to understand                                 the corresponding epoch. Due to this, it is difficult to esti-
sleep quality. The standard method for measuring sleep qual-                              mate REM sleep when the characteristics do not appear at
ity (sleep stage) is to evaluate biological data acquired by                              the epoch. In addition to that, the ML is not good at learning
___________________________________
In T. Kido, K. Takadama (Eds.), Proceedings of the AAAI 2022 Spring Symposium
“How Fair is Fair? Achieving Wellbeing AI”, Stanford University, Palo Alto, California,
USA, March 21–23, 2022. Copyright © 2022 for this paper by its authors. Use permitted
under Creative Commons License Attribution 4.0 International (CC BY 4.0).

                                                                                                                                                 74
data with individual differences.                                   WAKE
                                                                     REM
   To overcome these problems, this paper aims to improve          NREM1
the accuracy of REM sleep estimation and proposes the              NREM2
                                                                   NREM3
novel REM sleep estimation method with a mattress sensor                   0     90        180       270        360   Time (min)
that can consider before/after the corresponding epoch and
be automatically adjusted the REM sleep estimation thresh-
                                                                       Figure 1: Example of the overnight sleep stage.
old for each person. Concretely, this paper employs TANITA
sleep scan SL511 (Japan) as the mattress sensor for acquir-
ing bio-vibration data, and prepares several RFs for learning
                                                                 the PSG test and are easy to evaluate. Note that these char-
multi-timescale logarithmic spectrums. It is combined that
                                                                 acteristics occur intermittently, rather than continuously.
the REM sleep estimations by each RF for the final output
of estimation, and the estimation sensitivity is automatically
adjusted based on the REM sleep estimation ratio out of all                           Related Works
epochs in an overnight sleep.                                    Sleep Stage Estimation by mattress sensor
   This paper is organized as follows. The next section de-      Watanabe et al. tried to extract the relation between the
scribes the sleep mechanism especially REM sleep. Section        change in the heart rate and sleep stages through the fre-
3 describes the related works of non-contact sleep stage es-     quency band containing the multiple biological rhythms of a
timation and RF which is the main ML method in our pro-          human to build a foundation of sleep stage estimation from
posed method and Section 4 proposes our multiple scales          heart rate variability (HRV) (Watanabe and Watanabe 2001).
REM estimation method. The experiment is conducted in            They focused on two biological rhythms that the ultradian
Section 5 and the result are analyzed in Sections 6. Finally,    rhythm and the circadian rhythm, which is an approximate
our conclusion is given in Section 7.                            25 hours cycle. From their study, the relations between the
                                                                 frequency of HRV and sleep stage have been revealed, and
                   Sleep Mechanism                               they built a sleep stage estimation method based on the heart
Sleep Stage                                                      rate data acquired from the air mattress sensor (Watanabe
The sleep stage is an indicator of the depth of sleep defined    and Watanabe 2004).
by the R&K method. The depth of sleep is classified into six
stages in each epoch (30 seconds), i.e., WAKE, REM, Non-         Random Forests
REM1 (N1), N2, N3, and N4 (N4 is often included in N3).          This study employs Random Forests (RF) (Breiman 2001).
The proportion of each sleep stage in healthy young adults       The RF model repeats random sampling from training data,
per night is as follows: WAKE is 1-5%; REM is 15-25%;            randomly construct decision trees with different conditional
N1 is 5-20%; N2 is 45-75%; and N3 is 10-22% (depending           branches, and classify them by majority rule of those results.
on age and physical condition on that day). In order to de-      In this research, Gini impurity is the splitting condition, it
termine the sleep stage, the R&K method needs biological         becomes low when all the samples contained in each node of
data such as electroencephalography (EEG), electrooculo-         the decision tree are the same. RF processing is as follows:
gram (EOG), and electromyogram (EMG) acquired by the             1. Generate bootstrapped sample (Sj ) from training data set
PSG test. Figure 1 shows the example of the overnight sleep         (S).
stages, where the vertical axis indicates the sleep stage and
                                                                 2. One-third of the original data is called Out-Of-Bug
the horizontal axis indicates the time. As shown in Figure 1,
                                                                    (OOB), and it is used for constructing decision tree. Each
the structure of the sleep stage in a healthy person repeats
                                                                    node processing is as follows:
deep sleep (N3 sleep) and shallow sleep (above N3 sleep)
alternately, and the regular sleep repeats this cycle (about      (a) Extract mtry features randomly with not allowing du-
90 to 120 minutes) three to five times a night. Each cycle is         plicate value.
connected by about 20 to 30 minutes of REM sleep, and this        (b) Choose the feature that minimizes Gini impurity, and
cycle is called the ultradian rhythm.                                 divide nodes.
                                                                 3. Repeat 1. to 2. Ntree times.
Characteristics of REM Sleep
The physiological characteristics of REM sleep are as fol-       Where Ntree is the number of decision trees to be con-
lows:                                                            structed. In the classification problem, it is recommended
                                                                 to use the square root of the total number of features for the
  • rapid eye movement;                                          variable mtry , which used to divide the nodes of decision
  • decreased skeletal muscle activity;                          trees.
  • increased or unstable heart rate and respiratory rate;
  • changes in autonomic function.                                  Proposed Method: Multi-Timescale REM
In particular, REM sleep is determined by focusing on “rapid                     Estimation
eye movement” and “decreased skeletal muscle activity” in        The proposed method, Multi-Timescale REM estimation
the R&K method. This is because that the two characteris-        starts from learning several RFs with each scale bio-
tics are clearly expressed in the biological data acquired by    vibration data, then combines REM predictions by each RF,




                                                                                                                         75
                                                                           (1) Learning each scale spectrum by each RF

                                                                                     RF              RF            RF             RF
                                                                                    32 sec.        64 sec.       128 sec.       256 sec.
                                     -
                                                                           (2) Combining the REM predictions
                                     -

                                                                                                                                           Time
         (a) power spectrum                   (b) logarithmic spectrum
                                                                               1      1       0    1    0    0      1       4    2   2      PC: prediction
                                                                                                                                                  count
Figure 2: Examples of (a) power spectrum and (b) logarith-
                                                                                      2       2    1    1    1      5       7    8
                                                                                                                              WPC: windowed
mic spectrum (L = 64).
                                                                                                                                   prediction
                                                                           (3) Exploring optimal threshold and estimating REM        count
       Learning phase: labeling to the data                                    TH REM estimation ratio
        N2 N3 R R R R R N2 N2 N2                                                1                 60%

                                                            Time                2                 38%




                                                                                              …
       30 sec.                                 N2 128 sec. of data
                                                                                                                        if WPC ≥ 5 then REM
                         R                                                      5                 21%
                                                                                                                        if WPC < 5 then not REM
       Estimating phase: predict for first epoch




                                                                                              …
         R       R   R       R   R N2 N2 N2 N2 N2
                                                            Time          Figure 4: Overview of Multi-Timescale REM Estimation.

Figure 3: Example of strides and how to label sleep stage to
the spectrum.                                                               the density (logarithmic value) of the spectrum and the
                                                                            horizontal axis indicates the frequency. Furthermore, the
                                                                            density of each frequency in the logarithmic spectrum is
and estimates REM sleep based on the REM estimation                         normalized to 0, 1 based on the value of the density of
threshold adjusted for each person.                                         the overall frequency.
                                                                         3. This logarithmic spectrum is calculated per 30-second
Input data                                                                  (stride size is 30-second) and labeled with the correct
To extract characteristics of bio-vibration data by the ML,                 sleep stage (REM/Not-REM) determine by R&K method
the frequency analysis is applied for decomposing each vi-                  for RF to learn. Figure 3 shows the example of strides
brations (i.e., heartbeats, respiration and body movement) to               (window size is 128 seconds) and how to label sleep stage
frequencies. This process is conducted as follows.                          to the spectrum. When labeling sleep stage to the spec-
                                                                            trum, bio-vibration data often have multiple sleep stages,
1. Applying the Fast Fourier Transform (FFT) (Cooley and
                                                                            so that, in this study, the sleep stage which is labeled to
   Tukey 1965) to the bio-vibration data in a L-second win-
                                                                            the spectrum is determined by a majority vote of the pro-
   dow to convert the data to a power spectrum (note that
                                                                            portions occupied by those sleep stages. Note that, when
   the sampling frequency of the mattress sensor is 16Hz,
                                                                            using RF for REM prediction (not learning phase), the
   and data size is L × 16). In this study, window size (L) is
                                                                            logarithmic spectrum is not labeled with the correct sleep
   set as next for capturing several scales of REM. L = {32,
                                                                            stage, and the output of the prediction is for first epoch.
   64, 128, 256}. According to the sampling theorem (Shan-
                                                                            The number of the input data that can be calculated from
   non 1949), the frequency that can be analyzed by FFT
                                                                            one subject (in case of seven hours of sleep) is about 840.
   is up to 8Hz, so that the data size of power spectrum is
   L×8, and the frequency resolution is 1/L Hz. Figure 2(a)              REM estimation based on multiple scales spectrum
   shows the example of power spectrum (L = 64) calcu-
   lated from bio-vibration data, where the vertical axis in-            Figure 4 shows the overview of the proposed Multi-
   dicates the density of power spectrum and the horizontal              Timescale REM Estimation. The flow of the method is as
   axis indicates the frequency. In particular, the frequency            follows: (1) preparing RFs for a number of scales (each
   band between 0.1Hz and 0.3Hz is related to the respira-               window size spectrum), and learning each scale spectrum
   tion, and the frequency band between 0.6Hz and 1.5Hz is               by each RF; (2) combining the number of REM predictions
   related to the heartbeats. Regarding the BM, the larger/s-            by each RF in each window (note that, this window is dif-
   maller BM, the higher/lower density of the power spec-                ferent from window size of spectrum); (3) exploring opti-
   trum. However, as shown in Figure 2(a), it is difficult to            mal threshold for REM estimation from overnight data, and
   understand the shape of the power spectrum above 1Hz                  REM sleep is detected when the number of REM predictions
   because of the high density of frequencies below 1Hz.                 in a window counted in (2) exceeds the threshold.
2. In order to make it easier to understand above 1Hz and                Combining REM predictions by each RF: Our method
   for RF to learn, power spectrum is converted into a loga-             outputs four REM predictions for each epoch from RF of 32
   rithmic spectrum (log10). Figure 2(b) shows the example               sec., 64 sec., 128 sec. and 256 sec., and the REM Prediction
   of the logarithmic spectrum converted from the power                  Count (PC) is counted for each epoch as shown in top of Fig-
   spectrum of Figure 2(a), where the vertical axis indicates            ure 4(2). Since REM sleep do not occur singly (one epoch)




                                                                                                                                                     76
Algorithm 1: Exploring optimal threshold                                  Table 1: Information of healthy subjects.
 1: r list[ ]: REM estimation ratio for each threshold            ID (Age)    WAKE       REM     N1     N2     N34    Total
 2: W P C list[ ]: WPC for each epoch in overnight sleep
                                                                  A (20’s)      46       176     41     421    164    848
 3: N EPOCH: Number of epoch in overnight sleep
                                                                   B (20’s)     53       113     48     368      2    584
 4: current T H ⇐ 1: Start to explore from 1
                                                                   C (30’s)    103       184     35     382     0     704
 5: IS CONTINUE⇐ true
                                                                  D (40’s)      75        65     46     419      2    607
 6: while IS CONTINUE do
                                                                   E (40’s)     44        85     53     390     23    595
 7:    REM COUNT ⇐ count epochs detected as REM
                                                                   F (40’s)     34       121     40     225      0    420
      from W P C list
                                                                  G (40’s)      53       110     80     407      1    651
 8:   r list.append(REM COUNT / N EPOCH)
                                                                  H (50’s)      35       249     56     520      0    860
 9:   if REM COUNT is equal to 0 then
                                                                   I (60’s)     98       159     27     436      0    720
10:       IS CONTINUE = f alse
11:   end if
12:   current T H++
13: end while                                                   The column “ID (Age)” indicates the ID of the subject and
14: optimal TH: Optimal threshold found by the exploring        age of that. The columns from “WAKE” to “N34” indicate
15: for i = 0 to r list.length−1 do                             the number of epochs in each sleep stage (WAKE, REM,
16:   if r list[ i] > 0.25 then                                 NREM1, NREM2 and NREM34), and the column “Total”
17:       break                                                 indicates the total number of epochs in one night. The av-
18:   end if                                                    erage number of epochs (30 seconds) of sleep is 664±130.
19:   optimal TH = i                                            As evaluation criteria, this study employs five evaluation in-
20: end for                                                     dicators, Accuracy, Precision, Recall, F-measure and Speci-
                                                                ficity of REM estimation. In addition, this study evaluates
                                                                the REM estimation ratio to see if REM estimations are be-
but in clusters (successive epochs), in order to consider the   ing made at an appropriate frequency.
state before/after the epoch wanted to estimate, the method
prepare a window to count PC for Ne epochs before/after the     setup
epoch, which called Windowed Prediction Count (WPC) as          The electrodes were attached to the body and head of each
shown in bottom of Figure 4(2).                                 subject to acquire EEG, EOG and EMG, and mattress sen-
Automatic adjustment of REM estimation threshold:               sor was placed under the mattress in the bed to acquire bio-
The finally output of the proposed REM estimation for an        vibration data in one night. After sleep, the correct sleep
epoch is determined by the value of WPC. If the WPC ex-         stages for each subject were determined according to the
ceeded a certain threshold, then the epoch is detected as       R&K method based on the data measured by PSG (helped
REM sleep, as shown in the Figure 4(3). It has a propor-        by medical specialist), and the bio-vibration data measured
tional relationship between the size of the threshold and the   by mattress sensor is converted to logarithmic spectrums of
REM estimation ratio, which means the ratio of estimation       several scales (i.e., window sizes are L = {32, 64, 128, 256,
out of all epochs in an overnight sleep (without considering    512}) of which are labeled with the correct sleep stage in
correct and incorrect answers). According to the physiologi-    each epoch.
cal characteristics of sleep, the proportion of REM sleep per      The logarithmic spectrum of each scale is learned with
overnight sleep is about 20%, so that the proposed method       different RFs from each other. The training data is gener-
explores the threshold which the REM estimation ratio is        ated from the eight subjects, and the validation data is the
about 20% for each person to avoid excessive or negative        other subjects. The ratio of REM sleep and not-REM sleep
estimation.                                                     of training data is 1:3 because REM sleep accounts for 20%
   The algorithm of exploring optimal threshold is described    of one night sleep and to prevent excessive REM estimation.
in Algorithm 1 The algorithm counts the number of epoch         The data which have large BM are excluded because it af-
detected as REM sleep by a threshold in overnight sleep, and    fect the shape of the spectrum and difficult to learn with RF.
calculate the REM estimation ratio, while the REM sleep         The parameters of RF are set as follows: (i) the maximum
count is equal to 0 (see line 6 to 13). Then, the optimal       depth of decision tree is 10; (ii) the number of decision tree
threshold is extracted as the previous threshold where the      is 50; (iii) the number of the features employed to construct
REM estimation ratio exceeds 25% (see line 15 to 20).           the decision tree is 16, 23, 32, 46 and 64 for window size 32,
                                                                64, 128, 256 and 512 respectively. The window size Ne for
                      Experiments                               counting WPC is set as 3.
To investigate the effectiveness of the proposed Multi-
Timescale REM Estimation, this paper conducted the hu-          Results
man subject experiment of the nine of healthy subjects.         Table 2 shows the results of REM estimations. The column
The performance of the REM estimation is compared with          “Type of RF” indicates the RF learned with each window
RFs learned with each window size of logarithmic spec-          size of spectrum and rows “proposed 1” and “proposed 2”
trum. The information of subjects is summarized in Table 1.     are the combination of multiple RF results, and the other




                                                                                                                       77
Table 2: Reaults of REM estimation by RFs learned with each window size and the proposed method.
Values are expressed as mean ± standard deviation.

        Type of RF              Accuracy      Precision            Recall             F-measure       Specificity        REM estimation ratio
      32 sec. window            74.1 ± 8.5   33.0 ± 21.3         9.8 ± 13.6          11.3 ± 10.3     91.5 ± 12.2             8.8 ± 12.3
      64 sec. window            76.0 ± 7.1   38.5 ± 21.3        12.9 ± 18.9          14.2 ± 12.3     92.9 ± 11.6             8.4 ± 13.3
     128 sec. window            76.6 ± 8.3   47.3 ± 14.8        20.6 ± 16.5           24.8 ± 9.8     91.5 ± 12.3            11.0 ± 13.3
     256 sec. window            78.0 ± 7.8   52.8 ± 20.8        24.5 ± 17.4          29.9 ± 12.6     92.0 ± 11.2            11.6 ± 12.4
     512 sec. window            76.3 ± 5.9   25.8 ± 16.8          4.6 ± 3.8            7.2 ± 5.3     95.6 ± 4.4              4.5 ± 4.1
        Proposed 1
                                79.0 ± 8.1   52.8 ± 20.7        55.4 ± 21.3          51.1 ± 16.4      84.9 ± 9.8               23.3 ± 11.1
     (same TH (= 2))
        Proposed 2
                                80.2 ± 5.5   51.4 ± 15.0        47.0 ± 15.5          48.5 ± 13.7      89.0 ± 3.0                18.5 ± 3.4
   (consider individual)


        W
        3


        2R


       1
     N12

     N34
       0
        0:00:00                0:57:36                1:55:12                 2:52:48                    3:50:24                       4:48:00
                       correct sleep stage   32sec. detection     64sec. Detection        128sec. Detection        256sec. detection


                                             Figure 5: Estimation result of subject “E”.


columns are the evaluation indicators. Each value of indica-              NREM34), the horizontal axis indicates the time, the blue
tors are expressed as mean value of nine subjects ± standard              line indicates the correct sleep stage, and orange line, gray
deviation value of those, and the value is a percentage. As               line, yellow line and green line are REM estimation result of
shown in Table 2, the Accuracy, Precision, Recall and F-                  32, 64, 128 and 256 seconds respectively. As shown in Fig-
measure are getting high rate as window size of spectrum                  ure 5, the areas of the actual REM sleep marked by red cy-
getting wide, but when window size of spectrum is too wide                cles tend to have a concentration of REM estimation by four
(i.e., 512 sec.), these evaluation indicators get worsen than             types of RF. On the other hand, the actual not REM sleep
any other results. The reason why the value of Specificity in             areas tend to have few REM estimations. Based on the pro-
512 seconds is most largest than any other results is that the            posed method, the number of REM estimations by each RF
REM estimation ratio of the RF leaned with 512 seconds is                 in a window interval of 3 epochs (1.5 minutes) before/after
small (i.e., the REM estimation is passive).                              is shown in Figure 6. In Figure 6, the left and right verti-
   Focusing on the results of two proposed methods, the val-              cal axes indicate the number of windowed REM prediction
ues of Accuracy, Recall and F-measure outperform any other                count (WPC) and sleep stage, respectively, the horizontal
results (i.e., single RF leaned with each window size of spec-            axis indicates the time, and the orange and blue lines indicate
trum). The values of Precision and Specificity are not the                the WPC and correct sleep stage, respectively. As shown in
best among the result but these values are close to the best.             Figure 6, the WPC tends to be high in the areas of the actual
In addition, the REM estimation ratio is larger than any other            REM sleep, and it tends to be low in that of actual not REM
results, and the ratio is close to the average ratio of REM               sleep. The proposed method (the combination of multiple
sleep per one night (about 20%). The standard deviation                   RF results) exploits this tendency and estimates REM sleep
of each evaluation indicators are smaller for the proposed                when the WPC exceeds a certain threshold. The threshold
method 2, which considers individual differences, than for                for REM estimation in the proposed method was determined
the proposed method 1, which does not considers individual                by the sensitivity analysis of REM estimation threshold as
differences, and this fact suggests the results are stable for            shown in the Figure 8, where the vertical axis indicates the
each subject.                                                             percentage, horizontal axis indicates the threshold, and blue,
                                                                          orange, gray, yellow, purple and green lines indicate Accu-
                        Discussion                                        racy, Precision, Recall, F-measure, Specificity and REM es-
How the combination of multiple RF results                                timation ratio, respectively. From the figure, the smaller the
contributes?                                                              threshold value, the higher the REM estimation ratio, and
                                                                          the better the estimation of the actual REM sleep (as shown
Figure 5 shows the estimation result of subject E, where the              in the Figure 8, Recall), while the larger the threshold value,
vertical axis indicates sleep stage (WAKE, REM, NREM12,




                                                                                                                                                 78
          10                                                                                                                                        3W

           8
                                                                                                                                                    2R
           6
    WPC



           4
                                                                                                                                                    1N12
           2
          0                                                                                                                                          0N34
          0:00:00       0:30:14        1:00:29        1:30:43           2:00:58      2:31:12   3:01:26      3:31:41      4:01:55   4:32:10      5:02:24
                                                 windowed REM prediction count (WPC)                     correct sleep stage

                             Figure 6: The number of windowed REM prediction count (WPC): subject “E”.

          30                                                                                                                                         3W
          25
          20                                                                                                                                         2R
    WPC




          15
          10                                                                                                                                         1N12
           5
           0                                                                                                                                         0N34
           0:00:00                    1:12:00                         2:24:00              3:36:00               4:48:00              6:00:00
                                                     windowed REM prediction count (WPC)                 correct sleep stage


                              Figure 7: The number of windowed REM prediction count (WPC): subject “I”.

100.0%
 90.0%
                                                                                           “I” who have to set a significantly different threshold com-
 80.0%                                                                                     pared to other subjects. Figure 7 shows the number of WPC
 70.0%                                                                                     of subject I, where the left and right vertical axes indicate the
 60.0%                                                                                     number of REM estimations and sleep stage, respectively,
 50.0%
 40.0%
                                                                                           the horizontal axis indicates the time, and the orange and
 30.0%                                                                                     blue lines indicate the WPC and correct sleep stage, respec-
 20.0%                                                                                     tively. Compared to the results of subject “E” in Figure 6, the
 10.0%                                                                                     WPC of subject “I” is excessive, and it is not desirable to set
  0.0%
               1    2    3        4       5      6      7         8       9     10
                                                                                           a threshold of the same value. To deal with the individual
           Accuracy               Precision              Recall
                                                                                           differences the proposed method focused on the physiolog-
           F-mesure               Specificity            REM detection ratio               ical characteristics about sleep that REM sleep accounts for
                                                                                           about 20% of the total sleep in one night.
Figure 8: Sensitivity analysis of REM estimation threshold
in the proposed method.                                                                       Table 3 shows the thresholds and results that were auto-
                                                                                           matically adjusted for each subject so that the REM esti-
                                                                                           mation ratio to the overall sleep time is about 20%. In the
the lower the REM estimation ratio and the better the esti-                                Table 3, the column “ID (Age)” indicates the subject ID and
mation of actual not-REM sleep (as shown in the Figure 8,                                  age, the columns “TH” and “REM estimation ratio” indicate
Specificity). In this study, the threshold value (= 2) was cho-                            the threshold (the value is an integer) for REM estimation
sen based on the Precision and Recall are almost equal and                                 and REM estimation ratio (the value is percentage) based
F-measure is the largest. However, this threshold is suscep-                               on the threshold. The other columns indicate each evalua-
tible to individual differences (e.g., age and physical condi-                             tion indicators (the value is a percentage). Each threshold
tion on the day) and must be carefully determined for each                                 is set to a value close to 20% without the REM estimation
subject.                                                                                   ratio exceeding 25%. As shown in Table 3, the thresholds
                                                                                           for all subjects, except subject “I”, are set between 1 and 4.
                                                                                           If the threshold value of 4 is given to the subject “I” like
Consideration of individual differences in the
                                                                                           the other subjects, the REM estimation ratio will be 86.1%
proposed method                                                                            and the Accuracy, Precision, Recall, F-measure and Speci-
As mentioned above, this section discusses the importance                                  ficity will be 35.9%, 25.7%, 99.4%,40.9% and 17.7%, re-
of the threshold setting in the proposed method with subject                               spectively. This situation should be avoided in real applica-




                                                                                                                                                          79
  Table 3: The results of proposed method considering the individual differences (the row “proposed 2” in Table 2) in detail.

             ID (Age)     TH     REM estimation ratio      Accuracy    Precision    Recall     F-measure     Specificity
             A (20’s)      3            17.2                 87.0        73.1        60.2        66.0          94.1
              B (20’s)     4            19.4                 72.4        29.5        29.2        29.3          83.0
              C (30’s)     3            19.4                 77.8        60.7        44.6        51.4          89.7
             D (40’s)      2             9.3                 84.8        23.2        21.3        22.2          92.0
              E (40’s)     1            19.6                 87.4        54.8        74.1        63.0          89.7
              F (40’s)     4            20.1                 77.7        63.9        46.1        53.5          89.9
             G (40’s)      4            20.0                 84.0        52.7        61.8        56.9          88.6
             H (50’s)      1            21.1                 72.7        54.4        39.4        45.7          86.4
              I (60’s)    20            20.2                 78.0        50.7        45.9        48.2          87.2


tions, and the proposed method makes the REM estimation                   The future task is following: (1) to investigate the validity
with fewer wrong estimation by the condition that the keep            of the combination of multiple RFs; (2) to validate whether
the REM estimation ratio around 20%.                                  it is effective for other sleep stages.
   In general, to improve the accuracy of the estimation for
such data, it is necessary to retrain the model by collect-                                   References
ing similar data or devising new input features, which are            Breiman, L. 2001. Random forests. Machine learning,
difficult tasks and take a long time to do. By contrast, the          45(1): 5–32.
proposed method does not need to do these things to im-
                                                                      Cooley, J. W.; and Tukey, J. W. 1965. An algorithm for the
prove the accuracy, and the only thing needed to do is set
                                                                      machine calculation of complex Fourier series. Mathematics
the REM estimation threshold. In addition, the threshold can
                                                                      of computation, 19(90): 297–301.
be automatically determined based on the REM estimation
ratio so that the proposed method makes it easy to adapt to           Goodfellow, I.; Bengio, Y.; Courville, A.; and Bengio, Y.
individual differences. Therefore, the differences in the au-         2016. Deep learning, volume 1. MIT press Cambridge.
tomatically determined thresholds, as shown in column TH              Holingue, C.; Wennberg, A.; Berger, S.; Polotsky, V. Y.; and
of Table 3, represent individual differences. Since the heart         Spira, A. P. 2018. Disturbed sleep and diabetes: A potential
rate is increased or unstable during REM sleep, it can be in-         nexus of dementia risk. Metabolism, 84: 85–93.
ferred, for example, that if the value of the threshold is high,      Mullington, J. M.; Haack, M.; Toth, M.; Serrador, J. M.; and
heart rate of overnight may be higher or more unstable than           Meier-Ewert, H. K. 2009. Cardiovascular, inflammatory,
an average person.                                                    and metabolic consequences of sleep deprivation. Progress
                                                                      in cardiovascular diseases, 51(4): 294–302.
                         Conclusion                                   Organization for Economic Cooperation and Development.
This paper proposed the novel REM estimation method that              2019. GENDER EQUALITY, Gender Data Portal. https:
combination of multiple RF learned with different timescale           //www.oecd.org/gender/data/.
of spectrums and investigates its effectiveness through com-          Rechtschaffen, A.; and Kales, A. 1968. A Manual of Stan-
parison of the REM estimation by single RF learned with               dardized Terminology, Techniques and Scoring System for
each scale spectrums. Concretely, the proposed method                 Sleep Stages of Human Subjects. Washington DC.
learns several RFs with each scale spectrums, then counts             Shannon, C. E. 1949. Communication in the presence of
the number of REM estimation in the length of the window              noise. Proceedings of the IRE, 37(1): 10–21.
and estimates REM sleep if the counted number exceeds the             Van Dongen, H.; Maislin, G.; Mullington, J. M.; and Dinges,
threshold. Furthermore, the threshold is automatically de-            D. F. 2003. The cumulative cost of additional wakeful-
termined for each person based on the REM estimation ra-              ness: dose-response effects on neurobehavioral functions
tio to the overall sleep time for considering individual dif-         and sleep physiology from chronic sleep restriction and total
ferences. The results of the human subject experiments, the           sleep deprivation. Sleep, 26(2): 117–126.
Accuracy, Precision, Recall and Specificity of the REM esti-
mation are 80.2(±5.5)%, 51.4(±15.0)%, 47.0(±15.5)% and                Watanabe, T.; and Watanabe, K. 2001. Estimation of the
48.5(±13.7)%, respectively. Through experiments, the fol-             sleep stages by the non-restrictive air mattress sensor rela-
lowing implications have been revealed: (1) the combination           tion between the change in the heart rate and sleep stages.
of RFs learned with multiple window sizes spectrum sepa-              Transactions of the Society of Instrument and Control Engi-
rately improves the Precision of REM estimation and Recall            neers, 37(9): 821–828.
of that, rather than RF learned with only a particular window         Watanabe, T.; and Watanabe, K. 2004. Noncontact method
size spectrum; (2) the automatic adjustment of the threshold          for sleep stage estimation. IEEE Transactions on biomedical
based on the REM estimation ratio to the total sleep length           engineering, 51(10): 1735–1748.
can be flexibly adapted to data with large individual differ-
ences without the need to retrain the model.




                                                                                                                               80