<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>B. Liu);</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Based on Variational Mode Decomposition and Lip-Motion-Guided Artifact Removal</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Boxiang Liu</string-name>
          <email>liubx@stu.scu.edu.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Xiujuan Zheng</string-name>
          <email>xiujuanzheng@scu.edu.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yue Ivan Wu</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>College of Electrical Engineering, Sichuan University</institution>
          ,
          <addr-line>Chengdu 610065</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>College of Electronics and Information Engineering, Sichuan University</institution>
          ,
          <addr-line>Chengdu 610065</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Key Laboratory of Information and Automation Technology of Sichuan Province</institution>
          ,
          <addr-line>Chengdu 610065</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>Remote photoplethysmography (rPPG) enables non-contact heart rate (HR) monitoring and holds great promise for applications in health monitoring, emotion recognition, and beyond. However, motion introduces substantial noise within the HR frequency band, which is a primary cause of rPPG performance degradation. To enhance the robustness of rPPG in complex scenarios, we propose a signal-processing-based framework for remote HR estimation. First, variational mode decomposition (VMD) is introduced to achieve efective separation of noise and pulse components in the rPPG signal. Then, the main sources of motion artifacts are analyzed, and motion information is derived based on the positional variations of the lip landmark in the video frames. By leveraging time-delay analysis, motion-induced noise components in the rPPG signals are accurately removed. Finally, principal component analysis (PCA) is applied to reconstruct the heartbeat component from the remaining signal set, and HR estimation is further refined by exploiting the temporal continuity of physiological parameters. Using the proposed method, we achieved third place in the 4th Remote Physiological Signal Sensing (RePSS) Challenge.</p>
      </abstract>
      <kwd-group>
        <kwd>refinement</kwd>
        <kwd>Remote photoplethysmography</kwd>
        <kwd>HR estimation</kwd>
        <kwd>signal separation</kwd>
        <kwd>noise removal</kwd>
        <kwd>pulse signal reconstruction</kwd>
        <kwd>HR</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>HR is a fundamental indicator of human physiological status. Traditional HR monitoring methods mainly
include electrocardiography (ECG) and photoplethysmography (PPG), both of which require direct
contact with the skin. However, long-term use of such contact-based methods may cause discomfort
or skin irritation, making them unsuitable for sensitive populations such as infants or patients with
skin damage. As a result, rPPG technology, which enables non-contact HR measurement, has attracted
increasing attention in recent years.</p>
      <p>
        Human heartbeat causes fluctuations in blood volume within blood vessels, and these fluctuations
afect the light absorption characteristics of the vessels, which manifest as rhythmic changes in skin
color. The rPPG technology extracts the pulse signal based on changes in skin color. This technique
enables unobtrusive and long-term monitoring of varied physiological parameters[
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ], making it
suitable for daily personal health management. However, a major obstacle to the practical deployment of
rPPG technology is its high sensitivity to illumination changes and human motion, which significantly
degrades its measurement accuracy.
      </p>
      <p>
        In the early stages of rPPG technology development, commonly used methods included blind source
separation (BSS), signal decomposition, and color space transformation. Poh et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] employed
independent component analysis (ICA) to separate the pulse component from RGB channel signals.
Song et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] applied ensemble empirical mode decomposition (EEMD) to decompose rPPG signals
from several facial sub-regions. Wang et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] introduced the Plane-Orthogonal-to-Skin (POS) method
to suppress illumination variations and specular reflection interference. Nevertheless, these traditional
methods generally overlooked the importance of accurate signal decomposition and explicit extraction
      </p>
      <p>CEUR
Workshop</p>
      <p>ISSN1613-0073
1. rPPG signals extraction 2. Multi-color space</p>
      <p>ROI pixel averaging
ROI 1
ROI 2
ROI 3
ROI 4</p>
      <p>C1
C2
C3
C4
C5</p>
      <p>…
b. Motion noise extraction</p>
      <p>PCA
DA
NR</p>
      <p>Pulse</p>
      <p>HR refinement
of noise source information, making them susceptible to noise interference and performance degradation
in practical applications.</p>
      <p>
        In recent years, deep learning methods have rapidly advanced and have been increasingly studied in
the field of rPPG. Bousefsaf et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] employed a 3D convolutional neural network (3D CNN) to extract
features directly from raw video and used a multilayer perceptron to regress HR. Lokendra et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]
incorporated facial action units (AUs) along with temporal signals from various triangular ROIs into
a multi-channel temporal convolutional network (TCN) to denoise and enhance estimation accuracy.
However, these deep learning–based approaches rely on large-scale annotated datasets and sufer from
limited interpretability.
      </p>
      <p>We propose an algorithmic framework that integrates multi-color space signal decomposition, noise
removal, pulse reconstruction, and HR refinement. The main contributions of this work are as follows:
1) We introduced VMD into the rPPG field, which significantly improved the accuracy of rPPG signal
decomposition and laid the foundation for subsequent accurate noise removal and pulse reconstruction.</p>
      <p>2) For the first time, motion reference noise is extracted based on the positional variations of lip
landmarks, and motion-induced noise is accurately identified in complex scenarios by exploiting the
common source characteristics shared between the motion components in rPPG signals and the reference
noise.</p>
      <p>3) A temporal continuity constraint of physiological parameter variation is incorporated into the HR
estimation process, further enhancing the stability of the estimated HR.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Method</title>
      <p>The proposed algorithm consists of four main steps, as illustrated in Fig. 1. First, the regions of interest
(ROIs) are segmented, and the raw rPPG signals are extracted. Second, the ROI signals are transformed
into multiple color spaces. Third, each color-space signal is decomposed using VMD. Time-delay
analysis is then performed between the decomposed components and the reference noise to eliminate
motion artifacts. Finally, PCA is applied to reconstruct the pulse waveform, and a temporal continuity
constraint is used to refine the HR estimation.</p>
      <sec id="sec-2-1">
        <title>2.1. Regions of Interest Segmentation and Raw rPPG Signal Extraction</title>
        <p>
          We employed the open source MediaPipe Face Mesh [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] to obtain facial landmarks. Four ROIs were then
defined on areas of the face rich in capillaries, as shown in Fig. 1. We perform spatial pixel averaging
within each ROI of every frame to obtain the corresponding raw rPPG signals.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Color Space Conversion</title>
        <p>
          In this study, we employed five color spaces that are favorable for pulse extraction, including CHROM [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ],
POS [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], Green, and NIR. In addition, we proposed a fusion scheme that integrates NIR and RGB.
        </p>
        <p>
          The variations in light intensity at the facial ROI locations are simultaneously reflected in both the
visible and NIR channels. Based on the derivation in Ref. [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], we propose projecting the normalized
RGB and NIR data onto a direction orthogonal to the illumination variation, as shown in Eq. (1).
 f = 3 ×   
n −  n −  n −  n
(1)
where NIRn,  n,  n, and  n represent the normalized NIR and RGB channel signals, respectively, and
 f denotes the fused signal. Based on Eq. (1), the RGB and NIR data can be efectively fused while
suppressing illumination variations.
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Motion Noise Removal</title>
        <sec id="sec-2-3-1">
          <title>2.3.1. Variational Mode Decomposition</title>
          <p>
            VMD is a fully non-recursive signal decomposition method that decomposes a signal by formulating
and solving a constrained variational problem. It decomposes the input signal into  intrinsic mode
functions (IMFs), and it efectively avoids mode mixing. The details of the VMD algorithm can be found
in Ref. [
            <xref ref-type="bibr" rid="ref10">10</xref>
            ]. VMD decomposes the signals from multiple color spaces into a set containing both pulse
and motion noise components. Therefore, it is necessary to remove the motion components.
          </p>
        </sec>
        <sec id="sec-2-3-2">
          <title>2.3.2. Motion Noise Extraction</title>
          <p>Human motion can be categorized into rigid and non-rigid movements: rigid motion is associated with
large-scale head movements, while non-rigid motion relates to deformations of local facial regions. From
a signal processing perspective, if reference noise correlated with these movements can be identified, it
becomes possible to accurately remove motion components from the rPPG signal.</p>
          <p>Non-rigid movements are typically associated with mouth motion, while rigid movements can also
be reflected in the displacement of the mouth. Therefore, we extract facial motion information based
on the relative positional changes of lip landmarks in the image. In this study, we select the landmark
located at the center of the upper lip. For a given video sequence, a time series related to the distance
between the landmark and the origin can be obtained, which we refer to as reference noise.</p>
          <p>For motion noise in the rPPG signal, it shares a common source with the reference noise. Therefore,
their waveform variations exhibit a high degree of similarity, i.e., they have a small time delay. Based
on this, we analyze the time delay  between each VMD-decomposed signal and the reference noise
to identify motion-related components. Due to noise interference or system errors, the time delay 
between the motion component in the rPPG signal and the reference noise may exhibit slight deviations
around zero. Based on experiments, this paper considers decomposed components with cross-correlation
peaks satisfying | | ≤ 0.2 s as motion noise and removes them accordingly.</p>
        </sec>
      </sec>
      <sec id="sec-2-4">
        <title>2.4. Pulse Signal Reconstruction and HR Refinement</title>
        <p>After motion noise removal, the signal set  r is considered to primarily contain pulse components along
with some random noise. PCA is capable of extracting the common signal components shared across
multiple channels into the first principal component, while suppressing weakly correlated random noise.
Based on this, we apply PCA to  r to reconstruct the dominant pulse component. For the reconstructed
pulse signal, the HR frequency  hr can be calculated using the FFT, and then multiplying  hr by 60 yields
the HR value.</p>
        <p>To further improve the accuracy of HR estimation, we developed a refinement scheme based on the
continuity of HR variation. Since physiological changes in the human body occur gradually, the HRs
estimated from three consecutive temporal segments are expected to exhibit relatively small diferences.
Assuming the estimated HRs for these three segments are ℎ 1, ℎ 2, and ℎ 3, the condition specified in
Eq. (2) should be satisfied.</p>
        <p>∀ ∈ {1, 2},
|ℎ +1 − ℎ  | &lt; ℎ 1
(2)</p>
        <p>In the proposed algorithm, the interval between two signal segments is 0.2 seconds, and the threshold
ℎ 1 is set to 10 beats per minute (bpm). In addition, if the HR exceeds 122.6 bpm or falls below 35.6 bpm,
we introduce an additional constraint for HR refinement. Specifically, the three signal segments are slid
with a step size of one frame. If the HR diference between each segment and its corresponding initial
segment is less than 5 bpm, the obtained three signal segments will be used for the final HR estimation.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Experiments</title>
      <sec id="sec-3-1">
        <title>3.1. Datasets</title>
        <p>
          The training set for this challenge is a subset of the VIPL-HR dataset [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ], containing data from 42
subjects. To match the test set, we selected modality (b) and modality (c) for our experiments, which
consist of paired RGB and NIR videos recorded using a RealSense F200 camera. The RGB images have
a resolution of 960×540, while the NIR images are 640×480; both are recorded at 25 fps. We used 10
seconds as the signal length for HR estimation in the training set.
        </p>
        <p>
          The test set includes portions of the VIPL-HR and OBF datasets [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. The VIPL-HR test set consists of
RGB and NIR videos from 100 subjects, with an average video length of 9.74 seconds. The OBF dataset
also includes RGB and NIR videos from 100 subjects, with each video segment lasting 10 seconds.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Performance Metrics</title>
        <p>We use mean absolute error (MAE), root mean square error (RMSE), mean error (ME), and Pearson
correlation coeficient (PCC) to evaluate the performance of the algorithm.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <p>We conducted experiments on the challenge training set. Taking the motion scenario as an example, the
raw rPPG signal extracted from the face is presented in the upper subfigure of Fig. 2(a). The original
rPPG signal is severely interfered by motion noise. Then, the NIR and RGB channels are fused using
Eq. (1), and the filtered signal is shown in the lower subfigure of Fig. 2 (a). This signal is decomposed
using VMD and compared with the results of EEMD and wavelet decomposition, as shown in Fig. 2(b),
(c), and (d). In Fig. 2(d), wavelet decomposition is disturbed by high-frequency noise (IMF1 at 11.3 Hz).
Obvious mode mixing occurs in IMF2, and IMF3 contains only 1.2 Hz noise. This method fails to
separate the 1.4 Hz pulse component. In Fig. 2(c), EEMD shows severe mode mixing in IMF2, with
multiple signal components blended together, while IMF1 and IMF3 correspond to 1.2 Hz noise. In
contrast, VMD avoids mode mixing, with each mode displaying a distinct, sharp frequency peak that
enables accurate separation of noise and pulse signals.</p>
      <p>Using the method described in Section 2.3.2, reference noise is extracted from the lip landmark, and
the filtered result is shown in Fig. 3(a). This reference noise has the same frequency as IMF3 in Fig. 2(b).
Then, the cross-correlation function between the reference noise and IMF3 is calculated, as shown in
Fig. 3(b). The highest correlation occurs at a time delay close to zero, approximately -0.2 s. Therefore,
IMF3 is identified as motion noise and removed. PCA is then used to reconstruct the pulse signal, and
the result is shown in Fig. 3(c). The reconstructed pulse signal exhibits a rhythm consistent with the
ground truth, and the estimated HR is 84 bpm, matching the ground truth.</p>
      <p>HR estimation was performed on the entire training set and compared with conventional methods
(EEMD and wavelet) and the end-to-end method from the relevant challenge, as shown in Table 1.
Conventional methods decompose the green channel signal and estimate HR using the component with
9.8 Hz
1.2 Hz
1.2
1
the highest dominant frequency amplitude. The results verify that the proposed method efectively
reduces HR estimation errors and improves PCC.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>Methods based on EEMD and wavelet decomposition are prone to mode mixing, whereas VMD can more
efectively separate noise and pulse components, making it more suitable for rPPG tasks. Conventional
methods often regard the lips as non-rigid motion regions to be excluded, overlooking the noise
information related to the rPPG signal contained in this region. In this work, we analyze the sources
of motion noise and establish a link between variations in lip landmark positions and motion noise
in the rPPG signal. We propose a time-delay analysis-based method to identify motion noise, and
successfully locate the motion component in the VMD-decomposed signals. This provides new insight
for improving the signal-to-noise ratio (SNR) of the rPPG signal in related research. As shown in Table 1,
the experimental results indicate that the proposed algorithm achieves better performance than both
the end-to-end method and the conventional methods. Finally, our algorithm achieved third place in
the 4th RePSS challenge with a score of 12.7079.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used DeepSeek, Grammarly in order to: Grammar and
spelling check, Paraphrase and reword. After using this tool/service, the authors reviewed and edited
the content as needed and take full responsibility for the publication’s content.
[13] C. Hu, K.-Y. Zhang, T. Yao, S. Ding, J. Li, F. Huang, L. Ma, An end-to-end eficient framework for
remote physiological signal sensing, in: 2021 IEEE/CVF International Conference on Computer
Vision Workshops (ICCVW), 2021, pp. 2378–2384. doi:10.1109/ICCVW54120.2021.00269.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>B.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ivan Wu</surname>
          </string-name>
          ,
          <article-title>Remote heart rate estimation in intense interference scenarios: A white-box framework</article-title>
          ,
          <source>IEEE Transactions on Instrumentation and Measurement</source>
          <volume>73</volume>
          (
          <year>2024</year>
          )
          <fpage>1</fpage>
          -
          <lpage>14</lpage>
          . doi:
          <volume>10</volume>
          .1109/TIM.
          <year>2024</year>
          .
          <volume>3419088</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>X.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Zou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , H. Tu,
          <article-title>Remote blood pressure estimation using bvp signal features from facial videos</article-title>
          ,
          <source>Pattern Recognition Letters</source>
          <volume>193</volume>
          (
          <year>2025</year>
          )
          <fpage>122</fpage>
          -
          <lpage>127</lpage>
          . URL: https://www.sciencedirect.com/ science/article/pii/S0167865525001400. doi:https://doi.org/10.1016/j.patrec.
          <year>2025</year>
          .
          <volume>04</volume>
          .010.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>M.-Z. Poh</surname>
            ,
            <given-names>D. J. McDuf</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>R. W.</given-names>
            <surname>Picard</surname>
          </string-name>
          ,
          <article-title>Non-contact, automated cardiac pulse measurements using video imaging and blind source separation</article-title>
          .,
          <source>Opt. Express</source>
          <volume>18</volume>
          (
          <year>2010</year>
          )
          <fpage>10762</fpage>
          -
          <lpage>10774</lpage>
          . URL: https://opg.optica.org/oe/abstract.cfm?URI=
          <fpage>oe</fpage>
          -18-10-
          <lpage>10762</lpage>
          . doi:
          <volume>10</volume>
          .1364/OE.18.010762.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>R.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wang</surname>
          </string-name>
          , J. Cheng,
          <string-name>
            <given-names>C.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <article-title>Remote photoplethysmography with an eemd-mcca method robust against spatially uneven illuminations</article-title>
          ,
          <source>IEEE sensors journal 21</source>
          (
          <year>2021</year>
          )
          <fpage>13484</fpage>
          -
          <lpage>13494</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>W.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. C.</surname>
          </string-name>
          den Brinker, S. Stuijk, G. de Haan,
          <article-title>Algorithmic principles of remote ppg</article-title>
          ,
          <source>IEEE Transactions on Biomedical Engineering</source>
          <volume>64</volume>
          (
          <year>2017</year>
          )
          <fpage>1479</fpage>
          -
          <lpage>1491</lpage>
          . doi:
          <volume>10</volume>
          .1109/TBME.
          <year>2016</year>
          .
          <volume>2609282</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>F.</given-names>
            <surname>Bousefsaf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Pruski</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Maaoui, 3d convolutional neural networks for remote pulse rate measurement and mapping from facial video</article-title>
          ,
          <source>Applied Sciences</source>
          <volume>9</volume>
          (
          <year>2019</year>
          )
          <article-title>4364</article-title>
          . URL: https://www. mdpi.com/2076-3417/9/20/4364. doi:
          <volume>10</volume>
          .3390/app9204364.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>B.</given-names>
            <surname>Lokendra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Puneet</surname>
          </string-name>
          ,
          <article-title>And-rppg: A novel denoising-rppg network for improving remote heart rate estimation</article-title>
          ,
          <source>Computers in Biology and Medicine</source>
          <volume>141</volume>
          (
          <year>2022</year>
          )
          <article-title>105146</article-title>
          . URL: https:// www.sciencedirect.com/science/article/pii/S0010482521009409. doi:https://doi.org/10.1016/ j.compbiomed.
          <year>2021</year>
          .
          <volume>105146</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kartynnik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ablavatski</surname>
          </string-name>
          , I. Grishchenko,
          <string-name>
            <given-names>M.</given-names>
            <surname>Grundmann</surname>
          </string-name>
          ,
          <article-title>Real-time facial surface geometry from monocular video on mobile gpus, 2019</article-title>
          . URL: https://arxiv.org/abs/
          <year>1907</year>
          .06724. arXiv:
          <year>1907</year>
          .06724.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>G.</given-names>
            <surname>De Haan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Jeanne</surname>
          </string-name>
          ,
          <article-title>Robust pulse rate from chrominance-based rppg</article-title>
          .,
          <source>IEEE Transactions on Biomedical Engineering</source>
          <volume>60</volume>
          (
          <year>2013</year>
          )
          <fpage>2878</fpage>
          -
          <lpage>2886</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>K.</given-names>
            <surname>Dragomiretskiy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Zosso</surname>
          </string-name>
          ,
          <article-title>Variational mode decomposition</article-title>
          ,
          <source>IEEE Transactions on Signal Processing</source>
          <volume>62</volume>
          (
          <year>2014</year>
          )
          <fpage>531</fpage>
          -
          <lpage>544</lpage>
          . doi:
          <volume>10</volume>
          .1109/TSP.
          <year>2013</year>
          .
          <volume>2288675</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>X.</given-names>
            <surname>Niu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shan</surname>
          </string-name>
          , H. Han,
          <string-name>
            <surname>X</surname>
          </string-name>
          . Chen, Rhythmnet:
          <article-title>End-to-end heart rate estimation from face via spatial-temporal representation</article-title>
          ,
          <source>IEEE Transactions on Image Processing</source>
          <volume>29</volume>
          (
          <year>2020</year>
          )
          <fpage>2409</fpage>
          -
          <lpage>2423</lpage>
          . doi:
          <volume>10</volume>
          .1109/TIP.
          <year>2019</year>
          .
          <volume>2947204</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>X.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Alikhani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Seppanen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Junttila</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Majamaa-Voltti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Tulppo</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. Zhao,</surname>
          </string-name>
          <article-title>The obf database: A large face video database for remote physiological signal measurement and atrial ifbrillation detection</article-title>
          ,
          <source>in: 2018 13th IEEE International Conference on Automatic Face Gesture Recognition (FG</source>
          <year>2018</year>
          ),
          <year>2018</year>
          , pp.
          <fpage>242</fpage>
          -
          <lpage>249</lpage>
          . doi:
          <volume>10</volume>
          .1109/FG.
          <year>2018</year>
          .
          <volume>00043</volume>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>