=Paper=
{{Paper
|id=Vol-2820/paper4
|storemode=property
|title=Little Motion, Big Results: Using Motion Magnification to Reveal Subtle Tremors in Infants

|pdfUrl=https://ceur-ws.org/Vol-2820/AAI4H-4.pdf
|volume=Vol-2820
|authors=Girik Malik,Ish K. Gulati
|dblpUrl=https://dblp.org/rec/conf/ecai/MalikG20
}}
==Little Motion, Big Results: Using Motion Magnification to Reveal Subtle Tremors in Infants
==
<pdf width="1500px">https://ceur-ws.org/Vol-2820/AAI4H-4.pdf</pdf>
<pre>
    Little Motion, Big Results: Using Motion Magnification
              to Reveal Subtle Tremors in Infants
                                                     Girik Malik1,2 and Ish K. Gulati 3,4


Abstract. Detecting tremors is challenging for both humans and                 initially admitted to a newborn nursery for monitoring and care. In-
machines. Infants exposed to opioids during pregnancy often show               fants may take up to 5 days to metabolize certain drugs taken by the
signs and symptoms of withdrawal after birth, which are easy to miss           mother before manifesting signs of withdrawal. Infants with quali-
with the human eye. The constellation of clinical features, termed as          fying scores for pharmacologic therapy are transferred to a special
Neonatal Abstinence Syndrome (NAS), include tremors, seizures, ir-             care nursery. Opioids and their derivatives are the mainstay choices,
ritability, etc. The current standard of care uses Finnegan Neonatal           regardless of the nature of opioid exposure during antenatal period.
Abstinence Syndrome Scoring System (FNASS), based on subjective                   However, the absence of a standardized therapy protocol for the
evaluations. Monitoring with FNASS requires highly skilled nursing             treatment of NAS makes FNASS the prime determinant for NAS
staff, making continuous monitoring difficult. In this paper we pro-           treatment, which is based on highly subjective evaluation. Most of
pose an automated tremor detection system using amplified motion               the primary centers in both rural and urban opioid endemic areas
signals. We demonstrate its applicability on bedside video of infant           lack trained nurses for FNASS scoring, and as a result, infants are
exhibiting signs of NAS. Further, we test different modes of deep              transferred to a higher center for optimal scoring, monitoring and
convolutional network based motion magnification, and identify that            treatment. About one-half of the infants are born at resource limited
dynamic mode works best in the clinical setting, being invariant to            hospitals, and need to be transferred to tertiary care centres for opti-
common orientational changes. We propose a strategy for discharge              mal management [1].
and follow up for NAS patients, using motion magnification to sup-                We aim to overcome this subjectivity and limitation of highly
plement the existing protocols. Overall our study suggests methods             skilled nursing training using vision-based objective monitoring and
for bridging the gap in current practices, training and resource uti-          evaluation technique. Our hypothesis is based on the principle of
lization.                                                                      objective monitoring evaluation of tremors, mitigating the need for
                                                                               trained nurses, minimising nursing exposure and allowing the possi-
                                                                               bility of remote monitoring. The goal is to capture tremor objectively
1     INTRODUCTION                                                             in an affected infant and supplement it with other parameters in the
Infants born to mothers taking prescribed or recreational opioids              scale. We use motion magnification [24] to amplify tremors, which
during pregnancy, often show signs of withdrawal after birth. The              are constant, involuntary, spontaneous, and repetitive movements at
constellation of these withdrawal symptoms, known as Neona-                    high frequency but low amplitude, and are commonly confused with
tal Abstinence Syndrome (NAS), include but are not limited to                  common newborn jitters and other newborn movements.
tremors, seizures, shrieking cry, increased muscle tone and irritabil-         The main contributions of this paper are as follows:
ity. Seizures are one of the most concerning and life threatening
symptoms, which account for 8% in Methadone users [7]. In the
U.S., incidence of NAS has risen six-fold from 2006 to 2016 affect-
ing between 6 and 20 newborns per 1000 live US births [6, 20].                 - System for continuous monitoring of NAS patients using Motion
   Unknown probability as well as multitude of symptoms pose a                   Magnification
unique challenge to appropriately diagnose NAS when all exposed                - Converting subjective visual evaluations of NAS patients to objec-
infants test positive for drug tests on body fluids but not all show             tive evaluations
troublesome symptoms. A validated scale called Finnegan Neona-                 - Proposal of discharge strategy for NAS patients
tal Abstinence Syndrome Scoring System (FNASS) is widely used
to monitor and manage therapies [4, 8]. Opioid exposed infants are
1 Khoury College of Computer Sciences, Northeastern University, 360 Hunt-
    ington Avenue, Boston, MA 02115, USA, email: gmalik@ccs.neu.edu               We start with background on Motion Magnification in Section 2.1
2 Labrynthe Pvt. Ltd., New Delhi, India
3 Center for Perinatal Research, Abigail Wexner Research Institute, Nation-    and Neonatal Abstinence Syndrome (NAS) in Section 2.2. We de-
  wide Childrens Hospital, 575 Childrens Crossroads, Columbus, OH 43215,       scribe the specifics of an automated tremor detection system in Sec-
  USA, email: ish.gulati@nationwidechildrens.org                               tion 3, starting with the details of the network used and experiments
4 Department of Pediatrics, The Ohio State University College of Medicine,
                                                                               in Section 3.1, and results in Section 3.2. We propose a follow-up and
  Columbus, OH, USA
                                                                               discharge strategy for patients in Section 4. The challenges, strengths
   Copyright 2020 for this paper by its authors. Use permitted under Cre-
  ative Commons License Attribution 4.0 International (CC BY 4.0). This        and limitations of this work are discussed in Section 5. The envi-
  volume is published and copyrighted by its editors. Advances in Artificial   sioned future direction of the current research and its applicability to
  Intelligence for Healthcare, September 4, 2020, Virtual Workshop.            other domains is discussed in Section 6.
2     BACKGROUND                                                                We use the deep convolutional neural network described in Oh et
                                                                             al. [17], with three primary components, namely, spatial decompo-
2.1    Motion Magnification                                                  sition filters, representation manipulator, and reconstruction filters,
Motion magnification can be widely classified into two categories,           which are designed as encoder, manipulator and decoder networks.
Lagrangian and Eulerian. In this paper, we use the Eulerian approach         The encoder and decoder networks are fully convolutional and use
[24], which decomposes video frames into representations useful for          residual blocks for generating high-quality images. Additionally, the
manipulating motion, without explicitly tracking the target in every         encoder and decoder also downsample and upsample the input using
frame.                                                                       strided convolution and nearest-neighbour upsampling respectively.
   Mathematically, let I(x, t) denote the image intensity at position        The manipulator works by multiplying the difference between the
x and time t. For translational motion(s), we can express the ob-            two representations found by the encoder, based on the given ampli-
served intensities with respect to a displacement function δ(t), such        fication factor (Please see [17] for details).
that I(x, t) = f (x + δ(t)), while the reference frame is given by              Two frames from the video are given as input to the encoder net-
I(x, 0) = f (x). The goal of motion magnification is to produce a            work. In case of dynamic mode, the frames are adjacent, while in
magnified image representation I, ˆ such that                                case of static mode, the input is first frame and the one at time t. The
                                                                             encoder behaves like a spatial decomposition filter that extracts the
                   ˆ t) = f (x + (1 + α)(δ(t)))
                   I(x,                                                      shape representations from each image separately. The representa-
                                                                             tion is then fed to the manipulator for amplifying the motion. Finally,
for some amplification factor α.                                             the amplified representation is fed to the decoder, which reconstructs
   For this work, we used a fully convolutional encoder-manipulator-         the modified representation into an individual magnified frame. See
decoder network, as described in [17]. The network learns and ap-            Fig 1.
plies filters directly to the examples, instead of using temporal fil-          In addition to static and dynamic mode, we also show the appli-
ters. However, the learned representations can be extended for use           cation of linear temporal filters, which have worked well in case of
with temporal filters for frequency-based motion selection. There are        linear shape representations [12, 13, 22]. Using the shape represen-
two main modes considered for this work, static and dynamic. In              tation, extracted from the encoder network, the difference operation
case of static amplification, the first frame is used as a reference, i.e.   in the manipulator network is replaced by a pixel-wise temporal fil-
(X0 , Xt ) frames are used as input; whereas dynamic amplification           ter across the temporal axis. This new, temporally-filtered shape rep-
uses the previous frame as reference, i.e. (Xt−1 , Xt ) are used as in-      resentation is fed to the decoder network for generating magnified
put, magnifying the difference between consecutive frames. We also           frames.
talk about using temporal filters [21], please see Section 3.1.                 We used weights from the network pre-trained on the synthetic
                                                                             dataset from [17]. The network is trained using `1 -loss and ADAM
                                                                             Optimizer [10], with a learning rate of 10−4 and no weight decay.
2.2    Neonatal Abstinence Syndrome (NAS)                                    The dataset consists of background images from MS COCO dataset
NAS represents a clinical phenotype, as a result of opioid expo-             [14], superposed on objects from PASCAL VOC dataset [2]. We
sure during the antenatal period. Opioids can easily cross the fetal         tested the network in static and dynamic modes using α = 10, while
blood brain barrier, accumulate in the fetus leading to prolonged half       for temporal mode, we set α = 20.
life, thereby increasing the severity of withdrawal symptoms after
birth [3]. A persistent exposure to high dosage of opioids during
pregnancy results in increased stimulation of neurotransmitters [19].
Noradrenaline is the most sensitive neurotransmitter in opioid with-
drawal and is secreted from Locus coeruleus of the fetal brain [15].
Tremor is a known symptom of a hypernoradrenergic state [11].
   The displacement caused by tremors is an important factor in clas-
sifying NAS patients. While sometimes imperceptible to the naked
eye, these movements can be identified by amplification of mo-
tion using techniques like motion magnification [24, 5]. In case of
NAS patients, there are observed sudden, non-purposeful, and non-            Figure 1. Schematic of motion magnification applied using the described
repetitive movements as well, causing major displacement of limbs.           architecture. Two adjacent frames are given as input to the fully convolutional
The distinction of these voluntary movements from the involuntary            encoder network for extracting shape and texture representations. These rep-
ones is fairly subjective in nature, making the quantitative objecti-        resentations are further fed to a manipulator network, for amplifying the mo-
fication a challenging problem. We would also like to highlight the          tion signals. The manipulated representation is then fed to a decoder network
dearth of datasets in the direction of objective evaluation of infants       that upsamples the representation to construct the motion-amplified frames.
with NAS, and video datasets for tremors, making it a nascent field.
                                                                             3.2    Results
3     AN AUTOMATED TREMOR DETECTION                                          We demonstrate the application of static, dynamic and temporal filter
      SYSTEM                                                                 [21] based magnification approaches, to a bedside video of an infant
3.1    Experiments                                                           exhibiting the signs of NAS. We compare the approach with applica-
                                                                             tion of the same algorithms to a sample baby video, as used in [24].
Our study applies the neural network from [17] on an open-source                Our results clearly indicate that the dynamic method, magnify-
bedside video of a baby exhibiting signs of NAS. For control, we             ing the difference between consecutive frames, has fewer edge arte-
used the video of a sleeping baby from Wu et al. [24].                       facts compared to static and temporal mode. For regular actions,
Figure 2. Result from dynamic, static and temporal filter mode of amplification. The top row shows frames from the original video, while the subsequent
rows show the corresponding motion-magnified frames. Observe how the subtle motion in the infants body is picked up and amplified by the network, while the
voluntary movement of caregivers hand is distorted as an artefact. The stationary objects are not influenced by the dynamic mode.

like breathing, the difference in the original and magnified video              of setup inline with the other at-home monitoring equipments using
is insignificant. During tremors, the video processed using dynamic             low-resource hardware, and to bring down the computational costs
mode, starts exhibiting magnified movements, wherein the body                   by using networks like MobileNets [9] for network backbone.
moves in a subtle pattern, while the limbs seem to move in a more
hysterical and uncontrollable manner. The caregivers hand in the
scene is also distorted in the magnified frame, and not amplified.              5    DISCUSSION
Dynamic mode is also invariant to orientational changes during the
video.                                                                          Neonatal abstinence syndrome (NAS) management has unintended
    In static mode, with the first frame taken as reference, body move-         troublesome consequences including logistical challenges of infant
ment is less magnified, compared to the surroundings. Keeping the               transfer, mother-infant dyad separation, lack of kangaroo care of the
first frame as anchor, it can magnify the objects with limited displace-        separated infant and prolonged hospital stay, stretching resources.
ment from their original position across the frames. It suffers from            Socio-economic disparity has been reported in allocation of re-
ringing artefacts and limits the ability to operate in conditions with          sources for optimal management [1]. In the current and post COVID
frequent orientational changes (rotations). The temporal filter mode            era, we expect the healthcare system to face deeper challenges. Some
also suffers from edge artefacts, given its inability to learn complex          low resource models are already struggling. Those struggling earlier,
limb movements with the linear temporal filters. Stationary objects             face imminent closures. One way of emerging successfully from this
are not amplified, but seem to be distorted in the static and temporal          crisis is to integrate current technology in our healthcare practice, not
filter case, as possible edge artefacts due to the distorted motion of          only with the medical devices, but also bringing solutions for train-
infants limbs. Results comparing the original and magnified frames              ing, objectification of clinical subjectivity, remote monitoring, and
are shown in Fig. 2.                                                            generating and utilizing the data to improvise.
                                                                                   In this paper, we have addressed one major issue of non stan-
                                                                                dardized clinical monitoring. The current standard-of-care for pa-
4   FOLLOW UP AND DISCHARGE STRATEGY
                                                                                tients with NAS is still dependent on subjective evaluations which
    FOR NAS PATIENTS
                                                                                are prone to human errors [23]. While it is impossible to argue for
NAS infants often need to follow up for rebound symptoms using                  a complete automation of anything in healthcare, there are certain
Finnegan scoring for upto 2 to 5 days after therapy is discontinued.            areas that need innovation to be at par with standardization in other
In borderline results, infants may be kept in hospital for longer peri-         domains. In this paper, we make the first step towards such a stan-
ods [16]. This technology may have applications for improving dis-              dardization, by objectifying a largely subjective Finnegan scoring for
charge protocols in such situations. In the current and post COVID              NAS patients. Our use of motion magnification as a tool to detect and
era, focus will be on minimizing the number of patients in the hospi-           amplify tremors in infants that are imperceptible to naked eyes could
tal, shortening the length of stay and accessible remote monitoring.            help in better continuous monitoring of patients, that otherwise re-
Such technology could help with monitoring infants at home because              quires highly skilled nurse practitioners monitoring in intermittent
of its low cost and ease of operability. It is possible to bring the cost       intervals. The current discharge and followup strategy for patients
with NAS is also very loosely defined, without an objective way of       healthy infants. As next steps, we are investigating differences in
catering to misclassifications.                                          acoustics for detecting high pitched cry, and if they can be com-
   For our pilot study, we tested three different modes of motion mag-   bined with our vision-based model to add more sensitivity to the sam-
nification, and found that dynamic mode performed best with the          ple. Once validated, an automated video based motion magnification
current video. Our observations for static mode were coherent with       tool can be used to train care providers to understand the mechanism
the expected behaviour for the current video, given the use of the       of these pathophysiological manifestations, in low-resource settings.
first frame as reference. The temporal filter mode seems to produce      Further, we propose to formulate this setup to a scoring tool for pa-
edge artefacts, and needs more analysis with domain specific data        tients showing unique tremor signatures during and after the treat-
and better kernels to select small motions of interest. We believe the   ment of NAS, to strengthen the existing protocols. We also envision
currently implemented linear temporal filter might not be suitable to    to extrapolate automatic tremor detection to monitor patients with
learn the representations of complex non-linear motion. We propose       stroke and Parkinsons disease in nursing homes.
a video camera monitoring the infant with a monocular video stream
of 640x480 at 30-45 frames per second, fixed to the bedside. This
setup gives a continuous video stream, which is processed using Eu-      ACKNOWLEDGEMENTS
lerian Video Magnification [24, 17]                                      We would like to thank the referees for their comments and sugges-
   It is often easy to confuse tremors with tremor mimickers at the      tions, which helped improve this paper considerably. GM would like
bedside. Physical manifestations of tremor in NAS infants may look       to thank his advisor Prof. Ennio Mingolla, Northeastern University,
like myoclonus (sudden jerking), jitteriness or fine tremors, and        for letting him work on this research, which is not directly related to
are often misinterpreted as epileptic seizures, requiring electroen-     his PhD work. IKG would like to thank Dr. Deepak Gulati, Vascular
cephalogram (EEG) [18]. Motion magnification will be capable of          Neurologist, The Ohio State University College of Medicine for his
diagnosing and aiding clinical diagnosis of seizures with EEG. We        helpful insights.
propose that once a tremor signature of NAS is established, clinical
seizures of NAS will help correlate EEG findings of epileptic focus.
We hope further research in this area will explore more opportunities    REFERENCES
for characterisation of NAS tremors, and allow healthcare practition-     [1] Tammy E Corr and Christopher S Hollenbeak, ‘The economic burden of
ers to recommend personalised therapy and management plans.                   neonatal abstinence syndrome in the united states’, Addiction, 112(9),
   Limitations: The small size and type of currently available                1590–1599, (2017).
datasets makes it challenging to be used with deep learning meth-         [2] Mark Everingham, Luc Van Gool, Christopher KI Williams, John
ods. There is a need for more extensive data collection and its stan-         Winn, and Andrew Zisserman, ‘The pascal visual object classes (voc)
                                                                              challenge’, International journal of computer vision, 88(2), 303–338,
dardized protocols approved by Institutional Review Board(s) (IRB),           (2010).
specifically videos of tremors and seizures, for vision related meth-     [3] WO Farid, SA Dunlop, RJ Tait, and GK Hulse, ‘The effects of mater-
ods, to train humans and machines alike. In our study, we were lim-           nally administered methadone, buprenorphine and naltrexone on off-
ited by the dataset size for the same reason.                                 spring: review of human and animal data’, Current neuropharmacol-
                                                                              ogy, 6(2), 125–150, (2008).
   Strengths: The video of infant with NAS used has rotational            [4] Loretta P Finnegan, James F Connaughton Jr, Reuben E Kron, and
changes of about 90 degrees in the latter half (not shown in result           John P Emich, ‘Neonatal abstinence syndrome: assessment and man-
images), where the dynamic mode performed as well as in the ear-              agement.’, Addictive diseases, 2(1-2), 141–158, (1975).
lier part, showing its robustness to orientational changes. The study     [5] William T Freeman, Edward H Adelson, and David J Heeger, ‘Motion
presented is first to report how to objectively capture NAS tremor            without movement’, ACM Siggraph Computer Graphics, 25(4), 27–30,
                                                                              (1991).
using motion magnification. The possible use of low-resource hard-        [6] Bryant Furlow, ‘Neonatal opioid withdrawal in the usa.’, The Lancet.
ware allows easy scale-up of the system for monitoring patients at            Child & adolescent health, 2(9), 629–630, (2018).
home and in remote areas. However, we need a clinical feasibility         [7] Mathew George, Joseph P Kitzmiller, Michele Burns Ewald, Kather-
and validation study, along with a diverse video dataset, to compare          ine A ODonell, Melissa Lai Becter, and Steve Salhanick, ‘Methadone
                                                                              toxicity and possible induction and enhanced elimination in a prema-
this innovative technology with standard of care.                             ture neonate’, Journal of Medical Toxicology, 8(4), 432–435, (2012).
                                                                          [8] Matthew Grossman and Adam Berkwitt, ‘Neonatal abstinence syn-
                                                                              drome’, in Seminars in perinatology, volume 43, pp. 173–186. Elsevier,
6   CONCLUSION AND FUTURE DIRECTIONS                                          (2019).
                                                                          [9] Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko,
We have shown how the very subtle motion of the tremor in the cen-            Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam,
ter of the infants body is picked up by the motion magnification net-         ‘Mobilenets: Efficient convolutional neural networks for mobile vision
work, while the voluntary movement of the care-givers hand is dis-            applications’, arXiv preprint arXiv:1704.04861, (2017).
                                                                         [10] Diederik P Kingma and Jimmy Ba, ‘Adam: A method for stochastic
torted, and not amplified. Additionally, we highlighted the problems
                                                                              optimization’, arXiv preprint arXiv:1412.6980, (2014).
in the existing subjective evaluations, and made proposals of bridging   [11] Samet Kose and Mesut Cetin. β-adrenergic receptor blocker use
those gaps with innovative techniques using deep neural networks.             for traumatic memory reconsolidation in posttraumatic stress disorder,
This project aligns with American Academy of Pediatrics goals in              2016.
addressing both its key issues of health disparities and health equi-    [12] Reginald L. Lagendijk, Jan Biemond, Andrei Rare, and Marcel J.T.
                                                                              Reinders, ‘Chapter 4 - video enhancement and restoration’, in The Es-
ties by empowerment of low resource centers in disproportionately             sential Guide to Video Processing, ed., Al Bovik, 69 – 108, Academic
higher prevalence of opioid addicted mothers and infants with NAS.            Press, Boston, (2009).
We make some suggestions based on important observations in the          [13] Jinyu Li, Li Deng, Reinhold Haeb-Umbach, and Yifan Gong, ‘Chapter
field, that we believe will improve monitoring of NAS patients, and           4 - processing in the feature and model domains’, in Robust Automatic
                                                                              Speech Recognition, eds., Jinyu Li, Li Deng, Reinhold Haeb-Umbach,
help with better infant care in general.
                                                                              and Yifan Gong, 65 – 106, Academic Press, Oxford, (2016).
   Infants with NAS have a more shrieking high pitched cry recorded      [14] Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Per-
as a characteristic acoustic signature versus low pitched cry of              ona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick, ‘Microsoft
       coco: Common objects in context’, in European conference on com-
       puter vision, pp. 740–755. Springer, (2014).
[15]   Patrick J Little, Roger R Price, Robin K Hinton, and Cynthia M Kuhn,
       ‘Role of noradrenergic hyperactivity in neonatal opiate abstinence’,
       Drug and alcohol dependence, 41(1), 47–54, (1996).
[16]   AK Mangat, GM Schmölzer, and WK Kraft, ‘Pharmacological and
       non-pharmacological treatments for the neonatal abstinence syndrome
       (nas)’, in Seminars in Fetal and Neonatal Medicine. Elsevier, (2019).
[17]   Tae-Hyun Oh, Ronnachai Jaroensri, Changil Kim, Mohamed Elgharib,
       Fr’edo Durand, William T Freeman, and Wojciech Matusik, ‘Learning-
       based video motion magnification’, in Proceedings of the European
       Conference on Computer Vision (ECCV), pp. 633–648, (2018).
[18]   Murali Reddy Palla, Gulam Khan, Zahra M Haghighat, and Henrietta
       Bada, ‘Eeg findings in infants with neonatal abstinence syndrome pre-
       senting with clinical seizures.’, Frontiers in pediatrics, 7, 111, (2019).
[19]   Andra M Smith, Peter A Fried, Matthew J Hogan, and Ian Cameron,
       ‘Effects of prenatal marijuana on response inhibition: an fmri study of
       young adults’, Neurotoxicology and teratology, 26(4), 533–542, (2004).
[20]   Elisha M Wachman, Davida M Schiff, and Michael Silverstein, ‘Neona-
       tal abstinence syndrome: advances in diagnosis and treatment’, Jama,
       319(13), 1362–1374, (2018).
[21]   Neal Wadhwa, Michael Rubinstein, Frédo Durand, and William T Free-
       man, ‘Phase-based video motion processing’, ACM Transactions on
       Graphics (TOG), 32(4), 1–10, (2013).
[22]   Xiaogang Wang and Chen-Change Loy, ‘Chapter 10 - deep learning for
       scene-independent crowd analysis’, in Group and Crowd Behavior for
       Computer Vision, eds., Vittorio Murino, Marco Cristani, Shishir Shah,
       and Silvio Savarese, 209 – 252, Academic Press, (2017).
[23]   Philip M Westgate and Enrique Gomez-Pomar, ‘Judging the neonatal
       abstinence syndrome assessment tools to guide future tool develop-
       ment: the use of clinimetrics as opposed to psychometrics’, Frontiers
       in pediatrics, 5, 204, (2017).
[24]   Hao-Yu Wu, Michael Rubinstein, Eugene Shih, John Guttag, Frédo Du-
       rand, and William Freeman, ‘Eulerian video magnification for reveal-
       ing subtle changes in the world’, ACM transactions on graphics (TOG),
       31(4), 1–8, (2012).

</pre>