<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Psychoeducative Social Robots for an Healthier Lifestyle using Artificial Intelligence: a Case-Study</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Valerio Ponzi</string-name>
          <email>ponzi@diag.uniroma1.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Samuele Russo</string-name>
          <email>samuele.russo@uniroma1.it</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Valerio Bianco</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Christian Napoli</string-name>
          <email>christian.napoli@uniroma1.it</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Agata Wajda</string-name>
          <email>agata.wajda@polsl.pl</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer, Automation and Management Engineering, Sapienza University of Rome</institution>
          ,
          <addr-line>via Ariosto 25 Roma 00185</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Mathematics Applications and Methods for Artificial Intelligence, Faculty of Applied Mathematics, Silesian University of Technology</institution>
          ,
          <addr-line>44-100 Gliwice</addr-line>
          ,
          <country country="PL">Poland</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Department of Medical Surgical Sciences and Translational Medicine, Sapienza University of Rome</institution>
          ,
          <addr-line>Via di Grottarossa 1035, Roma 00189</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Department of Psychology, Sapienza University of Rome</institution>
          ,
          <addr-line>via dei Marsi 78 Roma 00185</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <fpage>26</fpage>
      <lpage>33</lpage>
      <abstract>
        <p>Smoking is the greatest preventable cause of mortality worldwide. In this paper, we present a social experiment where mobile robot equipped with a cigarette detector alert smokers, in particular those smoking close to children. In our research, we compare diferent methods. In the first case we trained the cigarette detection model using a homemade dataset based on the pre-trained SSD MobileNet detection model. In the second case we analyzed how smokingNet performs applied to our task. Next to distinguish between children and adults, we take advantage of the Cascade classifier and a neural network. Both networks are built to leverage TensorFlow Lite, a mobile-friendly format that enables inference on-device. When a smoking scene is identified, the mobile robot draws near the smoker and issues a warning based on the circumstances.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Smoking</kwd>
        <kwd>Object Detection</kwd>
        <kwd>Cascade Classifier</kwd>
        <kwd>Raspberry Pi</kwd>
        <kwd>TensorFlow Lite</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <sec id="sec-1-1">
        <title>Smoking is the leading preventable cause of death</title>
        <p>
          and disability. According to WHO (World Health
Organization) [1, 2] data, tobacco directly causes over 7
million deaths, while around 1.2 million are the result
of non-smokers being exposed to secondhand smoke.
Secondhand smoke is dangerous, especially for children,
and can increase their risk of multiple health issues.
For many years it has been forbidden to smoke in
the presence of children in public places except those
dedicated to smokers. In recent times, a further
restriction has been imposed by the Council of Ministers
which has issued several legislative decrees in which
the anti-smoking regulations are made even more
restrictive. In addition to the medical aspects, linked
to the now evident consequences associated to passive
smoking, there are other damages connected to the
development of fascination towards cigarettes and
consequent developement of possible addiction [
          <xref ref-type="bibr" rid="ref19">3</xref>
          ]. In
fact, starting from Albert Bandura’s studies on social
learning theory [4], it is highlighted as learning of
pro-social or anti-social behaviors can also occur without
direct contact with objects, or learning can also occur
through indirect experiences, through the observation
of other people[5, 6]. Bandura used the term modeling
(imitation) to identify a learning process that is activated
when the behavior of an individual who observes
changes according to the behavior of another individual
who acts as a model. So the behavior is the result of a
process of acquiring information from other individuals.
Furthermore, Bandura synthesizes a series of properties
acting in a modeling situation, which influence the
impact of information learned about performance: the
identification that is established between model and
modeled is identified as a fundamental characteristic
of observational learning (or vicarious learning). . The
higher it is, the more learning will have an efect on the
behavior of the model. So for example, according to this
theory, a child who daily observes a reference adult who
smokes, will learn this behavior more easily, since he
is exposed to behavior patterns that "normalize" the
use of cigarettes on a daily basis. This learning theory
is also called social learning, because it focuses on the
identification mechanism that links observer to observe.
This identification process is also linked to aefctive
aspects, and it is often found in identifying behaviors
that people adopt in certain roles or social characters.
It therefore becomes essential for the child to reduce
exposure to potentially dangerous behaviors as much as
possible, such as that related to cigarette smoking.
        </p>
      </sec>
      <sec id="sec-1-2">
        <title>The aim of this research is to develop a mobile robot capable of detecting smokers and raise a warning if those are in the proximity of youngsters. To achieve this, we splat the objective in two main</title>
        <p>tasks: cigarettes detection through a camera stream and light and near-infrared camera, and the developed system
classification between adults and teenagers. Cigarettes judges whether or not there is a driver smoking behavior
detection is analyzed in two diferent methods. In the first in the day and night conditions. Since their dataset is
we started from a pre-trained SSD MobileNet model[7] specific for foreground cigarette detection, their results
with a feature pyramid network as a subnetwork. The are not directly transferable to our study-case.
latter is especially well-suited for mobile oriented
applications since it gets rid of main memory access
constraints in a large number of embedded hardware. 3. Datasets
Instead, in the second case we used SmokingNet[8]
that detects smoking photos by utilizing the feature 3.1. Cigarette Dataset
extraction capabilities of convolutional neural networks. The dataset was realized with the objective of identifying
Furthermore, the program is hosted on a Raspberry cigarettes in a variety of situations. The background and
Pi [9, 10]. Once a cigarette is detected, the detection the quality of the pictures are varied. In the images, the
process determines the presence of people in the frame cigarette can be extremely large or extremely small in
using Haar-cascade frontal face classifier[ 11, 12, 13], comparison to the entire image. The images are uploaded
which requires significantly less hardware computation. to Roboflow [ 20], a tool for annotation, allowing users to
Moreover, the image’s detected face will be extracted and upload files, including images, annotations, and videos. It
fed as an input to a deep neural network. The network supports a wide variety of annotation formats and makes
acts as a binary classifier trained on the UTKFace dataset it simple to add new training data as it is collected. The
to distinguish children from adults. format of the data set is set to TFRecords. During the
To make models portable, both detection and classifi- annotation process, some images were discarded due to
cation models are written using TFlite[14] library and their low quality, such as cigarettes that are obscured by
inference is performed on the Raspberry Pi 4. The mobile other objects or are barely visible to human eyes. The
robot will approach the adult and issue a warning if remaining dataset consists of 2017 images that have been
there are children nearby. divided into a training set and a test set according to
radio 9:1. The training set contains 1816 images, while
2. Related Works the test set contains 201 images. A representative sample
of the dataset can be found in Figure 1.</p>
        <p>There has been some research on smoking detection
through diferent methods. In [ 15] the authors suggest
a smoking gesture based detection method. It captures
changes in the orientation of a person’s arm, and uses
a machine learning pipeline that processes this data to
accurately detect smoking gestures and sessions in
realtime.</p>
        <p>In [16] the authors proposed a machine learning method
for pufing and smoking detection using data from a wrist
accelerometer. More recent approaches suggest the use
of latest generation techniques based on the object
detection of the cigarette itself.</p>
        <p>In [17] the author presents object detection of cigarette
litter on side-walks. The system is designed to work in
real-time by exploiting a lighter version of YOLOV4 [18]
(Tiny-YOLOV4) in order to let the model be deployed on
a mobile robot. The dataset used to train this network
was specific to the littering problem, that is, detecting
cigarettes butts near sidewalks. Therefore being our
objective the identification of situations involving people
smoking, we cannot expect to achieve high performance
by applying transfer learning and fine tuning on their
pretrained network.</p>
        <p>In [19] the authors use a YOLOv2 deep-learning image
based methodology for driver’s cigarette object detection.</p>
        <p>The driver’s images are captured by a dual-mode visible</p>
        <sec id="sec-1-2-1">
          <title>3.2. UTKFace Dateset</title>
          <p>The UTKFace dataset[21] is a large-scale face collection
with a wide age range (range from 0 to 116 years old). The
dataset contains over 20,000 face images with age,
gender, and ethnicity annotations. The images demonstrate
a wide range of poses, facial expressions, illumination,
occlusion, and resolution. This dataset has the potential to
be used for a variety of tasks, including face detection, age
estimation, age progression/regression, and landmark
localization. We only use age information to classify those
under the age of 18 as children and those over the age of
18 as adults.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>4. Cigarette detection</title>
      <sec id="sec-2-1">
        <title>4.1. SSD MobileNet</title>
        <p>To achieve a trade-of between speed and accuracy, in
the first case we used the SSD MobileNet V2 object
detection model with the FPN-lite feature extractor, shared
box predictor, focal loss and a 640x640 training image Figure 4: mAP of all objects on the test set.
size. SSD[22] is a multi-category single-shot detector
that is substantially faster and more accurate than the
initial version of YOLO. In the literature, there is a variety
of more accurate but slower techniques such as Faster
R-CNN. However, since our main focus is to deploy a
real time application, we give more importance to the
speed of the model. The core of SSD is predicting
category scores and box ofsets for a fixed set of default
bounding boxes using small convolutional filters applied
to feature maps. MobileNet V2[7] significantly improves
the performance of mobile models on a variety of tasks
and benchmarks. It is based on an inverted residual
structure in which shortcut connections are made between Figure 5: mAP of large objects on the test set.
thin bottleneck layers. FPN[23] exploits the inherent
multi-scale, pyramidal hierarchy of deep convolutional
networks to construct feature pyramids with marginal The model performs poorly on small objects in
comextra cost. It is a top-down architecture with lateral con- parison to large objects. This is partly due to the SSD
nections is developed for building high-level semantic MobileNet’s features and also to the compressed input
feature maps at all scales. size. Large objects have an accuracy of up to 0.65, while
medium and small objects have an accuracy of only 0.57
4.2. Training and 0.29, respectively. This results in a decrease in the
overall accuracy of all images.</p>
        <sec id="sec-2-1-1">
          <title>The entirety of the training process was conducted on</title>
          <p>Google Colab[24]. Colab allows anybody to write and
execute arbitrary python code through the browser, and is
especially well suited to machine learning, data analysis
and education. With Colab Pro, we are able to train our
model with a K80 GPU for up to 24 hours. The training
procedure is carried out by means of 20,000 epoches.</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>4.3. SmokingNet</title>
        <sec id="sec-2-2-1">
          <title>A diferent method for cigarette detection is using SmokingNet. SmokingNet was announced in 2018, it detects smoking photos by utilizing the feature extraction capabilities of CNN. The convolution kernels of the CNN</title>
          <p>situations. Instead, in the presence of a large number of
people the amount of gestures to be analyzed becomes
too heavy from a computational point of view and too
many elbow movements are misleading.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>5. Age classification</title>
      <sec id="sec-3-1">
        <title>5.1. Cascade Classifier</title>
        <p>Object detection with Haar feature-based cascade
classiifers is a powerful object detection technique proposed in
2001 by Paul Viola and Michael Jones[11]. It is a method
for combining successively more complex classifiers in a
cascade structure which dramatically increases the speed
of the detector by focusing attention on promising
regions of the image. It is a machine learning-based
technique that involves training a cascade function on a large
number of positive and negative images. It is then
applied to other images in order to detect objects. OpenCV
includes pretrained cascade model for the frontal face,
eye, body, and even the smile. For our research, we used
the default Haar cascade frontal face model.</p>
      </sec>
      <sec id="sec-3-2">
        <title>5.2. Age classifier</title>
      </sec>
      <sec id="sec-3-3">
        <title>4.4. Comparison between SSD MobileNet model and SmokingNet</title>
        <p>Image classification is a classical problem in computer
convolutional layers have been used to extract local fea- vision which is the task of assigning a label to an input
imtures of a given image, and the features extracted by the age, from a fixed set of categories. This is a fundamental
ifrst convolutional layer directly afect the feature fusion problem in Computer Vision, and despite its simplicity,
of the deep network. Based on the shape characteris- it has a wide range of practical applications. With the
tics of cigarettes, convolution kernels of four sizes are UTKFace dateset, we trained a binary classifier to
distinincluded in the first convolutional layer of SmokingNet. guish children and adults, which allows the mobile robot
This method can detect smoking images by utilizing only to give diferent warnings. After training a CNN model
the information of human smoking gestures and cigarette with only 10 epochs, the model can achieve over 95
perimage characteristics without requiring the real detection cent accuracy. The results can be observed in Figure 8.
of cigarette. This model achieves an accuracy and recall In subfigure 8.a we can see the loss tendency through the
of 0.9. epoches. Subfigure 8.b shows the accuracy over the
training and validation set. We can appreciate the absence of
overfitting at the end of the training process.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>6. Mobile Robot</title>
      <p>Table 1 The mobile robot chosen for this research is the
Performance of SSD MobileNet model and SmokingNet Sapienza’s robot MARRtino. MARRtino is a ROS-based
low-cost diferential drive robot platform that comes in</p>
      <p>Model Precision Recall many shapes. MARRtino has been designed to be
easy-toSSD MobileNet 0.43 0.46 build and easy-to-program, but at the same time it uses
SmokingNet 0.90 0.90 professional software based on ROS. It is thus suitable to
implement and experiment many typical Robotics and</p>
      <p>As we can see in table 1 the results obtained by Smok- Artificial Intelligence tasks, such as smart navigation,
ingNet are much better than the other. However, this spoken human-robot interaction, image analysis, etc. It
is also due to the type of research that is carried out to uses a diferential wheeled robot, hence its movement is
understand whether or not the cigarette is present in based on two separately driven wheels placed on either
the image. This method is efective in not very crowded side of the robot body. It can thus change its direction
(a) Training and validation loss of age classifier.
(b) Training and validation accuracy of age classifier.</p>
      <sec id="sec-4-1">
        <title>The brain of our mobile robot is a Raspberry Pi 4B. The</title>
        <p>Raspberry Pi is a small, powerful, and low-cost embedded
device. The Raspberry Pi 4B uses a Broadcom BCM2711
SoC with a 1.5 GHz 64-bit quad-core ARM Cortex-A72
processor, with a 1 MB shared L2 cache. The Raspberry Pi
Foundation, in collaboration with Broadcom, developed a
series of miniature single-board computers (SBCs) in the
United Kingdom. Initially, the Raspberry Pi initiative was
geared toward promoting the teaching of fundamental
computer science in schools and impoverished countries.
The first model achieved greater popularity than planned,
selling outside of its intended market for applications
such as robots. It is widely utilized in a variety of fields,
including weather monitoring, due to its inexpensive
cost, modular construction, and open architecture. Due
to its support for HDMI and USB devices, it is commonly
utilized by computer and electronic hobbyists.</p>
        <sec id="sec-4-1-1">
          <title>6.2. Depth Camera</title>
          <p>(c) Confusion matrix of predication and ground truth</p>
          <p>on test set.
by varying the relative rate of rotation of its wheels and There are numerous types of depth cameras, which vary
hence does not require an additional steering motion. in terms of how they receive world data or how that
On the front of the mobile robot, a 480p webcam is in- data is processed in order to present it in a useful format.
stalled, which is essential for our objective of recognizing The sensors can difer in a variety of ways, including
a smoking scene by detecting cigarettes, adults, and chil- acquisition method, resolution, and range. Stereo
sendren in its view. The webcam can rotate horizontally and sors attempt to replicate human vision by utilizing two
vertically and has two degrees of freedom. After recog- cameras addressing the scene with a certain amount of
nizing a smoking scene, the speaker in the center of the separation between them. The images from these
cammobile robot will issue a warning. All of the sensors and eras are gathered and then utilized to extract and match
components are connected to a drive board that extends visual features (important visual information) in order
to create what is known as a disparity map between the
open-source deep learning framework that enables the
deployment of TensorFlow models on mobile devices.</p>
          <p>It is optimized for machine learning on-device. After
conversion, we can use TensorFlow Lite on our mobile
robot to create predictions based on the input data. As
illustrated in the plots, the TensorFlow Lite model is still
capable of doing high performance cigarette detection
on the test photos when the cigarettes are suficiently
visible.</p>
          <p>Our task is to determine whether one or more persons
are smoking and whether children are near the smoker.</p>
          <p>Figure 10: Astra Pro (Our RGB-D camera) Therefore, the object detector will first determine the
number of smokers and children in the area. For example,
if it detects a cigarette, two adults and a child, that means
there are a smoker and a child there. Then the robot will
cameras’ viewpoints. Time of Flight (ToF) sensors illu- approach the smoker and attempt to convince them to
minate the entire image and determine depth based on stop smoking or inform them that passive smoking is
the time required for each photon to return to the sensor. dangerous for children.</p>
          <p>This means that each pixel corresponds to a single beam
of light projected by the device, resulting in increased
data density, less shadows cast by objects, and simplified 8. Human-Robot Interaction
calibration (no stereo matching). By contrast, structured
light (SL) sensors make use of a predetermined pattern In this section we see how the interaction of the robot
projected into the scene by the IF sensor. The deforma- afects people. Social psychology has shown how explicit
tion of the pattern is then used to generate the depth map. prohibitions can lead individuals to develop a conduct
In this research, we choose a depth camera with a struc- opposite to what is required, whereby an explicit
prohibitured light sensor, a specific variant of the Orbbec Astra tion to do something causes the subject to disregard that
Pro. Actually, the Astra Pro has a higher RGB resolution prohibition and to take the prohibited behavior. For this
camera as well as a depth camera. Astra Pro was created reason, a very important aspect was the construction of a
to be largely compatible with the existing OpenNI library. dataset that contained "kind messages" that could be sent
Through a Python binding for OpenNI2, we are able to to people to dissuade them from smoking in the presence
obtain both RGB and depth information from the camera. of children. The characteristics of the message that we</p>
          <p>The initial mobile robot was equipped with a stan- considered relevant were the following: 1) it did not have
dard monocular camera that is unable of determining to contain an explicit prohibition; 2) the sentence had to
the depth of a scene. We did, however, investigate the be short and understandable; 3) the sentence had to have
possibility using an RGB-D camera. Due to mechanical a content that could be judged plausible by the subject;
constraints, we were unable to directly mount the cam- 4) the sentence had to use direct but gentle language.
era on the mobile robot with screws, but we devised a Eg. some of the phrases used could contain a message
method for attaching the camera to the mobile robot’s like this: "Please smoke away from here because there
base. It is obvious that when testing alone with the RGB- is a child" - "Kindly, do not smoke in this area because
D camera, the depth of the detected item can be easily there are children too close". To evaluate the "quality"
determined. The camera features a depth sensor in addi- of the answers, and the efectiveness of the robot, the
tion to the standard color sensor. The depth sensor can subjects were subjected to a questionnaire that contained
be used to determine the proximity of an object to the 3 types of questions: how did they evaluate the relevance
camera. As a result, once our object has been detected, it of the robot with respect to the task for which it was
is straightforward to locate and calculate the distance to programmed; how they assessed the robot’s kindness in
the identical object on the depth map. being able to persuade them not to smoke in front of the
children; and how they assessed the robot’s ability to
persuade them to smoke in general. Follow a table (Table
2) and a chart with the results of the social experiment
(Figure 11).</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>7. Implementation of the Whole</title>
    </sec>
    <sec id="sec-6">
      <title>Structure</title>
      <sec id="sec-6-1">
        <title>Prior to combining everything, we need to convert object detection model to TensorFlow Lite in order to deploy it on the Raspberry Pi. TensorFlow Lite is a free,</title>
        <p>Table 2 of afirmative answers on the robot’s ability to persuade
The questions proposed to the users after the social experiment people not to smoke in the presence of children.
HowN. Question % YES % NO ever, it would be necessary to consider that these latter
1 Was the robot pertinent? 77 23 responses could be conditioned by the sympathy aroused
2 Was the robot polite and kind? 83 17 by the robot, rather than by a sincere intent to change
3 Will you smoke near children? 35 65 one’s lifestyle. However, when asked about the
efective4 Will you quit smoking? 29 71 ness of the robot in convincing people to quit smoking
in general, there was a considerable number of negative
responses, which leads us to think that cigarette
addic9. Conclusion tion is very high and certainly requires further strategies
to convincing people to change their lifestyle in a stable
In this research, we deployed a mobile robot to deal and radical way.
with the exposure of youngsters to second-hand smoke. It would also be interesting to repeat the experiment
To detect cigarettes, we compared a custom SSD Mo- with a control group and an interview at least six months
bileNet model trained on our home-made dataset with apart.</p>
        <p>SmokingNet. Next, to discriminate between children and
adults, we use a pretrained Cascade classifier as a face
detector and a CNN model trained on the UTKFace dataset
as an age classifier. The age classifier is only initialized
when a mobile robot comes across a cigarette and the
Cascade classifier detects faces. This approach improved the
accuracy of distinguishing children from adults. In the
literature, researchers focused on achieving high
performance related to the cigarette detector. It can be shown
that these models required a significant amount of
computational time. The models chosen were structured in
order to be mobile friendly and succeeded to be deployed
on a Raspberry Pi. In the first model of cigarette detection
one of the main drawbacks was the performance of the Figure 11: Column chart showing the percentage of people’s
detector related to really small size images. This may also responses to the robot.
need some additional data augmentation on the dataset.</p>
        <p>
          The limited performance of SSD MobileNet chosen lead
to choose SmokingNet which turns out to be much more
precise. We also discovered that the Raspberry Pi’s per- References
formance is insuficient for running a high-performance
object detector. When running inference with Tensor- [1] World health organization, 2022. URL: https://www.
Flow Lite, the frames per second is only 0.55 which a who.int/.
bit slow for a real-time detection. It’s dificult to strike [2] C. Pasquarella, L. Veronesi, C. Napoli, P. Castiglia,
a balance between eficiency and performance. A more G. Liguori, R. Rizzetto, I. Torre, E. Righi, P.
Farrugpowerful computer, such as the Jetson Nano, will be able gia, M. Tesauro, et al., Microbial environmental
conto run a deeper neural network, which will undoubtedly tamination in italian dental clinics: A multicenter
increase eficiency and performance. In this paper we study yielding recommendations for standardized
have shown how, even with few computational resources sampling methods and threshold values, Science of
available, satisfactory results can be achieved for impor- the total environment 420 (2012) 289–299.
tant and large-scale problems such as the second-hand [
          <xref ref-type="bibr" rid="ref19">3</xref>
          ] P. Caponnetto, et al., The efects of physical exercise
smoking towards teenagers and children. In conclusion, on mental health: From cognitive improvements
from the answers to the questions, highlighted by the to risk of addiction, International Journal of
Enchart, it emerges how the use of the robot can be a good vironmental Research and Public Health 18 (2021).
"facilitator" to communicate messages inviting them not doi:10.3390/ijerph182413384.
to smoke in the presence of children. Probably this role [4] A. Bandura, Model of Causality in Social Learning
of facilitator is made possible by the "sympathy" that the Theory, Springer US, Boston, MA, 1985, pp. 81–99.
robot can arouse in the majority of the people involved in URL: https://doi.org/10.1007/978-1-4684-7562-3_3.
this study. In fact, as regards the questions related to how doi:10.1007/978-1-4684-7562-3_3.
the robot was perceived, it has a very high percentage of [5] S. Russo, C. Napoli, A comprehensive solution for
positive answers. There is also a significant percentage psychological treatment and therapeutic path
plan
        </p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <article-title>ning based on knowledge base and expertise shar</article-title>
          - https://projekter.aau.dk/projekter/files/419098381/
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>ing</surname>
          </string-name>
          , volume
          <volume>2472</volume>
          ,
          <year>2019</year>
          , p.
          <fpage>41</fpage>
          -
          <lpage>47</lpage>
          . Master_Thesis_Mathiebhan.pdf . [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Illari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Avanzato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          , A cloud- [18]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bochkovskiy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. M.</given-names>
            <surname>Liao</surname>
          </string-name>
          , Yolov4: Opti-
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <article-title>follow-up of hospitalized patients</article-title>
          , volume
          <volume>2694</volume>
          , abs/
          <year>2004</year>
          .10934 (
          <year>2020</year>
          ). URL: https://arxiv.org/abs/
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <year>2020</year>
          , p.
          <fpage>29</fpage>
          -
          <lpage>35</lpage>
          .
          <year>2004</year>
          .
          <volume>10934</volume>
          . arXiv:
          <year>2004</year>
          .
          <volume>10934</volume>
          . [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sandler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Howard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Zhmoginov</surname>
          </string-name>
          , L.- [19]
          <article-title>Deep learning based driver smoking behavior detec-</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>C.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <article-title>Mobilenetv2: Inverted residuals and linear tion for driving safety (</article-title>
          <year>2020</year>
          ). URL: http://www.joig.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          bottlenecks,
          <source>in: Proceedings of the IEEE conference net/uploadfile/</source>
          <year>2020</year>
          /0318/20200318051129839.pdf .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <article-title>on computer vision</article-title>
          and pattern recognition,
          <year>2018</year>
          , [20]
          <string-name>
            <surname>Roboflow</surname>
          </string-name>
          , https://roboflow.com,
          <year>2022</year>
          . URL: https:
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          pp.
          <fpage>4510</fpage>
          -
          <lpage>4520</lpage>
          . //roboflow.com. [8]
          <string-name>
            <given-names>D.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Jiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          , Smoking image detec- [21]
          <string-name>
            <given-names>S. Y.</given-names>
            <surname>Zhang Zhifei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Hairong</surname>
          </string-name>
          , Age progres-
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <source>2018 IEEE 4th International Conference on Com- coder</source>
          , in: IEEE Conference on Computer Vision
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <article-title>puter and Communications (ICCC)</article-title>
          , IEEE,
          <year>2018</year>
          , pp. and
          <article-title>Pattern Recognition (CVPR)</article-title>
          , IEEE,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          1509-
          <fpage>1515</fpage>
          . [22]
          <string-name>
            <given-names>W.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Anguelov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Erhan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Szegedy</surname>
          </string-name>
          , S. Reed, [9]
          <string-name>
            <surname>Raspberry</surname>
          </string-name>
          ,
          <year>2022</year>
          . URL: https://www.raspberrypi. C.
          <string-name>
            <surname>-Y. Fu</surname>
            ,
            <given-names>A. C.</given-names>
          </string-name>
          <string-name>
            <surname>Berg</surname>
          </string-name>
          , Ssd: Single shot multibox
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <article-title>com/for-home/</article-title>
          . detector, in: European conference on computer [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>Woźniak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Połap</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          , E. Tramontana, vision, Springer,
          <year>2016</year>
          , pp.
          <fpage>21</fpage>
          -
          <lpage>37</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <article-title>Application of bio-inspired methods in intelligent</article-title>
          [23]
          <string-name>
            <surname>T.-Y. Lin</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Dollár</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Girshick</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>He</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Hariharan</surname>
          </string-name>
          ,
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <source>trol 46</source>
          (
          <year>2017</year>
          )
          <fpage>150</fpage>
          -
          <lpage>164</lpage>
          . doi:
          <volume>10</volume>
          .5755/j01.itc. detection, in: Proceedings of the IEEE conference
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          46.1.13872.
          <article-title>on computer vision</article-title>
          and pattern recognition,
          <year>2017</year>
          , [11]
          <string-name>
            <given-names>P.</given-names>
            <surname>Viola</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <article-title>Rapid object detection using a pp</article-title>
          .
          <fpage>2117</fpage>
          -
          <lpage>2125</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <article-title>boosted cascade of simple features</article-title>
          ,
          <source>in: Proceedings [24] Colab</source>
          , https://colab.research.google.com,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <source>of the 2001 IEEE computer society conference on</source>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <year>2001</year>
          , volume
          <volume>1</volume>
          ,
          <string-name>
            <surname>Ieee</surname>
          </string-name>
          ,
          <year>2001</year>
          ,
          <string-name>
            <surname>pp. I-I.</surname>
          </string-name>
          [12]
          <string-name>
            <given-names>J.</given-names>
            <surname>Starczewski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pabiasz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Vladymyrska</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Mar-
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <article-title>maps for 3d face understanding</article-title>
          ,
          <source>Lecture Notes</source>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <source>in Bioinformatics)</source>
          <volume>9693</volume>
          (
          <year>2016</year>
          )
          <fpage>210</fpage>
          -
          <lpage>217</lpage>
          . doi:10.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <volume>1007</volume>
          /
          <fpage>978</fpage>
          -3-
          <fpage>319</fpage>
          -39384-1_
          <fpage>19</fpage>
          . [13]
          <string-name>
            <given-names>N.</given-names>
            <surname>Brandizzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Bianco</surname>
          </string-name>
          , G. Castro,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          , A. Wa-
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <source>tion recognition</source>
          , volume
          <volume>3092</volume>
          ,
          <year>2021</year>
          , p.
          <fpage>66</fpage>
          -
          <lpage>74</lpage>
          . [14]
          <string-name>
            <surname>M. A. ...,</surname>
          </string-name>
          <article-title>TensorFlow: Large-scale machine learn-</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <source>ing on heterogeneous systems</source>
          ,
          <year>2015</year>
          . URL: https:
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          sorflow.org. [15]
          <string-name>
            <given-names>A.</given-names>
            <surname>Parate</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.-C. Chiu</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Chadowitz</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Ganesan</surname>
          </string-name>
          ,
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <source>biSys 2014 - Proceedings of the 12th Annual</source>
          Interna-
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          <source>and Services</source>
          <year>2014</year>
          (
          <year>2014</year>
          ). doi:
          <volume>10</volume>
          .1145/2594368.
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          2594379. [16]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Vidrine</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Crowder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Intille</surname>
          </string-name>
          , Auto-
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          <string-name>
            <surname>accelerometers</surname>
          </string-name>
          , ICST,
          <year>2014</year>
          . doi:
          <volume>10</volume>
          .4108/icst.
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          <string-name>
            <surname>pervasivehealth.</surname>
          </string-name>
          <year>2014</year>
          .
          <volume>254978</volume>
          . [17]
          <article-title>Identification of cigarette litter with the</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          <article-title>use of outdoor mobile robots (</article-title>
          <year>2021</year>
          ). URL:
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>