<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Is Kinesthetic Teaching What Smart Factories Really Need?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jessica Villalobos</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Enrique Coronado</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alessandro Carf</string-name>
          <email>alessandro.carfi@dibris.unige.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Barbara Bruno</string-name>
          <email>barbara.bruno@unige.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fulvio Mastrogiovanni</string-name>
          <email>fulvio.mastrogiovanni@unige.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Genoa</institution>
          ,
          <addr-line>Via Opera Pia 13, 16145 Genoa</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Programming by demonstration techniques have been investigated to facilitate and speed up the setup of new robot tasks. Kinesthetic teaching (KT), i.e., teaching by physically guiding a robot in the execution of a motion, has been adopted in industrial scenarios for its ease of use. In the work described here, we analyse and discuss limits and drawbacks of KT and suggest the adoption of a set of autonomous behaviors.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        The Industry 4:0 paradigm [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] requires techniques for fast and easy-to-attain
robot task recon gurations. Programming by Demonstration (PbD) addresses
such requirements by making it possible to teach robots new tasks intuitively
[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. PbD is characterized by two phases: teaching, in which one
or multiple examples are shown by means of physical guidance, and learning, in
which examples are generalized in order to obtain a resulting robot behavior.
The execution of the task, then, is simply the autonomous repetition of the
learned behavior. When adopted in industrial scenarios, PbD has the competitive
advantage of not requiring any engineering knowledge for robot recon guration
and task teaching. Therefore, robot operators are specialized laborers whose
knowledge mainly derives from the experience gained when operating the robot
manually. This lack of awareness of how robots work, their limitations, as well
as the inherent di erences between their and human motions, can lead to low
quality robot motions, e.g., ine cient trajectories or increased execution time.
Is it acceptable for the quality of robot motions to be highly sensitive to the
operator's teaching skills?
      </p>
      <p>In this discussion, when we refer to quality, we consider the time required
for the teaching phase, in so far as it a ects the duration of task execution. In
particular:
{ pure playback is the worst case, since execution time is equal to teaching
time, and therefore any ine ciency in the teaching phase is replicated during
execution;
{ way points playback optimizes execution time but requires a longer teaching
time (e.g., the operator must stop at each key way point), does not constrain
motion between pairwise way points, and therefore their suboptimal selection
can lead to ine cient motions;
{ generalization over di erent examples reduces the in uence of a single bad
example on execution time, but increases the required teaching time.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Rationale and Hypotheses</title>
      <p>PbD makes it possible to setup a robot for operators without engineering
knowledge about how robots work, but with practical knowledge of industrial
equipment. This can lead to (problem P1) sub-optimal solutions as far as a trajectory
is concerned, or to the need of inserting more complex and robust learning
approaches to overcome sub-optimality, therefore resulting in a longer overall
teaching time tj j. Since di erent operators have di erent working experience,
(insight I1) an experienced operator is likely to obtain better results at teaching
than a less experienced one. Furthermore, operators with di erent skills and
experience may spend a di erent time for the same teaching goal, which could (P2)
tempt less skilled operators to speed up the teaching procedure to the detriment
of the nal result. Indeed, (I2) the necessity for a fast reprogramming, lack of
time, stress, inexperience or laziness are all factors that may shift focus to reduce
teaching time, with possible drawbacks on the nal result.</p>
      <p>Starting from these considerations, we hypothesize that (hypothesis H1) the
time required to teach a robot a given task, comprehensive of operator's attempts
and the required number of iterations, varies among operators.</p>
      <p>Considering the robot's end-e ector trajectory for a simple pick and place
task, as executed at teaching time, (H2) the two spatial intervals related to
grasping ( ) and releasing (r) are characterized by a higher density with respect
to nearby points in the trajectory. As a consequence, it may be reasonable (I4)
to reduce an operator's in uence during T and Tr because of their criticality in
the overall trajectory.</p>
      <p>In order to validate what discussed above, we posit the following working
hypotheses: (W H1) exposing operators to PbD training footage or to
humanhuman interactions in which one human role-plays as a robot engaged in PbD,
reduces the variance of teaching time; (W H2) the need to manually control the
opening and closing of a robot's gripper is a disturbing factor for the operator,
which a ects how grasping and releasing actions are taught; (W H3) the need to
identify an appropriate grasping pose for the object to pick while teaching the
task has a negative e ect.</p>
    </sec>
    <sec id="sec-3">
      <title>Methodology</title>
      <p>We conducted a series of experiments involving 25 unpaid volunteers among
students (and teachers) from a vocational education and training school in Italy,
aged between 15 and 60. Volunteers have previous knowledge of industrial
equipment, but not of collaborative robots, and may well be thought of as possible
robot operators in factories of the future. The volunteers have been divided in
two groups and asked to perform two activities, which we refer to as
humanhuman interaction and human-Baxter interaction. The two groups perform the
activities in di erent orders, i.e. with one group engaging in the human-human
interaction activity before the human-Baxter interaction, and the other after.
Human-Human Interaction: At the beginning, a seven minutes video4,
intended to demonstrate how KT works on Baxter, is shown to the group of
volunteers. Then, volunteers divide in pairs, with one acting as the human teacher
and the other as the robot learner, and are asked to apply the same
methodology to teach a randomly-chosen task to each other. No verbal communication
is allowed. Possible tasks include: screwing a jar lid, stacking six boxes to form
a pyramid, ordering ve bottles according to their weight, composing a square
with four pens, picking and placing a box, folding a shirt, driving a screw in a
piece of wood. The experiment is repeated twice swapping the roles of teacher
and learner. During all the experiments, teacher and learner are in front of each
other with a table in between. Once the teaching procedure ends, the learner is
asked to repeat the task. Video were recorded during the experiments.
Human-Baxter Interaction: A Baxter dual-arm manipulator stands in front
of a table on which two locations, namely A and B, are de ned. In A, a 0:5
liter plastic bottle lled with water for about one tenth is located, whereas in B
there is an open box with a 30 39 cm base and a 12 cm height. The distance
between A and B is 78 cm, while the distance between Baxter and the table
is about 60 cm. Before the activity starts, an experimenter gives a practical
demonstration on the use of KT to teach Baxter how to perform a pick and
place task. Each experiment starts with the Baxter in the untucked pose, while
we did not constrain the nal pose. The experiment loosely follows this sequence:
{ the volunteer stands in front of Baxter, on the other side of the table;
{ the volunteer grasps the wrist of Baxter's left arm to activate the zero gravity
mode and starts the teaching procedure;
{ the volunteer teaches Baxter how to relocate the bottle from A to B (i.e.,
how to put the bottle inside the open box) using KT;
{ when appropriate, the volunteer pushes the buttons located on the robot's
wrist, to open and close its left gripper;
{ the task is executed: if the robot does not succeed in performing it, the
volunteer is asked to repeat the teaching procedure.</p>
      <p>At all times, the trajectory (t) is recorded expressed in joint space. It is
noteworthy that no constraints on teaching time have been set for this experiment.
4 https://youtu.be/4FI7LwM3V38</p>
    </sec>
    <sec id="sec-4">
      <title>Discussion: What Next for Kinesthetic Teaching?</title>
      <p>By analyzing density distributions in trajectories, we found the that grasping
(T ) and releasing (Tr) phases group high density points. Furthermore,
observations suggest that, in a typical pick and place task, the time spent in those
phases is highly relevant, and amounts to about 38% of the total teaching time.
These two ndings prove that the process of teaching a robot how to grasp or
release an object via KT is critical, and that any e ort to improve it may lead to
a relevant reduction of the required teaching (and possibly execution) time. The
percentage of time spent during phases T and Tr and the densities of points
in these portions of the whole trajectory are characterized by a high variance
among di erent volunteers. This is probably due to a di erence in their skills or
experience, and suggests the need to reduce the operator's in uence on speci c
di cult phases of KT, by extending KT with a number of semi-autonomous
robot behaviors.</p>
      <p>To devise such behaviors we recur to human-human activities, and in
particular to what happens when humans apply a simple form of KT to train each
other. We observed that in the vast majority of cases, even without any verbal
interaction the learner autonomously deduces that a given object (e.g., a box,
a pen) must be picked, orients the hand in accordance with a suitable grasping
pose, and autonomously closes the hand once deemed appropriate. A similar
behaviour is observer during the release phase. These qualitative and
quantitative observations lead us to identify two possible disturbing factors in the KT
procedure: the need to control the opening and closing of the robot's gripper,
as suggested by W H2, and the need to select and reach the most appropriate
grasping pose, as put forth by W H3.</p>
      <p>In order to lessen the consequences of these disturbing factors, we propose a
research work plan that foresees the design and implementation of two
semiautonomous robot behaviors extending the basic KT paradigm: (i) the
autonomous opening and closing of a robot's gripper when appropriate, which
requires the robot to understand the operator's intentions and to reason about
the con guration of the objects in its workspace, and (ii) the autonomous
reorientation of the gripper according to the identi ed grasping pose, which
requires the robot to reason about object shapes and functions. The
implementation of such abilities can allow a robot to actively participate in the teaching
procedure, reducing the required teaching time and facilitating human-robot
interaction via KT.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>The authors would like to thank the teachers and students of the vocational
education and training school "Centro Oratorio Votivo, Casa di Carita, Arti
e Mestieri, Ovada" for their contribution to the drafting and execution of the
experiments.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Calinon</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guenter</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Billard</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>On learning, representing, and generalizing a task in a humanoid robot</article-title>
          .
          <source>Neural Networks</source>
          <volume>37</volume>
          (
          <issue>2</issue>
          ) (
          <year>2007</year>
          )
          <volume>286</volume>
          {
          <fpage>298</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Dillmann</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kaiser</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ude</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Acquisition of elementary robot skills from human demonstration</article-title>
          .
          <source>In: Proceedings of the International symposium on intelligent robotics systems (SIRS</source>
          <year>1995</year>
          ), Pisa, Italy (
          <year>1995</year>
          )
          <volume>185</volume>
          {
          <fpage>192</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>H.</given-names>
            <surname>Kagermann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.W.</given-names>
            ,
            <surname>Helbig</surname>
          </string-name>
          , J.:
          <article-title>Recommendations for implementing the strategic initiative industrie 4.0: Final report of the industrie 4.0 working group</article-title>
          .
          <source>Produktion, Automatisierung und Logistik (Frankfurt</source>
          ,
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Halbert</surname>
            ,
            <given-names>D.C.</given-names>
          </string-name>
          :
          <article-title>Programming by example</article-title>
          .
          <source>PhD thesis</source>
          , University of California, Berkeley, USA (
          <year>1984</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Inamura</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kojo</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Inaba</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Situation recognition and behavior induction based on geometric symbol representation of multimodal sensorimotor patterns</article-title>
          .
          <source>In: Proceeding of the 2006 IEEE/RSJ Internationl Conference on Intelligent Robots and Systems (IROS</source>
          <year>2006</year>
          ), Beijing, China (
          <year>October 2006</year>
          )
          <volume>5147</volume>
          {
          <fpage>5152</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Ito</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Noda</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hoshino</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tani</surname>
          </string-name>
          , J.:
          <article-title>Dynamic and interactive generation of object handling behaviors by a small humanoid robot using a dynamic neural network model</article-title>
          .
          <source>Neural Networks</source>
          <volume>19</volume>
          (
          <issue>3</issue>
          ) (
          <year>2006</year>
          )
          <volume>323</volume>
          {
          <fpage>337</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Kang</surname>
            ,
            <given-names>S.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ikeuchi</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>A robot system that observes and replicates grasping tasks</article-title>
          .
          <source>In: Proceedings of the 1995 IEEE International Conference on Computer Vision</source>
          (ICCV
          <year>1995</year>
          ), Boston, USA (
          <year>June 1995</year>
          )
          <volume>1093</volume>
          {
          <fpage>1099</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>