=Paper= {{Paper |id=Vol-2617/paper11 |storemode=property |title=Object-Centric Camera Drone Control for Unconstrained Telepresence |pdfUrl=https://ceur-ws.org/Vol-2617/paper11.pdf |volume=Vol-2617 |authors=Jiannan Li,Ravin Balakrishnan,Tovi Grossman |dblpUrl=https://dblp.org/rec/conf/chi/LiBG20 }} ==Object-Centric Camera Drone Control for Unconstrained Telepresence== https://ceur-ws.org/Vol-2617/paper11.pdf

Object-Centric Camera Drone Control
for Unconstrained Telepresence

Jiannan Li Abstract
University of Toronto Camera drones, a rapidly emerging technology, offer people
40 St George St, Toronto, ON, the ability to remotely inspect an environment with a high
Canada
degree of mobility and agility. However, manual remote pi-
jiannanli@dgp.toronto.edu
loting of a drone is prone to errors. In contrast, autopilot
systems are not necessarily designed to support flexible
Ravin Balakrishnan visual inspections. We propose the object-centric control
University of Toronto paradigm for efficient camera drone navigation, in which
40 St George St, Toronto, ON, a user directly specifies the navigation of a drone camera
Canada relative to a specified object of interest. We demonstrated
jiannanli@dgp.toronto.edu
the strengths of this approach through our first prototype,
StarHopper, and discuss future research opportunities.
Tovi Grossman
University of Toronto Author Keywords
40 St George St, Toronto, ON, Drone; telepresence; object-centric
Canada
tovi@dgp.toronto.edu Introduction
Researchers in telepresence have long envisioned ‘beyond
being there’ [1]. Replicating all relevant local experiences,
while remote, should not be the only goal of telepresence;
rather, we should also strive to create telepresence systems
This paper is published under the Creative Commons Attribution 4.0 International which can enable benefits that are not possible when the
(CC-BY 4.0) license. Authors reserve their rights to disseminate the work on their
personal and corporate Web sites with the appropriate attribution.
person is physically present. As such, telepresence goes
Interdisciplinary Workshop on Human-Drone Interaction (iHDI 2020) from replication to augmentation. One particular instance
CHI ’20 Extended Abstracts, 26 April 2020, Honolulu, HI, US
© Creative Commons CC-BY 4.0 License.
of this vision is enabled by camera drones: our local bodies
can only walk on the ground, but our remote bodies can fly.
Researchers have noted a number of social and functional
issues due to the insufficient mobility of current remote
robotic presence platforms [15]. With drones becoming
more affordable and reliable, they hold the potential for en-
abling more flexible remote presence and visual inspection
experiences (e.g. [2]) for the general population.

While drones offer promise for such telepresence applica-
tions, they are challenging to manually control remotely, due
to numerous factors including high degrees of freedom, nar-
row camera field-of-views, and network delays [7]. Their
control interfaces - virtual or physical joysticks for consumer
drones - are also unfamiliar for many users and take ex-
tended training time to master [9]. Figure 2: StarHopper system components.

To relieve the burden of manual piloting, autopilot tech-
niques have been applied to drone control. Most existing
drone autopilot interfaces are based on specifying a se- the location of a 3D object of interest. We demonstrated
ries of planned waypoints in a 2D or 3D global map (e.g. the potential of this approach through our first prototype,
[9]). However, in a situation where a user wishes to perform StarHopper [6], and illustrate future research opportunities.
a real-time inspection, setting waypoints a priori may not
be efficient for producing the viewer’s desired viewpoints. Previous Work: An Object-Centric Interface for
Figure 1: Operating a camera Some autonomous systems avoid the use of waypoints and Remote Inspection
drone remotely to inspect an StarHopper is a remote object-centric camera drone navi-
execute higher-level plans, such as following a subject to
apartment. (a) The user specifies a
form canonical shots [3], but they typically do not offer the gation interface that is operated through familiar touch inter-
desired view of the coffee machine
flexibility for exploring remote environments. actions and relies on minimal geometric information of the
by dragging on the drone’s camera
view (b) the drone flies towards to
environment (Figure 1). It consists of an overhead camera
The difficulty of drone piloting poses a significant barrier for view for context and a 3D-tracked drone’s first-person view
the specified viewpoint.
the widespread adoption of free-flying robots. The goal of for focus (Figure 2). New objects of interest can be speci-
this research is to design a camera drone control interface fied through simple touch gestures on both camera views.
to support efficient and flexible telepresence experience. We combine automatic and manual control via four navi-
Our work is inspired by decades of research in interactive gation mechanisms that can complement each other with
graphics, for which many camera navigation techniques unique strengths, to support efficient and flexible visual in-
have been established (e.g. [5]). spection. The system focuses on indoor environments, rep-
resentative of tasks such as remote warehouse inspection
Most relevant, we build upon object-centric techniques,
[9] and museum visits [11], and where positional tracking
where zooming, panning, and orbiting occurs relative to
construction methods. She selects the object of interest
through a drag gesture first in the overview camera view
and then in the drone camera view. A computer vision al-
gorithm triangulates the position of the object from these
two regions and estimates the dimensions of a bounding
cylinder of the object (see [6] for more technical details).

Inspired by camera control mechanisms in interactive graph-
ics, we have designed three object-centric physical camera
navigation mechanisms for viewing an object of focus: 360
viewpoint widget, delayed through-the-lens control, and
object-centric joysticks.
Figure 3: The StarHopper user interface. (a) Remote drone
camera view. (b) Overview camera view. (c) Virtual joysticks. (d) 360 Viewing widget
Object-of-interest list. (e) Icon for object-centric mode.
The 360 viewpoint widget is a widget for quickly navigat-
ing to and focusing on an object of interest, from a user-
specified viewing angle. The widget takes the shape of a
technology is more reliable. semi-transparent 3D ring, surrounding the focus object (Fig-
ure 4a). A 3D arrow aimed at the ring appears upon touch,
Design Guidelines indicating the desired viewing direction. The user can drag
Figure 4: Interaction with the 360
We base our design for remote object-centric drone nav- the finger on the ring to set the desired viewpoint position
viewpoint widget. (a) The user
igation on a set of guidelines grounded by our review of (Figure 4b). Once the user releases the finger, the autopi-
touches the area around the ring to
activate the widget. (b) The user prior literature. These guidelines are: (1) support situation lot system moves the drone to the calculated viewpoint.
drags the finger to adjust the awareness (2) minimize reliance on environmental informa- The algorithm determines a reasonable default viewing dis-
viewing angle and camera height. tion (3) combine automated and manual control (4) Support tance, based on the size of the bounding cylinder.
Upon releasing the drag, the drone simple touch interactions (5) Respect physical constraints.
navigates to the specified Delayed through-the-lens control
viewpoint. User Interface and Navigation Mechanisms To use this technique, the user first rests two fingers on the
StarHopper provides a touch screen interface for the users drone camera view to freeze the current frame (Figure 5a,
to view the drone’s live stream video and to perform drone next page). The user then performs a two-finger pinch-and-
navigations (Figure 3). The drone camera feed fills the pan gesture to transform the current frame to the desired
screen. The overview camera video and two virtual joy- viewpoint (Figure 5b, next page). The system then calcu-
sticks are at the bottom of the interface. lates a new drone position that can produce the desired
viewpoint which the drone navigates towards.
The user can obtain the approximate position and dimen-
sions of an object through a simple two-step procedure,
without using pre-built maps or expensive real-time 3D re-
Object-centric joysticks mechanisms (Table 1). The 360 viewpoint widget, despite
We remap the axis of traditional drone control joysticks to its high efficiency, lacks in flexibility and can be comple-
object-centric commands and add constraints to prevent mented by delayed through-the-lens control, object-centric
manipulation errors. More specifically, under the object- joysticks and manual controls.
centric constraints, the drone keeps the object of interest in
its field-of-view during the pan movements (Figure 7a, next User Study
page). In object-centric zoom, the drone aims its camera To evaluate the navigation mechanisms of StarHopper,
at the object of interest and moves closer or further away we conducted a user study consisting of a remote object
from it (Figure 7b, next page). In response to the orbiting inspection task with 12 volunteers (7 female, Mage =
commands, the drone orbits around the object while aiming 26.3, SDage = 4.4). A Ryze Tello drone was used in the
at its center (Figure 7c, next page). study. We compared StarHopper to a baseline, consisting
of conventional manual joystick controls. In each trial, the
Manual joysticks participant was instructed to fly the drone from the start-
In addition to the three object-centric navigation mecha- ing position to inspect one of the four sides (Left, Right,
nisms, StarHopper also supports fully manual controls. This Front, Back) of an item (Figure 8) using one of the two con-
could be useful in cases where the user wishes to make trol interfaces, StarHopper or manual joysticks (Manual).
slight adjustments to a viewpoint that the auto-pilot system We recorded the completion time of each trial.
navigated to.
Figure 5: Adjusting the camera A repeated measures analysis of variance showed that it
view using delayed Managing objects of interest was significantly faster to complete the task with StarHop-
through-the-lens control. (a) The The object-of-interest list on the right of the interface (3d) per than with egocentric manual control (F1,11 = 23.8, p <
user rests two fingers on the records the thumbnails of all previously registered objects 0.001). Overall StarHopper was 35.4% faster (StarHopper:
screen to freeze the current view. of interest. The user can tap on the thumbnail to set it as 20.33s, Manual: 31.45s), demonstrating a substantial gain
(b) A pan and zoom gesture on the the object-of-interest, and the drone will turn towards it. A in efficiency (Figure 6).
frozen frame specifies the desired double-tap on the thumbnail will trigger the drone to ap-
view.
proach that object.

Navigation mechanism properties
StarHopper consists of a set of four navigation mecha-
nisms, ranging from fully automated to fully manual. This
suite of techniques allows users to perform both flexible and Figure 6: Mean task completion time of manual control and
StarHopper. Error bars represent 95% CI.
efficient scene inspections by leveraging their contrasting
capabilities (Table 1, page 6).

We recognize the trend that a higher automation level in- Future Research Opportunities
creases efficiency but reduces flexibility. Taken together, the The StarHopper prototype demonstrated the potential ef-
system offers the user both efficient and flexible navigation ficiency advantage for object-centric camera drone con-
trol. More importantly, it revealed several future challenges rately interpreting remote users’ actions and intentions for
and opportunities for better leveraging the object-centric local users. Such challenges are exacerbated on drones as
paradigm for unconstrained telepresence. their movements and form factors can be very different from
humans. Recent research proposed signaling drone mo-
Increasing Situation Awareness for Leveraging Greater Mobil- tion intent with augmented reality [12]. However, future flight
ity paths and waypoints can be insufficient for a remote user
Supporting situation awareness has long been a key theme who operates the drone to establish common ground with a
in robot teleoperation research ([14]). With greater mo- local user, for example, when they want to make sure that
bility, drone operators face greater risk of getting lost in they are discussing about the same object among a num-
space ([7]). Prior research has shown the effectiveness ber of candidates in the environment. Visualizing objects of
of a live exocentric overview for enhancing situation aware- interest can complement the above signaling method and
ness in teleoperation (e.g. [8]). StarHopper incorporated facilitate communication.
a static live overview camera, but this setup reduced the
area where the drone could fly. Future research can explore Privacy Considerations
awareness mechanisms that do not sacrifice mobility. For A free-roaming viewpoint, such as a drone, raises privacy
example, a second, spatially coupled camera drone serving concerns about remote users intentionally or unintentionally
as the overhead camera [10]. seeing private visual information of local users. Prior re-
search in video-mediated communication has looked exten-
Richer Interaction Using Objects-of-Interest Semantic Informa- sively into privacy issues, but largely for fixed cameras (e.g.
tion [4]). Privacy research on drones mostly studied perceptions
Interactions with objects-of-interest in StarHopper were lim- about drones operated by strangers (e.g. [13]). Drones for
ited to specifying desirable viewpoints, as StarHopper only telepresence, especially drones that work closely with hu-
exploited simple geometric information. With recent ad- mans, call for new privacy mechanisms. Local users can
vancements in image understanding, a natural next step define sensitive objects or zones, which remotely operated
would be enabling richer and more meaningful interactions drones should always avert.
using semantic information about objects of interest. For
Figure 7: The object-centric instance, instead of following the single rule of placing the Conclusion
joystick controls. Red areas object at the center of the camera frame by default, the sys- Remotely operated camera drones hold potential for un-
indicate the joystick axes used. (a) tem could choose a more appropriate camera framing and constrained telepresence ‘beyond being there’ but require
Pan. (b) Zoom. (c) Orbit. trajectory depending on the object and relevant context. careful control interface designs to realize such poten-
The drone could focus on the upper body of a person in tial. Through prototyping and evaluating StarHopper, we
conversation, or zooming in on the console of an instrument showed the advantage of the object-centric control paradigm
for key readings. for camera drone teleoperation. We further invite the re-
search community to consider future opportunities in apply-
Design for Local Users
ing the object-centric paradigm to develop more useful and
While tele-operated robots give the ability to control view-
usable camera drone control interfaces for unconstrained
points back to remote users, they raise challenges of accu-
http://dx.doi.org/10.1145/142750.142769
event-place: Monterey, California, USA.
[2] Brennan Jones, Kody Dillman, Richard Tang, Anthony
Tang, Ehud Sharlin, Lora Oehlberg, Carman
Neustaedter, and Scott Bateman. 2016. Elevating
Communication, Collaboration, and Shared
Experiences in Mobile Video Through Drones.
Proceedings of the 2016 ACM Conference on
Designing Interactive Systems (2016), 1123–1135.
DOI:http://dx.doi.org/10.1145/2901790.2901847
Table 1: Properties of the four control mechanisms. event-place: Brisbane, QLD, Australia.
[3] Niels Joubert, Jane L. E, Dan B. Goldman, Floraine
Berthouzoz, Mike Roberts, James A. Landay, and Pat
Hanrahan. 2016. Towards a Drone Cinematographer:
Guiding Quadrotor Cameras using Visual Composition
Principles. arXiv:1610.01691 [cs] (5 10 2016).
http://arxiv.org/abs/1610.01691 arXiv:
1610.01691.
[4] Tejinder K. Judge, Carman Neustaedter, and
Andrew F. Kurtz. 2010. The family window: the design
and evaluation of a domestic media space.
Proceedings of the SIGCHI Conference on Human
Factors in Computing Systems (10 4 2010),
2361–2370. DOI:
Figure 8: Mean task completion time of manual control and http://dx.doi.org/10.1145/1753326.1753682
StarHopper. Error bars represent 95% CI. [Online; accessed 2020-02-10].
[5] Azam Khan, Ben Komalo, Jos Stam, George
Fitzmaurice, and Gordon Kurtenbach. 2005.
telepresence.
HoverCam: Interactive 3D Navigation for Proximal
Object Inspection. Proceedings of the 2005
REFERENCES
Symposium on Interactive 3D Graphics and Games
[1] Jim Hollan and Scott Stornetta. 1992. Beyond Being
(2005), 73–80. DOI:
There. Proceedings of the SIGCHI Conference on
http://dx.doi.org/10.1145/1053427.1053439
Human Factors in Computing Systems (1992),
event-place: Washington, District of Columbia.
119–125. DOI:
[6] Jiannan Li, Ravin Balakrishnan, and Tovi Grossman. N. Roy, J. Schulte, and D. Schulz. 2000. Probabilistic
2020. StarHopper: A Touch Interface for Remote Algorithms and the Interactive Museum Tour-Guide
Object-Centric Drone Navigation. Proceedings of Robot Minerva. The International Journal of Robotics
Graphical Interface 2020 (2020). Research 19, 11 (1 11 2000), 972–999. DOI:
http://dx.doi.org/10.1177/02783640022067922
[7] David Pitman and Mary L. Cummings. 2012.
Collaborative Exploration with a Micro Aerial Vehicle: A [12] Michael Walker, Hooman Hedayati, Jennifer Lee, and
Novel Interaction Method for Controlling a Mav with a Daniel Szafir. 2018. Communicating Robot Motion
Hand-held Device. Adv. in Hum.-Comp. Int. 2012 (1 Intent with Augmented Reality. Proceedings of the
2012), 18:18–18:18. DOI: 2018 ACM/IEEE International Conference on
http://dx.doi.org/10.1155/2012/768180 Human-Robot Interaction (2018), 316–324. DOI:
[8] D. Saakes, V. Choudhary, D. Sakamoto, M. Inami, and http://dx.doi.org/10.1145/3171221.3171253
T. Lgarashi. 2013. A teleoperating interface for ground event-place: Chicago, IL, USA.
vehicles using autonomous flying cameras. 2013 23rd [13] Yang Wang, Huichuan Xia, Yaxing Yao, and Yun
International Conference on Artificial Reality and Huang. 2016. Flying Eyes and Hidden Controllers: A
Telexistence (ICAT) (12 2013), 13–19. DOI: Qualitative Study of People’s Privacy Perceptions of
http://dx.doi.org/10.1109/ICAT.2013.6728900 Civilian Drones in The US. Proceedings on Privacy
[9] Daniel Szafir, Bilge Mutlu, and Terrence Fong. 2017. Enhancing Technologies 2016, 3 (1 7 2016), 172–190.
Designing planning and control interfaces to support DOI:http://dx.doi.org/10.1515/popets-2016-0022
user collaboration with flying robots. The International [14] H. A. Yanco and J. Drury. 2004. "Where am I?"
Journal of Robotics Research 36, 5-7 (1 6 2017), Acquiring situation awareness using a remote robot
514–542. DOI: platform. 2004 IEEE International Conference on
http://dx.doi.org/10.1177/0278364916688256 Systems, Man and Cybernetics (IEEE Cat.
[10] Ryotaro Temma, Kazuki Takashima, Kazuyuki Fujita, No.04CH37583) 3 (10 2004), 2835–2840 vol.3. DOI:
Koh Sueda, and Yoshifumi Kitamura. 2019. http://dx.doi.org/10.1109/ICSMC.2004.1400762
Third-Person Piloting: Increasing Situational
[15] Lillian Yang, Brennan Jones, Carman Neustaedter,
Awareness Using a Spatially Coupled Second Drone.
and Samarth Singhal. 2018. Shopping Over Distance
Proceedings of the 32Nd Annual ACM Symposium on
Through a Telepresence Robot. Proc. ACM
User Interface Software and Technology (2019),
Hum.-Comput. Interact. 2, CSCW (11 2018),
507–519. DOI:
191:1–191:18. DOI:
http://dx.doi.org/10.1145/3332165.3347953
http://dx.doi.org/10.1145/3274460
event-place: New Orleans, LA, USA.
[11] S. Thrun, M. Beetz, M. Bennewitz, W. Burgard, A. B.
Cremers, F. Dellaert, D. Fox, D. Hähnel, C. Rosenberg,