Design and Implementation of Asynchronous Remote Support Irene Reisner-Kollmann* Andrea Aschauer† University of Applied Sciences Upper Austria We provide a system, where multiple users can access and edit spatial data at different times. We provide different interfaces for mobile users on-site and desktop users at a remote location. The system is mainly based on annotations on 2D snapshots (see Fig. 1), but other content types are supported as well. For mobile users, the content is spatially registered to 3D landmarks. For desktop users the content is attached to 2D images, as they do not have the full spatial information. We identified the following goals for our asynchronous remote support system: • Exchange information between local and remote locations Figure 1: Screenshot of the mobile app, where the annotations are • Create a documentation and exchange information over time directly overlaid in instant mode. • Include existing data from other sources in order to be easily accessed on-site A BSTRACT • Keep the mobile app simple by moving time-consuming tasks We present an AR-based system that allows spatial collaboration for to the desktop app multiple users. The main contribution is that on-site and remote users The main benefit of asynchronous collaboration is that not all partic- can work together in an asynchronous way. On the one hand, mobile ipants have to attend simultaneously.The participants can conduct users can create a new project by documenting the environment their work at their preferred time slot. They have the possibility on-site and saving this data for asynchronous processing. On the to carefully plan and review their input. Another practical prob- other hand they can access previously saved data on-site with an lem, which demands asynchronous methods, is the poor internet augmented reality view where annotations and other content are connection in some peripheral industrial sites. directly visualized on top of the real world. Desktop users can In this work, we leverage standard hardware devices and software access the data remotely and have more sophisticated ways for components for creating a sophisticated collaboration system. This editing. Both, mobile and desktop users can revisit, view and edit allows to quickly get advantage of the system as it does not require a collaboration project anytime. We show how the system can be any costly investments or disruptive organizational changes. setup with standard hardware and software modules and share details about the implementation. 2 R ELATED WORK Index Terms: Human-centered computing—Human computer Remote collaboration using a live video stream and augmented interaction (HCI)—Interaction paradigms—Mixed / augmented reality has been an active research topic in recent years [1] and reality; Human-centered computing—Collaborative and social has also led to commercial applications [8, 9]. The systems usually computing—Collaborative and social computing systems and tools— connect a user with a mobile device and a remote expert. We use a Asynchronous editors; very similar concept and transfer it to an asynchronous paradigm. There are different variations of remote support systems [3, 6, 7]. 1 I NTRODUCTION Annotations can be stabilized in world coordinates or fixed to the For many services in industrial environments it is necessary to ac- video frame. They can be persistent or vanish after a short time. The cess a remote site multiple times and consolidate information from expert’s view is freely selectable or it is bound to the current view different people. For example, consider the following workflow for of the local user. the maintenance of a large machine at a customer site. First, a cus- In one of the first papers, Gauglitz et al. [4] present a remote tomer service representative gathers all information on-site including collaboration system for mobile devices. They coarsely reconstruct pictures, videos and notes from the customer. Based on this infor- the 3D scene in the remote application for correctly placing 3D mation, employees in the back office determine what exactly will be annotations. The remote user can select any of the previous view repaired and create an offer for the customer. Additionally, they add points provided by the local user. more information like manuals or specific instructions. Finally, a A comprehensive review of collaboration techniques in aug- technician executes the maintenance and retrieves all information mented and virtual reality is given by Ens et al. [2]. They analyze from the project. If the machine needs maintenance again at a later mixed reality systems according to concepts of Computer Supported time, it is beneficial to access the information again. Cooperative Work (CSCW) research. The vast majority of the ana- lyzed papers, namely 95 %, focus on synchronous collaboration. * e-mail: irene.reisner-kollmann@fh-steyr.com Irlitti et al. [5] describe theoretical considerations and challenges † e-mail:andrea.aschauer@fh-hagenberg.com for asynchronous collaboration using augmented reality. They sug- gest the importance of comprehensive information from users and for the spatial and temporal organization of the data. In contrast Copyright © 2020 for this paper by its authors. Use permitted under to our work, they mainly focus on local collaboration and not on Creative Commons License Attribution 4.0 International (CC BY 4.0). remote users. Figure 2: Overview of collaboration system: Mobile and desktop users share the same content on the server. Figure 3: Screenshot of the mobile app in icon mode. 3 D ESIGN CONSIDERATIONS Fig. 2 shows the structure of our asynchronous collaboration system. Instant mode As shown in Fig. 1, the 2D annotations are directly All users access the same data on the server. All data is organized in overlaid to the live camera image by rendering them as quads anchors, which contain a 3D pose, a 2D snapshot and the content at the anchor’s pose in the three-dimensional space. Multiple itself. Only mobile users can create new anchors, because the 3D annotations are shown at the same time. information from the augmented reality framework is needed. Delet- ing anchors is only possible for desktop users, because we want to Icon mode An icon is shown at the 3D location of the anchor. When keep the mobile app as simple as possible. clicking the icon, the original 2D snapshot and the annotations are shown full-screen. This mode is especially useful in case 3.1 Content types of poor registration (see Sect. 4.1), because direct overlays are not accurate enough. It also allows the user to see whether The most important content are 2D annotations. Both, mobile and something has changed in the scene compared to a previous desktop users can edit the annotations with a standard drawing tool session. on top of a 2D snapshot of the scene. Annotations are stored as separated images with a transparent background. Annotations allow Other content types are always indicated with an icon in the 3D to highlight specific objects in the scene or to scribble additional scene and are opened by clicking on the icon. We use standard instructions. software for showing videos and documents. A special form of the 2D annotations is text. The visualization is The user can create new anchors by tapping one of the buttons in similar to other 2D drawings, but it is important to store is as plain the bottom (see Fig. 3) or directly onto the video stream. The 3D text for editing and searching. position of the anchor is computed by raycasting the reconstructed Audio content allows the mobile user to easily and quickly cap- 3D points and planes from the augmented reality framework. Op- ture information. Video is useful to provide an overview of the tionally, the user can create and edit 2D annotations to the current whole scene or to include external data such as instructional videos. snapshot with a simple drawing tool. Besides drawing lines with the Documents allow to attach further information, e.g. the user guide finger tip, the user can select a color and clear the whole screen. for a specific machine. Documents are basically stored as links to While our system is designed for connecting on-site and remote external files, i.e. they can contain almost any type of data. The only users, it is not mandatory to have a remote user. The mobile app prerequisite is that a viewer is installed on all devices. can be used on its own and allows users to view and edit the data The system can be easily extended to any other content type. multiple times on-site. We’d like to incorporate interactive elements in the future such as animated instructions or checklists. 3.3 Desktop app The desktop app is aimed for remote users. They can review the 3.2 Mobile app data from the object or environment to work on, which has been The mobile app is used for creating, viewing and editing content on- provided by the mobile users. Further they can edit the data and site. We currently support handheld devices, but other AR displays provide additional information. Fig. 4 shows the main parts of the such as head-mounted displays could be used as well. Fig. 3 shows user interface for remote users. the main user interface of the mobile app. The desktop app visualizes the data only in 2D. Therefore, the We use the standard augmented reality view, i.e. the user sees the 2D snapshots are the main point of interaction. The user can flip live video of the device’s camera together with the AR annotations. through all 2D snapshots. Annotations are always overlaid if they The information from the AR framework such as points and planes are present. Other content types can be opened with buttons. can be optionally visualized. This gives the user a better understand- Users can edit the 2D annotations with a similar drawing tool as ing whether the scene has been captured well enough. If it disturbs in the mobile app. They can add any number of additional content the user from the actual information, it can be simply hidden. and assign it to one of the existing anchors. It is not possible to As noted before, our main content are 2D annotations. There are create new anchors in the desktop app, because the 3D information two modes for visualizing them in the mobile AR view: is not available. on wrong objects. In this case it is recommended to fallback to the icon mode and show annotations only on top of the original snapshot. 4.2 Data storage An important request from industrial enterprises is that data can be easily shared with other software units. Therefore we decided to store all content in conventional data types. For example, all 2D snapshots and annotations are stored as individual image files. The same is done for audio, video and documents. Additional data from our system is stored in json-files. This approach makes it easy to create importers for other software systems or to view the data directly with off-the-shelf software. 5 C ONCLUSIONS AND F UTURE WORK We showed a system that allows on-site and remote users to exchange information on a specific site. Mobile users can see the information directly overlaid to the real object of interest in an augmented reality view. Remote users have a 2D view on the data and have more ways for adding additional data. We showed that the system can be simply implemented with standard hardware. We think that asynchronous Figure 4: Screenshot of the desktop app. remote collaboration has a big potential of improving collaboration on visual data over distance and time. In the future we plan a detailed evaluation of the system. We like Usually, the goal on-site is to capture as much information as to compare it to traditional data sharing and to synchronous remote possible because it’s often difficult to access it again. The desktop support systems. app offers various ways for organizing large amounts of data. There An obvious extension will be the support of head mounted dis- is a rating mechanism in order to optionally hide less important plays. Another big improvement will be to make 3D data accessible anchors. The remote user has also the possibility to delete specific to the remote user. The remote user might get a better overview and content or complete anchors. has the possibility to create new anchors. An important topic will also be large scene changes between sessions and the handling of 4 I MPLEMENTATION DETAILS incorrect scene registration. The system is implemented in Unity 3D, which has the benefit that many parts can be shared between the mobile and the desktop app. ACKNOWLEDGMENTS We use Unity’s integrated augmented reality framework ARFounda- This work was funded by the FFG under the project Mixed Reality tion. ARFoundation provides a unified API for ARCore and ARKit Based Collaboration 4 Industry (Collective Research). and thus supports both, Android and iOS. ARFoundation is also necessary for using Microsoft’s Azure spatial anchors in Unity. R EFERENCES [1] R. A. J. de Belen, H. Nguyen, D. Filonik, D. Del Favero, and T. Bednarz. 4.1 Registration A systematic review of the current state of collaborative mixed reality An important factor for the AR application is to register the cur- technologies: 2013–2018. AIMS Electronics and Electrical Engineering, rent session to the existing spatial data. We provide three different 3(2):181–223, 2019. approaches: [2] B. Ens, J. Lanir, A. Tang, S. Bateman, G. Lee, T. Piumsomboon, and M. Billinghurst. Revisiting collaboration through mixed reality: The Cloud anchors Azure spatial anchors are created by capturing the evolution of groupware. International Journal of Human-Computer surrounding scene and can be shared with other Android, iOS Studies, 131:81–98, 2019. and Hololens devices. When the scene is viewed again, the [3] O. Fakourfar, K. Ta, R. Tang, S. Bateman, and A. Tang. Stabilized annotations for mobile remote assistance. In Proceedings of the 2016 pose of the anchor is re-established without user interaction. CHI Conference on Human Factors in Computing Systems, pp. 1548– This approach fails if there were too many changes in the scene. 1560, 2016. In our system, we use one Azure spatial anchor per project. [4] S. Gauglitz, C. Lee, M. Turk, and T. Höllerer. Integrating the physical environment into mobile remote collaboration. In Proceedings of the Marker Aligning the scene to a known marker is supported by 14th international conference on Human-computer interaction with all established AR frameworks. Usually, it is not possible to mobile devices and services, pp. 241–250, 2012. permanently mount markers at customers site. In this case, the [5] A. Irlitti, R. T. Smith, S. Von Itzstein, M. Billinghurst, and B. H. Thomas. marker location can be inferred from screenshots of previous Challenges for asynchronous collaboration in augmented reality. In sessions. The same printed marker can be used for each support 2016 IEEE International Symposium on Mixed and Augmented Reality case and the user positions it again in the scene according to (ISMAR-Adjunct), pp. 31–35. IEEE, 2016. the screenshot. [6] A. Jakl, L. Schöffer, M. Husinsky, and M. Wagner. Augmented reality for industry 4.0: Architecture and user experience. In FMT, pp. 38–42, Manual registration The user selects a screenshot from a previous 2018. session and manually overlays the current camera view. This [7] S. Kim, M. Billinghurst, and G. Lee. The effect of collaboration styles procedure is clearly very inaccurate. We included it as a fall- and view independence on video-mediated remote collaboration. Com- back when there is no internet connection and markers have puter Supported Cooperative Work (CSCW), 27(3-6):569–607, 208. not been used in previous sessions. [8] Microsoft. Remote Assist. https://dynamics.microsoft.com/de-de/mixed- reality/remote-assist/. None of these registration methods is absolutely free from defects. If [9] PTC. Vuforia Chalk. https://chalk.vuforia.com/. the registration is not good enough, the annotations will be overlaid