=Paper=
{{Paper
|id=Vol-2903/IUI21WS-HAIGEN-5
|storemode=property
|title=avatars4all
|pdfUrl=https://ceur-ws.org/Vol-2903/IUI21WS-HAIGEN-5.pdf
|volume=Vol-2903
|authors=Eyal Gruss
|dblpUrl=https://dblp.org/rec/conf/iui/Gruss21
}}
==avatars4all==
avatars4all Eyal Gruss Tel-Aviv, Israel Abstract We present an environment [1] for running First Order Motion Model [2], using a live webcam feed, in the browser over Google Colaboratory. This allows novice users to experience almost real-time live head puppeteering, or so called "deep fake avatars", with no need of dedicated hardware, software installation or technical know how. A rich GUI allows extensive control of model and media options, as well as some unique innovations including fast auto-calibration and a Muppets generator [3]. This, and other accompanying notebooks, serve in practice as educational, creative and activist tools. Keywords deepfakes, avatars, Google Colab, 1. Main species, live action, cartoon, it’s all your call. [Ready Player One With the advance of the Coronavirus pan- film, 2018] demic in the beginning of 2020, The majority of Human social activity has been forced on- The repository [1] contains a few Colab line to the virtual realm. Only a few months notebooks that attempt to make the tech- earlier, First Order Motion Model (FOMM) nology accessible for all. Requiring only [2] was released, introducing the ability a browser and a Google account, these of one-shot video-driven image animation. notebooks can be operated with one click Soon followed by [4], a real-time environ- ("run all"). However, they are also flexible ment for FOMM allowing using "Avatars for tools, allowing users to use and manipulate Zoom, Skype and other video-conferencing their own selected media. The live webcam apps". Is the time ripe to claim the once environment is based on WebSocket similar promised cybernetic utopia? Could we at to [5]. To the author’s best knowledge, it last shed our physical shells and be whoever is the fastest purely online solution for live we want to be in Zoom-space? FOMM avatars, as well as one of very few real-time webcam Colab implementations. People come to the Oasis for all The GUI in figure 1 shows a multitude of the things they can do, but they controls for zooming, calibration, switching stay because of all the things between avatars, generating new avatars, they can be: tall, beautiful, and various model and display parameters. scary, a different sex, a different A novel fast auto-calibration mode that works in real-time, finds the best alignment Joint Proceedings of the ACM IUI 2021 Workshops, April between driver and avatar based on model 13–17, 2021, College Station, USA " eyalgruss@gmail.com (E. Gruss) keypoints (rather than facial landmarks). ~ eyalgruss.com (E. Gruss) Following Avatarify [4], which inspired this project, the user can generate new © 2021 Copyright ©2021 for this paper by its authors. Use permitted under Creative Commons License Attribution avatars based on StyleGAN "This Person 4.0 International (CC BY 4.0). CEUR http://ceur-ws.org CEUR Workshop Proceedings Does Not Exist" [6] website. Taking the (CEUR-WS.org) Workshop ISSN 1613-0073 Proceedings idea further, one can also generate avatars these facilities are still accessible mostly to specifically of men, women, boys, girls the tech savvy and those of means to hire [7], Waifus [8], Fursonas [9] and Muppets them. It may not be long, before we have [3], the latter developed especially for this ubiquitous and seamless smartphone apps project by Doron Adler, in collaboration that can create perfect deep fakes. However, with the author. One can also drag and drop it is the author’s opinion that precisely in local or web images on the GUI to upload this interim, it is imperative to liberate and new avatars, as inspired by [10]. Other democratize the technology. innovations include an exaggeration factor The advancement of technology cannot slider to lever stronger keypoint motions, be stopped. AI and synthetic media, like an option to take your own snapshot and electricity, fire and other technologies, can puppeteer it, reminiscent to Nvidia Maxine be used for good and for bad. It can be used [11], which may help understanding the both to infringe one’s privacy and to protect mechanism, an optional post-process step one’s privacy. It can be used to bully and to for the pipeline for offline videos, using harass, or to promote self expression and self Wav2Lip [12] following FOMM, to fix the lip acceptance. Fake news is not a new problem. sync, and combining Wav2Lip with speaker Blood libels have existed throughout the last diarization for automatic animated skit millennia. Photomontage technology has creation from audio ("Wav2Skit"). been used to fake photographs as early as These tools were the basis for several 1857. Videos are harder to fake, but Hol- workshops and tutorials at international lywood, Disney and government agencies festivals and conferences in 2020, including have been doing so for the last century. Suoja/Shelter, South Africa NAF, ADAF, Contemporary examples show that it is Reclaim Futures, Fubar, ISEA, Technarte, enough to change the label on an image, EVA London, Piksel, Stuttgarter Filmwinter, or slightly edit an audiovisual recording, Dorot-Con and MozFest [1]. They are now to achieve a strong effect. The solution to being introduced in elementary and middle combat this is in education. Making the schools in Israel with the Pisga-Cyber excel- technology accessible to educators, artists, lence program [13]. A pleasantly surprising journalists as well as the general public, will first real-world usage of the described serve to raise awareness, healthy skepticism system. and critical thinking, toward media and the spectrum of contemporary possibilities in media creation and manipulation. Broader impact and ethical implications This is a dangerous time. The ability to synthesize and manipulate media is improv- ing by the day. In the quality of outcome, in the mediums, modalities and conditions dealt with, in the required compute and data resources, and in the availability and accessibility of the technology. We are in the midst of a transition period, were Figure 1: GUI for live webcam avatar in Colab. The author (left) is puppeteering a generated Muppet. References [1] E. Gruss, avatars4all, 2020. URL: https: //github.com/eyaler/avatars4all. [2] A. Siarohin, S. Lathuilière, S. Tulyakov, E. Ricci, N. Sebe, First order mo- tion model for image animation, in: H. Wallach, H. Larochelle, A. Beygelz- imer, F. d'Alché-Buc, E. Fox, R. Gar- nett (Eds.), Advances in Neural Infor- mation Processing Systems 32, Curran Associates, Inc., 2019, pp. 7137–7147. URL: https://aliaksandrsiarohin.github. io/first-order-model-website. [3] D. Adler, E. Gruss, This mup- pet does not exist, 2020. URL: https://thismuppetdoesnotexist.com. [4] A. Aliev, K. Iskakov, Avatarify, 2020. URL: https://github.com/alievk/ avatarify. [5] a2kiti, Webcam google colab, 2020. URL: https://github.com/a2kiti/ webCamGoogleColab. [6] 2020. URL: https:// thispersondoesnotexist.com. [7] 2020. URL: https://fakeface.rest. [8] 2020. URL: https://www. thiswaifudoesnotexist.net. [9] 2020. URL: https:// thisfursonadoesnotexist.com. [10] 2020. URL: https://terryky.github.io/ tfjs_webgl_app/face_landmark. [11] 2020. URL: https://developer.nvidia. com/MAXINE. [12] K. R. Prajwal, R. Mukhopadhyay, V. Namboodiri, C. V. Jawahar, A lip sync expert is all you need for speech to lip generation in the wild, 2020. URL: http://bhaasha.iiit.ac.in/lipsync. arXiv:2008.10010. [13] 2020. URL: https://pisgacyber.co.il.