=Paper=
{{Paper
|id=Vol-3217/paper7
|storemode=property
|title=Drawcto: A Multi-Agent Co-Creative AI for Collaborative Non-Representational Art
|pdfUrl=https://ceur-ws.org/Vol-3217/paper7.pdf
|volume=Vol-3217
|authors=Manoj Deshpande,Brian Magerko
|dblpUrl=https://dblp.org/rec/conf/aiide/DeshpandeM21
}}
==Drawcto: A Multi-Agent Co-Creative AI for Collaborative Non-Representational Art==
Drawcto: A Multi-Agent Co-Creative AI
for Collaborative Non-Representational Art
Manoj Deshpande, Brian Magerko
Georgia Institute of Technology
Atlanta, GA 30308, USA
{manojdeshpande, magerko}@gatech.edu
Abstract algorithms and CC systems that are capable of (co)creating
Non-representational art–such as works by Wassily Kandin-
artworks in a specific artistic style (Gatys, Ecker, and Bethge
sky, Joan Mitchell, Willem de Kooning, etc.–showcases 2016), identify and suggest conceptually or visually simi-
diverse artistic expressions and challenges viewers with lar objects (Karimi et al. 2020), produce strokes or pixels to
its interpretive open-endedness and lack of a clear map- complete a sketch or an image semantically (Su et al. 2020;
ping to our everyday reality. Human cognition and per- Iizuka, Simo-Serra, and Ishikawa 2017), etc. While many of
ception nonetheless aid us in making sense of, reasoning these approaches have examined creating representational
about, and discussing the perceptual features prevalent in artwork (e.g., realistic or impressionistic presentations of
such non-representational art. While there have been var- real-world scenes and objects), little work has been done in
ious Computational Creative systems capable of generat- exploring how a more abstract, non-representational work
ing representational artwork, only a few existing Computa- could be done by (or in collaboration with) AI–inspired by
tional (Co)Creative systems for visual arts can produce non-
representational art. How would a co-creative AI that in-
human cognition and perception–that can discuss its inten-
corporates elements of the human visual perception theory tion behind specific features and composition of the artwork
be able to collaborate with a human in co-creating a non- with a human collaborator. In an attempt to address this, we
representational art? This paper explores this challenge in de- present Drawcto - a web-based, multi-agent AI drawing ap-
tail, describes potential machine learning and non-machine plication capable of co-creating and discussing specific fea-
learning approaches for designing an AI agent and intro- tures of non-representational art with a human collaborator.
duces a new web-based, multi-agent AI drawing application,
called Drawcto, capable of co-creating non-representational People often use abstract art and non-representational art
artwork with human collaborators. interchangeably to refer to the same painting style, yet there
are crucial differences between the two terms (Ashmore
Playing video games can be a highly creative activity, re- 1955). Abstract artwork distorts the view of a familiar sub-
quiring individuals to engage in creative behaviors like con- ject (i.e., a thing, face, body, place, etc.). For example, Pi-
tent creation, collaborative building, problem solving, etc. casso distorts a person’s face to show different views of the
(Green and Kaufman 2015; Blanco-Herrera, Gentile, and same figure within a single painting. The resulting artwork
Rokkum 2019). A model of creativity within AI agents may appears abstracted, but still, there are discernable features
support new forms of creative gameplay and new applica- and structures intact from the original subject. Figure 1a and
tions of AI in game spaces. Inspired by this potential, we 1b show examples of abstract art. Non-representational art,
focus on exploring a specific c reative i nteraction modality on the other hand, doesn’t have a known object or a thing
that has its roots in popular sketch-based games like Pic- that the artwork is trying to depict. For non-representational
tionary or web games like Skribbl.io (mel 2011). Previous art–also known as non-objective art–the artist only uses vi-
research with these games has been limited to the training sual design elements like form, shape, color, line, etc., to
and develop of computationally creative agents (Bhunia et express themselves.
al. 2020; Sarvadevabhatla et al. 2018); our aim is to develop Non-representational art represents the spiritual, mystic,
a co-creative system for co-creating non-objective visual art non-materialistic, experiential, or creative painting/thought
that seeks to invoke the properties of human-computer co- process of the artist (Fingesten 1961), making it challeng-
creativity in ways applicable to the study, creation, and play ing to appreciate, contextualize, or understand. For exam-
of digital and analog games involving creative aspects. ple, Kandinsky’s non-objective compositions represent his
Computational Creativity (CC) in the visual arts has emotional experience of listening to music, Mondrian’s
gained attention since the early days of AARON (Cohen paintings–which only contain straight lines and primary
1995). Over the years, researchers have developed various colors–represent “what is absolute, among the relativity of
Copyright © 2021 for this paper by its authors. Use permitted time and space” (Wallis 1960), Pollock’s artworks repre-
under Creative Commons License Attribution 4.0 International sent the action-painting process, in other words it depicts
(CC BY 4.0).
Figure 1: (From left to right)1a:Picasso’s painting (Picasso 1932), 1b:Klee’s painting (Klee 1922), 1c:Rangoli designs (Balaji
2018), 1d:Joan Mitchell’s composition (Bracket 1989)
the forces that lead to its creation. Figure 1c and Fig- research projects and AI architectures have explored im-
ure 1d show examples of non-representational art. Non- age generation from various perspectives and for multi-
representational art generally is not preconceived; instead, ple reasons like co-creating, sketch-based image retrieval,
it emerges from the artist’s in-the-moment interaction with image completion, design ideation, image stylizing, etc.
the medium, reflection-in-design process (Schön 1983). This literature review focuses on diverse learning and non-
Generating visually sensible content in such a dynamic learning approaches for stroke generation for abstract or
scenario is the main challenge for developing an AI agent non-representational art.
for co-creating non-representational art. We cannot simply
train the agent to use object detection or classification to Recurrent Neural Network (RNN)
make sense of and generate new strokes as usually there are The Sketch-RNN model (Ha and Eck 2017) is a sequence
no recognizable objects. At the same time, we can’t gener- to sequence variational autoencoder (VAE) that has inspired
ate random strokes as they would not be visually sensible. and informed various co-creative drawing systems. Some
Therefore, developing an AI that can create various strokes examples in recent years are Collabdraw (Fan, Dinculescu,
based on its perceptual ability to understand and reason with and Ha 2019), DuetDraw (Oh et al. 2018), Suggestive Draw-
the quality of strokes made by the human collaborator is the ing (Alonso 2017), etc. Sketch-RNN model is trained on
challenge we address in this research. QuickDraw dataset (Jongejan et al. 2016) and has learned
We utilize the perceptual organization theory (or Gestalt to express images as short sequential vector strokes. The
theory) for the agent(s) to make sense of and generate new QuickDraw dataset is a collection of labeled sketches drawn
strokes while co-creating a non-representational artwork. under 20 seconds for a selected object category. Facilitated
Gestalt theory describes a finite set of rules that guide and by quickdraw, the Sketch-RNN model can produce seman-
aid the reasoning of our visual system. Some of the gestalt tically meaningful strokes. Like in Collabdraw, the user and
grouping principles are proximity, balance, continuity, sim- AI collaborate by taking turns to finish a semantically ac-
ilarity, etc. (Arnheim 1957). Previously, researchers have curate sketch. Also, in projects like DuetDraw and Sugges-
used perceptual organization theory for various applications tive Drawing, the capability of the Sketch-RNN model is en-
like image segmentation, contour detection, shape parsing, hanced by combining it with other features like completing a
etc. In this paper, we present work that attempts to circum- drawing, transforming an image, doing style transfer, recom-
navigate the ”authoring bottleneck” commonly associated mending empty-space, etc. RNNs are great for co-creating
with co-creative systems (Csinger, Booth, and Poole 1995) with line drawing; however, the main challenge while us-
by using perceptual theories (like Gestalt’s) to both boot- ing an RNN is that the training data needs to be sequential
strap various learning/non-learning approaches to collabo- vectors, i.e., we can’t directly use images to train. We have
rative sketching as well as a basis for affording AI explain- developed two agents for Drawcto using this approach.
ability.
We have organized the paper as follows. We examine po- Generative Adversarial Network (GAN)
tential learning and non-learning approaches for developing
an AI agent in Related Work. The System Design section Similar to Sketch-RNN, another influential model is GAN
describes the current version of Drawcto and explains each (Goodfellow et al. 2020). In GAN, two neural networks–
component in detail. In the Discussion section, we reflect on generator and discriminator–compete with each other to
the present drawbacks of the three drawing agents. Finally, make predictions. The role of the generator is to produce
we share potential future avenues of research we have iden- output that highly resembles the actual data and, the role of
tified for Drawcto in Future Work. the discriminator is to identify the artificially created data.
Some of the research projects based on GAN are Sketch-
GAN (Liu et al. 2019), Doodler GAN (Ge et al. 2021), in-
Related Work teractive image to image translation using GAN (Isola et
In recent years, research on developing image/sketch gen- al. 2018), etc. GAN is used in various ways; for exam-
eration AI has gained a lot of interest. As a result, many ple, in Sketch-GAN, it generates strokes for missing parts
of the image, while in Doodler GAN, it is used to seman- et al. use RL to train an agent to paint like humans with only
tically generate stroke to co-create a surreal creature or a a small number of strokes (Huang, Zhou, and Heng 2019).
bird. In image-to-image translation (Iizuka, Simo-Serra, and Interactive learning is another approach for training an RL
Ishikawa 2017), GAN is used for edge-detection, style gen- agent as utilized in Drawing Apprentice (Davis et al. 2015).
eration, etc. Building over GAN, Elgammal et al. proposed In Drawing Apprentice, the AI agent analyses the user’s
CAN (creative adversarial network) (Elgammal et al. 2017) input strokes, recognize drawn objects, and responds with
to generate artworks deviating from existing artistic styles complementary strokes. RL provides various potent meth-
resulting in non-representational art with varying degrees of ods for training an agent, which we hope to experiment with
complex textures and compositions. The generative capabil- in the future.
ities of GAN are very inspiring; since GAN works with pix-
els, we can use images to train the network and it can also Non-Learning Approaches
be very efficient for style transfer.
There are many non-learning approaches to generate strokes
Transformers for abstract/non-representational art. For example, AARON
(McCorduck 1991) is an intricately authored rule-based AI
A transformer is a sequence-to-sequence model that uses an developed by artist Harold Cohen. Similarly, The Paint-
attention-mechanism to identify important context, helping ing Fool (Colton 2012) system emulates a human painter
it provide better results than an RNN (Vaswani et al. 2017). and can describe its artwork through textual description
Based on the Transformer, researchers have developed an following a set of rules. Drawing Apprentice also has a
open-source ML framework–called BERT (Bi-directional rule-based AI component to respond to the user’s stroke
Encoder Representation from Transformer)–which helps by tracing, replication, or transformation. Another approach
computers ‘understand’ the meaning of a word/phrase in for stroke generation is by using a shape grammar (Stiny
the input text by using surrounding text to create context 2006). For example, in Broadened Drawspace (Gün 2017),
(Devlin et al. 2019). BERT is a pre-trained model which the user engages in a visual-making process with a shape-
can be fine-tuned using a question and answer dataset; grammar-based generative system. Stroke-based rendering
researchers have utilized BERT in various text-to-sketch (SBR) methods also provide many algorithms for stroke
projects. In CalligraphyGAN (Zhuo, Fan, and Wang 2020), generation. SBR algorithms are search or trial-and-error al-
authors combine BERT and conditional GAN to create ab- gorithms designed to optimize stroke placement by minimiz-
stract artworks representing a set of Chinese characters ing an energy function (Hertzmann 2003) or other optimiza-
given as input. In Sketch-BERT (Lin et al. 2020), the tion goals like the number of strokes. Generative capabilities
model learns representations that capture the sketch gestalt. of rule-based systems like AARON inspired us to create a
The dual-language image encoder model–CLIP(Contrastive rule-based agent for Drawcto.
Language-Image Pre-training) (Radford et al. 2021) which
uses vision-transformer (Dosovitskiy et al. 2021)–has in-
spired a whole range of drawing-related projects. For ex-
Summary
ample, in ClipDraw (Frans, Soros, and Witkowski 2021), The above-related research highlights the following gaps in
the agent produces a set of vector strokes in diverse artis- existing systems for (co)creating abstract artistic images.
tic styles satisfying a text input; Fernando et al. combine a All the learning approaches are black-box approaches.
dual encoder (Fernando et al. 2021) similar to CLIP with a The AI agent can’t justify why a certain stroke in a specific
neural L-system to produce abstract images corresponding location and particular style (color, width, length, weight,
to the input text. The dependency on text makes using the etc.) makes sense to the entire composition. Even with rule-
transformer a challenge in the context of co-creation; nev- based or shape grammar approaches, the agent can’t convey
ertheless, transformer-based models are potent models for its perception of the composition as a whole. In other words,
image/sketch generation. existing agents can not reason about the visual design or jus-
tify their actions while creating the artwork.
Reinforcement Learning Barring Drawing Apprentice, none of the CC systems are
Many research projects deal with learning stroke generation capable of co-creating a non-representative art. But, a limita-
using reinforcement learning (RL). Researchers typically tion of Drawing Apprentice is that it does not have any per-
train an agent by letting it interact with a simulated paint- ceptual knowledge bootstraps, so Drawing Apprentice takes
ing environment. The painting environment can be contin- a black box learning approach to train the agent which re-
uous (e.g., SPIRAL (Ganin et al. 2018), Improved-SPIRAL sults in it not being able to discuss its intention with a hu-
(Mellor et al. 2019), etc.) or differentiable (e.g., StrokeNET man collaborator. Even with generative systems, only a few
(Zheng, Jiang, and Huang 2018), Neural Painter (Nakano research projects focus on creating non-representational art,
2019), etc.). As a result of learning in this simulated envi- and almost none discuss the intent behind the composition.
ronment, the RL agent learns to produce strokes and abstract It is also interesting to note that the existing systems are ei-
artworks. Another approach in training an RL agent is lim- ther single-agent or multi-feature systems.
iting the number of strokes used to represent an object. For The shortcomings mentioned above informed us to de-
example, in Pixelor (Bhunia et al. 2020), the agent is in- velop Drawcto- a multi-agent system for co-creating non-
volved in a Pictionary-like game with a human to learn opti- representational artwork, which can explain its actions based
mal stroke sequence to represent an object. Similarly, Huang on the current state of the canvas. In the following section,
Figure 2: Drawcto system design
phize the AI, we named our AI avatar “Dr. Drawctopus” and
created a vector image of a cyborg octopus to represent the
AI. We chose an octopus to represent our AI with the idea
that each tentacle will correspond to a different agent, sym-
bolically conveying that it is part of the same system.
Prior research (Davis et al. 2016) shows that collaboration
can be improved by - having permanent screen-presence of
the AI character, and dynamically drawing the strokes gen-
erated by AI. Hence, we permanently show the AI avatar and
its name on the UI, and we animated a visual glyph repre-
senting the hand of Drawcto moving along the stroke.
Creating non-representational art requires time for
Figure 3: Drawcto user interface reflection-in-design, and turn-taking interaction can facili-
tate this process. But, turn-taking can be a dynamic process;
the most straightforward approach seen in the literature we
we discuss the system design and various agents’ logic for adopted for Drawcto is simple turn-alternation (Winston and
the current prototype of Drawcto. Magerko 2017). However, we needed a way to clearly signal
the beginning and end of a turn to the AI with the alternate
Drawcto System Design turn-taking. Therefore, to overcome this, on the interface, we
We developed Drawcto as an easily accessible web-based incorporated a pencil that the human collaborator can “pick
application with the graphical interaction happening through up” to signal the start of the turn and “place it down” to sig-
a P5.js canvas. The canvas (frontend) communicates to the nal the completion of their turn.
python server (backend) using HTTP Methods. The back- We wanted to present the AI stroke intention to the hu-
end is responsible for the different AI agents’ logic. We de- man collaborator coherently without disturbing the creative
veloped the python server using the micro web framework collaboration. Hence we decided to show the stroke logic in
Flask. Currently, we are hosting the application on Heroku. a dialog box, seeming as if Drawcto is communicating. Fur-
As shown in Figure 2, the user draws strokes and selects the ther, the AI dialogs were written in a way that reflects the
agent on the interface; in response, the AI responds with its friendly persona of the AI.
strokes and a textual description of its stroke intent. We have Along with the UI features mentioned above, the human
developed the backend in a modular manner allowing us to collaborator can also toggle between the three agents via a
add or remove an agent based on its performance. In the fol- clickable arrow. Figure 3 highlights the different parts of the
lowing subsections, we describe the user interface and the user interface.
AI agents in detail.
Rule-Based Agent
Drawcto UI When the rule-based agent is selected, it reacts to the user’s
We designed all of the UI features to foster a human-like stroke(s) based on a set of hand-authored rules. We derived
collaboration between an AI and a human. To anthropomor- these rules from perceptual grouping theory, such as bal-
Figure 4: (Top, from left to right) 4a:collaboration with rule-based agent, 4b:collaboration with artist agent, 4c:collaboration
with quick draw agent, 4d:collaboration with all agents; (Bottom, from left to right) 4e:stylization of art made with rule-based
agent, 4f:stylization of art made with artist agent, 4g:stylization of art made with quick draw agent, 4h:stylization of art made
with all agents
ance, symmetry, continuity, closure, etc. (Arnheim 1957). up or down, sheared, etc.) open shape. The agent randomly
Since non-representational art, in general, is not precon- chooses between these moves, produces the relevant stroke
ceived, we designed the agent also to behave similarly and on the canvas, and presents a textual description of the rule
not have an end goal across multiple turns. Instead, the agent and why it was triggered to the human collaborator.
emulates a painter’s reflection-in-design process and reacts
only based on the current state of composition on the canvas, RNN Agents
primarily based on the collaborator’s latest move. Figure 4a We were curious if we could build a data-driven agent in
shows the result of interaction with the rule-based agent. Drawcto that relied on latent information learned from ex-
The agent makes sense of the strokes strictly based on isting artworks or sketch datasets instead of following pre-
the observable, salient features on the canvas. We use the defined rules. We developed two separate agents to explore
OpenCV library to make sense of and extract features from this “authorless” approach in Drawcto - the Quick Draw and
the strokes. The feature set includes - number of contours Artist agents. Both the agents are based on Google’s inter-
(for whole canvas or current stroke), the center of mass, active SketchRNN (Ha and Eck 2017) model.
white space, and four-way symmetry. We make use of this SketchRNN Quick Draw Agent This agent responds by
feature set to traverse a decision tree to find an applicable producing a new stroke in exact continuation to the human
rule. Some examples of rules include - closing a stroke if collaborator’s last drawn stroke. The agent utilizes the ml5.js
the user drew an open stroke; connecting strokes if the user library’s SketchRNN model (Nickles, Shiffman, and Mc-
drew more than one separate stroke; enclosing a stroke if the Clendon 2018), and the main goal for the Quick Draw agent
no. of contours is above a threshold; creating similar strokes was to explore the stroke generation information learned
if the canvas has an empty area, etc. from the Quick! Draw dataset (Jongejan et al. 2016). How-
To better understand how a rule-based agent works con- ever, SketchRNN model requires a particular object cat-
sider the following scenario. Assume the human collabora- egory to generate strokes, which is not feasible in non-
tor draws an open shape like a ‘U’ shape or a polygon with representational art as there are no objects. We developed
one side missing on the canvas and finishes their turn. Then, two strategies to overcome this- first, we limited the length
a snapshot of the canvas is sent to the backend. With the help of the output stroke to have a maximum of 30 points; second,
of functions in the OpenCV library like findContour or is- we used the “everything” category in SketchRNN, allowing
Closed, we develop a feature set that indicates that the shape it to utilize the entire Quick! Draw dataset for stroke gen-
is open. Following this, the agent traverses a predefined de- eration. These strategies and alternate turn-taking interac-
cision tree and comes up with two possible moves– to close tion resulted in the Quick Draw agent, based on SketchRNN
the open-shape or, to draw a similar but new distorted (scaled model that successfully showcased its stroke generation ca-
Figure 5: (From left to right) 5a:Kandinsky’s painting (Kandinsky 1928), 5b:Edge extraction from the painting, 5c:Kandinsky’s
painting (Kandinsky 1915), 5d:Edge extraction from the painting
pabilities. Figure 4c shows the result of interaction with the a boundary. Hence the model learned it as an essential com-
Quick Draw agent. ponent to any drawing. However, we noticed that the artist
agent could respond to the user’s stroke, complementing the
SketchRNN Artist Agent This agent–to produce strokes– essence of their drawing style.
utilizes the SketchRNN model trained on our custom dataset
of around 2000 images of non-representational artworks
from the web. The goal for the Artist agent was to see if Future Work
the model would automatically learn visual concepts such Adding color, texture, line variation, etc., in a particular
as symmetry, shape-completion, balance, etc. However, get- artist’s style can enhance the co-creative experience. Figure
ting the correct training data was the biggest challenge. To 4e - Figure 4h shows our initial experiment with style trans-
overcome this, we obtained famous non-representational art- fer; The images in Figure 4 show art in Kandinsky’s style.
works and extracted edges from them and converted them In the future, we hope to develop an agent which will let the
into simple sequential vector drawings. Figure 5 shows ex- user choose to add a stroke in a particular artist’s style.
amples of the edge extraction we did to collect data. We can Drawing is an embodied activity, and studies show that
see that these images are composed of distinct shapes - like maintaining embodied interaction can improve the co-
circles, rectangles, etc., have a sense of composition - like creative drawing experience (Jansen and Sklar 2021). There-
positive & negative space, symmetries, etc., and textures are fore, we plan to incorporate a robotic arm, which Drawcto
depicted through different densities of the shorter lines. We can use to draw physical strokes on paper or canvas, creating
trained the Artist agent on this data, and Figure 4b shows the a system where people can draw and co-create physically.
interaction with the artist agent. Lastly, For building a learning agent capable of produc-
ing stokes based on gestalt theory, Reinforcement learning
Discussion approaches appear to be a very promising avenue, especially
In the related work section, we identified various learning with research like PQA (Qi et al. 2021).
and non-learning approaches that we could take to tackle
the challenge of AI generating different semantically accu- Acknowledgments
rate strokes while co-creating non-representational art. From We thank Arpit Mathur, Bhavika Devnani, Laney Light, Lu-
that, we chose one non-learning approach - a rule-based owen Qiao, and Tianbai Jia for collaborating to develop this
Agent, and one learning approach - RNN Agent(s) to incor- project. We thank Erin Truesdell for providing helpful in-
porate in the current prototype of Drawcto. In this section, sights and feedback.
we reflect on the current limitations of the three co-creative
agents. References
The rule-based agent currently echoes and acknowledges
the user’s strokes but rarely produces a novel stroke to the Alonso, N. M. 2017. Suggestive Drawing Among Human
composition. In other words, though the rule-based agent and Artificial Intelligences. Master’s thesis, Harvard Uni-
can generate and justify new strokes, it lacks stroke variabil- versity Graduate School of Design.
ity and can become predictable after using it a few times. Arnheim, R. 1957. Art and visual perception: A psychology
The RNN agents, on the other hand, especially the quick of the creative eye. Univ of California Press.
draw agent, produce a whole variety of strokes due to the di- Ashmore, J. 1955. Some differences between abstract and
versity and volume of the Quick! Draw data. The dialogues non-objective painting. The Journal of Aesthetics and Art
for both the RNN agents give a clue about the training data Criticism 13(4):486–495. Publisher: JSTOR.
but fail to reason about a particular stroke. We will have to
develop a separate gestalt module to analyze the produced Balaji, S. 2018. Simple apartment kolam designs.
stroke and provide suitable explanations. We notice that a lot Bhunia, A. K.; Das, A.; Muhammad, U. R.; Yang, Y.;
of the responses from the artist agent are two lines at right Hospedales, T. M.; Xiang, T.; Gryaditskaya, Y.; and Song,
angles. We believe this is because each training image had Y.-Z. 2020. Pixelor: a competitive sketching AI agent. so
you think you can sketch? ACM Transactions on Graphics Language-Image Encoders. arXiv:2106.14843 [cs]. arXiv:
39(6):1–15. 2106.14843.
Blanco-Herrera, J. A.; Gentile, D. A.; and Rokkum, J. N. Ganin, Y.; Kulkarni, T.; Babuschkin, I.; Eslami, S. A.; and
2019. Video games can increase creativity, but with caveats. Vinyals, O. 2018. Synthesizing programs for images using
Creativity Research Journal 31(2):119–131. Publisher: Tay- reinforced adversarial learning. In International Conference
lor & Francis. on Machine Learning, 1666–1675. PMLR.
Bracket, J. M. 1989. Untitled. San Francisco Museum of Gatys, L. A.; Ecker, A. S.; and Bethge, M. 2016. Image style
Modern Art. transfer using convolutional neural networks. In Proceed-
Cohen, H. 1995. The further exploits of AARON, painter. ings of the IEEE conference on computer vision and pattern
Stanford Humanities Review 4(2):141–158. recognition, 2414–2423.
Colton, S. 2012. The painting fool: Stories from building an Ge, S.; Goswami, V.; Zitnick, C. L.; and Parikh, D. 2021.
automated painter. In Computers and creativity. Springer. Creative Sketch Generation. arXiv:2011.10039 [cs]. arXiv:
3–38. 2011.10039.
Csinger, A.; Booth, K. S.; and Poole, D. 1995. AI meets au- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.;
thoring: User models for intelligent multimedia. In Integra- Warde-Farley, D.; Ozair, S.; Courville, A.; and Bengio, Y.
tion of Natural Language and Vision Processing. Springer. 2020. Generative adversarial networks. Communications of
283–304. the ACM 63(11):139–144.
Davis, N.; Hsiao, C.-P.; Singh, K. Y.; Li, L.; Moningi, S.; Green, G., and Kaufman, J. C. 2015. Video games and cre-
and Magerko, B. 2015. Drawing apprentice: An enactive ativity. Academic Press.
co-creative agent for artistic collaboration. In Proceedings of Gün, O. Y. 2017. Computing with Watercolor Shapes.
the 2015 ACM SIGCHI Conference on Creativity and Cog- In Çağdaş, G.; Özkar, M.; Gül, L. F.; and Gürer, E., eds.,
nition, 185–186. Computer-Aided Architectural Design. Future Trajectories,
Davis, N.; Hsiao, C.-P.; Yashraj Singh, K.; Li, L.; and volume 724. Singapore: Springer Singapore. 252–269. Se-
Magerko, B. 2016. Empirically Studying Participatory ries Title: Communications in Computer and Information
Sense-Making in Abstract Drawing with a Co-Creative Cog- Science.
nitive Agent. In Proceedings of the 21st International Con- Ha, D., and Eck, D. 2017. A Neural Representation of
ference on Intelligent User Interfaces, 196–207. Sonoma Sketch Drawings. arXiv:1704.03477 [cs, stat]. arXiv:
California USA: ACM. 1704.03477.
Devlin, J.; Chang, M.-W.; Lee, K.; and Toutanova, K. 2019. Hertzmann, A. 2003. A survey of stroke-based rendering.
BERT: Pre-training of Deep Bidirectional Transformers for IEEE Computer Graphics and Applications 23(4):70–81.
Language Understanding. arXiv:1810.04805 [cs]. arXiv: Huang, Z.; Zhou, S.; and Heng, W. 2019. Learning to
1810.04805. Paint With Model-Based Deep Reinforcement Learning. In
Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, 2019 IEEE/CVF International Conference on Computer Vi-
D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; sion (ICCV), 8708–8717. Seoul, Korea (South): IEEE.
Heigold, G.; Gelly, S.; Uszkoreit, J.; and Houlsby, N. 2021. Iizuka, S.; Simo-Serra, E.; and Ishikawa, H. 2017. Globally
An Image is Worth 16x16 Words: Transformers for Im- and locally consistent image completion. ACM Transactions
age Recognition at Scale. arXiv:2010.11929 [cs]. arXiv: on Graphics (ToG) 36(4):1–14. Publisher: ACM New York,
2010.11929. NY, USA.
Elgammal, A.; Liu, B.; Elhoseiny, M.; and Mazzone, M. Isola, P.; Zhu, J.-Y.; Zhou, T.; and Efros, A. A. 2018.
2017. CAN: Creative Adversarial Networks, Generating Image-to-Image Translation with Conditional Adversarial
”Art” by Learning About Styles and Deviating from Style Networks. arXiv:1611.07004 [cs]. arXiv: 1611.07004.
Norms. arXiv:1706.07068 [cs]. arXiv: 1706.07068.
Jansen, C., and Sklar, E. 2021. Exploring Co-creative Draw-
Fan, J. E.; Dinculescu, M.; and Ha, D. 2019. collabdraw: ing Workflows. Frontiers in robotics and AI 8:577770.
An Environment for Collaborative Sketching with an Arti-
ficial Agent. In Proceedings of the 2019 on Creativity and Jongejan, J.; Rowley, H.; Kawashima, T.; Kim, J.; and Fox-
Cognition, 556–561. San Diego CA USA: ACM. Gieg, N. 2016. Quick, Draw! The Data.
Fernando, C.; Eslami, S. M. A.; Alayrac, J.-B.; Mirowski, Kandinsky, W. 1915. Untitled, c.1915 - Wassily Kandinsky
P.; Banarse, D.; and Osindero, S. 2021. Generative - WikiArt.org.
Art Using Neural Visual Grammars and Dual Encoders. Kandinsky, W. 1928. Auf Spitzen (On the points) - Kandin-
arXiv:2105.00162 [cs]. arXiv: 2105.00162. sky, Vassily.
Fingesten, P. 1961. Spirituality, mysticism and non- Karimi, P.; Rezwana, J.; Siddiqui, S.; Maher, M. L.; and De-
objective art. Art Journal 21(1):2–6. Publisher: Taylor & hbozorgi, N. 2020. Creative sketching partner: an analysis of
Francis. human-AI co-creativity. In Proceedings of the 25th Interna-
Frans, K.; Soros, L. B.; and Witkowski, O. 2021. tional Conference on Intelligent User Interfaces, 221–230.
CLIPDraw: Exploring Text-to-Drawing Synthesis through Klee, P. 1922. Senecio. Page Version ID: 1028960636.
Lin, H.; Fu, Y.; Jiang, Y.-G.; and Xue, X. 2020. Sketch- Wallis, M. 1960. The Origin and Foundations of Non-
BERT: Learning Sketch Bidirectional Encoder Represen- Objective Painting. The Journal of Aesthetics and Art Criti-
tation from Transformers by Self-supervised Learning of cism 19(1):61–71. Publisher: JSTOR.
Sketch Gestalt. arXiv:2005.09159 [cs, stat]. arXiv: Winston, L., and Magerko, B. 2017. Turn-taking with
2005.09159. improvisational co-creative agents. In Thirteenth Artificial
Liu, F.; Deng, X.; Lai, Y.-K.; Liu, Y.-J.; Ma, C.; and Wang, Intelligence and Interactive Digital Entertainment Confer-
H. 2019. SketchGAN: Joint Sketch Completion and ence.
Recognition With Generative Adversarial Network. In 2019 Zheng, N.; Jiang, Y.; and Huang, D. 2018. Strokenet: A
IEEE/CVF Conference on Computer Vision and Pattern neural painting environment. In International Conference
Recognition (CVPR), 5823–5832. Long Beach, CA, USA: on Learning Representations.
IEEE.
Zhuo, J.; Fan, L.; and Wang, H. J. 2020. A Framework and
McCorduck, P. 1991. Aaron’s code: meta-art, artificial in- Dataset for Abstract Art Generation via CalligraphyGAN.
telligence, and the work of Harold Cohen. Macmillan. arXiv:2012.00744 [cs]. arXiv: 2012.00744.
mel. 2011. skribbl - Free Multiplayer Drawing & Guessing
Game.
Mellor, J. F. J.; Park, E.; Ganin, Y.; Babuschkin, I.; Kulkarni,
T.; Rosenbaum, D.; Ballard, A.; Weber, T.; Vinyals, O.; and
Eslami, S. M. A. 2019. Unsupervised Doodling and Paint-
ing with Improved SPIRAL. arXiv:1910.01007 [cs, stat].
arXiv: 1910.01007.
Nakano, R. 2019. Neural Painters: A learned differ-
entiable constraint for generating brushstroke paintings.
arXiv:1904.08410 [cs, stat]. arXiv: 1904.08410.
Nickles, E.; Shiffman, D.; and McClendon, B. O. 2018. ml5-
library/src/SketchRNN at main · ml5js/ml5-library.
Oh, C.; Song, J.; Choi, J.; Kim, S.; Lee, S.; and Suh, B.
2018. I Lead, You Help but Only with Enough Details: Un-
derstanding User Experience of Co-Creation with Artificial
Intelligence. In Proceedings of the 2018 CHI Conference on
Human Factors in Computing Systems, 1–13. Montreal QC
Canada: ACM.
Picasso, P. 1932. Girl before a Mirror. Page Version ID:
1023645704.
Qi, Y.; Zhang, K.; Sain, A.; and Song, Y.-Z. 2021. PQA: Per-
ceptual Question Answering. arXiv:2104.03589 [cs]. arXiv:
2104.03589.
Radford, A.; Kim, J. W.; Hallacy, C.; Ramesh, A.; Goh, G.;
Agarwal, S.; Sastry, G.; Askell, A.; Mishkin, P.; Clark, J.;
Krueger, G.; and Sutskever, I. 2021. Learning Transfer-
able Visual Models From Natural Language Supervision.
arXiv:2103.00020 [cs]. arXiv: 2103.00020.
Sarvadevabhatla, R. K.; Surya, S.; Mittal, T.; and Babu,
R. V. 2018. Game of sketches: Deep recurrent models of
pictionary-style word guessing. In Thirty-second AAAI con-
ference on artificial intelligence.
Schön, D. A. 1983. The Reflective Practitioner: How Pro-
fessionals Think in Action. Basic Books, Inc.
Stiny, G. 2006. Shape: talking about seeing and doing. MIt
Press.
Su, G.; Qi, Y.; Pang, K.; Yang, J.; Song, Y.-Z.; and SketchX,
C. 2020. SketchHealer: A Graph-to-Sequence Network for
Recreating Partial Human Sketches. In BMVC.
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones,
L.; Gomez, A. N.; Kaiser, \.; and Polosukhin, I. 2017. At-
tention is all you need. In Advances in neural information
processing systems, 5998–6008.