AI as an Active Writer: Interaction strategies with generated
text in human-AI collaborative fiction writing
Daijin Yang1 , Yanpeng Zhou2 , Zhiyuan Zhang3 , Toby Jia-Jun Li4 and Ray LC3
1
  Northeastern University, 360 Huntington Ave, Boston, MA 02115, USA
2
  Nanyang Technological University, 50 Nanyang Ave, 639798, Singapore
3
  City University of Hong Kong,Tat Chee Avenue, Kowloon Tong, 518057, Hong Kong
4
  University of Notre Dame, Notre Dame, IN 46556, USA


                                             Abstract
                                             Machine Learning (ML) has become an important part of the creative process for human fiction writers, allowing them to
                                             utilize various sources of information and be inspired by strategies and data previously seldom explored. To investigate how
                                             human writers collaborate with ML systems in fiction writing, we prototyped a web-based human-AI collaborative writing
                                             tool that allows writers to shorten, edit, summarize, and regenerate text produced by AI. To investigate the dynamics of
                                             human-AI interaction in fiction co-writing, we used a "finish each other’s story" approach where humans and machines took
                                             turns writing collaborative fiction. In results from a preliminary study with 9 users, we found that users took inspiration from
                                             unexpected text generated by the machine, that users expected reduced fluency and coherence in the machine text when
                                             allowed to edit the output, and that they perceived a mental model of the AI as an active writer in the collaborative process
                                             rather than simply as a passive AI writing assistant. This study provides design implications on supporting co-creative writing
                                             of humans and machines.

                                             Keywords
                                             Applications of intelligent user interfaces, Collaborative interfaces, User Modelling for Intelligent Interfaces, Evaluations of
                                             intelligent user interfaces - Reproducibility


1. Introduction                                                                                                       ing how users perceive the AI used for text generation
                                                                                                                      and how users interact with AI in the creative writing
The rapid development of machine learning has made                                                                    process[18, 19, 20]. Most designs consider collaborative
it possible for artificial intelligence (AI) to collaborate                                                           creative writing systems with AI as the user’s assistant,
with humans to generate creative content [1, 2, 3, 4, 5, 6].                                                          such as supplementing the user’s unfinished sentences or
Human-AI collaborative creative systems based on ma-                                                                  providing users with suggestions for writing [10, 11, 13].
chine learning have been gradually entering people’s                                                                  We seek to explore how an AI system may play a more
creative artistic life such as music composition [6, 7, 8],                                                           active role in co-creative writing. Specifically, we explore
creative illustration [1, 9], and co-writing [10, 11]. These                                                          what interactive capabilities users actually need when
human-AI collaborative creation systems can assist expe-                                                              co-creative writing with AI, and how these capabilities
rienced creators by inspiring them with new ideas and                                                                 affect the writing co-creation experience.
providing suggestions [12, 13]. They can also bring a                                                                    To ground our study, we prototyped a collaborative
novel creative experience to users who have no or little                                                              writing system with a web interface for human-AI co-
creative experience, such as completing the drawing that                                                              writing. In this system, users and the machine take turns
the user has started or automatically filling in the user’s                                                           writing paragraphs for each other to continue with. The
unfinished sentence [1, 10]. In this article, we focus on                                                             system has two different modes, the "Edit Mode" and the
the needs of users when they collaborate with AI for                                                                  "No Edit Mode". In our preliminary study with 9 users,
creative writing.                                                                                                     each user was first asked to write the beginning of a
   Recent work is focused on improving the algorith-                                                                  sci-fi story about human beings finding new homes. A
mic performance of natural language generation models,                                                                GPT-2-based language model fine-tuned to a sci-fi theme
such as improving the logic of generated text [14, 15]                                                                generates follow-up paragraphs of the story based on
or making the generated text closer to the natural lan-                                                               what users have written. Before continuing writing, users
guage [16, 17]. However, little work focuses on explor-                                                               could choose to regenerate or select from multiple ver-
                                                                                                                      sions of machine-generated texts. The machine would
Joint Proceedings of the ACM IUI Workshops 2022, March 2022,
Helsinki, Finland
                                                                                                                      consider changes made by users into account for its next
$ yang.dai@northeastern.edu (D. Yang); zydstd@gmail.com                                                               generation. In each study sessions, the user and the AI
(Y. Zhou); zzhang452-c@my.cityu.edu.hk (Z. Zhang);                                                                    tool finished a 5 paragraph sci-fi story together, with 2
toby.j.li@nd.edu (T. J. Li); LC@raylc.org (R. LC)                                                                     paragraphs generated by the AI, and 3 paragraphs written
                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative
                                       Commons License Attribution 4.0 International (CC BY 4.0).                     by the user.
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073       CEUR Workshop Proceedings (CEUR-WS.org)
Figure 1: The architecture of the prototype human-AI co-writing tool used in this study. First, the story head is written by
humans and entered into the fine-tuned GPT-2. The user then judges the text generated by the machine and decides whether
to regenerate it. After that, the text generated by the machine is modified by the user as the final machine text. The user
follows the story development of machine text to write. Finally, both machine text and human text are used as input for the
next machine generation.


   By observing 9 users’ writing process in two modes, in-     In these systems, AI collects the user’s input information
terviewing about their experiences in the co-writing pro-      as to its output condition or predicts the user’s true in-
cess, and analyzing their written stories, we concluded        tention based on the user’s feedback.The output of our
our main findings as follows:                                  human-machine collaborative innovative writing system
                                                               is influenced by users’ input, which is consistent with
    1) We find the patterns of texts in Human-AI col-          previous works. However, our system comes with its
       laboratively written stories: The AI-enabled tool       own consideration of the plot, while following the user’s
       served as a good unexpected twist provider but          writing wishes.
       not a fully competent writer.                              Compared with the difficulty of human innovation in
    2) We discover users with different writing inten-         story writing, machines cannot fully understand the in-
       tion and in different interactive modes (allowing       tentions of human writing, so they are more likely to
       editing versus not allowing editing) had different      create unexpected plots and drive the development of the
       mental expectation on text coherence and fluency.       story[24]. Since the self-attention mechanism is mainly
    3) We describe users perceptions of machine’s role         used in the current machine language model, the word
       in the co-writing process and discuss future pos-       vector sometimes notices itself thus falls into the looping
       sibility of writing machine.                            state of the text [10, 13, 24], especially in the process of
                                                               generating long text. In addition, the training dataset of
  Taken together, these findings guide the design of fu-
                                                               the machine is quite larger than the related knowledge
ture Human-AI co-writing interfaces.
                                                               in the human brain, so it has the potential to generate
                                                               interesting story text[25, 26, 27, 24]. Human’s logic is
2. Related Work                                                stronger than machine’s, which is necessary for coher-
                                                               ence in creative writing[24].Especially, Humans have a
The recent development of AI enabled extensive applica-        much better common sense world understanding than
tions that explored the creation of cooperation between        language models[20]. Hence, combining text generation
humans and AI, including drawing creation [1, 9, 2], cre-      language models and human writing for innovative writ-
ative writing [10, 11, 3], dance [21] and other fields [22,    ing, including text interactive games, writing assistants
23]. For example, Clark et al. conducted a study that          and so on, might be a potential way of human-computer
explore the use of AI to complete sentences and provide        interaction. In a text interactive game[28, 29, 30, 31], the
suggestions [13] and Louie et al. built an AI-enabled          user controls the character through natural language,
tool for creating music [6]. In this line of works, the        and the AI agent recognizes the user’s input, intelligently
AI acts as the user’s collaborator. It can adjust its out-     manipulates the character’s actions in a text-described
put according to the goals and actions proposed by the         environment, and feeds the results back to the user. AI
user and then makes corresponding recommendations.             writing assistant is also an important research field of
human-AI creative writing. The AI writing assistant can       GPT-2’s finetune function was called. The step and the
correct users’ spelling and grammatical errors[10], com-      learning rate was set as 1500 and 1e-5, respectively. Most
plete users’ unfinished sentences or supplement full-text     of the science fiction story data in it come from Pulp
paragraphs[32], and provide inspiration and suggestions       Magazine Archive.
for users’ creative writing[13].

                                                              4. Preliminary User Study
3. The Collaborative Writing Tool
                                                              In this study, we ask the three research questions (RQs)
For our study, we prototyped a web-based collaborative        below:
writing tool where the user can co-write a short sci-fi
story with a GPT-2-based text generation model. The tool       RQ1: What patterns of interactions are taken up by
uses a “turn-taking” approach (Figure 1) where the user             humans when they interact with machines in col-
starts with writing the beginning the story. The model              laborative writing?
then continues the story by generating a section that          RQ2: How does the ability to select, edit, and cut
follows the user’s previous one. The user and the model             out machine-generated text affect the human-
continue each writing a section in turns until the end of           machine co-writing process?
the story. The user may also edit the AI-generated section     RQ3: How do humans perceive the role of the ma-
or regenerate a section when they are not satisfied with            chine in the editable vs. non-editable interaction
the AI-generated result.                                            modes?

3.1. The Web Interface
                                                              4.1. Procedure
The web interface of our tool was implemented using the
                                                          To answer these RQs, with the tool described in Section 3,
Django framework. As shown in2, two different inter-
                                                          we conducted a user study to investigate the dynamics
faces are designed for two modes. "Edit Mode" and "No
                                                          of human-AI interaction in fiction co-writing. The study
Edit Mode" both have a "Submit" button for the human
                                                          uses a within-subject design, where each user had to use
user to submit their written text, a "Regenerate" button
                                                          both the "Edit Mode" (where editing the AI-generated
for the AI model to regenerate sentences, and an "End"
                                                          text was enabled) and the "No Edit Mode" (where editing
button to end the story. There is also an "Edit" button for
                                                          the AI-generated text was disabled) when writing with
the user to edit the text generated by the AI model (the
                                                          the tool. The order of the two conditions was random. In
Edit button was disabled in the “No Edit” condition in the
                                                          each study session, following a short demonstration of
study). All history texts will be shown at the top of the
                                                          the user interface and the theme that they would write
page, with human-written texts in black and machine-
                                                          about, the user was asked to write the beginning of a
generated texts in red. The back button allows the user
                                                          sci-fi story about humans finding a new home in space.
to go back to their last operation.
                                                          Using this beginning, the user then wrote a 5-paragraph
   When the user clicks on the “Regenerate” button, the
                                                          story with our tool in the first condition. After this, the
model re-selects a random seed, and uses the model to
                                                          user filled out a usability questionnaire and had a 5–10-
generate its last paragraph. This feature allows the user
                                                          min semi-structured interview about their experience in
to quickly get a new AI-generated paragraph when the
                                                          the first condition. Similarly, the user then wrote another
previous one was not desirable, such as when the model
                                                          story, filled out the questionnaire, and had an interview
fails to generate readable text, generates repetitive text,
                                                          in the second condition.
or switches topics abruptly.
                                                             We post advertisement on the university’s bulletin
                                                          and recruited nine participants (n=9) for our study, later
3.2. The Text-Generation Model                            referred to U1 to U9 in this paper. Participants were
Our prototype tool uses a GPT-2 language model that all graduate students who were interested in human-
was fine-tuned to a sci-fi theme. GPT-2 is a super-large- machine co-writing. 5 of them were 18–25 years old, and
scale language model proposed by OpenAI in 2019 [33, 4 of them were 26–35 years old. 5 of them used English as
34]. We used the “medium” version of GPT-2 with 355M their first language, and 4 of them used Chinese as their
parameters. In order to adapt the style of generated text first language. 8 of them were males, and 1 of them was
to the sci-fi domain used in the study, we fine-tuned female. All users had some creative writing experiences.
GPT-2-medium to the field of science fiction. We used 2 were novices, 3 had intermediate-level of experiences,
the Sci-Fi Stories Text Corpus [12] collected by Robin and 4 were experienced fiction writers.
Sloan as the dataset for fine-tuning GPT-2-medium. The       Each user’s screen recording of their writing process,
                                                          questionnaire, and interview was recorded and tran-
Figure 2: The user interface of our collaborative creative writing tool used in the study.Top Left:The initial interface includes
writing prompts, mode selection, theme selection, input box, and submit button. Top Right:In the upper interface, the black
font represents the text entered by the user, while the red font represents the text generated by the AI model. After the AI
model generates the text, the user can choose to modify, to regenerate, to skip the modification to continue generating, or to
end the interaction. Bottom:The user can modify the text generated by the AI model.


scribed. Users were asked to think aloud while writ-    minutes’ writing, the users would finish the story that
ing. One of the experimenters conducted open coding     had an average length of 622 words [M=622, SD=109],
analysis [35] of the written contents, think-aloud, and in which 320 words were written by the user [M=320,
interview transcripts for the qualitative results.      SD=106], and 302 words were written by the machine
                                                        [M=302, SD=28]. In the “No Edit Mode”, it took 4–10 min-
                                                        utes for users to write another beginning of a sci-fi story
5. Descriptive Statistics of the                        that had an average length of 117 words [M=106, SD=51].
     Stories                                            After that, users spent 20–40 minutes completing the
                                                        the story with an average length of 599 words [M=599,
In the “Edit Mode”, users usually spent 4–10 minutes SD=127], where 282 words were written by users [M=282,
writing a beginning of a sci-fi story that had an aver- SD=125], and 317 words were written by the machine
age length of 117 words [M=117, SD=56]. After 20–30 [M=317, SD=31].
Figure 3: The Task flow of our study. The user and the AI takes turns to co-write a short science-fiction story. The user starts
with writing the beginning (Paragraph 1) of a sci-fi story of human beings finding new homes. The AI model generates the
Paragraph 2, which the user can regenerate, select, or edit the texts (in the “Edit Mode” condition). The user and the AI repeats
this process to write Paragraphs 3 and 4 and finally the user ends the story with the Paragraph 5.


Figure 4: Screenshot of the experiment. Left: The user shares his screen. At this time the user is editing the text generated by
the machine. Right: The user is in the upper left corner and the other three are researchers.


6. Qualitative Findings                                           6.1. Story Content
In this section, we describe the properties of the co-            By coding the stories, we found that new twists, includ-
written story and user’s strategies for co-writing with           ing new characters, new scenes, and new events were
the AI-enabled tool. Specifically, we investigated what           more frequently found in AI-generated texts (after re-
was their reaction towards the AI-generated texts in two          generation, if any) than user-written texts. For example,
different modes, how AI-generated texts affected their            in U4’s "Edit Mode" story. U4 only mentioned "We" as a
creative writing process, and how they perceived the              new character, "uncertain terrain" as a new scene, "We are
partnership between them and the AI-enabled tool in               currently approaching a new solar system with a planet
two different modes of writing.                                   that seems inhabitable." as a new event in paragraph 1.
                                                                  However, the machine wrote in both paragraph 2 and "a
man", "Icter" as additional characters, "winding corridor",      performed differently in the same condition, and users
"a tunnel", "a dimly lit room" as new scenes, and "walking       who had similar expectations also performed differently
down", "a man stood in front of me", and "should walk back       in two conditions.
and tell the others" as new events. Surprisingly, users
considered the unexpectedness as the core inspiration or         6.2.1. Reaction to the Coherence of AI-generated
reason of continuing their writing, and took good use of                Texts
the new elements to continue the story, such as U4 wrote
"The others look at me inquisitively, wondering what was   Users in the II group had lower coherence expectation of
in the structure, and glad that I had made it out alright. the AI-generated texts than users in the DI group. And
I said ’there was a man.’" in paragraph 3 after reading    they all had higher expectation of coherence when they
machine texts "should walk back and tell the others" in    were in the "No Edit Mode".
paragraph 2.                                                  Users in the II group prefer the model to generate texts
   Notably, the AI model would sometimes suddenly          that contain some new entities that they could work on.
change the positive atmosphere in previous paragraphs      For example, any new characters, events, or locations
into a negative atmosphere, or the other way around.       could be good for them: "I don’t think I said anything
For example, U2 wrote an optimistic beginning that "As     about a name so I guess it named somebody, which is cool."
Commander Barone’s shuttle hummed along, he couldn’t       (U4 in "Edit Mode").
help but feel a sense of optimism about humanity’s future.    However, users in the DI group were trying to find
He had successfully surveyed Planet T74 and was returning  something that logically fitted into their story in the AI-
back to Space Station Endurance with a cargo hold full of  generated texts. For example, some expected subjects
samples of rocks, plants, and even some animal life.", but to have logical continuations such as "I guess it depends
the machine suddenly turned the story into a negative de-  first on what I wrote and then if I think it’s a logical con-
scription that "He had been told that there were no known  tinuation." (U1 "No Edit Mode"), and refused illogical
diseases or parasites on the planet." (U2 "Edit Mode", usercharacters such as "Machine starts spitting out more and
and machine wrote in paragraph 1 and 2).                   more characters that were not mentioned in the scene which
   Despite having many unexpected elements, user-          made it really wonky later on." (U5 "Edit Mode"). Addi-
written texts and the selected AI-generated texts were     tionally, they wanted the AI model to continue the story
coherent with each other. The selected AI-generated texts  as they expected, such as "Well I expected the machine
often use events, characters, and scenes that were men-    to basically take, you know, to see what I wrote and can
tioned in user-written paragraphs. For instance, since U2  expand upon it or relate to it in some way that’s what I ex-
had written "However, one day an accident at the factory   pected." (U5 "Edit Mode"). However, they would be more
would force AB67 to do something extraordinary.", the ma-  excited if some unexpected items that logically fitted to
chine continued the plot with "And the result is this: a   their story were found in the generated text, like meeting
                                                           with an unexpected plot: "And they took it even one step
new robot, the first fully-autonomous, self-repairing, self-
                                                           further with like, Okay, what if you peel off his skin." (U2
replenishing, fully-reactive, self-repairing robot.", and men-
tioned the user-made entity "AB67" in "AB67 had been       "No Edit Mode").
the first fully self-repairing robot." (U2 "No Edit Mode",    In the "Edit Mode", users in both groups would accept
user and machine wrote in paragraph 1 and 2).              text that contained parts that they could use, like "And
                                                           if there are some sentences can use, you will definitely
                                                           work, work, work on it."(U1 "Edit Mode"), or "But I can
6.2. Strategies for Interaction with                       work with these first three sentences."(U2 "Edit Mode").
      AI-generated Texts                                   However, in the "No Edit Mode", users would expect the
By coding the think-aloud scripts and interview tran- text to fully meet their expectations on coherence, like "I
scripts, we found that users’ reactions to the text gen- think I definitely wanted something that flowed a bit better
erated by the AI model and their strategies of utilizing with a story, but with the first one, I was more okay with
them can be classified into two different groups by their giving me something that perhaps added new ideas." (U3).
expectation of the story: having clear and explicit intent
about what they wanted to write (referred to as DI group) 6.2.2. Reaction to the Unexpectedness of
and having only implicit implied intent about what they            AI-generated Texts
wanted to write (referred to as II group).                 Users were amused when unexpected texts appeared,
   Most users had concrete ideas about the story, say- even if they presented random events or characters that
ing like "As in my mind. Earth is destroyed."(U2 in "Edit had no relationship to what users had written. For ex-
Mode"). By contrast, the users who had only implied ample, U2 laughed when he saw his story turned into
intent would say like "I don’t think about the ending "(U1 a Christmas story, but regenerated it by saying "This is
in "Edit Mode"). Users who had different expectations
not a Christmas story." (U2 "No Edit Mode"). The unex-              U4 and U8 preferred edits that do not affect the contin-
pectedly redundant texts also amused users. For example,         uation of the story. In the interaction flow, they preferred
U4 laughed when encountering sentences like "I was a             to refine on the fluency of the writing after the whole
human with a human face." in the story, and U5 laughed           story has been generated. U4 said, "I wouldn’t really
when saying "That’s a very odd sentence ’the man in the          change the story it comes up with but I would just change
open suit it wasn’t a woman’, very weird." (U5 "Edit Mode").     or delete a few sentences or something." (U4 "Edit Mode").
   Meeting with unexpectedness, users applied some of               Although all users agreed that editing was essential,
the AI-generated texts that was easy to work with. In            U6 preferred not to edit the text because editing was a
most situations, redundant texts were too hard to work           burden: "In the first mode (’Edit Mode’), I must understand
with: "I’m trying to like get some notes that fit a little bit   the machine texts and then edit them. But in the second
more and gives the idea about how to drive the plot forward      mode (’No Edit Mode’), I don’t need to understand them
but it seems to like to be redundant." (U5 "Edit Mode"). But     and just choose one of my favorite and continue the story."
there was one exception in U4’s writing: "I guess like           (U6).
the only way to make that sentence makes sense (’I was a
human with a human face’), is if it wasn’t redundant, the
story could be that he didn’t always have a human face."         7. Discussion
(U4 "Edit Mode"). Even when the text was not redundant,
                                                                 The language model’s limitation made it unpredictable.
it could still be hard to work on when the plot was being
                                                                 It sometimes provides low-quality texts full of words
driven forward too quickly: "I’m going to regenerate it
                                                                 that could hardly make sense. At other times, it pro-
because it focuses so much on death and yet I don’t want
                                                                 vides high-quality inspirations that move the plot for-
it to be like at the start of the story." (U2 "Edit Mode").
                                                                 ward beyond humans’ expectation. Such unexpectedness
However, this situation could be mitigated or even be
                                                                 accounted for the unique interaction pattern of human-
useful if the machine wrote the ending: "I feel like it wrote
                                                                 AI co-writing in this study. Corresponding to previous
a decent ending on its own and didn’t really want to add
                                                                 quantitative findings [36], the qualitative results in this
anything to it." (U5 "No Edit Mode", in delight tone).
                                                                 paper showed that users considered the coherence of
                                                                 the machine-generated texts as a priority. The users’
6.2.3. Reaction to the Fluency of AI-generated                   attitudes towards unexpected but coherent elements gen-
       Texts                                                     erated by the AI model further suggested that users ex-
Fluency of machine texts was more important in the "No           pected the model to provide them with surprising inspi-
Edit Mode" than the "Edit Mode". In the "Edit Mode", for         rations. However, due to the repetition caused by the
most of the users, partial readability would be sufficient       model over-confidence problems [24], users could only
for the requirement on fluency because "if you’re able to        get such paragraphs occasionally by chance. The AI gen-
edit it and then it’s less important because you can just fix    erating process was not transparent and there was a lack
it up a little bit of it." (U1). In "No Edit Mode", the expec-   of user control, and thus users could not expect the next
tation on fluency becomes as important as the coherence          batch of generated text to be better than the previous
for most users, such as "But if you can’t (edit), then it’s      one. The low probability of getting useful pieces from the
kind of more important that it is fluent." (U1).                 model would frustrate users and make them compromise
                                                                 on the incoherent and tenuous text that conveyed merely
6.2.4. Reaction to Editing                                       inspiration.
                                                                    Nevertheless, the unexpectedness of machine-
All users agreed that at least some basic edits of AI-           generated texts should be highlighted in an ideal
generated texts should be allowed to make them more use-         human-AI co-writing tool. After being selected and
ful. The most common reason is along the lines of: "This         refined by human writers, such unexpected but logical
one is definitely harder because oftentimes there would be       elements could make the story more exciting than
a good amount of it that would be useful and like I would        writers’ previous intention. From findings in our study,
want to keep writing off of. But then there’s also be sections   this could not only serve as a dramatic contradiction
like a piece of sentences that were not greatly helpful." (U3    in the story but also as motivations for users to keep
"No Edit Mode"). Even if some of the texts in "No Edit           writing. In the design of the tool, it would be useful
Mode" had high quality and met the basic expectations            to facilitate the user’s utilization of the unexpected
of users, most users still felt editing was necessary, like      elements as their wish (e.g. provide users with options
"I think some editing would be required because, You know,       of editing machine-generated texts in the system).
there’s still some consistencies but not as glaring as that in   This could help to reduce the frustration brought by
first mode texts." (U5 "No Edit Mode").                          unpredictable repetitions and occasional bad fluency.
                                                                 However, even the design of the interaction modes could
mitigate such frustration, the algorithm should also be       their work in a professional way. The AI should act as an
improved. Better quality in AI-generated texts could          writing assistant with a customized avatar on call who
allow users to focus more on the ideas conveyed by            eventually become essential in users’ writing process.
AI-generated texts rather than spending most of the time
on regenerating and fixing the coherence and fluency of
AI-generated texts.                                           8. Conclusion
   The users’ different perceptions of AI’s roles in the
                                                              In this paper, we reported preliminary findings on how 9
co-writing process suggested different interaction pat-
                                                              users interacted with a “turn-taking” style human-AI co-
terns. In the study, most users perceived the machine as
                                                              writing tool to write short science-fiction stories. We dis-
an active idea generator. They preferred the "Edit Mode"
                                                              covered that different mental expectation of users could
more since they could pick what they liked regardless
                                                              affect their strategies and their perception of the ma-
of the fluency and coherence of the texts. Thus, it was
                                                              chine’s role in the co-writing process. The AI-enabled
important to make them able to edit both machine texts
                                                              tool was used as an active idea generator, a co-writer,
and their texts at any time in the writing process. Fur-
                                                              or a writing assistant in different scenarios by different
thermore, more user-controllable variables can be added
                                                              users. We discussed the challenges in managing the trade-
into the tool for them to allow finer-grained user control
                                                              offs in the desired level of unexpectedness in generated
of the generation process. For example, the tool could
                                                              story plots, the coherence and fluency of AI-generated
allow users to control the ideal length of the generated
                                                              texts, the appropriate level of user-control, and the future
text, the scenes, the atmosphere of the plot, or some
                                                              interface design.
weights that could help the model focus more on certain
important parts of the user texts. Some users, on the
other hand, wished the machine to be a human writing          Acknowledgements
assistant. In this case, the machine should be able to ac-
cept both previous paragraphs and following paragraphs        This research was supported in part by a Google Cloud
as inputs to connect the user-defined milestones in the       Research Credit Grant, a Hong Kong Arts Development
story for them. Several works were focused on short           Grant, and an Asia Research Collaboration Grant from
sentence infilling [37, 38], but long paragraphs infilling    Notre Dame International.
still remains to be explored.
   Besides, some users regarded the AI-enabled tool as
an active co-writer or a writing exerciser. They tried to     References
keep the initial output of the machine texts regardless
                                                               [1] C. Oh, J. Song, J. Choi, S. Kim, S. Lee, B. Suh, I lead,
of its coherence and fluency. They enjoyed all the un-
                                                                   you help but only with enough details: Understand-
expectedness of the machine-generated texts and wish
                                                                   ing user experience of co-creation with artificial
not to intervene in the generating process. In this case,
                                                                   intelligence, in: Proceedings of the 2018 CHI Con-
the texts were expected to be uncontrollable. However,
                                                                   ference on Human Factors in Computing Systems,
the definition of good quality of texts could be vague
                                                                   2018, pp. 1–13.
in this mode of interaction, since even the redundancy
                                                               [2] N. Davis, C.-P. Hsiao, K. Yashraj Singh, L. Li,
could be interpreted as metaphors. More research should
                                                                   B. Magerko,        Empirically studying participa-
be conducted to develop a good co-writing or writing
                                                                   tory sense-making in abstract drawing with a co-
exercise machine.
                                                                   creative cognitive agent, in: Proceedings of the
   In summary, the future active AI writing tools should
                                                                   21st International Conference on Intelligent User
strengthen AI’s strong ability of producing high quality
                                                                   Interfaces, 2016, pp. 196–207.
unexpectedness. And they should allow users to utilize
                                                               [3] K. I. Gero, L. B. Chilton, Metaphoria: An algorith-
such unexpectedness efficiently. The "Regenerate" and
                                                                   mic companion for metaphor creation, in: Proceed-
the "Edit" function mentioned in this paper should be
                                                                   ings of the 2019 CHI Conference on Human Factors
the core. The goal of the "Regenerate" is to ensure users
                                                                   in Computing Systems, 2019, pp. 1–12.
to find what they want as they wish as fast as possible
                                                               [4] M. Guzdial, N. Liao, J. Chen, S.-Y. Chen, S. Shah,
(reduce the compromises). To accomplish this goal, for
                                                                   V. Shah, J. Reno, G. Smith, M. O. Riedl, Friend,
example, the future interface can display multiple outputs
                                                                   collaborator, student, manager: How design of an
simultaneously[13, 10] and enable users to control more
                                                                   ai-driven game level editor affects creators, in: Pro-
parameters to regenerate the texts. It could also ask users
                                                                   ceedings of the 2019 CHI conference on human
to grade the outputs and learn from their writing (use
                                                                   factors in computing systems, 2019, pp. 1–13.
users’ inputs as fine-tune datasets)[36]. For the "Edit"
                                                               [5] J. Koch, A. Lucero, L. Hegemann, A. Oulasvirta,
function, the tool should be integrated into a text editor
                                                                   May ai? design ideation with cooperative contex-
such as Microsoft Word for users to both edit and save
     tual bandits, in: Proceedings of the 2019 CHI Con-            Conference on Human Factors in Computing Sys-
     ference on Human Factors in Computing Systems,                tems, 2021, pp. 1–13.
     2019, pp. 1–12.                                          [19] O. Schmitt, D. Buschek, Characterchat: Supporting
 [6] R. Louie, A. Coenen, C. Z. Huang, M. Terry, C. J.             the creation of fictional characters through conver-
     Cai, Novice-ai music co-creation via ai-steering              sation and progressive manifestation with a chatbot,
     tools for deep generative models, in: Proceedings             in: Creativity and Cognition, 2021, pp. 1–10.
     of the 2020 CHI Conference on Human Factors in           [20] M. Guzdial, M. Riedl, An interaction frame-
     Computing Systems, 2020, pp. 1–13.                            work for studying co-creative ai, arXiv preprint
 [7] M. Suh, E. Youngblom, M. Terry, C. J. Cai, Ai as              arXiv:1903.09709 (2019).
     social glue: Uncovering the roles of deep generative     [21] M. Jacob, B. Magerko, Interaction-based authoring
     ai during social music composition, in: Proceedings           for scalable co-creative agents., in: ICCC, 2015, pp.
     of the 2021 CHI Conference on Human Factors in                236–243.
     Computing Systems, 2021, pp. 1–11.                       [22] Y. Lin, J. Guo, Y. Chen, C. Yao, F. Ying, It is your
 [8] C.-Z. A. Huang, H. V. Koops, E. Newton-Rex, M. Din-           turn: collaborative ideation with a co-creative robot
     culescu, C. J. Cai, Ai song contest: Human-                   through sketch, in: Proceedings of the 2020 CHI
     ai co-creation in songwriting, arXiv preprint                 Conference on Human Factors in Computing Sys-
     arXiv:2010.05388 (2020).                                      tems, 2020, pp. 1–14.
 [9] P. Karimi, J. Rezwana, S. Siddiqui, M. L. Maher,         [23] A. Elton-Pym, Principles for ai co-creative game
     N. Dehbozorgi, Creative sketching partner: an anal-           design assistants, in: Proceedings of the AAAI Con-
     ysis of human-ai co-creativity, in: Proceedings of            ference on Artificial Intelligence and Interactive
     the 25th International Conference on Intelligent              Digital Entertainment, volume 16, 2020, pp. 335–
     User Interfaces, 2020, pp. 221–230.                           336.
[10] A. Coenen, L. Davis, D. Ippolito, E. Reif, A. Yuan,      [24] A. See, A. Pappu, R. Saxena, A. Yerukola,
     Wordcraft: a human-ai collaborative editor for story          C. D. Manning, Do massively pretrained lan-
     writing, arXiv preprint arXiv:2107.07430 (2021).              guage models make better storytellers?, 2019.
[11] M. Kreminski, M. Dickinson, M. Mateas, N. Wardrip-            arXiv:1909.10705.
     Fruin, Why are we like this?: The ai architecture        [25] T. Liu, K. Wang, L. Sha, B. Chang, Z. Sui, Table-to-
     of a co-creative storytelling game, in: International         text generation by structure-aware seq2seq learn-
     Conference on the Foundations of Digital Games,               ing, in: Thirty-Second AAAI Conference on Artifi-
     2020, pp. 1–4.                                                cial Intelligence, 2018.
[12] R. slogan, Writing with the machine, https:              [26] A. Fan, M. Lewis, Y. Dauphin, Hierarchical neural
     //www.robinsloan.com/notes/writing-with-the-                  story generation, arXiv preprint arXiv:1805.04833
     machine/ (2016).                                              (2018).
[13] E. Clark, A. S. Ross, C. Tan, Y. Ji, N. A. Smith, Cre-   [27] M. Rose, Rigid rules, inflexible plans, and the stifling
     ative writing with a machine in the loop: Case stud-          of language: A cognitivist analysis of writer’s block,
     ies on slogans and stories, in: 23rd International            College composition and communication 31 (1980)
     Conference on Intelligent User Interfaces, 2018, pp.          389–401.
     329–340.                                                 [28] B. Kostka, J. Kwiecieli, J. Kowalski, P. Rychlikowski,
[14] C. Shu, Y. Zhang, X. Dong, P. Shi, T. Yu, R. Zhang,           Text-based adventures of the golovin ai agent, in:
     Logic-consistency text generation from semantic               2017 IEEE Conference on Computational Intelli-
     parses, arXiv preprint arXiv:2108.00577 (2021).               gence and Games (CIG), IEEE, 2017, pp. 181–188.
[15] A. Krishna, S. Riedel, A. Vlachos, Proofver: Natural     [29] T. Atkinson, H. Baier, T. Copplestone, S. Devlin,
     logic theorem proving for fact verification, arXiv            J. Swan, The text-based adventure ai competition,
     preprint arXiv:2108.11357 (2021).                             IEEE Transactions on Games 11 (2019) 260–266.
[16] W. Fedus, I. Goodfellow, A. M. Dai, Maskgan: better      [30] N. Fulda, D. Ricks, B. Murdoch, D. Wingate, What
     text generation via filling in the_, arXiv preprint           can you do with a rock? affordance extraction via
     arXiv:1801.07736 (2018).                                      word embeddings, arXiv preprint arXiv:1703.03429
[17] Y. Zhang, Z. Gan, K. Fan, Z. Chen, R. Henao, D. Shen,         (2017).
     L. Carin, Adversarial feature matching for text gen-     [31] M. Hausknecht, R. Loynd, G. Yang, A. Swaminathan,
     eration, in: International Conference on Machine              J. D. Williams, Nail: A general interactive fiction
     Learning, PMLR, 2017, pp. 4006–4015.                          agent, arXiv preprint arXiv:1902.04259 (2019).
[18] D. Buschek, M. Zürn, M. Eiband, The impact of mul-       [32] A. Calderwood, V. Qiu, K. I. Gero, L. B. Chilton, How
     tiple parallel phrase suggestions on email input and          novelists use generative language models: An ex-
     composition behaviour of native and non-native                ploratory user study., in: HAI-GEN+ user2agent@
     english writers, in: Proceedings of the 2021 CHI              IUI, 2020.
[33] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei,
     I. Sutskever, et al., Language models are unsuper-
     vised multitask learners, OpenAI blog 1 (2019) 9.
[34] S. Dathathri, A. Madotto, J. Lan, J. Hung, E. Frank,
     P. Molino, J. Yosinski, R. Liu, Plug and play lan-
     guage models: A simple approach to controlled text
     generation, arXiv preprint arXiv:1912.02164 (2019).
[35] S. H. Khandkar, Open coding, University of Calgary
     23 (2009) 2009.
[36] N. Akoury, S. Wang, J. Whiting, S. Hood, N. Peng,
     M. Iyyer, Storium: A dataset and evaluation plat-
     form for machine-in-the-loop story generation,
     2020. arXiv:2010.01717.
[37] C. Donahue, M. Lee, P. Liang, Enabling lan-
     guage models to fill in the blanks, 2020.
     arXiv:2005.05339.
[38] D. Ippolito, D. Grangier, C. Callison-Burch, D. Eck,
     Unsupervised hierarchical story infilling, in: Pro-
     ceedings of the First Workshop on Narrative Un-
     derstanding, 2019, pp. 37–43.