=Paper=
{{Paper
|id=Vol-2903/IUI21WS-HAIGEN-11
|storemode=property
|title=Multiversal views on language models
|pdfUrl=https://ceur-ws.org/Vol-2903/IUI21WS-HAIGEN-11.pdf
|volume=Vol-2903
|authors=Laria Reynolds,Kyle McDonell
|dblpUrl=https://dblp.org/rec/conf/iui/ReynoldsM21
}}
==Multiversal views on language models==
Multiversal views on language models Laria Reynoldsa , Kyle McDonella a knc.ai, USA Abstract The virtuosity of language models like GPT-3 opens a new world of possibility for human-AI collab- oration in writing. In this paper, we present a framework in which generative language models are conceptualized as multiverse generators. This framework also applies to human imagination and is core to how we read and write fiction. We call for exploration into this commonality through new forms of interfaces which allow humans to couple their imagination to AI to write, explore, and understand non-linear fiction. We discuss the early insights we have gained from actively pursuing this approach by developing and testing a novel multiversal GPT-3-assisted writing interface. Keywords writing assistant, hypertext narratives, multiverse writing, GPT-3 1. Introduction we have learned over several months of test- ing and designing interfaces for writing with GPT-3 [1], OpenAI’s new generative lan- the aid of GPT-3, beginning by introducing guage model, has astonished with its ability the framework of language models as multi- to generate coherent, varied, and often beau- verse generators. tiful continuations to natural language pas- sages of any style. To creative writers and those who wish themselves writers, such a 2. Language models are system opens a new world of possibilities. Some rightfully worry whether human writ- multiverse generators ing will become deprecated or worthless in Autoregressive language models such as a world shared with such generative models, GPT-3 take natural language input and out- and others are excited for a renaissance in put a vector of probabilities representing pre- which the creative powers of human writers dictions for the next word or token. Such lan- are raised to unprecedented heights in col- guage models can be used to generate a pas- laboration with AI. In order to achieve the sage of text by repeating the following pro- latter outcome, we must figure out how to cedure: a single token is sampled from the engineer human-machine interfaces that al- probability distribution and then appended low humans to couple their imaginations to to the prompt, which then serves as the next machines and feel freed rather than replaced. input. We will present the still-evolving approach As the sampling method can be stochas- tic, running this process multiple times on Joint Proceedings of the ACM IUI 2021 Workshops, April 13-17, 2021, College Station, USA the same input will yield diverging continua- " moire@knc.ai (L. Reynolds); kyle@knc.ai (K. tions. Instead of creating a single linear con- McDonell) tinuation, these continuations can be kept © 2021 Copyright for this paper by its authors. Use permit- and each continued themselves. This yields ted under Creative Commons License Attribution 4.0 Inter- national (CC BY 4.0). a branching structure, which we will call a CEUR http://ceur-ws.org CEUR Workshop Proceedings multiverse, or the “subtree” downstream of a (CEUR-WS.org) Workshop ISSN 1613-0073 Proceedings Figure 1: The process of generating a multiverse story with a language model. The probability dis- tribution is sampled multiple times, and each sampled token starts a separate branch. Branching is repeated at the next token (or per set interval, or adaptively), resulting in a branching tree structure as shown in Figure 1. Figure 2: A narrative tree with initial prompt “In the beginning, GPT-3 created the root node of the” prompt as shown in Figure 1. lem is not merely epistemic; the future truly has not yet been written, except in probabili- 2.1. Analogy to Everettian ties. However, when we do finally venture to measure it, the ambiguous future seems to us quantum physics to become a concrete, singular present. Quantum mechanics tells us that the future is The Everettian or many-worlds interpre- fundamentally indeterminate. We can calcu- tation of quantum mechanics, which has be- late probabilities of future outcomes, but we come increasingly popular among quantum cannot know with certainty what we will ob- physicists in recent years, views the situa- serve until we actually measure it. The prob- tion differently [2]. It claims that we, as ob- servers, live in indeterminacy like the world mension of multiplicity that we must also around us. When we make a measurement, consider, especially when we are talking rather than collapsing the probabilistic world about states defined by natural language. around us into a single present, we join it in Natural language descriptions invariably ambiguity. “We” (in a greater sense than we contain ambiguities. In the case of a narra- normally use the word) experience all of the tive, we may say that the natural language possible futures, each in a separate branch of description defines a certain present – but a great multiverse. Other branches quickly it is impossible to describe every variable become decoherent and evolve separately, no that may have an effect on the future. In longer observable or able to influence our any scene there are implicitly many objects subjective slice of the multiverse. present which are not specified but which This is the universe an autoregressive lan- may conceivably play a role in some future guage model like GPT-3 can generate. From or be entirely absent in another. any given present it creates a functionally The multiverse generated by a language infinite multitude of possible futures, each model downstream of a prompt will con- unique and fractally branching. tain outcomes consistent with the ambiguous David Deutsch, one of the founders of variable taking on separate values which are quantum computing, draws a connection be- mutually inconsistent. tween the concept of a state and its quan- So we define two levels of uncertainty, tum evolution with virtual reality generation which can both be explored by a language [3]. He imagines a theoretical machine which model: simulates environments and models the pos- 1. An uncertainty/multiplicity of present sible responses of all interactions between states, each associated with objects. Deutsch further posits that it will 2. An uncertainty/multiplicity of futures one day be possible to build such a universal consistent with the same "underlying" virtual reality generator, whose repertoire in- present cludes every possible physical environment. Language models, of course, still fall well We will call the first form of multiplicity short of this dream. But their recent, dra- interpretational multiplicity, and the second matic increase in coherence and fluency al- form dynamic multiplicity. low them to serve as our first approxima- tion of such a virtual reality generator. When given a natural-language description of ob- 3. Human imaginations are jects, they can propagate the multiverse of multiverse generators consequences that result from a vast number of possible interactions. Humans exist in a constant state of epistemo- logical uncertainty regarding what will hap- 2.2. Dynamic and pen in the future and even what happened in the past and the state of the present [4]. We interpretational multiplicity are then, by virtue of being adapted to our Deutsch’s view emphasizes that from any uncertain environments, natural multiverse given a state there are a multiplicity of pos- reasoners. sible future single-world dynamics; stories David Deutsch also points out that our unfold differently in different rollouts of an imaginations, which seek to model the world, identical initial state. There is another di- mimic reality as virtual reality generators: we model environments and imagine how 3.3. Writing multiverses they could play out in different branches. So far we’ve implicitly assumed that, despite the multiversal forces at work, the writer’s 3.1. Reading as a multiversal act objective is to eventually compose a single When a piece of literature is read, the un- history. However, language models natu- derlying multiverse shapes the reader’s in- rally encourage writing explicitly multiversal terpretations and expectations. The struc- works. ture which determines the meaning of a piece In the same way that hypertext tran- as experienced by a reader is not the linear- scended the limitations the linear order in time story but the implicit, counterfactual which physical books are read, exciting a past/present/future plexus surrounding each surge of multiversal fiction [5], language point in the text given by the reader’s projec- models introduce new possibilities for writ- tive and interpretive imagination. ing nonlinear narratives. More concretely stated, at each moment in After all, it’s only a small leap from in- a story, there is uncertainty about how dy- corporating multiverses in the brainstorming namics will play out (will the hero think of a process to including them in the narrative. way out of their dilemma?) as well as uncer- Counterfactual branches often occur in tra- tainty about the hidden state of the present ditional fiction in the form of imaginary con- (is the mysterious mentor good or evil?). structs, and our minds are naturally drawn to Each world in the superposition not only ex- their infinite possibility [6]. erts an independent effect on the reader’s imagination but interacts with counterfactu- als (the hero is aware of the uncertainty of 4. Interfaces their mentor’s moral alignment, and this in- We propose the creation of new tools to allow fluences their actions). writers to work alongside language models The reader simulates the minds of the to explore and be inspired by the multiverses characters and experiences the multiverses already hiding in their writing. evoked by the story. Research into hypertext writing tools has been ongoing for more than two decades and 3.2. Writing as a multiversal act has produced notable tools like StorySpace[7]. However, the issue of hypertext interfaces as- A writer may have a predetermined interpre- sisted by language models is a newer devel- tation and future in mind or may write as a opment, as only very recently have language means of exploring the interpretative and/or models become advanced enough to be use- dynamic multiverse. Regardless, a writer ful in the writing process [8]. Likewise, there must be aware of the multiplicity which de- has been significant research into interactive fines the readers’ and characters’ subjective narratives, including in branching, multiver- experiences as the shaper of the meaning and sal settings [9, 10], but never one in which the dynamics of the work. The writer thus seeks human and the language assistant can act to- to simulate and manipulate that multiplicity. gether as such high-bandwidth partners. We propose that generative language mod- As has been shown in past hypertext inter- els in their multiversal modality can serve as face design studies [11], the primary concern an augmentation to and be augmented by the in the creation of an interface for writing writer’s inherently multiversal imagination. multiverse story is the massive amount of in- written linear and nonlinear stories spanning formation that could be shown to the writer. the equivalent of thousands of pages of often If intuitive user experience is not central to astonishing ingenuity and beauty and sur- the design of the program, this information prisingly long-range coherence. Three users will feel overwhelming and functionally pre- have reported a sense of previously unimag- vent the user from leveraging the power of- ined creative freedom and expressive power. fered by multiverse access at all. However, it has also become evident that An effective multiversal interface should much more research and development is nec- allow the writer, with the aid of a generative essary. In our beta-tests, we’ve found that language model, to expose, explore, and ex- flaws in interface design can easily over- ploit the interpretational and dynamic multi- whelm or damage a feeling of ownership plicity of a passage. Not only will such a tool over the work produced. Below we will share allow the user to explore the ways in which some of our findings, which represent only a scenario might play out, such an interface the first step in creating a true interface be- will also expose previously unnoticed ambi- tween the creative mind and the machine. guities in the text (and their consequences). Depending on the design of the interface 4.2. Multiple visualizations and the way the user approaches it, many dif- ferent human-AI collaborative workflows are We have found that a visual representation possible. Ideally, the interface should give the of the branching structure of the narrative user a sense of creative superpowers, provid- helps users conceptualize and navigate frac- ing endless inspiration combined with exec- tal narratives. This view (called visualize) utive control over the narrative, as well as al- displays the flow of pasts and futures sur- lowing and encouraging the user to intervene rounding each node (Figure 3) and zooming to any degree. out displays the global structure of the mul- tiverse (Figure 4). The visualize view al- lows users to expand and collapse nodes and 4.1. Progress so far subtrees, as well as “hoist” any node so that Over the past several months, we have pro- it acts as the root of the tree. Altering the totyped and tested several iterations of mul- topology of the tree, (e.g. reassigning chil- tiversal writing tools using GPT-3 as the gen- dren to different parents, splitting and merg- eration function. ing nodes) is more intuitive for users in the The demand for a multiversal writing ap- visualize view than the linear view. plication grew from use of GPT-3 as a more In addition to tree-based multiverse visu- standard linear writing assistant. It became alization, the read view displays the text of a increasingly clear, as users sought greater in- node and its ancestry in a single-history for- teraction bandwidth and more efficient ways mat (Figure 5). to structure and leverage the model’s ideas, that an interface which organizes the model’s 4.3. Multiverse navigation outputs in a branching tree would be more effective. With a generative language model, story The early results we have seen leave no multiverses can quickly become too large to doubt about the power of language models navigate through node connections alone. To as writing assistants. Our small cohort of assist navigation, we have implemented the five beta users have, alongside GPT-3, co- following features: Figure 3: Visualize view Figure 4: Zoomed-out visualization of a nonlinear story • Search all text or text in a subtree ship. Tags are similar to bookmarks, and/or text in a node’s ancestry but can be applied to multiple nodes. • Indexing by chapters: Chapters are assigned to individual nodes, and all 4.4. Adaptive branching nodes belong to the chapter of the clos- A naive way to automatically generate a mul- est ancestor that is the root of a chap- tiverse using a language model might in- ter. As a consequence, chapters have volve branching every fixed n tokens. How- the shape of subtrees. ever, this is not the most meaningful way to branch in a story. In some situations, there • Bookmarks and tags: Bookmarks is essentially one correct answer for what create a named pointer to a node a language model should output next. In without enforcing chapter member- such a case, the language model will assign Figure 5: Read view a very high confidence (often >99%) for the the multiverse such as merging interesting top token. Branching at this point would aspects of two separate branches into one. introduce incoherent continuations. Con- The interface should ideally allow the hu- versely, when the language model distributes man to perform all desired operations with transition probabilities over multiple tokens, maximal ease. Because GPT-3 is so capable branching is more likely to uncover a rich di- of producing high-quality text, some inter- versity of coherent continuations. face designs make it feasible for the human to One algorithm to dynamically branch is cultivate coherent and interesting passages to sample distinct tokens until a cumula- through curation alone. We have found that tive probability threshold is met. Adaptive an interface which makes it easy to generate branching allows visualization of the dynam- continuations but relatively difficult to mod- ics of the multiverse: stretches of relative de- ify the content and topology of the result- terminism alternating with divergent junc- ing multiverse encourages a passive work- tures (Figure 6). flow, where the user relies almost exclusively on the language model’s outputs and the 4.5. Reciprocal workflow branching topology determined by the pro- cess of generation. Humans retain an advantage over current While such a passive mode can be fun, language models in our ability to edit writ- resembling an open-ended text adventure ing and perform topological modifications on game, and as well as useful for efficiently ex- ploring counterfactuals, the goal of a writ- 4.7. Nonstandard topologies ing interface is to facilitate two-way inter- The interface supports nodes with multi- action: the outputs of the language model ple parents and allows cyclic graphs (Fig- should augment and inspire the user’s imag- ure 7). Opportunities to arrange convergent ination and vice versa. and cyclic topologies, which do not occur Thus, we are are developing features to en- if the language model is used passively, en- courage meaningful and unrestrained human courage human cowriters to play a more ac- contribution such as: tive role, for instance, in arranging for sep- • Easy ways to edit, move text, and arate branches to converge to a single out- change tree topology come. Multiversal stories naturally invite plots about time travel and weaving time- • Support for nonstandard topologies lines, and we have found this feature to un- that are not automatically generated lock many creative possibilities. by language models and require hu- man arrangement, such as cycles and multiple parents (§4.7) 4.8. Memory management • Floating notes to allow saving pas- GPT-3 has a limited context window, which sages and ideas independent from the might seem to imply limited usefulness for tree structure (§4.6) composing longform works like novels, but our users have found that long-range coher- • Fine-grained control over language ence is surprisingly easy to maintain. Of- model memory (§4.8) ten, the broad unseen past events of the nar- rative are contained in the interpretational • Interactive writing tools that offer multiplicity of the present and thus exposed influence over the narrative in ways through generations, and consistent narra- other than direct intervention (§4.9) tives are easily achieved through curation. • Program modes which encourage man- In order to reference past information more ual synthesis of content from multi- specifically, often all that is needed is min- verse exploration into a single history, imal external suggestion, introduced either for instance by distinguishing between by the author-curator or by a built-in mem- exploratory and canonical branches ory system. We are developing such a sys- tem which automatically saves and indexes story information from which memory can 4.6. Floating notes be keyed based on narrative content. Floating notes are text files which, rather than being associated with a particular node, 4.9. Writing tools are accessible either globally or anywhere in Beyond direct continuations of the body of a subtree. We decided to implement this fea- the story, a language model controlled by en- ture because users would often have a sepa- gineered prompts can contribute in an open- rate text file open in order to copy and paste ended range of modalities. Sudowrite[12] interesting outputs and keep notes with- has pioneered using GPT-3 powered func- out being constrained by the tree structure. tions that, for instance, generate sensory de- Floating notes make it easier for the user ex- scriptions of a given object, or prompt for a ert greater agency over the narrative. Figure 6: A subtree generated with adaptive branching Figure 7: Nodes can have multiple parents, allowing for cyclic story components twist ending given a story summary. [2] B. S. DeWitt, N. Graham, The many The ability to generate high-quality sum- worlds interpretation of quantum me- maries has great utility for memory and as chanics, volume 63, Princeton Univer- input to helper prompts and forms an ex- sity Press, 2015. citing direction for our future research. We [3] D. Deutsch, The Fabric of Reality, Pen- are exploring summarization pipelines for guin UK, 1998. GPT-3 that incorporate contextual informa- [4] P. A. Roth, The pasts, History and The- tion and examples of successful summariza- ory 51 (2012) 313–339. tions of similar content. [5] K. Amaral, Hypertext and writing: An overview of the hypertext medium, Re- trieved August 16 (1995) 2004. 5. Conclusion [6] E. J. Aarseth, Nonlinearity and liter- ary theory, Hyper/text/theory 52 (1994) The problem of designing good interfaces for 761–780. AI systems to interact with humans in novel [7] M. Bernstein, Storyspace 1, in: Proceed- ways will become increasingly important as ings of the thirteenth ACM conference the systems increase in capability. We can on Hypertext and hypermedia, 2002, pp. imagine a bifurcation in humankind’s future: 172–181. one path in which we are left behind once the [8] L. Lagerkvist, M. Ghajargar, Multiverse: machines we create exceed our natural capa- Exploring human machine learning in- bilities, and another in which we are uplifted teraction through cybertextual genera- along with them. We hope that this paper can tive literature, in: 10th International further inspire the HCI community to con- Conference on the Internet of Things tribute to this exciting problem of building Companion, 2020, pp. 1–6. the infrastructure for our changing future. [9] M. O. Riedl, R. M. Young, From lin- ear story generation to branching story Acknowledgments graphs, IEEE Computer Graphics and Applications 26 (2006) 23–31. doi:10. We are grateful to Lav Varshney for his 1109/MCG.2006.56. valuable discussions and helpful feedback [10] M. O. Riedl, V. Bulitko, Interactive nar- and to Michael Ivanitskiy and John Balis for rative: An intelligent systems approach, their feedback and help compiling this arti- Ai Magazine 34 (2013) 67–67. cle. In addition we would like to thank Miles [11] S. Jordan, ‘an infinitude of possible Brundage and OpenAI for providing access worlds’: Towards a research method to GPT-3. for hypertext fiction, New Writing 11 (2014) 324–334. [12] A. Gupta, J. Yu, References https://www.sudowrite.com/, 2021. URL: https://www.sudowrite.com/. [1] T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, et al., Language models are few-shot learners, arXiv preprint arXiv:2005.14165 (2020).