The Dynamic Creativity of Proto-artifacts in Generative Computational Co-creation Juan Salamanca3 , Ph.D., Daniel Gómez-Marín1,2 , Ph.D. and Sergi Jordà1 , Ph.D. Universitat Pompeu Fabra, Music Technology Group. Barcelona, Spain Universidad ICESI, Facultad de Ingeniería. Cali, Colombia University of Illinois, School of Art and Design. Urbana-Champaign, United States Abstract This paper explores the attributes necessary to determine the creative merit of intermediate artifacts produced during a computational co-creative process (CCC) in which a human and an artificial intelligence system collaborate in the generative phase of a creative project. In an active listening experiment, subjects with diverse musical training (N=43) judged unfinished pieces composed by the New Electronic Assistant (NEA). The results revealed that a two-attribute definition based on the value and novelty of an artifact (e.g., Corazza’s effectiveness and novelty) suffices to assess unfinished work leading to innovative products, instead of Boden’s classic three-attribute definition of creativity (value, novelty, and surprise). These findings reduce the creativity metrics needed in CCC processes and simplify the evaluation of the numerous unfinished artifacts generated by computational creative assistants. Keywords Computational Co-Creativity, creativity assessment, dynamic creativity 1. Introduction intermediate products, named proto-artifacts. Traces of these dimensions can be found in practitioners’ accounts The increasing learning and predictive capabilities of of their experiences with CCC in the arts, spontaneously computational agents consistently find new ways of par- alluding to concepts traditionally discussed in creativ- ticipating in the arts, design, and humanities, such as in ity literature such as surprise: “working with an AI is architecture [1], creative writing [2], music composition not dissimilar as working with a human being, given its [3], and video game design [4], speeding the generation of capacity to surprise you, because that is where the art alternatives, suggesting concepts, generating variants, or comes in. That is where the magic comes in, in any kind automating arrangements. Computational co-creativity of performance or working with anybody or anything. (CCC) is the field within the domain of computational [Once surprise comes in] you are able to intervene in creativity (CC) which deals with the collaborative pro- what has been generated” [8]. cess between humans and computational agents aiming We argue that the proto-artifacts in human-agent co- at producing creative artifacts. Such collaborative open creation processes are essential factors of CCC not ac- process in often modeled in terms of sequences of sub- counted in traditional CC. Thus, we propose the adop- processes, rather than automated generative pipelines tion of Corazza’s definition of dynamic creativity as it [5, 6], and many of such models characterize it as iter- addresses both the process and the potential of proto- ations over a generative phase and an evaluative phase artifacts in hybrid collaborative structures. This paper [7]. explores the validity of using a two-dimensional defi- As this type of human-agent collaborative partner- nition of creativity (based on originality and effective- ship takes hold, the interests of CCC researchers have ness) [9, 10], instead of Boden’s three-dimensional one shifted from studying the creative quality of a system’s (based on value, novelty and surprise) [11] to assess proto- output (the classic CC approach) to examining how cre- artifacts generated by computational agents. ativity evolves during the ongoing stream of co-creative The method used is a subject-based empirical eval- subprocesses, raising new questions such as: what di- uation of a musical CCC system that exemplifies the mensions should be used to assess a creative process, generic CCC process observed in creative disciplines. how should they be interpreted, what information do Subjects with different levels of musical training were these dimensions provide about the creative quality of integrated into the co-creative workflow of producing an album and appraised the creative quality of interme- Joint Proceedings of the ACM IUI Workshops 2023, March 2023, Sydney, Australia diate pieces (proto-artifacts). The experimental method $ jsal@illinois.edu (J. Salamanca); daniel.gomez@upf.edu takes into account CCC practitioners’ reflections on the (D. Gómez-Marín); sergi.jorda@upf.edu (S. Jordà) value of creating with AI systems and relates Boden’s © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). [11] and Corazza’s [10] attributes of creativity. The fol- CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) lowing sections briefly introduce the field of CCC assess- creative product. Corazza’s definition of dynamic cre- ment, describe the apparatus used to generate musical ativity is a good starting point: “[c]reativity requires pieces, discuss the experiment, and present the results. potential originality and effectiveness” [10, p. 262]. The The paper concludes with reflections about why a two- word ‘potential’ conveys the openness of the co-creative dimensional assessment of creativity might suffice to process and the latent merit of its intermediate products. characterize CCC processes, as well as its applications A dynamic creative process has a mutable focus, incor- and limitations. porates the intermediate assessment of proto-artifacts, and provides feedback in response to contextual condi- tions. Corazza suggests that assessing a collaborative cre- 2. Computational Co-Creativity ative process implies a dynamic evaluation of an agent’s production of unfinished artifacts by another agent that CCC encompasses creative processes where two or more takes the role of an estimator. The intermediate outcome participants actively collaborate, and at least one of of the process is filtered as estimators foresee the con- them is a computational agent [12]. While humans ad- sequences of adopting or rejecting such proto-artifacts. just the parameters of conceptual spaces, computational Furthermore, Corazza suggests the provocative idea that agents effectively explore such conceptual spaces reflect- "[d]iscrepancies between [multiple] estimators’ assess- ing the human ability to select and refine the best ideas ments are a sign of potentially disruptive novelties, gen- [13]. As a result, humans and agents make creative, erating the necessary energy for transformation of a do- mutually influential contributions to an artefact [14]. main” [10, p. 265]. This process coincides with the TOTE behavioral model In CC and CCC, estimators (the humans assessing (test,operate,test,exit) that accounts for how humans ex- proto-artifacts) are usually required to use two-attribute ecute plans to pursue goals, recursively evaluating the or three-attribute definitions of creativity as inputs to incongruity between the state of machine-generated in- assess and curate the results of a co-creative process. termediate artifacts and the intended goal [15]. An in- To mention some of the most prevalent definitions, the teresting observation is that a positive affect is often a standard definition of creativity [9, 24] and the dynamic cue that a person is moving toward the goal and negative definition of creativity [10] use originality and effective- affect signals the opposite[16]. The open ended nature ness attributes. Boden defines creativity as the capac- of a CCC process entails that such evaluations are car- ity to obtain surprising, valuable, and novel ideas [25]. ried out by a human estimator who tries to maximize Her definition has been further expanded and formal- the chances of achieving something valuable with no ized mathematically [e.g. 26], remaining as the prevalent certainty of success [10]. In this paper, we are interested model in CC literature as it applies to humans or genera- in such evaluative dynamics that determine how creative tive devices without preference for either. The following are proto-artifacts in a CCC process. definitions intend to clarify some of these attributes and In traditional CC, generative processes are commonly how they overlap, but are not an exhaustive account of described as iterative structures [17, 18, 19, 20, 21, 22] the literature. and can be roughly characterized as two primary sub- Originality. The originality of an artifact accounts processes: producing artifacts and judging them. How- for its authenticity in terms of the independence from ever, it is not clear how or what to measure in each sub- precedent realizations, or for the ingenious repositioning process. More so if they entail collaborative work in the of an existing work in an new domain, as does ready- arts and humanities [23]. Such indetermination has been made art. Originality is closely related to novelty because a known challenge for CC evaluation frameworks such it derives its appraisal from being the first to occupy an as FACE [6] and SPECS [5] because subjects fail to under- unclaimed space in a domain or being the first exemplar stand the concepts of creativity used in the assessment of a new domain. or because the assessment relies on subjects’ knowledge Effectiveness. This term describes the ability to pro- of the inner workings of the systems being evaluated duce a result. It is used in the standard definition of cre- [7]. Nowadays, the question of how to evaluate compu- ativity as a criterion for eliminating trivial instances that tational creativity remains open and continues to mature may qualify as original or novel. Some definitions of as technical and social applications of systems evolve. creativity use usefulness, fit, or appropriateness to con- vey the same meaning. For Runco and Jaeger [9], it also 2.1. Dynamic Assessment of takes the form of value when creative pieces are appreci- Computational Co-Creativity ated in a market. Corazza uses effectiveness and value interchangeably to convey meaningfulness. A key concern of CCC assessment is the definition of Novelty. The novelty of an artifact can be fully appre- a framework to judge uncompleted work leading to a ciated by the domain-experts as they know a vast portion of the cognitive space of the domain. Therefore, they are observed and dissociated in electroencephalogram sig- responsible for identifying if an artifact extends the do- nals. It is suggested that “humans use surprise as a signal main’s boundaries or, even more, transforms the domain to decide when to adapt their behavior, while they use itself. For Boden, novelty has two meanings: when some- novelty to decide where and what to explore—to even- thing is new to its creator (psychological creativity), and tually develop an improved world-model [...] novelty when something comes to life for the first time in hu- is more related to memory-recall and surprise is more man history (historical creativity) [25]. Thus, historical related to predictions.” [33, p. 1]. creativity corresponds to originality. In summary, Corazza and Boden propose attribute as- The concept of novelty has been approached scientifi- sessment frameworks with different number of dimen- cally in AI and technology literature, perhaps more than sions. Originality and novelty are overlapping concepts others such as value and surprise [27]. In CC the idea of specially in relation to Boden’s historic creativity. Ef- domain knowledge is implicit in the selection of a train- fectiveness and value refer to the creative purpose from ing set. That is, novelty is often measured against the two different view points, for Corazza, effectiveness is training corpus of the AI system, a convenient method for defined in terms of practical applications while Boden self-assessment of the quality of the generated artifacts. sees it as a social construction. Finally, Corazza acknowl- Surprise. Assessing surprise during a CCC process edges they are separate concepts but argues that surprise gives clues to the level of fulfillment or intensity of the and novelty could be part of a mental process of joint creative process experienced by a subject. This attribute feeling-appreciation. has been used to measure creativity in finished artifacts None of them can be measured in absolute terms as (e.g., [28]), as a synonyms of non-obviousness in patent they are sensitive to how, when, and by who the assess- evaluation [20], and as a proxy of the quality of the CCC ment is made. To the best of our knowledge these frame- process from a practitioner’s perspective. Although the works have not been evaluated in practice and this study subjective nature of surprise could be a shortcoming attempts to shed light on the interplay between them when judging a finished artifact, it might serve to assess specifically in the assessment of CCC processes. The the performance of a computational agent’s creativity, following sections elaborate and operationalize a genera- especially when the estimator is not a domain expert. tive musical assistant and assess the co-creative process Boden specifies three causes of surprise: when unlikely judging the value, novelty and surprise of proto-artifacts things happen, when unexpected ideas fit know concepts, in a musical setting. and when new ideas break the boundaries of established conceptual spaces. Value. Creativity researchers generally agree that do- 3. Computational Co-Creation main experts classify an artifact as creative when they With The New Electronic positively evaluate its value within a field and its subse- quent domain. Boden argues that the concept of value, Assistant (NEA) unlike novelty, is elusive [25]. The usefulness of value We use the domain of music to exemplify a CCC context in defining creativity is not straightforward, as social in which a computational agent generates real artifacts judgments of value change over time, as in the case of to be evaluated by human subjects. The complete ex- artworks that prove to be valuable years after they were periment is presented in Section 4, while this section first presented, or artists who are considered creative describes the CCC generative system and its inner work- after they have died [29]. Both value and novelty are ings. subject to the scrutiny of appraisers. They look for virtue The New Electronic Assistant (NEA) is a music system in the former dimension, while they look for originality capable of analyzing a musical style from a symbolic cor- in the latter. The value of a potentially creative artifact pus and generating short musical fragments in such style cannot be measured until it is deployed and evaluated (melody and chord accompaniment). Once a melody and by estimators. Only domain experts or gatekeepers ulti- its accompaniment are generated, NEA allows a user to mately judge the value of an artifact [19, 30]. excerpt real-time transformations at four different levels: While novelty and surprise are subjects of the cogni- rhythmic, dynamic, pitch and density1 . tive sciences and neuroscience, value is a social construc- NEA is designed as a loop generator for music compo- tion [31, 32]. Corazza claims that "It is, however, evident sition and performance. A single NEA instance is suited that novelty and surprise are not disjointed dimensions, to complement pre-recorded or real-time performed ma- because if an item is expected, both surprise and concep- terial or even to conform an ensemble of multiple NEA tual novelty are denied." [10, p. 259]. Moreover, recent research claims that novelty and surprise are human re- 1 Basic functionality videos can be found in this link: actions independent but close to each other that can be https://youtube.com/playlist?list=PLD3SOdFCvDNkOBA9Gh3FTAncou- L6xLw1 instances. This latter configuration of instances can be 4. Experiment: Evaluating used to achieve rich polyphonic musical arrangements, specially when different NEAs generate complementary Creativity In A Computational melodic styles (e.g., bass lines, main melodies, vocal Co-Creative (CCC) Process melodies, etc.) and share information among them (i.e. the chord progression of a generated melody can be trans- As presented in the section 2, a creative activity can be ferred to other instances, unifying the whole set of gener- roughly simplified as a two-stage process: generation of ated melodies). Therefore, provided the performer makes multiple ideas and refinement of the best ones. In typical a correct selection of training styles and real-time set- CC experiments, humans or computational agents mea- tings, this parallel propagation of information allows for sure the result of the process, reflecting on how creative highly creative mixes of musical material with low effort. the generated output is. Instead, this study is interested The nature of recent generative interactive systems in how creative are the proto-artifacts produced during such as NEA suggests a shift in the traditional workflow the idea-generation phase. To that aim, an active listen- of composition and production. Traditionally, a music ing experiment was designed to examine to what extent performance process has required precise embodied co- subjects apprehend the concepts of novelty, surprise, and ordination in a constant listening/performing cognitive value and apply them to assess CCC-generated proto- loop, where music is listened to by a performer as she con- artifacts. currently plays the exact movements on a gesture-to-note Subjects with diverse musical training levels were re- instrument contributing to the composition. But inter- cruited and contextualized as participants in a musical acting with a generative system such as NEA requires a album production. Their task was to listen to musical different share of skills. Clicking on a graphical interface stimuli, score them, and decide which ones progress to of knobs and sliders replaces the motor skills needed to the next iteration of the creative process. The stimuli execute the instrument. The user of NEA seeks to gener- were a sample of autonomously generated musical pieces ate a variety of unfinished new melodies in real-time and produced by a multi-NEA system. judge their potential to become something greater until the "right one" is supplied and then transformed. The 4.1. Method classic procedure of playing note-by-note in an instru- ment is replaced by fast critical filtering and real-time Materials. Eighteen iterations of the multi-NEA system parameter tweaking to obtain high-level transformations yielded a constant stream of evolving electronic music of intermediate artifacts. There is a shift from motor re- pieces scattered throughout the conceptual space of am- action to fast acoustic discrimination: agile, coordinated bient music. One-minute fragments were selected from embodiment resigns to collected, prospective assessment. each of them and used as stimuli. In addition, Two ran- To fulfill the purpose of NEA as a musical generative domly selected control stimuli were duplicated to evalu- system, it uses synthesizers and a mixer to convert notes ate the consistency of subject’s responses. Specifically, to sound so that the music is perceived. The synthesizers stimuli 1 and 8 are the same, as well as 11 and 19. In stand out for their ability to model sound in flexible and total, twenty stimuli were arranged in four different se- resourceful ways, especially the sound property known quences to prevent biases. Each participant listened to as timbre. Timbre is the character or identity of the one sequence. source reproducing the notes (i.e., the timbre of a guitar The multi-NEA system used to generate the pieces is different than that of a trumpet even though both play was trained with classical and pop music melodic styles the same notes). The mixer, on the other hand, allows while timbre and structure were managed by the systems control of the intensity of a sound from mute to very loud. briefly explained in section 3 . At both the synthesizing and mixing stages two comple- Participants. The experiment had 43 participants, mentary systems have been developed allowing for easy 37.2% (16) identified as female, 46.5% (20) identified as prototyping of new material using multiple instances of male, and 16.3% (7) undeclared their gender. Their mu- NEA. Describing these systems is beyond the scope of sical training is homogeneously distributed throughout this paper but in general terms they complete the experi- professional musicians to amateur music producers and ence of a NEA’s user seeking to automate certain aspects performers range. The average musical training is 3.3 of real-time music creation. (sd = 1.7) on a scale of 1 to 6, with 1 being no-training and 6 being a professional musician. Among participants, 81.4% (41) have a medium to high knowledge of elec- tronic music, and only 4.6% (2) reported no knowledge of electronic music. Procedure. Participants were primed with the follow- ing script: For this listening session you are going to there was no statistically significant difference between the means of novelty and surprise scores (p = 0.119). Effect of musical training on creativity scores. Subjects were segmented in three levels –low, mid, and high– according to their reported musical training to observe if the musical training has a significant effect on creativity scores. For this analysis creativity attributes are considered as treatments and training groups are analyzed as independent categories. A one-way ANOVA followed by the corresponding Post-Hoc Tukey tests for multiple comparisons revealed that all training levels gave significantly different scores Figure 1: Distribution of novelty, surprise and value scores to value and novelty dimensions; only highly trained among all participants and all pieces subjects gave significantly different scores to value and surprise dimensions; and all training levels gave no signif- icant different scores to novelty and surprise dimensions. play the role of a music producer part of a creative group Moreover, the scores of novelty surprise and value for all working on a new album. Your task is to listen to several pieces by highly trained subjects have higher standard pieces and assess each so that it continues in the produc- deviations than those of low trained subjects, and those tion process or not. The music production process will of mid trained subjects (see Table 1). Echoing [34], re- continue but the essence of the piece will remain close sults reveal that a highly surprising artifact for an expert to what you are listening to. Before starting to listen might pass as routinary to a novice . to each of the pieces, the following text was presented: Similarity of creativity scores across training After listening carefully to this piece please answer: how groups. A complementary analysis was carried out surprising do you find it? How valuable is it to be published having musical training as treatments and creativity at- in the album? How novel does it seem to you? Do you have tributes as categories (see Figure 2 and Table 2). This any comments on the piece? Participants answered the serves to discern to what extent the training level refines same questions for each of the twenty pieces. For the estimator’s creativity appraisal. first three questions, a six-step Likert scale was offered A series of one-way ANOVAs, one for each creativity with the following ranges: from “It is not surprising” to attribute, followed by corresponding Tukey’s HSD Test “It is completely surprising”; from "It is not valuable" to for multiple comparisons showed that the scores of mid "This piece is very valuable and should be part of the and low trained subjects are significantly different for album"; from "It is not a novel piece" to "It is a revolution- all three attributes. High and low trained subjects have ary piece". To give a precise sense of the process, subjects significantly different value scores, while high and mid were contextualized in an on-going activity. They were trained groups have significantly different novelty scores unaware the stimuli were made by a machine. (see 2). The dispersion of novelty and surprise scores are very similar for all subject segments, but value scores are more dispersed than those of novelty and surprise in 4.2. Results high and mid trained subjects (sd= 1.679 vs 1.608, 1.607 Three subjects with high musical training consistently and sd= 1.242 vs 1.092, 1.081 respectively). In the case scored the same control stimuli with a difference of more of low-trained subjects the opposite effects is observed: than 3 points for all three attributes (value, surprise, and value scores are less disperse than those of novelty and novelty). Therefore, all their responses were discarded surprise (sd= 1.406 vs 1.449, 1.460). due to inconsistency.Figure 1 depicts the distribution of The effect of training in value scores has proven to responses. be statistically significant between mid and low training. Discernibility of Boden’s three dimensions. A The analysis reveals that the standard deviation of scores one-way analysis of variance (ANOVA) carried out to of highly trained subjects is significantly greater than the evaluate the difference between the three score sets re- ones of the rest of the subjects. veals a statistically significant difference between at least two groups (F(2) = 14.9, p < 0.005). A Tukey’s HSD Test for multiple comparisons found that the mean of value 5. Discussion scores (mean = 4.12) was significantly different than the The results from the statistical tests elucidate whether means of novelty and surprise scores (mean = 3.72, p = a three-attribute definition of creativity (value, surprise, 0.001, and mean = 3.86 p = 0.0018, respectively). However, and novelty) accounts for proto-artifacts creativity in the Table 1 Tukey HSD Post Hoc tests of significance for the difference between attribute ratings for three music training levels (low, mid, and high). Significant values marked with * Attribute pair value-novelty value-surprise novelty-surprise Training mean sd p mean sd p mean sd p high 4.24 1.68 0.00083* 3.61 1.63 0.01663* 3.76 1.62 0.64244 mid 4.34 1.23 0.00243* 3.99 1.07 0.08534 4.11 1.07 0.43779 low 3.87 1.44 0.0229137* 3.56 1.44 0.4016408 3.73 1.46 0.3700743 Figure 2: Novelty, surprise and value at three different musical training levels: high, mid and low (left, bottom, right respectively) generative phase of co-creative processes and how to in- process. terpret such metrics. While value scores are significantly Is potential creativity a two or three-attribute different from surprise and novelty scores, surprise and space? The empirical results obtained show that novelty novelty ones are close to each other. Further compara- and surprise responses are not statistically distinguish- tive analysis between pairs of scores reveals three clear able, suggesting that these attributes, although differ- insights: value stands out as a different concept from ent in meaning, have a joint appraisal in experimental novelty, novelty and surprise appear as non-discernible conditions. This resonates with the two-dimensional co- concepts 2 , and value and surprise appear discernible creativity assessment models proposed by the standard to highly trained subjects but mid and low trained sub- definition of creativity, and Kantosalo et al. (value and jects have similar mental constructs for value and sur- novelty, plus the quality of user interaction) [35]. Conse- prise. Consequently, one could cautiously argue that two- quently, in evaluating proto-artifacts during a co-creative attribute models of creativity could suffice expert estima- process a two-attribute model of value and originality tors to assess unfinished artifacts during a co-creative could account for Boden’s three-attribute model. Reducing the dimension of the attribute space while 2 This is compliant with [33] as they suggest surprise and novelty preserving the assessment quality simplifies human or are cognitive processes that operate closely. Table 2 Tukey HSD post hoc tests of significance for the difference between high, mid, and low levels of musical training. Significant differences marked with * Training level Attribute High-Mid High-Low Mid-Low Novelty 0.04* 0.89 0.004* Surprise 0.101 0.722 0.005* Value 0.789 0.019* 0.001* agent estimator’s tasks. Such reduction has practical im- termine the value of unfinished artifacts in terms of the plications in multiple real-life scenarios that require filter- foreseen potential to evolve into more refined pieces or ing large sets of artifacts created during CCC processes. branch out novel variations worth exploring. Indeed, as CCC processes permeate human creative ac- tivities, the number and quality of potentially creative artifacts that need assessment will most likely grow ex- 6. Conclusions ponentially, demanding effective and adequate metrics This paper argues for the adoption of a dynamic to carry out estimation tasks. To assess the creative po- framework to judge uncompleted work (deemed proto- tential of creating with such computational assistants, artifacts) leading to a creative product in the context of one would need to measure the creative quality of proto- computational co-creation (CCC) processes. Such ap- artifacts produced during the idea generation phase. proach derived from Corazza’s dynamic definition of cre- However, one could argue that the observed proximity ativity, recognizes that artists engaged in computational between the novelty and surprise concepts can result co-creation not only estimate the creative merit of their from the experimental conditions. On the relation be- work once the piece is finished, but assess the creative tween experiencing surprise and rating novelty, Xu et al. potential of intermediate proto-artifacts at each iteration explain how “humans use surprise as a signal to decide of the generative process. Intermediate assessments de- when to adapt their behavior, while they use novelty to pict how a CCC process may go about and put forward decide where and what to explore—to eventually develop the potential anticipation of creative outcomes from the an improved world-model.” [33, p.1] This idea suggests early stages. Hence, a suitable computational assistant that both attributes are used in conjunction to adjust ex- should maximize the creative potential of the process, pectations dynamically. They operate independently yet either by enhancing the human’s generative capacity or contribute to broader cognitive processing. It is necessary by facilitating recurrent proto-artifacts assessments. to investigate whether the closeness of these concepts The findings of an active listening experiment con- stems from the unfinished nature of stimuli that con- ducted to determine the creative quality of unfinished founds their subjective assessment or from the training musical pieces generated by NEA (New Electronic As- level of estimators participating in the study. sistant) suggest that in an experimental setting subjects’ The effect of domain training in assessing proto- appraisal of novelty and surprise is not discernible. Thus, artifacts. There is a plausible effect of domain knowl- a two-attributes definition of creativity could account for edge in scores of the three creativity attributes. The Boden’s three-attributes definition. Even though novelty higher the training the greater the significance of the and surprise represent different creative attributes, orig- differences between value and surprise, and value and inality could account for both of them because novelty novelty (see Table 1 columns 4 and 7). But the inverse and surprise tend to blend in subjective assessments of effect is observed between value and surprise: the higher creativity, while value is certainly differentiable, espe- the training the lower the significance between novelty cially for domain experts. and surprise (see Table 1 column 10). This evidence shows For the time being, a two dimensional creativity as- that as training becomes more specialized, subjects are sessment of proto-artifacts is not invalidated, and may more confident gauging value, yet they learn that not simplify assessment procedures with subjects. We sug- every valuable artifact is surprising. In particular, highly gest using the dimensions of value and originality (rather trained subjects encounter more pieces with extreme than Corazzas’ effectiveness and originality). Value is value scores than mid or low trained subjects (see Fig- preferred to effectiveness because it conveys meaningful- ure 2). That is, experts used the whole semantic range ness in a variety of fields, including the arts, better than of the evaluation scale, while non-experts concentrate the functional notion of effectiveness. On the other hand, their scores around the second third. A triangulation of the responses of subjects with three levels of expertise Tukey Post Hoc test of effects for experts reinforces the in the domain studied showed that novelty and surprise claim that novelty and surprise are not discernible, while are two different but coupled mental operations. The value is the only attribute with statistically significant former is related to memory and the ability to forget and difference between high and low trained subjects. This the latter is related to the stability of short-term predic- suggests that training has a higher positive effect on the tions. This suggests that the assessment of one could be ability to appreciate value than novelty or to experience a proxy for the other. For practical research purposes, it surprise. In other words, domain expertise is especially makes more sense to use fewer dimensions to conduct expressed when assessing value and not so much when large-scale experiments, especially with lay subjects for assessing novelty or surprise. A potential explanation whom these concepts generally remain fuzzy. is that training builds a more nuanced domain-specific Finally, as AI permeates human creative activities of all cognition and reinforces the estimator’s capacity to de- sorts the generation of proto-creative material flourishes. That is, an unavoidable bi-product of assisted creativity [11] M. A. Boden, Creativity and art: Three roads to is the proliferation of unfinished artifacts that must be surprise, Oxford University Press, 2010. assessed not only by humans but also by AI agents. Such [12] A. Jordanous, Four pppperspectives on computa- increase in potentially creative outcomes calls out for the tional creativity in theory and in practice, Connec- implementation of assertive assessment methods. The tion Science 28 (2016) 194–216. results presented here might prove useful to define fur- [13] T. Lubart, How can computers be partners in the ther methodologies for effective human and agent-based creative process: classification and commentary on assessment of creative artifacts in CCC scenarios. the special issue, International Journal of Human- Computer Studies 63 (2005) 365–369. [14] N. M. Davis, Human-computer co-creativity: Blend- References ing human and computational creativity, in: Ninth Artificial Intelligence and Interactive Digital Enter- [1] W. Huang, H. Zheng, Architectural drawings recog- tainment Conference, 2013, pp. 9–12. nition and generation through machine learning, in: [15] G. Miller, E. Galanter, K. Pribram, Plans and the Proceedings of the 38th Annual Conference of the Structure of Behavior, Martino Publishing, USA, Association for Computer Aided Design in Archi- 1960. tecture (ACADIA), CumInCad, 2018, pp. 156–165. [16] C. Carver, M. Scheier, Attention and Self-Regulation doi:10.52842/conf.acadia.2018.156. : A Control-Theory Approach to Human Behavior, [2] H. Osone, J.-L. Lu, Y. Ochiai, Buncho: ai supported New York: Springer-Verlag, 1981. story co-creation via unsupervised multitask learn- [17] G. Wallas, The art of thought, volume 10, Harcourt, ing to increase writers’ creativity in japanese, in: Brace, 1926. Extended Abstracts of the 2021 CHI Conference on [18] E. Sadler-Smith, Wallas’ four-stage model of the cre- Human Factors in Computing Systems, 2021, pp. ative process: More than meets the eye?, Creativity 1–10. Research Journal 27 (2015) 342–352. [3] M. Avdeeff, Artificial intelligence & popular mu- [19] M. Csikszentmihalyi, Flow and the psychology of sic: Skygge, flow machines, and the audio un- discovery and invention, HarperPerennial, New canny valley, Arts 8 (2019) 130. doi:10.3390/ York 39 (1997). arts8040130. [20] D. K. Simonton, Creativity and discovery as blind [4] V. Volz, J. Schrum, J. Liu, S. M. Lucas, A. Smith, variation: Campbell’s (1960) bvsr model after the S. Risi, Evolving mario levels in the latent space of a half-century mark, Review of General Psychology deep convolutional generative adversarial network, 15 (2011) 158–174. in: Proceedings of the genetic and evolutionary [21] T. B. Ward, S. M. Smith, R. A. Finke, Creative cogni- computation conference, 2018, pp. 221–228. tion, in: R. J. Sternberg (Ed.), Handbook of Creativ- [5] A. Jordanous, A standardised procedure for eval- ity, Cambridge University Press, 1998, p. 189–212. uating creative systems: Computational creativity doi:10.1017/CBO9780511807916.012. evaluation based on what it is to be creative, Cog- [22] T. Amabile, Componential theory of creativity, Har- nitive Computation 4 (2012) 246–279. vard Business School Boston, MA, 2011. [6] S. Colton, J. W. Charnley, A. Pease, Computational [23] L.-C. Yang, A. Lerch, On the evaluation of gen- creativity theory: The face and idea descriptive erative models in music, Neural Computing and models., in: ICCC, Mexico City, 2011, pp. 90–95. Applications 32 (2020) 4773–4784. [7] C. Lamb, D. G. Brown, C. L. Clarke, Evaluating com- [24] M. I. Stein, Creativity and culture, The journal of putational creativity: An interdisciplinary tutorial, psychology 36 (1953) 311–322. ACM Computing Surveys (CSUR) 51 (2018) 1–34. [25] M. A. Boden, The creative mind: Myths and mecha- [8] H. Herndon, M. Dryhurst, Latent visions, nisms, Routledge, 2004. promptism and the future of ai art with ad- [26] G. A. Wiggins, A preliminary framework for de- verb [audio podcast episode], NPR, 2021. scription, analysis and comparison of creative sys- URL: https://interdependence.fm/episodes/ tems, Knowledge-Based Systems 19 (2006) 449–458. latent-visions-promptism-and-the-future-of-ai-art-with-adverb. doi:10.1016/j.knosys.2006.04.009, creative [9] M. A. Runco, G. J. Jaeger, The standard definition Systems. of creativity, Creativity research journal 24 (2012) [27] K. Grace, M. L. Maher, Expectation-based models 92–96. of novelty for evaluating computational creativity, [10] G. E. Corazza, Potential originality and effective- in: Computational Creativity, Springer, 2019, pp. ness: The dynamic definition of creativity, Creativ- 195–209. ity research journal 28 (2016) 258–267. [28] M. E. Q. Gonzalez, et al., Creativity: Surprise and abductive reasoning, Semiotica 2005 (2005) 325– 342. [29] R. W. Weisberg, On the usefulness of “value” in the definition of creativity, Creativity Research Journal 27 (2015) 111–124. doi:10.1080/10400419.2015. 1030320. [30] V. P. Glăveanu, Creativity as a sociocultural act, The Journal of Creative Behavior 49 (2015) 165–180. [31] N. Heinich, A pragmatic redefinition of value (s): Toward a general model of valuation, Theory, Cul- ture & Society 37 (2020) 75–94. [32] J. Dewey, Theory of valuation., International ency- clopedia of unified science (1939). [33] H. A. Xu, A. Modirshanechi, M. P. Lehmann, W. Ger- stner, M. H. Herzog, Novelty is not surprise: Human exploratory and adaptive behavior in sequential decision-making, PLOS Computational Biology 17 (2021) e1009070. [34] R. Maguire, P. Maguire, M. T. Keane, Making sense of surprise: an investigation of the factors influenc- ing surprise judgments., Journal of Experimental Psychology: Learning, Memory, and Cognition 37 (2011) 176. [35] A. Kantosalo, P. T. Ravikumar, K. Grace, T. Takala, Modalities, styles and strategies: An interaction framework for human-computer co-creativity., in: ICCC, 2020, pp. 57–64.