=Paper= {{Paper |id=Vol-3359/paper11 |storemode=property |title=The Dynamic Creativity of Proto-artifacts in Generative Computational Co-Creation |pdfUrl=https://ceur-ws.org/Vol-3359/paper11.pdf |volume=Vol-3359 |authors=Juan Salamanca,Daniel Gómez-Marin,Sergi Jordà |dblpUrl=https://dblp.org/rec/conf/iui/SalamancaGJ23 }} ==The Dynamic Creativity of Proto-artifacts in Generative Computational Co-Creation== https://ceur-ws.org/Vol-3359/paper11.pdf
The Dynamic Creativity of Proto-artifacts in Generative
Computational Co-creation
Juan Salamanca3 , Ph.D., Daniel Gómez-Marín1,2 , Ph.D. and Sergi Jordà1 , Ph.D.
Universitat Pompeu Fabra, Music Technology Group. Barcelona, Spain
Universidad ICESI, Facultad de Ingeniería. Cali, Colombia
University of Illinois, School of Art and Design. Urbana-Champaign, United States


                                       Abstract
                                       This paper explores the attributes necessary to determine the creative merit of intermediate artifacts produced during a
                                       computational co-creative process (CCC) in which a human and an artificial intelligence system collaborate in the generative
                                       phase of a creative project. In an active listening experiment, subjects with diverse musical training (N=43) judged unfinished
                                       pieces composed by the New Electronic Assistant (NEA). The results revealed that a two-attribute definition based on the
                                       value and novelty of an artifact (e.g., Corazza’s effectiveness and novelty) suffices to assess unfinished work leading to
                                       innovative products, instead of Boden’s classic three-attribute definition of creativity (value, novelty, and surprise). These
                                       findings reduce the creativity metrics needed in CCC processes and simplify the evaluation of the numerous unfinished
                                       artifacts generated by computational creative assistants.

                                       Keywords
                                       Computational Co-Creativity, creativity assessment, dynamic creativity



1. Introduction                                                                                        intermediate products, named proto-artifacts. Traces of
                                                                                                       these dimensions can be found in practitioners’ accounts
The increasing learning and predictive capabilities of of their experiences with CCC in the arts, spontaneously
computational agents consistently find new ways of par- alluding to concepts traditionally discussed in creativ-
ticipating in the arts, design, and humanities, such as in ity literature such as surprise: “working with an AI is
architecture [1], creative writing [2], music composition not dissimilar as working with a human being, given its
[3], and video game design [4], speeding the generation of capacity to surprise you, because that is where the art
alternatives, suggesting concepts, generating variants, or comes in. That is where the magic comes in, in any kind
automating arrangements. Computational co-creativity of performance or working with anybody or anything.
(CCC) is the field within the domain of computational [Once surprise comes in] you are able to intervene in
creativity (CC) which deals with the collaborative pro- what has been generated” [8].
cess between humans and computational agents aiming                                                       We argue that the proto-artifacts in human-agent co-
at producing creative artifacts. Such collaborative open creation processes are essential factors of CCC not ac-
process in often modeled in terms of sequences of sub- counted in traditional CC. Thus, we propose the adop-
processes, rather than automated generative pipelines tion of Corazza’s definition of dynamic creativity as it
[5, 6], and many of such models characterize it as iter- addresses both the process and the potential of proto-
ations over a generative phase and an evaluative phase artifacts in hybrid collaborative structures. This paper
[7].                                                                                                   explores the validity of using a two-dimensional defi-
   As this type of human-agent collaborative partner- nition of creativity (based on originality and effective-
ship takes hold, the interests of CCC researchers have ness) [9, 10], instead of Boden’s three-dimensional one
shifted from studying the creative quality of a system’s (based on value, novelty and surprise) [11] to assess proto-
output (the classic CC approach) to examining how cre- artifacts generated by computational agents.
ativity evolves during the ongoing stream of co-creative                                                  The method used is a subject-based empirical eval-
subprocesses, raising new questions such as: what di- uation of a musical CCC system that exemplifies the
mensions should be used to assess a creative process, generic CCC process observed in creative disciplines.
how should they be interpreted, what information do Subjects with different levels of musical training were
these dimensions provide about the creative quality of integrated into the co-creative workflow of producing
                                                                                                       an album and appraised the creative quality of interme-
Joint Proceedings of the ACM IUI Workshops 2023, March 2023, Sydney,
Australia                                                                                              diate pieces (proto-artifacts). The experimental method
$ jsal@illinois.edu (J. Salamanca); daniel.gomez@upf.edu                                               takes into account CCC practitioners’ reflections on the
(D. Gómez-Marín); sergi.jorda@upf.edu (S. Jordà)                                                       value of creating with AI systems and relates Boden’s
          © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License
          Attribution 4.0 International (CC BY 4.0).                                                   [11] and Corazza’s [10] attributes of creativity. The fol-
 CEUR
 Workshop
 Proceedings
               http://ceur-ws.org
               ISSN 1613-0073
                                    CEUR Workshop Proceedings (CEUR-WS.org)
lowing sections briefly introduce the field of CCC assess-    creative product. Corazza’s definition of dynamic cre-
ment, describe the apparatus used to generate musical         ativity is a good starting point: “[c]reativity requires
pieces, discuss the experiment, and present the results.      potential originality and effectiveness” [10, p. 262]. The
The paper concludes with reflections about why a two-         word ‘potential’ conveys the openness of the co-creative
dimensional assessment of creativity might suffice to         process and the latent merit of its intermediate products.
characterize CCC processes, as well as its applications       A dynamic creative process has a mutable focus, incor-
and limitations.                                              porates the intermediate assessment of proto-artifacts,
                                                              and provides feedback in response to contextual condi-
                                                              tions. Corazza suggests that assessing a collaborative cre-
2. Computational Co-Creativity                                ative process implies a dynamic evaluation of an agent’s
                                                              production of unfinished artifacts by another agent that
CCC encompasses creative processes where two or more
                                                              takes the role of an estimator. The intermediate outcome
participants actively collaborate, and at least one of
                                                              of the process is filtered as estimators foresee the con-
them is a computational agent [12]. While humans ad-
                                                              sequences of adopting or rejecting such proto-artifacts.
just the parameters of conceptual spaces, computational
                                                              Furthermore, Corazza suggests the provocative idea that
agents effectively explore such conceptual spaces reflect-
                                                              "[d]iscrepancies between [multiple] estimators’ assess-
ing the human ability to select and refine the best ideas
                                                              ments are a sign of potentially disruptive novelties, gen-
[13]. As a result, humans and agents make creative,
                                                              erating the necessary energy for transformation of a do-
mutually influential contributions to an artefact [14].
                                                              main” [10, p. 265].
This process coincides with the TOTE behavioral model
                                                                 In CC and CCC, estimators (the humans assessing
(test,operate,test,exit) that accounts for how humans ex-
                                                              proto-artifacts) are usually required to use two-attribute
ecute plans to pursue goals, recursively evaluating the
                                                              or three-attribute definitions of creativity as inputs to
incongruity between the state of machine-generated in-
                                                              assess and curate the results of a co-creative process.
termediate artifacts and the intended goal [15]. An in-
                                                              To mention some of the most prevalent definitions, the
teresting observation is that a positive affect is often a
                                                              standard definition of creativity [9, 24] and the dynamic
cue that a person is moving toward the goal and negative
                                                              definition of creativity [10] use originality and effective-
affect signals the opposite[16]. The open ended nature
                                                              ness attributes. Boden defines creativity as the capac-
of a CCC process entails that such evaluations are car-
                                                              ity to obtain surprising, valuable, and novel ideas [25].
ried out by a human estimator who tries to maximize
                                                              Her definition has been further expanded and formal-
the chances of achieving something valuable with no
                                                              ized mathematically [e.g. 26], remaining as the prevalent
certainty of success [10]. In this paper, we are interested
                                                              model in CC literature as it applies to humans or genera-
in such evaluative dynamics that determine how creative
                                                              tive devices without preference for either. The following
are proto-artifacts in a CCC process.
                                                              definitions intend to clarify some of these attributes and
   In traditional CC, generative processes are commonly
                                                              how they overlap, but are not an exhaustive account of
described as iterative structures [17, 18, 19, 20, 21, 22]
                                                              the literature.
and can be roughly characterized as two primary sub-
                                                                 Originality. The originality of an artifact accounts
processes: producing artifacts and judging them. How-
                                                              for its authenticity in terms of the independence from
ever, it is not clear how or what to measure in each sub-
                                                              precedent realizations, or for the ingenious repositioning
process. More so if they entail collaborative work in the
                                                              of an existing work in an new domain, as does ready-
arts and humanities [23]. Such indetermination has been
                                                              made art. Originality is closely related to novelty because
a known challenge for CC evaluation frameworks such
                                                              it derives its appraisal from being the first to occupy an
as FACE [6] and SPECS [5] because subjects fail to under-
                                                              unclaimed space in a domain or being the first exemplar
stand the concepts of creativity used in the assessment
                                                              of a new domain.
or because the assessment relies on subjects’ knowledge
                                                                 Effectiveness. This term describes the ability to pro-
of the inner workings of the systems being evaluated
                                                              duce a result. It is used in the standard definition of cre-
[7]. Nowadays, the question of how to evaluate compu-
                                                              ativity as a criterion for eliminating trivial instances that
tational creativity remains open and continues to mature
                                                              may qualify as original or novel. Some definitions of
as technical and social applications of systems evolve.
                                                              creativity use usefulness, fit, or appropriateness to con-
                                                              vey the same meaning. For Runco and Jaeger [9], it also
2.1. Dynamic Assessment of                                    takes the form of value when creative pieces are appreci-
     Computational Co-Creativity                              ated in a market. Corazza uses effectiveness and value
                                                              interchangeably to convey meaningfulness.
A key concern of CCC assessment is the definition of             Novelty. The novelty of an artifact can be fully appre-
a framework to judge uncompleted work leading to a            ciated by the domain-experts as they know a vast portion
of the cognitive space of the domain. Therefore, they are     observed and dissociated in electroencephalogram sig-
responsible for identifying if an artifact extends the do-    nals. It is suggested that “humans use surprise as a signal
main’s boundaries or, even more, transforms the domain        to decide when to adapt their behavior, while they use
itself. For Boden, novelty has two meanings: when some-       novelty to decide where and what to explore—to even-
thing is new to its creator (psychological creativity), and   tually develop an improved world-model [...] novelty
when something comes to life for the first time in hu-        is more related to memory-recall and surprise is more
man history (historical creativity) [25]. Thus, historical    related to predictions.” [33, p. 1].
creativity corresponds to originality.                           In summary, Corazza and Boden propose attribute as-
   The concept of novelty has been approached scientifi-      sessment frameworks with different number of dimen-
cally in AI and technology literature, perhaps more than      sions. Originality and novelty are overlapping concepts
others such as value and surprise [27]. In CC the idea of     specially in relation to Boden’s historic creativity. Ef-
domain knowledge is implicit in the selection of a train-     fectiveness and value refer to the creative purpose from
ing set. That is, novelty is often measured against the       two different view points, for Corazza, effectiveness is
training corpus of the AI system, a convenient method for     defined in terms of practical applications while Boden
self-assessment of the quality of the generated artifacts.    sees it as a social construction. Finally, Corazza acknowl-
   Surprise. Assessing surprise during a CCC process          edges they are separate concepts but argues that surprise
gives clues to the level of fulfillment or intensity of the   and novelty could be part of a mental process of joint
creative process experienced by a subject. This attribute     feeling-appreciation.
has been used to measure creativity in finished artifacts        None of them can be measured in absolute terms as
(e.g., [28]), as a synonyms of non-obviousness in patent      they are sensitive to how, when, and by who the assess-
evaluation [20], and as a proxy of the quality of the CCC     ment is made. To the best of our knowledge these frame-
process from a practitioner’s perspective. Although the       works have not been evaluated in practice and this study
subjective nature of surprise could be a shortcoming          attempts to shed light on the interplay between them
when judging a finished artifact, it might serve to assess    specifically in the assessment of CCC processes. The
the performance of a computational agent’s creativity,        following sections elaborate and operationalize a genera-
especially when the estimator is not a domain expert.         tive musical assistant and assess the co-creative process
Boden specifies three causes of surprise: when unlikely       judging the value, novelty and surprise of proto-artifacts
things happen, when unexpected ideas fit know concepts,       in a musical setting.
and when new ideas break the boundaries of established
conceptual spaces.
   Value. Creativity researchers generally agree that do-     3. Computational Co-Creation
main experts classify an artifact as creative when they          With The New Electronic
positively evaluate its value within a field and its subse-
quent domain. Boden argues that the concept of value,
                                                                 Assistant (NEA)
unlike novelty, is elusive [25]. The usefulness of value      We use the domain of music to exemplify a CCC context
in defining creativity is not straightforward, as social      in which a computational agent generates real artifacts
judgments of value change over time, as in the case of        to be evaluated by human subjects. The complete ex-
artworks that prove to be valuable years after they were      periment is presented in Section 4, while this section
first presented, or artists who are considered creative       describes the CCC generative system and its inner work-
after they have died [29]. Both value and novelty are         ings.
subject to the scrutiny of appraisers. They look for virtue      The New Electronic Assistant (NEA) is a music system
in the former dimension, while they look for originality      capable of analyzing a musical style from a symbolic cor-
in the latter. The value of a potentially creative artifact   pus and generating short musical fragments in such style
cannot be measured until it is deployed and evaluated         (melody and chord accompaniment). Once a melody and
by estimators. Only domain experts or gatekeepers ulti-       its accompaniment are generated, NEA allows a user to
mately judge the value of an artifact [19, 30].               excerpt real-time transformations at four different levels:
   While novelty and surprise are subjects of the cogni-      rhythmic, dynamic, pitch and density1 .
tive sciences and neuroscience, value is a social construc-      NEA is designed as a loop generator for music compo-
tion [31, 32]. Corazza claims that "It is, however, evident   sition and performance. A single NEA instance is suited
that novelty and surprise are not disjointed dimensions,      to complement pre-recorded or real-time performed ma-
because if an item is expected, both surprise and concep-     terial or even to conform an ensemble of multiple NEA
tual novelty are denied." [10, p. 259]. Moreover, recent
research claims that novelty and surprise are human re-       1
                                                                  Basic functionality videos can be found in this link:
actions independent but close to each other that can be           https://youtube.com/playlist?list=PLD3SOdFCvDNkOBA9Gh3FTAncou-
                                                                  L6xLw1
instances. This latter configuration of instances can be       4. Experiment: Evaluating
used to achieve rich polyphonic musical arrangements,
specially when different NEAs generate complementary
                                                                  Creativity In A Computational
melodic styles (e.g., bass lines, main melodies, vocal            Co-Creative (CCC) Process
melodies, etc.) and share information among them (i.e.
the chord progression of a generated melody can be trans-      As presented in the section 2, a creative activity can be
ferred to other instances, unifying the whole set of gener-    roughly simplified as a two-stage process: generation of
ated melodies). Therefore, provided the performer makes        multiple ideas and refinement of the best ones. In typical
a correct selection of training styles and real-time set-      CC experiments, humans or computational agents mea-
tings, this parallel propagation of information allows for     sure the result of the process, reflecting on how creative
highly creative mixes of musical material with low effort.     the generated output is. Instead, this study is interested
   The nature of recent generative interactive systems         in how creative are the proto-artifacts produced during
such as NEA suggests a shift in the traditional workflow       the idea-generation phase. To that aim, an active listen-
of composition and production. Traditionally, a music          ing experiment was designed to examine to what extent
performance process has required precise embodied co-          subjects apprehend the concepts of novelty, surprise, and
ordination in a constant listening/performing cognitive        value and apply them to assess CCC-generated proto-
loop, where music is listened to by a performer as she con-    artifacts.
currently plays the exact movements on a gesture-to-note          Subjects with diverse musical training levels were re-
instrument contributing to the composition. But inter-         cruited and contextualized as participants in a musical
acting with a generative system such as NEA requires a         album production. Their task was to listen to musical
different share of skills. Clicking on a graphical interface   stimuli, score them, and decide which ones progress to
of knobs and sliders replaces the motor skills needed to       the next iteration of the creative process. The stimuli
execute the instrument. The user of NEA seeks to gener-        were a sample of autonomously generated musical pieces
ate a variety of unfinished new melodies in real-time and      produced by a multi-NEA system.
judge their potential to become something greater until
the "right one" is supplied and then transformed. The          4.1. Method
classic procedure of playing note-by-note in an instru-
ment is replaced by fast critical filtering and real-time      Materials. Eighteen iterations of the multi-NEA system
parameter tweaking to obtain high-level transformations        yielded a constant stream of evolving electronic music
of intermediate artifacts. There is a shift from motor re-     pieces scattered throughout the conceptual space of am-
action to fast acoustic discrimination: agile, coordinated     bient music. One-minute fragments were selected from
embodiment resigns to collected, prospective assessment.       each of them and used as stimuli. In addition, Two ran-
   To fulfill the purpose of NEA as a musical generative       domly selected control stimuli were duplicated to evalu-
system, it uses synthesizers and a mixer to convert notes      ate the consistency of subject’s responses. Specifically,
to sound so that the music is perceived. The synthesizers      stimuli 1 and 8 are the same, as well as 11 and 19. In
stand out for their ability to model sound in flexible and     total, twenty stimuli were arranged in four different se-
resourceful ways, especially the sound property known          quences to prevent biases. Each participant listened to
as timbre. Timbre is the character or identity of the          one sequence.
source reproducing the notes (i.e., the timbre of a guitar        The multi-NEA system used to generate the pieces
is different than that of a trumpet even though both play      was trained with classical and pop music melodic styles
the same notes). The mixer, on the other hand, allows          while timbre and structure were managed by the systems
control of the intensity of a sound from mute to very loud.    briefly explained in section 3 .
At both the synthesizing and mixing stages two comple-            Participants. The experiment had 43 participants,
mentary systems have been developed allowing for easy          37.2% (16) identified as female, 46.5% (20) identified as
prototyping of new material using multiple instances of        male, and 16.3% (7) undeclared their gender. Their mu-
NEA. Describing these systems is beyond the scope of           sical training is homogeneously distributed throughout
this paper but in general terms they complete the experi-      professional musicians to amateur music producers and
ence of a NEA’s user seeking to automate certain aspects       performers range. The average musical training is 3.3
of real-time music creation.                                   (sd = 1.7) on a scale of 1 to 6, with 1 being no-training
                                                               and 6 being a professional musician. Among participants,
                                                               81.4% (41) have a medium to high knowledge of elec-
                                                               tronic music, and only 4.6% (2) reported no knowledge
                                                               of electronic music.
                                                                  Procedure. Participants were primed with the follow-
                                                               ing script: For this listening session you are going to
                                                                 there was no statistically significant difference between
                                                                 the means of novelty and surprise scores (p = 0.119).
                                                                    Effect of musical training on creativity scores.
                                                                 Subjects were segmented in three levels –low, mid, and
                                                                 high– according to their reported musical training to
                                                                 observe if the musical training has a significant effect on
                                                                 creativity scores. For this analysis creativity attributes
                                                                 are considered as treatments and training groups are
                                                                 analyzed as independent categories.
                                                                    A one-way ANOVA followed by the corresponding
                                                                 Post-Hoc Tukey tests for multiple comparisons revealed
                                                                 that all training levels gave significantly different scores
Figure 1: Distribution of novelty, surprise and value scores     to value and novelty dimensions; only highly trained
among all participants and all pieces                            subjects gave significantly different scores to value and
                                                                 surprise dimensions; and all training levels gave no signif-
                                                                 icant different scores to novelty and surprise dimensions.
play the role of a music producer part of a creative group       Moreover, the scores of novelty surprise and value for all
working on a new album. Your task is to listen to several        pieces by highly trained subjects have higher standard
pieces and assess each so that it continues in the produc-       deviations than those of low trained subjects, and those
tion process or not. The music production process will           of mid trained subjects (see Table 1). Echoing [34], re-
continue but the essence of the piece will remain close          sults reveal that a highly surprising artifact for an expert
to what you are listening to. Before starting to listen          might pass as routinary to a novice .
to each of the pieces, the following text was presented:            Similarity of creativity scores across training
After listening carefully to this piece please answer: how       groups. A complementary analysis was carried out
surprising do you find it? How valuable is it to be published    having musical training as treatments and creativity at-
in the album? How novel does it seem to you? Do you have         tributes as categories (see Figure 2 and Table 2). This
any comments on the piece? Participants answered the             serves to discern to what extent the training level refines
same questions for each of the twenty pieces. For the            estimator’s creativity appraisal.
first three questions, a six-step Likert scale was offered          A series of one-way ANOVAs, one for each creativity
with the following ranges: from “It is not surprising” to        attribute, followed by corresponding Tukey’s HSD Test
“It is completely surprising”; from "It is not valuable" to      for multiple comparisons showed that the scores of mid
"This piece is very valuable and should be part of the           and low trained subjects are significantly different for
album"; from "It is not a novel piece" to "It is a revolution-   all three attributes. High and low trained subjects have
ary piece". To give a precise sense of the process, subjects     significantly different value scores, while high and mid
were contextualized in an on-going activity. They were           trained groups have significantly different novelty scores
unaware the stimuli were made by a machine.                      (see 2). The dispersion of novelty and surprise scores are
                                                                 very similar for all subject segments, but value scores
                                                                 are more dispersed than those of novelty and surprise in
4.2. Results
                                                                 high and mid trained subjects (sd= 1.679 vs 1.608, 1.607
Three subjects with high musical training consistently           and sd= 1.242 vs 1.092, 1.081 respectively). In the case
scored the same control stimuli with a difference of more        of low-trained subjects the opposite effects is observed:
than 3 points for all three attributes (value, surprise, and     value scores are less disperse than those of novelty and
novelty). Therefore, all their responses were discarded          surprise (sd= 1.406 vs 1.449, 1.460).
due to inconsistency.Figure 1 depicts the distribution of           The effect of training in value scores has proven to
responses.                                                       be statistically significant between mid and low training.
   Discernibility of Boden’s three dimensions. A                 The analysis reveals that the standard deviation of scores
one-way analysis of variance (ANOVA) carried out to              of highly trained subjects is significantly greater than the
evaluate the difference between the three score sets re-         ones of the rest of the subjects.
veals a statistically significant difference between at least
two groups (F(2) = 14.9, p < 0.005). A Tukey’s HSD Test
for multiple comparisons found that the mean of value 5. Discussion
scores (mean = 4.12) was significantly different than the
                                                              The results from the statistical tests elucidate whether
means of novelty and surprise scores (mean = 3.72, p =
                                                              a three-attribute definition of creativity (value, surprise,
0.001, and mean = 3.86 p = 0.0018, respectively). However,
                                                              and novelty) accounts for proto-artifacts creativity in the
Table 1
Tukey HSD Post Hoc tests of significance for the difference between attribute ratings for three music training levels (low, mid,
and high). Significant values marked with *
                                                                   Attribute pair
                                  value-novelty                    value-surprise               novelty-surprise
                Training     mean    sd         p             mean     sd         p         mean    sd          p

                high         4.24      1.68    0.00083*       3.61       1.63   0.01663*    3.76     1.62    0.64244
                mid          4.34      1.23    0.00243*       3.99       1.07   0.08534     4.11     1.07    0.43779
                low          3.87      1.44    0.0229137*     3.56       1.44   0.4016408   3.73     1.46    0.3700743




Figure 2: Novelty, surprise and value at three different musical training levels: high, mid and low (left, bottom, right
respectively)



generative phase of co-creative processes and how to in-                process.
terpret such metrics. While value scores are significantly                Is potential creativity a two or three-attribute
different from surprise and novelty scores, surprise and                space? The empirical results obtained show that novelty
novelty ones are close to each other. Further compara-                  and surprise responses are not statistically distinguish-
tive analysis between pairs of scores reveals three clear               able, suggesting that these attributes, although differ-
insights: value stands out as a different concept from                  ent in meaning, have a joint appraisal in experimental
novelty, novelty and surprise appear as non-discernible                 conditions. This resonates with the two-dimensional co-
concepts 2 , and value and surprise appear discernible                  creativity assessment models proposed by the standard
to highly trained subjects but mid and low trained sub-                 definition of creativity, and Kantosalo et al. (value and
jects have similar mental constructs for value and sur-                 novelty, plus the quality of user interaction) [35]. Conse-
prise. Consequently, one could cautiously argue that two-               quently, in evaluating proto-artifacts during a co-creative
attribute models of creativity could suffice expert estima-             process a two-attribute model of value and originality
tors to assess unfinished artifacts during a co-creative                could account for Boden’s three-attribute model.
                                                                          Reducing the dimension of the attribute space while
2
    This is compliant with [33] as they suggest surprise and novelty    preserving the assessment quality simplifies human or
    are cognitive processes that operate closely.


Table 2
Tukey HSD post hoc tests of significance for the difference between high, mid, and low levels of musical training. Significant
differences marked with *
                                                                       Training level
                                           Attribute     High-Mid        High-Low Mid-Low
                                           Novelty         0.04*            0.89      0.004*
                                           Surprise        0.101            0.722     0.005*
                                           Value           0.789           0.019*     0.001*
agent estimator’s tasks. Such reduction has practical im-         termine the value of unfinished artifacts in terms of the
plications in multiple real-life scenarios that require filter-   foreseen potential to evolve into more refined pieces or
ing large sets of artifacts created during CCC processes.         branch out novel variations worth exploring.
Indeed, as CCC processes permeate human creative ac-
tivities, the number and quality of potentially creative
artifacts that need assessment will most likely grow ex-          6. Conclusions
ponentially, demanding effective and adequate metrics
                                                                  This paper argues for the adoption of a dynamic
to carry out estimation tasks. To assess the creative po-
                                                                  framework to judge uncompleted work (deemed proto-
tential of creating with such computational assistants,
                                                                  artifacts) leading to a creative product in the context of
one would need to measure the creative quality of proto-
                                                                  computational co-creation (CCC) processes. Such ap-
artifacts produced during the idea generation phase.
                                                                  proach derived from Corazza’s dynamic definition of cre-
   However, one could argue that the observed proximity
                                                                  ativity, recognizes that artists engaged in computational
between the novelty and surprise concepts can result
                                                                  co-creation not only estimate the creative merit of their
from the experimental conditions. On the relation be-
                                                                  work once the piece is finished, but assess the creative
tween experiencing surprise and rating novelty, Xu et al.
                                                                  potential of intermediate proto-artifacts at each iteration
explain how “humans use surprise as a signal to decide
                                                                  of the generative process. Intermediate assessments de-
when to adapt their behavior, while they use novelty to
                                                                  pict how a CCC process may go about and put forward
decide where and what to explore—to eventually develop
                                                                  the potential anticipation of creative outcomes from the
an improved world-model.” [33, p.1] This idea suggests
                                                                  early stages. Hence, a suitable computational assistant
that both attributes are used in conjunction to adjust ex-
                                                                  should maximize the creative potential of the process,
pectations dynamically. They operate independently yet
                                                                  either by enhancing the human’s generative capacity or
contribute to broader cognitive processing. It is necessary
                                                                  by facilitating recurrent proto-artifacts assessments.
to investigate whether the closeness of these concepts
                                                                     The findings of an active listening experiment con-
stems from the unfinished nature of stimuli that con-
                                                                  ducted to determine the creative quality of unfinished
founds their subjective assessment or from the training
                                                                  musical pieces generated by NEA (New Electronic As-
level of estimators participating in the study.
                                                                  sistant) suggest that in an experimental setting subjects’
   The effect of domain training in assessing proto-
                                                                  appraisal of novelty and surprise is not discernible. Thus,
artifacts. There is a plausible effect of domain knowl-
                                                                  a two-attributes definition of creativity could account for
edge in scores of the three creativity attributes. The
                                                                  Boden’s three-attributes definition. Even though novelty
higher the training the greater the significance of the
                                                                  and surprise represent different creative attributes, orig-
differences between value and surprise, and value and
                                                                  inality could account for both of them because novelty
novelty (see Table 1 columns 4 and 7). But the inverse
                                                                  and surprise tend to blend in subjective assessments of
effect is observed between value and surprise: the higher
                                                                  creativity, while value is certainly differentiable, espe-
the training the lower the significance between novelty
                                                                  cially for domain experts.
and surprise (see Table 1 column 10). This evidence shows
                                                                     For the time being, a two dimensional creativity as-
that as training becomes more specialized, subjects are
                                                                  sessment of proto-artifacts is not invalidated, and may
more confident gauging value, yet they learn that not
                                                                  simplify assessment procedures with subjects. We sug-
every valuable artifact is surprising. In particular, highly
                                                                  gest using the dimensions of value and originality (rather
trained subjects encounter more pieces with extreme
                                                                  than Corazzas’ effectiveness and originality). Value is
value scores than mid or low trained subjects (see Fig-
                                                                  preferred to effectiveness because it conveys meaningful-
ure 2). That is, experts used the whole semantic range
                                                                  ness in a variety of fields, including the arts, better than
of the evaluation scale, while non-experts concentrate
                                                                  the functional notion of effectiveness. On the other hand,
their scores around the second third. A triangulation of
                                                                  the responses of subjects with three levels of expertise
Tukey Post Hoc test of effects for experts reinforces the
                                                                  in the domain studied showed that novelty and surprise
claim that novelty and surprise are not discernible, while
                                                                  are two different but coupled mental operations. The
value is the only attribute with statistically significant
                                                                  former is related to memory and the ability to forget and
difference between high and low trained subjects. This
                                                                  the latter is related to the stability of short-term predic-
suggests that training has a higher positive effect on the
                                                                  tions. This suggests that the assessment of one could be
ability to appreciate value than novelty or to experience
                                                                  a proxy for the other. For practical research purposes, it
surprise. In other words, domain expertise is especially
                                                                  makes more sense to use fewer dimensions to conduct
expressed when assessing value and not so much when
                                                                  large-scale experiments, especially with lay subjects for
assessing novelty or surprise. A potential explanation
                                                                  whom these concepts generally remain fuzzy.
is that training builds a more nuanced domain-specific
                                                                     Finally, as AI permeates human creative activities of all
cognition and reinforces the estimator’s capacity to de-
                                                                  sorts the generation of proto-creative material flourishes.
That is, an unavoidable bi-product of assisted creativity [11] M. A. Boden, Creativity and art: Three roads to
is the proliferation of unfinished artifacts that must be          surprise, Oxford University Press, 2010.
assessed not only by humans but also by AI agents. Such [12] A. Jordanous, Four pppperspectives on computa-
increase in potentially creative outcomes calls out for the        tional creativity in theory and in practice, Connec-
implementation of assertive assessment methods. The                tion Science 28 (2016) 194–216.
results presented here might prove useful to define fur- [13] T. Lubart, How can computers be partners in the
ther methodologies for effective human and agent-based             creative process: classification and commentary on
assessment of creative artifacts in CCC scenarios.                 the special issue, International Journal of Human-
                                                                   Computer Studies 63 (2005) 365–369.
                                                              [14] N. M. Davis, Human-computer co-creativity: Blend-
References                                                         ing human and computational creativity, in: Ninth
                                                                   Artificial Intelligence and Interactive Digital Enter-
 [1] W. Huang, H. Zheng, Architectural drawings recog-
                                                                   tainment Conference, 2013, pp. 9–12.
      nition and generation through machine learning, in:
                                                              [15] G. Miller, E. Galanter, K. Pribram, Plans and the
      Proceedings of the 38th Annual Conference of the
                                                                   Structure of Behavior, Martino Publishing, USA,
      Association for Computer Aided Design in Archi-
                                                                   1960.
      tecture (ACADIA), CumInCad, 2018, pp. 156–165.
                                                              [16] C. Carver, M. Scheier, Attention and Self-Regulation
      doi:10.52842/conf.acadia.2018.156.
                                                                   : A Control-Theory Approach to Human Behavior,
 [2] H. Osone, J.-L. Lu, Y. Ochiai, Buncho: ai supported
                                                                   New York: Springer-Verlag, 1981.
      story co-creation via unsupervised multitask learn-
                                                              [17] G. Wallas, The art of thought, volume 10, Harcourt,
      ing to increase writers’ creativity in japanese, in:
                                                                   Brace, 1926.
      Extended Abstracts of the 2021 CHI Conference on
                                                              [18] E. Sadler-Smith, Wallas’ four-stage model of the cre-
      Human Factors in Computing Systems, 2021, pp.
                                                                   ative process: More than meets the eye?, Creativity
      1–10.
                                                                   Research Journal 27 (2015) 342–352.
 [3] M. Avdeeff, Artificial intelligence & popular mu-
                                                              [19] M. Csikszentmihalyi, Flow and the psychology of
      sic: Skygge, flow machines, and the audio un-
                                                                   discovery and invention, HarperPerennial, New
      canny valley, Arts 8 (2019) 130. doi:10.3390/
                                                                   York 39 (1997).
      arts8040130.
                                                              [20] D. K. Simonton, Creativity and discovery as blind
 [4] V. Volz, J. Schrum, J. Liu, S. M. Lucas, A. Smith,
                                                                   variation: Campbell’s (1960) bvsr model after the
      S. Risi, Evolving mario levels in the latent space of a
                                                                   half-century mark, Review of General Psychology
      deep convolutional generative adversarial network,
                                                                   15 (2011) 158–174.
      in: Proceedings of the genetic and evolutionary
                                                              [21] T. B. Ward, S. M. Smith, R. A. Finke, Creative cogni-
      computation conference, 2018, pp. 221–228.
                                                                   tion, in: R. J. Sternberg (Ed.), Handbook of Creativ-
 [5] A. Jordanous, A standardised procedure for eval-
                                                                   ity, Cambridge University Press, 1998, p. 189–212.
      uating creative systems: Computational creativity
                                                                   doi:10.1017/CBO9780511807916.012.
      evaluation based on what it is to be creative, Cog-
                                                              [22] T. Amabile, Componential theory of creativity, Har-
      nitive Computation 4 (2012) 246–279.
                                                                   vard Business School Boston, MA, 2011.
 [6] S. Colton, J. W. Charnley, A. Pease, Computational
                                                              [23] L.-C. Yang, A. Lerch, On the evaluation of gen-
      creativity theory: The face and idea descriptive
                                                                   erative models in music, Neural Computing and
      models., in: ICCC, Mexico City, 2011, pp. 90–95.
                                                                   Applications 32 (2020) 4773–4784.
 [7] C. Lamb, D. G. Brown, C. L. Clarke, Evaluating com-
                                                              [24] M. I. Stein, Creativity and culture, The journal of
      putational creativity: An interdisciplinary tutorial,
                                                                   psychology 36 (1953) 311–322.
      ACM Computing Surveys (CSUR) 51 (2018) 1–34.
                                                              [25] M. A. Boden, The creative mind: Myths and mecha-
 [8] H. Herndon, M. Dryhurst, Latent visions,
                                                                   nisms, Routledge, 2004.
      promptism and the future of ai art with ad-
                                                              [26] G. A. Wiggins, A preliminary framework for de-
      verb [audio podcast episode], NPR, 2021.
                                                                   scription, analysis and comparison of creative sys-
      URL:          https://interdependence.fm/episodes/
                                                                   tems, Knowledge-Based Systems 19 (2006) 449–458.
      latent-visions-promptism-and-the-future-of-ai-art-with-adverb.
                                                                   doi:10.1016/j.knosys.2006.04.009, creative
 [9] M. A. Runco, G. J. Jaeger, The standard definition
                                                                   Systems.
      of creativity, Creativity research journal 24 (2012)
                                                              [27] K. Grace, M. L. Maher, Expectation-based models
      92–96.
                                                                   of novelty for evaluating computational creativity,
[10] G. E. Corazza, Potential originality and effective-
                                                                   in: Computational Creativity, Springer, 2019, pp.
      ness: The dynamic definition of creativity, Creativ-
                                                                   195–209.
      ity research journal 28 (2016) 258–267.
                                                              [28] M. E. Q. Gonzalez, et al., Creativity: Surprise and
     abductive reasoning, Semiotica 2005 (2005) 325–
     342.
[29] R. W. Weisberg, On the usefulness of “value” in the
     definition of creativity, Creativity Research Journal
     27 (2015) 111–124. doi:10.1080/10400419.2015.
     1030320.
[30] V. P. Glăveanu, Creativity as a sociocultural act,
     The Journal of Creative Behavior 49 (2015) 165–180.
[31] N. Heinich, A pragmatic redefinition of value (s):
     Toward a general model of valuation, Theory, Cul-
     ture & Society 37 (2020) 75–94.
[32] J. Dewey, Theory of valuation., International ency-
     clopedia of unified science (1939).
[33] H. A. Xu, A. Modirshanechi, M. P. Lehmann, W. Ger-
     stner, M. H. Herzog, Novelty is not surprise: Human
     exploratory and adaptive behavior in sequential
     decision-making, PLOS Computational Biology 17
     (2021) e1009070.
[34] R. Maguire, P. Maguire, M. T. Keane, Making sense
     of surprise: an investigation of the factors influenc-
     ing surprise judgments., Journal of Experimental
     Psychology: Learning, Memory, and Cognition 37
     (2011) 176.
[35] A. Kantosalo, P. T. Ravikumar, K. Grace, T. Takala,
     Modalities, styles and strategies: An interaction
     framework for human-computer co-creativity., in:
     ICCC, 2020, pp. 57–64.