=Paper=
{{Paper
|id=Vol-2090/AIC17_paper3
|storemode=property
|title=Towards Computationally Creating Multi-answer Queries for the Remote Associates Test
|pdfUrl=https://ceur-ws.org/Vol-2090/paper3.pdf
|volume=Vol-2090
|authors=Ana-Maria Olteteanu,Kunkanit Yoopoo
|dblpUrl=https://dblp.org/rec/conf/aic/OlteteanuY17
}}
==Towards Computationally Creating Multi-answer Queries for the Remote Associates Test==
<pdf width="1500px">https://ceur-ws.org/Vol-2090/paper3.pdf</pdf>
<pre>
Towards Computationally Creating Multi-answer
    Queries for the Remote Associates Test

                   Ana-Maria Olteţeanu1 and Kunkanit Yoopoo2
    1
       Cognitive Systems, Bremen Spatial Cognition Center, Universität Bremen,
                                    Germany
    2
      Faculty of Information and Communication Technology, Mahidol University,
                                    Thailand


        Abstract. The Remote Associates Test is a creativity test used to assess
        human participants’ability for association. Small normative datasets of
        queries exist for this test; however, such datasets do not deal with the
        issue of potential multiple answers to the same test query. In this work we
        create a large dataset of queries to which multiple answers are possible.
        The computational work to create such a dataset is presented, together
        with the metrics relating to this dataset. The applications of this tool for
        the investigation and modeling of the creative processes of association in
        human cognition are also discussed.


1    Introduction

Imagine that, as a cognitive psychologist, you would want to investigate an aspect
of the creativity and creative problem solving process in humans. Or that you
would attempt to computationally model such a process. Various forms of tests
exist to measure creativity and creative problem solving performance in human
participants [4, 6, 3, 5]. However, some of these tests are old and do not provide
normative data. Furthermore, such tests do not provide an ability to control for
and parametrize their variables. Much more insight in the creative process could
be obtained if cognitive psychologists and computational modelers would have
access to large datasets of test items, the variables of which they could control.
Furthermore, despite of measuring creativity, some such tests only allow for one
“correct” answer, ignoring the fact that multiple answers might be possible, and
thus being unable to explore how the cognitive process functions in the context
of multiple solutions.
    This paper starts from the premise that the investigation and testing of cre-
ative performance can benefit from the help of computational methods in estab-
lishing (i) new ways of assessing creative problem solving; (ii) better controlled
parametrized stimuli sets for existing creativity and creative problem solving
tasks and (iii) allowing and accounting for multiple possible solutions. The cur-
rent work focuses on the last two: using computational methods to establish a
set of controlled parametrized stimuli sets for a classical creativity test – the Re-
mote Associates Test [7]. Specifically, we focus on computationally building and
extracting a set of Remote Associates Test queries for which multiple answers
are possible.
    The rest of the paper is organized as follows: the Remote Associates Test is
briefly described in section 2, together with previous work on a computational
solver for this test. An approach in creating stimuli subsets for multi-answer
queries is described in section 3. The obtained dataset and the multi-answer
query metrics are described in section 4. In closing, the applications of this work
in cognitive psychology are discussed and future work is proposed.


2    The Remote Associates Test, comRAT-C and
     comRAT-G

The Remote Associates Test [7] (RAT) is a creativity test often used in the
literature [2, 1]. In this test, three word queries are given to participants, like the
query Cottage, Swiss, Cake. The participants are asked to come up with a
fourth word, which is connected to each of the query words. A potential answer in
this case would be Cheese. According to its creators, the RAT aims to measure
creativity as the ability to make associations.
    In previous work, [9] implemented a computational solver of the RAT called
comRAT-C. This solver used language data (bigrams) from the Corpus of Con-
temporary American English, and a type of knowledge organization which sup-
ports the solving process [8, 11]. The solving of a RAT query can be visually
represented as depicted in Figure 1. The initially given query words trigger word
associates that have been previously encountered in conjunction with the query
words. The query words shown in green in Figure 1 trigger the associates shown
in blue. For example, the word Cake triggers the words Flour and Layer,
because the cognitive agent has previously encountered expressions like Cake
Flour and Layer Cake.
    Amongst the associates that are activated by each of the query words, some
overlaps might happen. For example Chocolate is such an overlap, triggered by
associates of the query words Swiss and Cake. The activation process started
by presenting the query words will converge on such overlaps. A convergence of
the associates of all three initial query words can result in an answer – like for
example Cheese in Fig. 1.
    Besides solving the RAT computationally and correlating with human per-
formance data, comRAT-C [9] has shown that multiple possible answers may
exist for RAT queries, by sometimes providing different answers than the unique
answer considered “correct” in the normative data. For example, for the query
Change, Circuit, Cake, the answer considered correct in the normative data
was Short, while comRAT-C provided the equally plausible answer Design.
For the query High, District and House, the answer considered correct in
normative data was School, but comRAT-C provided other answers as well,
like State, Court, etc.
    However, no dataset of queries with multiple answers was yet available. A
researcher administering the RAT thus has no access to knowing whether her
Fig. 1. A visual depiction of the associative process used by comRAT-C to solve the
Remote Associates Test, at the concept level. Only a small subset of associates are
depicted, in order to maintain visibility.


queries might potentially have different correct answers than the ones she is
expecting. She might thus judge an answer as “wrong” just because this is not the
answer expected as correct by the normative data. Meanwhile, this answer might
be not wrong, but plausible, and different from the recognized correct answer.
In comRAT-C computational terms, the participant might have just found a
different convergence term, because of their knowledge base being structured or
weighted slightly differently than that of other participants. As no account of
multiple answers exists in the literature, however, this participant might end
up with lower creativity scores because her answers do not match the “correct”
answers, and this would affect the results of the empirical investigation.
    Such plausible but different answers could also be used to investigate the
process of solving this task at a deeper level. For example, why would one
answer be preferred by a participant over another potential answer? Is this a
function of that particular participant’s set of associations strengths in their
memory/knowledge base? Or would certain associations be generally preferred
over others? How would the parameters of such associations need to be mod-
ified in order to change the preferred answer? Manipulating various setups of
queries with multiple answers could shed more light into the process of remote
association. However, no hypothesis testing for queries with multiple answers is
possible until a dataset for such queries is created.

3   Creating a Set of Multi-answer Queries
A set of 17 million RAT queries was created by reverse engineering the comRAT-
C solving process with comRAT-G [10]. In short, this system considers each
word as a potential answer, and uses its knowledge and knowledge organization
to combinatorially generate queries which converge in that word as an answer.
    Though very rich, this dataset is too large to explore manually, and requires
the application of computational methods for extracting valuable subsets and
their metrics. In this work, we focus on the RAT queries which allow for multiple
answers, and apply computational methods for finding all the multi-answer query
sets, cleaning up this data computationally and building a multi-answer query
dataset. We extract metrics regarding this dataset, as to prepare it for evaluation
with human participants and distribution to the research community.
    First, all multiple answer subsets are extracted. This step involved searching
for query subsets of the form

                                (w1 , w2 , w3 , ans1 ),

                                (w1 , w2 , w3 , ans2 ),
                                         ...
                                (w1 , w2 , w3 , ansx ),
    where wk , k ∈ {1, 2, 3}, stand for the various query words, and ansx for the
various potential answers. As shown in table 1, the application of this step has
as result ordered subsets of queries which have multiple answers. For example,
query Access, Back, Side is shown with both its answers Panel and Road.
    To offer the possibility of parametrising queries, the dataset we build also
provides the following information for each query:

  – the frequency of each of the query words – f r(w1 ), f r(w2 ), f r(w3 );
  – the frequency of the answer word, which might help differentiate between
    different answers to the same query – f r(wans );
  – the frequency of the query words and answer words together as an expression
    f r(w1 , wans ), f r(w2 , wans ), f r(w3 , wans );
  – the conditional probability for achieving each of the answers, given the query
    words (P [wans |w1 ]), (P [wans |w2 ]) and (P [wans |w3 ]);
  – the probability of finding a particular answer if all query words are equally
    weighted.

    All parameters are calculated based on the frequencies provided with the
Corpus of Contemporary American English bigram dataset.
    In the second step, we build a dataset in which each query with multiple
answers is uniquely represented, together with the number of answers we found
for that query, and the following metrics:

  (i) lowest, highest and mean conditional probability of the different answers to
      the query, if each of the query words equally influenced the answer;
 (ii) lowest, highest and mean conditional probability given each of the query
      words, across the different answers and
(iii) lowest, highest and mean frequency of the query words.

The dataset and metrics thus constructed look as depicted in Table 2. These
metrics are provided in order to help cognitive psychologists or other users decide
which query subsets to use, and thus tailor the subset to their research question
or problem.
Table 1. Multi-answer query subsets, example data extract. The [. . . ] symbol stands
for columns in the table which describe parameters and have not been shown here
because of table size constraints.

                        w1       w2         w3 answer      [. . . ]
                        Access Back         Side Panel
                        Access Back         Side Road
                        Industry Management Tax Consultant
                        Industry Management Tax Estate
                        Industry Management Tax Expert
                        Industry Management Tax Hotel
                        Industry Management Tax Officials
                        Industry Management Tax Waste

Table 2. Data and metrics on each query subset. Please note that at least four decimals
are provided in the dataset, but these, together with other columns, were compressed
here for the sake of visual depiction

Query                         No. of P(all) P(all) P(all) P(wx ) P(wx ) P(wx ) F(wx ) F(wx ) F(wx )
                             answers Low High Mean Low          High Mean Low High Mean
Ability Education Skills        2    0.003 0.0345 0.0324 0.0016 0.0709 0.00324 41     639    226.333
Graduate University Programs    4    0.0045 0.0019 0.011 0.0013 0.0295 0.011   31     355    118.417
Youth Team World                3    0.025 0.028 0.027 0.0009 0.0711 0.02694 24       241    104.778
Business Company Management     9    0.0019 0.009 0.0042 0.0006 0.0245 0.0042 24      563    109.556


4    Results - metrics of the dataset

A dataset of 1206622 queries with multiple answers was obtained in step one.
Out of these, 403341 queries were unique, as observed after agglomerating the
data in step 2. The mean number of answers for the entire dataset was 2.27
(SD = 0.77).
    Most of the queries obtained were two answer queries (332974), while a few
sets of queries had between 17-30 answers (6 queries). The metrics pertaining to
the number of queries are shown split in nine bands based on their number of
answers in Table 3.


5    Discussion and Future work

This paper briefly presented our initial efforts in computationally constructing
a set of queries with multiple answers for the Remote Associates Test.
    One of the challenges of creating this dataset related to the presence of plu-
rals in multiple query answers. Our task was to search for subsets of the form
(w1 , w2 , w3 , ans1 ), (w1 , w2 , w3 , ans2 ), [. . .], (w1 , w2 , w3 , ansx ). However, subsets of
queries with two answers were encountered where the two queries and answers
were of the form (w1 , w2 , w3 , ans1 ), (w1 , w2 , w3 , pl(ans1 )), where pl(ans1 ) is the
plural of the other answer. For example, we encountered the query Draft,
Membership, Punch with both answers Card and Cards. We used a set of
               Table 3. Dataset metrics based on number of answers

                              No. of Number of
                              answers   queries
                                  2      332974
                                 3-4      61259
                                 5-6       7045
                                 7-8       1461
                                9-10        401
                               11-12        132
                               13-14         44
                               15-16         19
                               17-30          6


plural rules for English to find such queries. We then compressed plural and
singular forms of queries in one data item, maintaining the singular form and
calculating the mean of the probability and frequency metrics.
    As we have now created a dataset of multiple answer queries, the next steps
are as follows:

 – to evalute the dataset with human participants;
 – to create a set of normative data – expressing accuracy, answer times and
   preferred answers for a subset of multi-answer queries;
 – to use the dataset (and support the use of the dataset) in various cognitive
   science applications.

    The dataset can be evaluated with human participants by checking (i) whether
participants consider multiple answers to be indeed viable answers and (ii)
whether empirical relations hold between the propensity of people to choose
a particular answer and the probability, frequency or other factors associated
with the various answers. As part of future work we also intend to show partic-
ipants multiple possible answers and have them choose the one they find to be
the most “appropriate”, in conditions in which the answer choices are similar or
different in probability/frequency or other factors. This will help us investigate
whether such factors have an impact in perceived appropriatenes of answers, and
whether similarity or difference in a particular factor influences the difficulty of
the choice, affecting response times.
    The creation of a normative dataset for multi-answer queries requires gath-
ering data from human participants regarding response times, and the number
of times the various answers are given. Whether human answers in the case of
such queries cover all the potential multiple answers, or a very small subset of
them, and for which queries and answers this manifests is also an interesting
future empirical question.
    Various applications of the use of such a dataset exist for cognitive psycholo-
gists. This tool and dataset can be used to design experiments that can capture
which answers are preferred in various multi-answer conditions – for example
in cases in which the frequency, probability, beginning letter, or other param-
eters are varied. This dataset can thus be used as a means to establish and
falsify various theoretical hypotheses about the creative process and the process
of association.
    After evaluating this dataset with human participants, we intend to provide
it for scientific use via an online interface.


Acknowledgements
Ana-Maria Olteţeanu acknowledges the support of the German Research Foun-
dation (DFG) for the Creative Cognitive Systems Project OL 518/1-1 (CreaCogs).


References
 1. Bourke, P., Shaw, H.: Spontaneous lucid dreaming frequency and waking insight.
    Dreaming 24(2), 152 (2014)
 2. Cunningham, J.B., MacGregor, J.N., Gibb, J., Haar, J.: Categories of insight and
    their correlates: An exploration of relationships among classic-type insight prob-
    lems, rebus puzzles, remote associates and esoteric analogies. The Journal of Cre-
    ative Behavior 43(4), 262–280 (2009)
 3. Duncker, K.: On problem solving. Psychological Monographs 58(5, Whole No.270)
    (1945)
 4. Guilford, J.P.: The nature of human intelligence. McGraw-Hill, New York (1967)
 5. Kim, K.H.: Can we trust creativity tests? a review of the torrance tests of creative
    thinking (ttct). Creativity research journal 18(1), 3–14 (2006)
 6. Maier, N.R.: Reasoning in humans. II. The solution of a problem and its appearance
    in consciousness. Journal of Comparative Psychology 12(2), 181 (1931)
 7. Mednick, S.A., Mednick, M.: Remote associates test: Examiner’s manual. Houghton
    Mifflin (1971)
 8. Olteţeanu, A.M.: Publications of the Institute of Cognitive Science, vol. 01-2014,
    chap. Two general classes in creative problem-solving? An account based on the
    cognitive processes involved in the problem structure - representation structure
    relationship. Institute of Cognitive Science, Osnabrück (2014)
 9. Olteţeanu, A.M., Falomir, Z.: comRAT-C: A computational compound remote as-
    sociate test solver based on language data and its comparison to human perfor-
    mance. Pattern Recognition Letters 67, 81–90 (2015)
10. Olteţeanu, A.M., Schultheis, H., Dyer, J.B.: Constructing a repository of compound
    Remote Associates Test items in American English with comRAT-G. Behavior
    Research Methods, Instruments, & Computers (accepted)
11. Olteţeanu, A.M.: From simple machines to Eureka in four not-so-easy
    steps.Towards creative visuospatial intelligence. In: Müller, V. (ed.) Fundamental
    Issues of Artificial Intelligence, Synthese Library, vol. 376, pp. 159–180. Springer
    (2016)

</pre>