=Paper=
{{Paper
|id=Vol-2699/paper22
|storemode=property
|title=Visualizing and Quantifying Vocabulary Learning During Search
|pdfUrl=https://ceur-ws.org/Vol-2699/paper22.pdf
|volume=Vol-2699
|authors=Nilavra Bhattacharya,Jacek Gwizdka
|dblpUrl=https://dblp.org/rec/conf/cikm/BhattacharyaG20
}}
==Visualizing and Quantifying Vocabulary Learning During Search==
Visualizing and Quantifying Vocabulary Learning
During Search
Nilavra Bhattacharyaa , Jacek Gwizdkaa
a School of Information, The University of Texas at Austin, USA
Abstract
We report work in progress for visualizing and quantifying learning during search. Users initiate a search session with a
Pre-Search Knowledge state. During search, they undergo a change in knowledge. Upon conclusion, users attain a Post-
Search Knowledge state. We attempt to measure this dynamic knowledge-change from a stationary reference point: Expert
Knowledge on the search topic Using word-embeddings of searchers’ written summaries, we show that w.r.t. Expert Knowl-
edge, there is observable and quantifiable difference between the Pre-Search knowledge (Pre-Exp distance) and Post-Search
knowledge (Post-Exp distance).
Keywords
search as learning, quantifying learning, expert knowledge, word embedding
Pre-Search
Knowledge
tual multiple choice questions (MCQs). The answer
options can be a mixture of fact-based responses (TRUE,
FALSE, or I DON’T KNOW ), [3, 4] or recall-based re-
sponses (I remember / don’t remember seeing this infor-
mation) [5, 6]. Constructing topic-dependant MCQs
Searching may take time and effort, which may be aided by auto-
mated question generation techniques[7]. For evalua-
tion, this approach is the easiest, and often automated.
Post-Search Expert However, MCQs allow respondents to answer correctly
Knowledge Post‒Exp Distance Knowledge by guesswork. The third approach lets searchers
write natural language summaries or short answers,
Figure 1: Conceptual framework of Search-as-Learning. before and after the search [8, 2]. Depending on ex-
perimental design, prompts for writing such responses
can be generic (least effort) [9] or topic-specific (some
1. Introduction effort) [7]. While this approach can provide the rich-
est information about the searcher’s knowledge state,
An important aspect of understanding learning during evaluating such responses is the most challenging, and
web search is to measure and quantify learning, possi- requires extensive human intervention.
bly in an automated fashion. Recent literature adopts We report progress on extending work by [9], and
three broad approaches for this purpose. The first ap- take the third approach mentioned above. We attempt
proach asks searchers to rate their self-perceived pre- to visualize and quantify vocabulary learning during
search and post-search knowledge levels [1, 2]. This search, using natural language Pre-Search and Post-
approach is the easiest to construct, and can be gener- Search responses. The previous authors used sentence
alized over any search topic. However, self-perceptions embedding models, and reported not finding strong as-
may not objectively represent true learning. The sec- sociations between search interactions and knowledge
ond approach tests searchers’ knowledge using fac- change measures. A possible reason is that sentence
embedding approaches are yet to attain maturity, and
Proceedings of the CIKM 2020 Workshops, October 19-20, 2020, typically employ average pooling operation to generate
Galway, Ireland sentence vectors from individual word vectors. Devis-
email: nilavra@ieee.org (N. Bhattacharya); ing effective strategies to obtain vectors for compound
iwilds2020@gwizdka.com (J. Gwizdka)
url: https://nilavra.in (N. Bhattacharya); http://gwizdka.com (J. units (phrases / sentences) from individual word vec-
Gwizdka) tors is always a challenge [10]. Differently from [9],
orcid: 0000-0001-7864-7726 (N. Bhattacharya); we use word embedding vectors and max-pooling op-
0000-0003-2273-3996 (J. Gwizdka) erations (taking element wise maximum of individual
© 2020 Copyright for this paper by its authors. Use permitted under Creative
CEUR
Workshop
Proceedings
http://ceur-ws.org
ISSN 1613-0073
Commons License Attribution 4.0 International (CC BY 4.0).
CEUR Workshop Proceedings (CEUR-WS.org)
word vectors to form sentence vectors), which experi-
Pre-Search Prompt: Post-Search Prompt:
Think of what you already know on the topic of this search and list as Now that you have completed this search task, think of the information
many phrases or words as you can that come to your mind. For that you found and list as many words or phrases as you can on the topic
example, if you know about side effects, please do not just type the of the search task. This will be short ANSWERS to the search questions.
phrase “side effects” ,but rather type “side effects” and then list the For example, if you were searching for side effects, please do not just
specific side effects you know about. Please list only one word or phrase type the phrase “side effects”, but rather type “side effects” and then list
per line and end each line with a comma. the specific side effects you found. Please list only one word (or phrase)
per line and end each line with a comma.
Example Pre-Search Example Post-Search Knowledge: Expert Knowledge (Excerpt):
Knowledge: Vitamin A deficiency can led to blindness Health benefits of using vitamin A: Vision, Breast cancer,
health benefits vitamin consumption Vitamin A is not toxic if over ingested Catarats, measles, Malaria, Diahrrea related to hiv, lower
is highly debated if over consumed risk of complications during and after pregnancy, Retinitis
I know nothing about Vitamin A vitamin A can decrease vitamin B absorption pigmentosa, Ensures Healthy Eyes, soft skin, strong bones
specifically and increase likelihood of hip fractures and teeth, acne, prevents muscular dystrophy, slow the
Vitamin A can be found in leafy green vegetables aging process, lower risk of leukemia, good vision, Can
Participant: P03 organ meats prevent cancer, antioxidant, protects cells, maintain healthy
and broccoli skin, healthy immune system, healthy skeletal and soft
Vitamin A contents can be found on nutritional tissue...
labels
Figure 2: Example of Pre-Search and Post-Search knowledge
Participant: P03 assessment responses from a participant, for Task T3 (Vita-
min A), alongside Expert Knowledge .
mentally showed better results than average-pooling. Search, and Expert Knowledge (Fig. 1). word2vec con-
tains 300 dimensional vectors for about 100 billion
words (tokens) from the Google News dataset, and is
2. Experimental Design claimed to be the most stable word-embedding [13].
GloVe offers multiple pre-trained word embeddings;
We analyze data from the user-study reported in [8, 9].
we ran experiments with 50, 100, and 300 dimensional
Participants (𝑁 = 30, 16 females, mean age 24.5 years)
versions.
searched for health-related information on the web,
Word embedding algorithms produce vectors for in-
over two search-tasks, T3 (topic: Vitamin A) and T4
dividual words. To obtain vectors for phrases and sen-
(topic: Hypotension). Each search task began (Pre-
tences, the individual word vectors are usually pooled
Search) and ended (Post-Search) with a knowledge as-
or aggregated. As discussed in Sec. 1, we performed
sessment, to gauge the participants’ initial and final
max pooling, to produce a single high dimensional vec-
knowledge states. Participants entered natural lan-
tor for a participant response (or expert knowledge).
guage responses from free-recall, as answers. A vo-
We employed two distance metrics – euclidean, and an-
cabulary of Expert Knowledge was also created for
gular (cosine) – to compute distances between vectors
each topic, in consultation with a medical doctor. Ex-
of Pre-Search responses, Post-Search responses, and
ample participant responses, and an excerpt from the
Expert’s Knowledge (Fig. 1). The euclidean distance is
Expert Knowledge are shown in Fig. 2. After data clean-
unbounded, while the angular distance (Eqn. 1) ranges
ing, we obtained data from 49 participant-task pairs
from 0 (no distance) to 1 (maximum distance).
(𝑁𝑇 3 = 26; 𝑁𝑇 4 = 23). Due to space limitations, please
see [9] for more details about the study. 𝐮⋅𝐯
angular distance(𝐮, 𝐯) = arccos /𝜋 (1)
( ‖𝐮‖ ‖𝐯‖ )
3. Data Analysis & Preliminary We manually set the angular distance to be 1 (i.e, max-
Results imum) if one of the input vectors was a zero vector.
This makes sense because zero vectors are obtained
We hypothesize that participants’ learning during search only if participants’ responses do not contain any signs
can be assessed from the ‘difference’ in their Pre-Search of knowledge (e.g., “none” or “i dont know”).
and Post-Search responses. Since different participants To visualize the high-dimensional vectors of various
may have different initial and final knowledge states, knowledge states, we employed the t-SNE algorithm.
we measured it from a stationary reference-point: the This algorithm projects a set of high-dimensional ob-
expert knowledge. Calculating such differences be- jects on a 2D plane in such a way that similar objects
tween pieces of natural language texts is challenging, are modelled by nearby points, and dissimilar objects
and is an active research topic. Word embedding is a are modelled by distant points. Using this algorithm,
popular method of computing semantic similarity (or we obtained 2D representations of the Pre-Search, Post-
distances) between two pieces of natural language texts. Search, and Expert Knowledges (Fig. 3, left column).
A word embedding algorithm produces a numeric, high- The visualization shows an almost clear separation
dimensional vector for each word, which is assumed between the Pre-Search (red circle) and Post-Search
to encapsulate the ‘meaning’ of the word. In this work, (green square) knowledge states, with Expert Knowl-
we leverage two popular pre-trained word-embedding edge (blue star) residing near the Post-Search knowl-
models: word2vec [11], and GloVe [12], to compute edge states. This is a visual confirmation and support
‘differences’ or ‘distances’ between Pre-Search, Post- to the hypothesis that participants gain knowledge dur-
Distance Magnitude
Distance Magnitude
Pre-search Knowledge Embedding Participant-Task Pair Participant-Task Pair
Post-search Knowledge Embedding Pre – Exp Distance Pre – Exp Distance
Expert Knowledge Embedding Post – Exp Distance Post – Exp Distance
Visualizing high-dimensional Angular Distance Metric
embeddings in 2D using t-SNE Euclidean Distance Metric [0 = min distance, 1= max distance]
Figure 3: Results using word2vec 300d word embeddings, across tasks T3 and T4 combined. A clear separation can be
observed between the majority of Pre-Search and Post-Search knowledge states (left column), as well as between Pre-Exp
and Post-Exp distances (middle and right column).
ing search, and move ‘closer’ to the Expert Knowledge the sum of the positive difference ranks (Σ𝑅+ ) and the
state at the end of a search. sum of the negative difference ranks (Σ𝑅− ). Since Σ𝑅−
The Euclidean and Angular distances between Pre- was greater than Σ𝑅+ in all the tests, the difference be-
Search and Expert (Pre-Exp distance), and Post-Search tween Pre-Exp and Post-Exp distances is negative. This
and Expert (Post-Exp distance), are shown in the mid- means that the majority of participants had lower Post-
dle and right columns, respectively, in Fig. 3. For both Exp distance than Pre-Exp distance (i.e. they moved
distance metrics, the majority of the participants have closer to expert knowledge at the end of the task). The
lower Post-Exp distances than Pre-Exp distances (i.e. magnitude of a phenomenon is measured by effect size,
their Post-Search response is less distant, or more simi- which ranges from 0 (no effect) to 1 (maximum effect).
lar to, Expert Knowledge). These metrics were calcu- All the tests had effect sizes greater than 0.8, signifying
lated between the high dimensional embedding vectors, that searching online had a strong effect on minimizing
which supports the fact that the 2D visualizations (left the distance between participants’ knowledge level and
column) showing the clear separation between Pre- and expert knowledge.
Post-Search Knowledge levels is not merely by random
chance. Interestingly, for few participants, the Post-Exp
distance was higher than the Pre-Exp distance. This 4. Conclusion and Future Work
possibly demonstrates a ‘loss’ in knowledge level: users
We showed that word embeddings have promise for
were closer to Expert Knowledge before the search, and
visualizing and quantifying vocabulary-based learn-
moved away from Expert Knowledge after the search.
ing during search. Clear separation between user’s
We further tested whether these visual differences
Pre-Search and Post-Search knowledge states was seen
between Pre-Exp and Post-Exp distances were statis-
and measured using simple distance metrics. Possi-
tically significant. Since the distance values were not
ble future directions include predicting these learning
normally distributed, we employed the non-parametric
metrics from search-interactions measures. Another
Wilcoxon Signed-Rank test, which is used for compar-
direction is to experiment with contextual embeddings
ing paired or related samples. (b)
The[36]
results are presented
(e.g., BERT). We also plan to investigate individual dif-
in Table 1. We can see that across different(d) choices of
eye-tracking [135]
(c) eye-tracking [103] ferences in learning during search.
word embeddings, there were significant differences
between the Pre-Exp and Post-Exp distances. Thus, the
results are not due to choice of particular word em- 4.0.1. Acknowledgements
bedding models. The directionalities of the differences We thank Sudipto Mukherjee, for technical and concep-
in the Wilcoxon Signed-Rank test are expressed using tual mentoring; Dr. Andrzej Kahl, our medical doctor
Table 1
Descriptive values of Pre-Exp and Post-Exp distances, and results of statistical significance tests, using different word-
embeddings to model knowledges. As evident from Fig. 3, Pre-Exp and Post-Exp distances are significantly different for all
the tested choices of word embedding models.
Euclidean Distance Metric Angular Distance Metric (Normalized)
[0=least distance; 1=max distance]
Word
Embedding Pre – Exp Post – Exp Pre – Exp Post – Exp
Wilcoxon SR Test Wilcoxon SR Test
mean (±SD) mean (±SD) mean (±SD) mean (±SD)
all tests significant at p < .05 all tests significant at p < .05
median median median median
ΣR + = 20.0, ΣR – = 1205.0 ΣR + = 28.0, ΣR – = 1197.0
6.30 (±1.52) 3.90 (±0.87) 0.30 (±0.28) 0.11 (±0.03)
word2vec 6.12 3.68
95% CI: -2.76 to -1.82
0.18 0.10
95% CI: -0.13 to -0.06
Effect Size: 0.84 Effect Size: 0.83
ΣR + = 37.0, ΣR – = 1188.0 ΣR + = 43.0, ΣR – = 1182.0
8.67 (±2.39) 5.12 (±1.29) 0.27 (±0.28) 0.10 (±0.03)
GloVe 6B 50d 8.26 4.68
95% CI: -4.03 to -2.48
0.17 0.09
95% CI: -0.12 to -0.06
Effect Size: 0.82 Effect Size: 0.81
ΣR + = 30.0, ΣR – = 1195.0 ΣR + = 32.0, ΣR – = 1193.0
9.34 (±2.55) 5.46 (±1.42) 0.30 (±0.28) 0.11 (±0.03)
GloVe 6B 100d 8.96 5.17
95% CI: -4.46 to -2.79
0.19 0.10
95% CI: -0.15 to -0.07
Effect Size: 0.83 Effect Size: 0.82
ΣR + = 29.0, ΣR – = 1196.0 ΣR + = 35.0, ΣR – = 1190.0
12.15 (±3.18) 7.20 (±1.72) 0.30 (±0.27) 0.11 (±0.03)
GloVe 6B 300d 11.97 6.81
95% CI: -5.79 to -3.65
0.20 0.10
95% CI: -0.14 to -0.07
Effect Size: 0.83 Effect Size: 0.82
ΣR + = 29.0, ΣR – = 1196.0 ΣR + = 38.0, ΣR – = 1187.0
12.17 (±3.10) 7.09 (±1.80) 0.31 (±0.27) 0.11 (±0.03)
GloVe 42B 300d 11.74 6.66
95% CI: -5.92 to -3.79
0.21 0.10
95% CI: -0.16 to -0.08
Effect Size: 0.83 Effect Size: 0.82
ΣR + = 28.0, ΣR – = 1197.0 ΣR + = 38.0, ΣR – = 1187.0
13.24 (±3.16) 8.36 (±1.79) 0.30 (±0.27) 0.12 (±0.03)
consultant for expert-vocabulary
GloVe 840B 300d 12.69 creation;
7.71
95%and Ying-
CI: -5.67 M. Teng,
to -3.48
0.20 S. Williams,
0.11 D. W. W. Tay, S. Iqbal, Im-
95% CI: -0.13 to -0.06
Effect Size: 0.83 Effect Size: 0.82
long Zhang, for contributing to experimental data col- proving learning outcomes with gaze tracking
lection. The research was partially funded by IMLS and automatic question generation, in: The Web
Award #RE-04-11-0062-11 to Jacek Gwizdka. Conference (WWW), 2020.
[8] N. Bhattacharya, J. Gwizdka, Relating eye-
tracking measures with changes in knowledge
References on search tasks, in: Symposium on Eye Tracking
Research & Applications (ETRA), 2018.
[1] S. Ghosh, M. Rath, C. Shah, Searching as learning:
[9] N. Bhattacharya, J. Gwizdka, Measuring learning
Exploring search behavior and learning outcomes
during search: differences in interactions, eye-
in learning-related tasks, in: Conference on Hu-
gaze, and semantic similarity to expert knowledge,
man Information Interaction & Retrieval (CHIIR),
in: Conference on Human Information Interaction
2018.
and Retrieval (CHIIR), 2019.
[2] H. L. O’Brien, A. Kampen, A. W. Cole, K. Bren-
[10] D. Roy, D. Ganguly, M. Mitra, G. J. Jones, Rep-
nan, The role of domain knowledge in search as
resenting documents and queries as sets of word
learning, in: Conference on Human Information
embedded vectors for information retrieval, in:
Interaction and Retrieval (CHIIR), 2020.
ACM SIGIR workshop on neural information re-
[3] L. Xu, X. Zhou, U. Gadiraju, How does team com-
trieval (Neu-IR), 2016.
position affect knowledge gain of users in collab-
[11] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado,
orative web search?, in: Conference on Hypertext
J. Dean, Distributed representations of words and
and Social Media (HT), 2020.
phrases and their compositionality, in: Advances
[4] U. Gadiraju, R. Yu, S. Dietze, P. Holtz, Analyzing
in neural information processing systems, 2013,
knowledge gain of users in informational search
pp. 3111–3119.
sessions on the web, in: Conference on Human
[12] J. Pennington, R. Socher, C. D. Manning, Glove:
Information Interaction & Retrieval (CHIIR), 2018.
Global vectors for word representation, in: Con-
[5] S. Kruikemeier, S. Lecheler, M. M. Boyer, Learning
ference on empirical methods in natural language
from news on different media platforms: An eye-
processing (EMNLP), 2014, pp. 1532–1543.
tracking experiment, Political Communication 35
[13] L. Burdick, J. K. Kummerfeld, R. Mihalcea, Fac-
(2018) 75–96.
tors influencing the surprising instability of word
[6] N. Roy, F. Moraes, C. Hauff, Exploring users’ learn-
embeddings, in: Conference of the North Ameri-
ing gains within search sessions, in: Conference
can Chapter of the Association for Computational
on(a)[21]
Human Information (b)Interaction
[36] and Retrieval
Linguistics: Human Language Technologies, 2018,
(d) eye-tracking [135]
(CHIIR), 2020.
pp. 2092–2102.
[7] R. Syed, K. Collins-Thompson, P. N. Bennett,