-

Personalized Query Auto-Completion Through a Lightweight Representation of the User Context

Manojkumar Rangasamy Kannadasan

mkannadasan@ebay.com 0

Grigor Aslanyan

gaslanyan@ebay.com 0 0 eBay Inc. , San Jose, CA , USA

2019

Query Auto-Completion (QAC) is a widely used feature in many domains, including web and eCommerce search. This feature suggests full queries based on a prefix of a few characters typed by the user. QAC has been extensively studied in the literature in the recent years, and it has been consistently shown that adding personalization features can significantly improve the performance of the QAC model. In this work we propose a novel method for personalized QAC that uses lightweight embeddings learnt through fastText [2, 14]. We construct an embedding for the user context queries, which are the last few queries issued by the user. We also use the same model to get the embedding for the candidate queries to be ranked. We introduce ranking features that compute the distance between the candidate queries and the context queries in the embedding space. These features are then combined with other commonly used QAC ranking features to learn a ranking model using the state of the art LambdaMART ranker [3]. We apply our method to a large eCommerce search engine (eBay) and show that the ranker with our proposed feature significantly outperforms the baselines on all of the ofline metrics measured, which includes Mean Reciprocal Rank (MRR), Success Rate (SR), Mean Average Precision (MAP), and Normalized Discounted Cumulative Gain (NDCG). Our baselines include the Most Popular Completion (MPC) model which is a commonly used baseline in the QAC literature, as well as a ranking model without our proposed features. The ranking model with the proposed features results in a 20 − 30% improvement over the MPC model on all metrics. We obtain up to a 5% improvement over the baseline ranking model for all the sessions, which goes up to about 10% when we restrict to sessions that contain the user context. Moreover, our proposed features also significantly outperform text based personalization features studied in the literature before, and adding text based features on top of our proposed embedding based features results only in minor improvements.

In: J. Degenhardt, S. Kallumadi, U. Porwal, A. Trotman (eds.): Proceedings of the SIGIR 2019 eCom workshop, July 2019, Paris, France, published at http://ceur-ws.org

INTRODUCTION

Query Auto-Completion (QAC) is a common feature of most modern search engines. It refers to the task of suggesting full queries after the user has typed a prefix of a few characters [ 6 ]. QAC can significantly reduce the number of characters typed [ 26 ], which is especially helpful to users on mobile devices. QAC can also help reduce the number of spelling errors in queries. In cases when the user is not really sure how to formulate the query, QAC can be of great help. It has been shown that QAC can greatly improve user satisfaction [ 24 ]. Moreover, this can reduce the overall search duration, resulting in a lower load on the search engine [ 1 ]. Currently QAC has a wide range of applications, including search (such as web, eCommerce, email), databases, operating systems, development environments [ 6 ].

Query Auto-Completion has been extensively studied in the literature in the recent years. A detailed survey of the work prior to 2016 can be found in [ 6 ], which broadly classifies QAC approaches into two main categories - heuristic models and learning based models. Heuristic models use a few diferent sources for each possible query completion and compute a final score. These approaches do not use a large variety of features. In contrast, learning based approaches treat the problem as a ranking problem and rely on the extensive research in the literature in the learning-to-rank (LTR) ifeld [ 15 ]. Learning based approaches rely on a large number of features and generally outperform heuristic models [ 6 ]. The features for both of these approaches can be broadly characterized into three groups - time-sensitive, context-based, and demography based. Time-sensitive features model the query popularity and changes over time, such as weekly patterns. Demographic based features, such as gender and age, are typically limited and may be hard to access. In contrast, context based features rely on the user’s previous search activity (short term, as well as long term) to suggest new query completions. This data is essentially free, making context-based features an attractive approach for personalizing QAC. Context-based features for LTR models will be the focus of this work.

In this paper we propose a novel method to learn the query embeddings [ 2, 14 ] using a simple and scalable technique and use it to measure similarity between user context queries and candidate queries to personalize QAC. We learn the embeddings in a semisupervised fashion using fastText by taking all the queries in a session as a single document. We design features that measure the similarity between the context and candidate queries, which are then incorporated into a learning-to-rank model. We use the state of the art LambdaMART model [ 3 ] for ranking candidate queries for QAC. Even though embedding based features have been studied for QAC in the literature before, as discussed in Section 2, our work makes the following novel contributions: • A lightweight and scalable way to represent the user’s context in the embedding space. • Simple and robust ranking features based on such embeddings for QAC, which can be used in any heuristic or LTR model. • Training and evaluation of a pairwise LambdaMART ranker for QAC using the proposed features. We show that our proposed features result in significant improvements in ofline metrics compared with state-of-the-art baselines. • We also compare and combine text based features with embedding based features and show that embedding based features result in larger improvements in ofline metrics.

The rest of the paper is organized as follows. Section 2 discusses some of the related work in the literature. In Section 3 we describe our methodology. In Section 4 we describe our datasets and experiments. We summarize our work and discuss possible future research in Section 5. 2

RELATED WORK

The user’s previously entered text is used for personalized QAC by Bar-Yossef and Kraus [ 1 ]. The method, called NearestCompletion, computes the similarity of query completion candidates to the context queries (user’s previously entered queries), using termweighted vectors for queries and contexts and applying cosine similarity. This method results in significant improvements in MRR. In addition, the authors proposed the MPC approach, which is based on the overall popularity of the queries matching the given prefix. MPC is a straightforward heuristic approach with good performance and is typically used as a baseline for more complex approaches. We use MPC as one of the baselines in this work as well.

The user’s long term search history is used in [ 5 ] to selectively personalize QAC, where a trade-of between query popularity and search context is used to encode the ranking signal. Jiang et. al. [ 13 ] study user reformulation behavior using textual features. Shokouhi [ 22 ] studies QAC personalization using a combination of context based textual features and demographic features, and shows that the user’s long term search history and location are the most efective for QAC personalization. Su et. al. [ 25 ] propose the framework EXOS for personalizing QAC, which also relies on textual features (token level). Jiang et. al. [ 8 ] use history-level, session-level, and query-level textual features for personalized QAC. Fei et. al. [ 4 ] study features on the observed and predicted search popularity both for longer and shorter time periods for learning personalized QAC. Diversification of personalized query suggestion is studied in [ 7 ].

Recurrent Neural Networks (RNN) [ 11 ] have also been studied for QAC. Three RNN models - session-based, personalized, and attention based, have been proposed in [ 12 ]. Fiorini and Lu [ 9 ] use user history based features as well as time features as input to an RNN model. [ 19 ] uses RNNs to specifically improve QAC performance on previously unseen queries. An adaptable language model is proposed in [ 10 ] for personalized QAC.

Word embeddings, such as word2vec [ 17 ], glove [ 20 ], and fastText [ 2, 14 ], have become increasingly popular in the recent years for a large variety of tasks, including computing similarity between words. Embeddings have also been studied in the context of QAC. Specifically, Mitra [ 18 ] studies a Convolutional Latent Semantic Model for distributed representations of queries. Query similarity based on word2vec embeddings is studied in [ 21 ] where the features are combined with the MPC model. In Section 3 , we explain our approach of learning embeddings for the user context in a simple and scalable fashion and the usage of these embeddings and text based features to personalize QAC. 3

PERSONALIZED QUERY AUTO-COMPLETION WITH REFORMULATION

A search session is defined as a sequence of queries ⟨q1, q2, . . . , qT ⟩ issued by a user within a particular time frame. A query consists of a set of tokens. If the user types a prefix pT and ends up issuing the query qT , then the user’s context is the previous queries issued till time step T , ⟨q1, q2, . . . , qT −1⟩. For example, if the queries issued in a session is ⟨nike, adidas, shoes⟩, the prefix used to issue the query shoes is sh, then ⟨q1, q2, . . . , qT −1⟩ = ⟨nike, adidas⟩, pT = sh, qT = shoes. Given a prefix pT , the user context ⟨q1, q2, . . . , qT −1⟩ and candidate queries QT matching the prefix, our goal is to provide ranking for the queries q ∈ QT such that we have the best ranking for qT . The ranking score can be considered as P (QT |⟨q1, q2, . . . , qT −1⟩). This can be solved using the learning to rank framework.

The influence of user context features towards the prediction accuracy has already been studied in [ 13, 18 ]. In this paper we propose a simple and scalable way to understand the user’s context using query embeddings and use multiple distance related features to compare the user’s context to the candidate queries QT . 3.1

Learning Query Representation for Reformulation

Continuous text representations and embeddings for a text can be learnt through both supervised [ 18 ] and semi-supervised approaches [ 2, 14, 17, 20 ]. In this paper, we learn the query representations via semi-supervised techniques. We use the publicly available fastText library [ 2, 14 ] for eficient representation learning to learn the query embeddings. The fastText model learns subword representations while taking into account morphology. The model considers subword units, and represents a word by the sum of its character n-grams. The word iphone with character n-grams (n = 3) is represented as:

“⟨ip”, “iph”, “pho”, “hon”, “one”, “ne⟩”

Some of the previous work learns distinct vector representations for the words thereby ignoring internal structure of the words [ 17 ]. If we have a dictionary of n-grams of size G, then the set of n-grams in a word w is denoted as Grw ∈ {1, 2, . . . , G }. We use the skipGram model where the goal is to independently predict the presence or absence of the context words. The problem is framed as a binary classification task. For the word at position t we consider all context words as positive examples and sample negatives at random from the dictionary as described in [ 2, 14 ]. For a context word wc , we use the negative log likelihood, l : x 7→ loд(1 + e−x ), for the binary logistic loss. The objective function is defined as:

ÕtT=1 c ∈CÕont ex t l (s(wt , wc )) + n ∈ÕNt,c l (−s(wt , n)) (1) where wt is the target word, wc is the context word, Nt,c is a set of negative examples sampled from the vocabulary. The scoring function, s(w, c) is defined as s(w, c) = Õ д ∈Gw zTд vc (2) where zд is the vector representation of each n-gram of a word w and vc is the vector representation of the context. Our goal is to learn scalable and lightweight embeddings for queries based on their reformulation behavior across diferent users. Here we represent all the queries issued in a session ⟨q1, q2, . . . , qT ⟩ as one document in the training data. By learning the subword representations using the probability of words in the context of other words present in the queries issued in the same context, we are able to provide a simple and scalable way to encode the query reformulation behavior in the embedding space. We are also able to learn the syntactic and semantic similarity between the vocabulary.

We learn query representations by mining 3 days of eBay’s search logs to get the query reformulations issued by the user. The query log is segmented into sessions with a 30 minute session boundary as followed in [ 13 ]. Based on this definition of a session boundary, we collect diferent queries issued by the users within that session. We remove special characters in the query and convert them to lowercase. We also filter out sessions with only one query in the session. For example, if the user issues q1, q2, . . . , qT in a session, then all of these queries ⟨q1, q2, . . . , qT ⟩ together, separated by whitespace, are considered as one user context. For example, if a session contains 2 queries in the user context, “iphone”, “iphone xs case”, then a single document for training will be represented as “iphone iphone xs case”.

For training unsupervised character n-Gram representations we consider each user context as one document sample. We tune the model hyperparameters by fixing the dimension of subword representations as 50, minimum occurrence of the words in the vocabulary to be 20 and hierarchical softmax as the loss function. The other hyperparameters of the model are tuned based on the Embedding_Features model described in Section 4.3. The number of unique words in the vocabulary used to train the model is 189,138. The user context ⟨q1, q2, . . . , qT ⟩ is then converted to multiple vector representations. Similar vector representations are also created for all candidate target queries in the dataset. 3.2

User Context Features

In this section we propose diferent user context features based on the queries issued in the session. Vector representations are created for both the individual queries as well as the entire context taking all queries in the session. vC represents the user context vector and vqT represents the vector for one query at time step T . We develop four features based on the query representations learned in the previous section. We denote these features as Embedding Features. One embedding feature is based on all the queries in the context. Since the median number of searches in a session is approx 3, we considered up to 3 queries previously issued by the user for generating the remaining embedding features. The Embedding Features are computed as a distance between 2 vectors using cosine similarity [ 23 ].

• user_context_cosine: Cosine distance between the user context vector vC and the current target query vqT . • prev_query1_cosine: Cosine distance between the query vector vqT −1 and the current target query vqT . • prev_query2_cosine: Cosine distance between the query vector vqT −2 and the current target query vqT . • prev_query3_cosine: Cosine distance between the query vector vqT −3 and the current target query vqT .

In addition to the Embedding_Features, we also developed various Textual_Features comparing the user context and the current target query to be ranked as defined in Table 1. We categorize them into three categories, namely Token, Query, and Session. There is a large overlap between the features defined in Table 1 and the features defined in [ 13, 22 ]. A query qT can contain multiple tokens. Users may add or remove tokens between 2 consecutive queries in a session. Based on analyzing the user sessions, between queries qT and qT −1, tokens can either be added and/or removed. These token reformulation user behavior can be encoded via 16 features, described in Table 1, representing the efectiveness of the tokens in the context C and the target query qT . Similarly, Query level features represent how users reformulate the queries in a session through repetition and textual similarity between qT and qT −1. The Session level features represent how users reformulate their queries without taking into account the relationship to the target query qT . We conduct our ranking experiments on a large scale search query dataset sampled from the logs of eBay Search engine. The query log is segmented into sessions with a 30 minute session boundary

Prefix MPC Top N Candidates Baseline_Features fastText Context Embeddings Textual_Features Embedding_Features Ranking Model Final Ranked List

as described in [ 13 ]. For ranking experiments, we do not filter out sessions containing a single query. This is to make sure that we have a single learning to rank model powering sessions with and without user context. The dataset obtained based on the above logic results in about 53% of the sessions with user’s context. This gives us good coverage of user context features to learn a global model.

The labeling strategy used in [ 13, 22 ] assume there is at least one query in the context, remove target queries qT not matching the prefix pT , setting the first character of the prefix pT based on qT . In our method, we use a slightly diferent labeling strategy for building the personalized QAC. We sample a set of impressions from search logs. For each issued query qT , we capture the prefix pT that led to the search page. This is marked as a positive label. For the same prefix pT we identify rest of the candidate queries QT \ qT that were shown to the user and did not lead to the impression. They are marked as negative labels. We also retain sessions without user context.

The above training data now consists of labeled prefix-query pairs. To learn the performance of the lightweight query representation of reformulations, we use LambdaMART [ 3 ] as the choice of learning to-rank algorithms, a boosted tree version of LambdaRank. LambdaMART is considered as one of the state-of-the-art learning to rank algorithms and has won the Yahoo! Learning to Rank Challenge (2010) [ 13 ]. We use a pairwise ranking model and fine tune our parameters based on the Baseline_Ranker defined in Section 4.2. We fix these parameters to train and evaluate our models across all of our experiments. 4.2

Baseline System

To evaluate our new personalized QAC ranker we establish two baseline ranking algorithms.

• MPC: The Most Popular Completion model [ 1 ] predicts and provides users with candidate queries which are ranked by the popularity of the query. Popularity of a query is defined as the number of times the query has been issued by all the users in the past. • Baseline_Ranker: The baseline ranker is a Learning to Rank model built using the same methodology for creating and labeling the dataset. The features used in building the model are prefix features, target query features and prefix-query features. We refer to these features as Baseline_Features. The hyperparameters used for the LambdaMART model are exactly the same as in all the experiments for the personalized ranker. 4.3

Personalized Ranking Models

We have developed three personalized ranking models with diferent combinations of user context features, as described in Section 3.2. These ranking models are compared against the two baseline rankers by experimentally evaluating the improvements on eBay datasets. The results are presented in Section 4.5.

• Textual: Ranker with Baseline_Features and Textual_Features representing the user context. • Embedding: Ranker with Baseline_Features, as well as Embedding_Features representing the user context. • Textual_Embedding: Ranker with Baseline_Features, Textual_Features, and Embedding_Features representing the user context.

For all of the ranking models we first get the top N candidate queries from the MPC model and re-rank them with the ranking model. We show the full end to end architecture for the Textual_Embedding model in Figure 1. The architecture for the other models is similar except that they will only include a subset of the features. 4.4

Evaluation Metrics

The quality of our predictions can be measured using the following metrics: • Mean Reciprocal Rank (MRR) - the average of the reciprocal ranks of the target queries in the QAC results. Given a test dataset S, the MRR for algorithm A is computed as

1 Õ 1 MRR(A) = |S | CT ,qT ∈S hitRank(A, CT , qT ) (3) where CT represents the user context at time step T , qT represents the relevant target query, and the function hitRank computes the rank of the relevant query based on the order created by algorithm A. Relevant query refers to the clicked query in QAC. • Success Rate at Top-K (SR@K ) - the average percentage of relevant queries ranked at or above the position K in the where i denotes the rank and reli is the relevance of query at rank i. For our purposes reli takes values 0 or 1. nDCG is defined as normalized DCG. Namely, it is the ratio of DCG to I DCG (ideal DCG):

DCGq = ÕP 2r eli − 1 i=1 loд2(i + 1) nDCGq =

DCGq

I DCGq where I DCGq is the maximum possible value of DCGq for any ranker.

The overall performance of the ranking algorithm A is measured by the average nDCG across all queries in the dataset: nDCG = ÍQ q=1 nDCGq

Q • Mean Average Precision (MAP) - the mean of the average precision scores for each query across the entire set of queries: MAP = ÍQ q=1 AvдPrecision(q)

Q . 4.5

Results

We perform our evaluation in two phases. Firstly, we evaluate the quality of our query representations. Secondly, we evaluate the user context embeddings against the user context based textual features using a Learning to Rank framework [ 3 ].

To evaluate our query representations we sample a few words across diferent verticals like fashion, electronics, home and garden, (4) (5) (6) (7) to evaluate if the embeddings are representing the syntactic and semantic knowledge of the queries learnt from the query reformulation behavior. We use t-SNE [ 16 ] to visualize the embeddings for these sampled queries and show that words like samsung, galaxy, tv are close to each other in the embedding space and far from queries like adidas and iphone. This verifies that our query embeddings have good subword information to represent the user context in the embedding space. The t-SNE plot for a small sample of queries is shown in Figure 2.

Ofline metrics MRR, SR@k, nDCG, MAP , and MAP @k are shown in Table 2, where we have normalized the metrics with respect to the MPC model. We show results for the whole test dataset, which includes queries with and without user context, as well as the dataset with user context only. To assess statistical significance we have performed 1,000 bootstrap samples over the test queries and computed 95% confidence intervals using those samples. The metrics, together with the 95% confidence intervals, are plotted in

Figure 3, where the plots on the left are for the whole dataset and

MRR Ratio Over MPC

Manojkumar Rangasamy Kannadasan and Grigor Aslanyan

MRR Ratio Over MPC using 1,000 bootstrap samples of the tesTt queries. Baseline Embedding Textual Baseline Embedding Textual extual_Embedding extual_Embedding Figure 3: Metrics ratio to MPC for the whole dataset (left) and the user context only dataset (rightT). Error bars are computed B aseline

Embedding

Textual extual_Embedding T

B aseline

Embedding

Textual extual_Embedding T SR@3 Ratio Over MPC

SR@3 Ratio Over MPC B aseline

Embedding 1.34 NDCG Ratio Over MPC

Textual extual_Embedding T

Baseline Embedding Textual

extual_Embedding NDCG Ratio Over MPC T the plots on the right are for the context only dataset. We have only plotted one variant of each metric since the others are very similar.

Our results show that all of the LTR models result in 20 − 30% improvements over the MPC model. All three models with contextualization features outperform the Baseline_Ranker on all the metrics statistically significantly. For example, for MAP @3 Embedding outperforms Baseline_Ranker by 5% on the whole dataset and 10% for the context only dataset. The Embedding model also outperforms Textual with an improvement of 1.5% for MAP @3 on the whole dataset and 3% for the context only dataset. The Textual_Embedding model performs very similarly to Embedding which implies that the embedding based features proposed in this work capture all of the information in the textual features (from the perspective of the ranking model), and provide additional significant improvements.

4.6 Feature Analysis

In this section we analyze the user context embedding features through partial dependence plots shown in Figure 4. The partial dependence plot for the user_context_cosine feature clearly indicates that the cosine similarity between the user context ⟨q1, q2, . . . , qT −1⟩ and the target query qT has a linear relationship with the target. The embedding features based on individual time step (prev_query1_cosine, prev_query2_cosine, prev_query3_cosine) also show a clear monotonic relationship.

5 SUMMARY AND FUTURE WORK

In this work we have presented a simple and scalable approach to learn lightweight vector representations (embeddings) for the query reformulations in a user session. These query representations exhibit both syntactic and semantic similarities between diferent queries and enable them to model the user context seamlessly. We have leveraged these lightweight embeddings to represent the user context in our personalized ranker for Query Auto-Completion. Different combinations of user context features are created, including textual and embedding features on top of our baseline ranker. We have applied these personalization features to a large scale commercial search engine (eBay) and experimentally verified significant improvements on all the ofline ranking metrics. We have evaluated our personalized ranker on the entire dataset and a dataset restricted to sessions containing the user context. We see the biggest improvements on the user context filtered dataset. Furthermore, we show that the ranking model with embedding features outperforms the model with the textual features, whereas the model with combined textual and embedding features results in only minor improvements on top of the model with embedding features alone. The minor improvements from the textual features is likely due to the session level features which are agnostic of the queries in the context. As a future work, we would like to explore diferent representation learning techniques like sent2vec, doc2vec, and sequence models, to understand the user context better and incorporate them in the personalized ranker. We also plan to explore the trade ofs between short term and long term user contexts in QAC. Lastly, the user context vectors provide a simple and scalable way to understand the user sessions which can be utilized for personalizing diferent parts of search and recommender systems.

[1]

Ziv

Bar-Yossef and

Naama

Kraus . 2011 . Context-sensitive Query Auto-completion . In Proceedings of the 20th International Conference on World Wide Web (WWW '11) . ACM, New York, NY, USA, 107 - 116 . https://doi.org/10.1145/1963405.1963424

[2]

Piotr

Bojanowski , Edouard Grave, Armand Joulin, and

Tomas

Mikolov . 2017 . Enriching word vectors with subword information . Transactions of the Association for Computational Linguistics 5 ( 2017 ), 135 - 146 .

[3] Christopher

Burges . 2010 . From ranknet to lambdarank to lambdamart: An overview . Learning 11 , 23 - 581 ( 2010 ), 81 .

[4]

Fei

Cai , Wanyu Chen, and

Xinliang

Ou . 2017 . Learning search popularity for personalized query completion in information retrieval . Journal of Intelligent & Fuzzy Systems 33 , 4 ( 2017 ), 2427 - 2435 .

[5]

Fei

Cai and Maarten de Rijke. 2016 . Selectively Personalizing Query AutoCompletion . In Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '16) . ACM, New York, NY, USA, 993 - 996 . https://doi.org/10.1145/2911451.2914686

[6]

Fei

Cai and Maarten de Rijke. 2016 . A Survey of Query Auto Completion in Information Retrieval . Foundations and Trends® in Information Retrieval 10 , 4 ( 2016 ), 273 - 363 . https://doi.org/10.1561/1500000055

[7]

Wanyu

Chen , Fei Cai, Honghui Chen, and Maarten de Rijke. 2017 . Personalized query suggestion diversification . In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM , 817 - 820 .

[8]

Jiang

Danyang , Fei Cai, and

Honghui

Chen . 2018 . Personalizing Query AutoCompletion for Multi-Session Tasks . 203 - 207 . https://doi.org/10.1109/CCET. 2018 .8542201

[9]

Nicolas

Fiorini and

Zhiyong

Lu . 2018 . Personalized neural language models for real-world query auto completion . arXiv preprint arXiv: 1804 . 06439 ( 2018 ).

[10]

Aaron

Jaech and

Mari

Ostendorf . 2018 . Personalized language model for query auto-completion . arXiv preprint arXiv: 1804 . 09661 ( 2018 ).

[11]

L. C.

Jain and

L. R.

Medsker . 1999 . Recurrent Neural Networks: Design and Applications (1st ed .). CRC Press, Inc., Boca

Raton

, FL, USA.

[12]

Jiang ,

Chen ,

Cai , and

Chen . 2018 . Neural Attentive Personalization Model for Query Auto-Completion . In 2018 IEEE 3rd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC) . 725 - 730 . https://doi.org/10.1109/IAEAC. 2018 .8577694

[13] Jyun-Yu

Jiang

, Yen-Yu

, Pao-Yu Chien , and Pu-Jen Cheng. 2014 . Learning User Reformulation Behavior for Query Auto-completion . In Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval (SIGIR '14) . ACM, New York, NY, USA, 445 - 454 . https://doi.org/10. 1145/2600428.2609614

[14] Armand

Joulin

, Edouard Grave, Piotr Bojanowski, and

Tomas

Mikolov . 2016 . Bag of tricks for eficient text classification . arXiv preprint arXiv:1607.01759 ( 2016 ).

[15] Tie-Yan Liu . 2009 . Learning to Rank for Information Retrieval . Foundations and Trends® in Information Retrieval 3 , 3 ( 2009 ), 225 - 331 . https://doi.org/10.1561/ 1500000016

[16] Laurens

van der Maaten and Geofrey

Hinton . 2008 . Visualizing data using t-SNE . Journal of machine learning research 9 , Nov ( 2008 ), 2579 - 2605 .

[17] Tomas

Mikolov

, Ilya Sutskever, Kai Chen, Greg Corrado, and

Jefrey

Dean . 2013 . Distributed Representations of Words and Phrases and Their Compositionality . In Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2 (NIPS'13) . Curran Associates Inc., USA, 3111 - 3119 . http: //dl.acm.org/citation.cfm?id= 2999792 . 2999959

[18]

Bhaskar

Mitra . 2015 . Exploring session context using distributed representations of queries and reformulations . In Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval. ACM , 3 - 12 .

[19]

Dae

Hoon Park and

Rikio

Chiba . 2017 . A neural language model for query autocompletion . In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM , 1189 - 1192 .

[20] Jefrey

Pennington

, Richard Socher, and

Christopher D.

Manning . 2014 . GloVe: Global Vectors for Word Representation . In Empirical Methods in Natural Language Processing (EMNLP) . 1532 - 1543 . http://www.aclweb.org/anthology/ D14-1162

[21] Taihua

Shao

, Honghui Chen, and

Wanyu

Chen . 2018 . Query Auto-Completion Based on Word2vec Semantic Similarity . In Journal of Physics: Conference Series , Vol. 1004 . IOP Publishing, 012018 .

[22]

Milad

Shokouhi . 2013 . Learning to Personalize Query Auto-completion . In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '13) . ACM, New York, NY, USA, 103 - 112 . https://doi.org/10.1145/2484028.2484076

[23]

Amit

Singhal et al. 2001 . Modern information retrieval: A brief overview . IEEE Data Eng. Bull. 24 , 4 ( 2001 ), 35 - 43 .

[24] Yang

Song

, Dengyong Zhou , and Li-wei He . 2011 . Post-ranking Query Suggestion by Diversifying Search Results . In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '11) . ACM, New York, NY, USA, 815 - 824 . https://doi.org/10.1145/2009916.2010025

[25]

Su ,

Somaiya ,

Mishra , and

Mukherjee . 2015 . EXOS: Expansion on session for enhancing efectiveness of query auto-completion . In 2015 IEEE International Conference on Big Data (Big Data) . 1154 - 1163 . https://doi.org/10.1109/BigData. 2015 .7363869

[26] Aston

Zhang

, Amit Goyal, Weize Kong, Hongbo Deng, Anlei Dong,

Chang ,

Carl A.

Gunter , and Jiawei Han. 2015 . adaQAC: Adaptive Query Auto-Completion via Implicit Negative Feedback . In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '15) . ACM, New York, NY, USA, 143 - 152 . https://doi.org/10.1145/2766462.2767697