<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>uQery Rewriting using Automatic Synonym Extraction for E-commerce Search</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Aritra Mandal*</string-name>
          <email>arimandal@ebay.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ishita Khan*</string-name>
          <email>ishikhan@ebay.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Prathyusha Senthil Kumar</string-name>
          <email>prathykumar@ebay.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>eBay Inc</institution>
          ,
          <addr-line>San Jose, CA</addr-line>
          ,
          <country country="US">United States</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <abstract>
        <p>Query rewriting is a critical component in modern search engines. It is the process of altering and enhancing raw user queries using synonymous keywords or structured metadata to improve search recall and relevance using data mining techniques applied on textual data and user behavioral signals. For example, the query bicycle is rewritten to match (bicycle OR bike) - i.e. all items that either contain the word bicycle or bike in their title are returned for this query. Choosing the right set of synonymous terms for a given query term is essential to ensure the quality of search results, especially in the context of e-commerce where buyer needs can be very specific. As an example, shoe is a good synonym for shoes, whereas sandals, while related, is not a good synonym for shoes. In this work, we describe one version of the approaches to query rewriting taken at eBay search. At a high level, we use a two step process to generate and apply synonyms for query expansions - 1. ofline token level synonym generation and 2. runtime search query rewriting. In the ofline phase, we ifrst generate a large pool of candidate synonyms for query tokens using various techniques leveraging user behavioral data, inventory and taxonomy information and open source knowledge bases; then, we leverage a machine learned binary classifier t rained o n h uman j udged b inary r elevance l abels t o filter the candidate synonyms that are truly useful as query expansions without compromising result set precision; this classifier allows us to leverage a wide variety of sources and techniques to generate synonym candidates by providing a scientific and scalable method to evaluate their efectiveness for query rewriting. This filtered set of token level synonyms is stored in a dictionary for runtime query rewriting. In the online phase, we rewrite user search queries by combining the token level synonyms in the dictionary, creating a boolean recall expression. We empirically demonstrate the value of this approach to enhance e-commerce search recall and relevance.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>One of the aims of query rewriting in information retrieval is to
bridge the vocabulary gap between queries and documents. Other
direct applications of rewriting a user query, such as addressing
miss-spelling, different ways of describing the same entity
regardless of level of formality exits. In the context of e-commerce search,
user queries are much shorter and often colloquial, compared to
product listing titles that are more verbose and contain formal
terms. As an example, buyers may search for noise absorbing
blankets when they are looking for soundproof blankets. In order to find
all relevant inventory that matches the buyer’s search intent, query
terms are expanded or rewritten to synonymous terms which are
also used to retrieve documents. In this example, we may rewrite
the user’s original query to acoustic blankets, soundproof blankets
or soundproof blanket etc.</p>
      <p>
        Several techniques have been proposed in literature to
automatically identify query expansions including those based on a.
pseudorelevance feedback [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] where expansion terms are selected from
the top results retrieved from a first round of retrieval, b. leveraging
user search logs [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] which capitalize on the large
volumes of readily available user behavioral logs to extract query
substitutions using a variety of innovative techniques, c. ontology
or thesaurus based [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] and [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], that employ either
externally available knowledge bases or create one from the specific
corpora to expand search queries. One more interesting approach
proposed by Gao et.al [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ][
        <xref ref-type="bibr" rid="ref9">9</xref>
        ][
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] uses ranking mechanism to score
rewrite lexicons specific to a query and boosts rewrites which are
more relevant to the query.
      </p>
      <p>
        The paper by Andrew et.al [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] gives an overall view of the
search architecture at eBay. In our work we focus only on the query
rewrite part of search at eBay. Our approach combines the
effectiveness of several of the above mentioned techniques to extract
synonymous words and phrases for query terms. Contrary to these
techniques, we use a boolean recall basis to determine the items
returned for a query, which are then ranked based on the user’s
specific ranking criteria (relevance, price, time on site, etc). The
automatically generated query term synonyms are used to augment
the query in the boolean recall expression. As an example, the query
mens bikes is rewritten to ((men OR mens) AND (bike OR bikes OR
bicycle OR bicycles), where the query term mens is expanded to mens
and the term bikes is expanded to bike, bicycle and bicycles. These
synonyms are mined using a number of techniques as described in
3.1 relying on the presence of these synonyms in behavioral logs
and item index. These techniques result in a large pool of synonym
candidates with varying quality. A novel second step involving a
machine learned classifier is used to diferentiate the truly valuable
synonyms from semantically related but not synonymous terms. As
illustrated earlier, shoe and shoes are truly synonymous, whereas
shoe and sandals are related but not synonymous. This classifier
is trained using a variety of behavioral and lexical features on a
human judged relevance label.
      </p>
      <p>The rest of the paper is organized as follows: in Section 2 we
provide an overview of our end to end process for e-commerce query
rewriting, in Section 3 we describe the ofline synonym dictionary
generation process in detail, followed by evaluation and results for
the synonym classifier and its efectiveness for query rewriting in
Section 4 and conclude with future work in Section 5.
2</p>
    </sec>
    <sec id="sec-2">
      <title>END-TO-END ARCHITECTURE FOR QUERY</title>
    </sec>
    <sec id="sec-3">
      <title>REWRITING</title>
      <p>In this section we provide an overview of the synonym candidate
generation and evaluation pipeline and the runtime use of keyword
synonyms for query rewriting. Subsections 2.1 and 2.2 outline the
ofline and online components in our search query rewriting system,
respectively.
2.1</p>
    </sec>
    <sec id="sec-4">
      <title>Ofline Synonym Candidate Generation and Selection Pipeline</title>
      <p>The overall pipeline of the synonym candidate generation and
evaluation framework for query rewriting is shown in Figure 1. We
mine user behavioral data to collect search queries and item titles
from search result pages (SRPs). From there, we use a semantic
learning function that models three types of candidates:a.
queryquery transitions generated from pairs of queries within a single
user session; b. query-item pairs of query and clicked item titles
in the same SRP; c. item-item pairs of items shown in the same
SRP. For each of these types of candidates, the semantic learning
function essentially considers pairs with a certain minimum
number of occurrences across search sessions, and then generates all
possible token n-grams (n up to a fixed limit) from these pairs as
potential synonym candidates. Next, we pass this exponentially
large set of candidate pairs to a set of filtering components, one
for each synonym generation technique (e.g. stemming, acronyms),
that identifies if the candidate pair of n-grams qualifies as a
synonym pair according to that technique. Following are some of the
ifltering components / synonym generation techniques that we will
use in the scope of this paper: stemming equivalents, compounding
equivalents, phrasing, open-source dictionary based synonyms and
acronyms-full form equivalents (We are unable to disclose the full
set of filtering components).</p>
      <p>As mentioned earlier, we developed a machine learned
classification model to evaluate the above generated synonym candidates,
and filter out the synonym pairs that are truly valuable for query
expansions. This classifier is optimized for a human judged binary
relevance target (label=1 for candidates that are true synonyms,
label=0 for candidates that are irrelevant or related but not
synonymous). A number of behavioral and lexical features are extracted
for the list of candidate synonyms being considered. In
particular, for the behavioral features, we use clicks and other associated
engagement signals from historic user sessions where the query
token (trigger) and the corresponding synonym candidate occurred.
Additionally we also derive ratio-based features from these
countbased features. We construct lexical features based on semantics
such as presence of numeric characters, lengths of the trigger and
synonym etc. After the classifier has predicted a score for each
candidate, we perform a score thresholding step to filter out the final
set of synonyms. The thresholding is based on analyzing the score
distribution on the training dataset in positive and negative classes
and selecting a cutof that balances between classifier performance
and false negative rate. The final filtered query n-gram - synonym
pairs are stored in a dictionary and used in runtime query rewriting
to enhance the recall set for queries.
2.2</p>
    </sec>
    <sec id="sec-5">
      <title>Runtime Query Rewriting using N-gram</title>
    </sec>
    <sec id="sec-6">
      <title>Based Synonyms</title>
      <p>As mentioned above, once the final filtered synonym rewrites are
generated for the n-gram query tokens, we store them in an ofline
rewrite dictionary for runtime query rewriting. At run-time, we
have a two-phase approach to constructing boolean query rewrite
expressions using the n-gram rewrite dictionaries: query
segmentation and query rewriting. In the query segmentation phase, the
raw user query is split into non-overlapping n-gram segments to
maximize coverage in the rewrite dictionary. The segmentation
heuristic used in this particular application is to take the longest
matching segment from left among the query tokens, and then split
the rest of the query tokens recursively in the same manner. In the
query rewriting phase, we look up the synonyms for the n-gram
segments in the synonym dictionary, and then combine them into
a boolean expression. The diferent synonyms for a given segment
are combined with an OR clause while the diferent segments for a
given query are combined with an AND clause. For example, for
the query ps 4 games, the first phase will segment the query as ps 4
and games and the second phase might create a rewrite expression
like ((ps 4 OR playstation 4) AND (games OR game)), based on the
following n-gram synonym rewrites: ps 4 -&gt; playstation 4 and games
-&gt; game.
3</p>
    </sec>
    <sec id="sec-7">
      <title>SYNONYM DICTIONARY GENERATION</title>
    </sec>
    <sec id="sec-8">
      <title>FOR QUERY REWRITING</title>
      <p>In this section, we deep dive into the steps involved in the ofline
synonym rewrite dictionary generation as outlined in Figure 1.
3.1</p>
    </sec>
    <sec id="sec-9">
      <title>Mining Synonym Candidates</title>
      <p>For generating individual synonym candidates, we process user
behavioral data to collect query and item titles from SRPs. We use
this data in a semantic learning function that models three types of
transitions to generate potential synonym candidates. For each of
these transitions, we create several pairs of n-grams (length upto a
ifxed n) as potential candidates as described below.</p>
      <p>• Query - Query Transitions: generated from pairs of queries
within the same user session. For example, consider a search
session with the following queries: womens dresses and ladies
gown. Using the pairs of queries within this set, we generate
n-gram pairs as possible synonym candidates, where each
member of the pair comes from a diferent query. With n=2
(bigrams), we generate the following candidates - womens
ladies, womens - gown, dresses - ladies dresses - gown, womens
dresses - ladies, womens dresses - gown, womens dresses - ladies
gown, ladies gown - womens and ladies gown - dresses.
• Query - Item Transitions: pairs of query and clicked item
titles in the same SRP. Consider the query womens dresses
- let’s say we observe the following items being clicked in
an SRP for this query - womens dress/gown size 4, summer
dress for women. Similar to the query-query transition case,
all pairs on n-grams between each pair of query and
clickeditem are considered as potential candidates.
• Item - Item Transitions: pairs of items shown in the same
SRP (same query). Using our example with query-item
transitions, for pairs of items in the same SRP - womens dress/gown
size 4, summer dress for women, we generate n-gram
candidate pairs, one from each item.</p>
      <p>
        For each of these types of candidates, we only consider pairs with
a certain minimum number of occurrences across search sessions.
Once generated, we pass this exponentially large set of candidate
pairs to a set of filtering components, one for each synonym
generation technique, that identifies if the candidate pair of n-grams
qualifies as a synonym pair according to that technique.
Following are some of the filtering components we currently have in our
pipeline:
• Stemming equivalents Words that resolve to the same stem
root, using Snowball stemmer [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. E.g. boat and boats, both
resolve to the same root boat.
• Compounding equivalents These are n-grams that when
written as a compound word mean the same as the n-gram.
These are more popular in some languages like German.
E.g.: sail boat and sailboat, granitpflastersteine and granit
pflastersteine or granit pflaster steine .
• External dictionary based synonyms We leverage open
source knowledge bases and dictionaries to extract
synonyms. One such example is WordNet [
        <xref ref-type="bibr" rid="ref21 ref4 ref5">4, 5, 21</xref>
        ] which is
a large lexical database of English, where nouns, verbs,
adjectives and adverbs are grouped into sets of cognitive
synonyms, each expressing a distinct concept. Each synonym in
one group are interlinked by means of conceptual-semantic
and lexical relations. From our n-gram candidates, we filter
out synonyms as identified by wordnet. E.g.: sunblock and
sunscreen, tomahawk to hatchet.
• Acronyms-full form equivalents We implement the Schwartz
Hearst algorithm [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] for extracting acronym based
synonyms that are identified using pattern matching on item
titles and query item pairs. At a high level, the algorithm
consists of two tasks: extraction of (short form, long form)
pair candidates from text, and identification of the correct
long form among the candidates in the sentence that
surrounds the short form. E.g.: hp and Hewlett Packard, NY&amp;Co
and New York &amp; Company.
• Phrasing By default, all tokens in a query are matched as a
bag of words, meaning any item that contains those tokens
in any order is returned for the query. For example, for the
query womens dress we may return items like womens
summer dress. However, in some cases we might want to enforce
strict word ordering while matching terms in items for user
queries. This applies to named entities like brands, models,
book names etc where the query intent is preserved only
when matched exactly. Phrasing is a mechanism that enables
us to do this enforcement. These are mined by understanding
user preferences from query - item n-gram pairs. (Kumar et
al. [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]). E.g.: If a user searches for new balance shoes, since
new balance is a single entity, we enforce that only the items
containing the words new balance contiguously in that order
be returned for this query.
3.2
      </p>
    </sec>
    <sec id="sec-10">
      <title>Synonym Selection with ML classifier</title>
      <p>
        The diferent synonym mining techniques described in Section 3.1
yield a large pool of synonyms with varying degrees of quality
and coverage. However, not all of these will be useful and could
even potentially be harmful when use e-commerce search query
expansion. For example, the words tub and bathtub may be used
interchangeably in some contexts, but using one as an expansion
for the other may result in some unexpected results from other
product categories. We developed a Random Forest classifier [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] in
order to filter out the truly useful synonyms from candidates that
are related but not always synonymous.
      </p>
      <p>
        3.2.1 Training Dataset and Classifier Information. Our
random forest classifier for filter useful synonym candidates is trained
on human judged binary labels indicating the applicability of the
synonyms to query expansions (label=1 for candidates that are
true synonyms, label=0 for candidates that are irrelevant or related
but not synonymous). For example, the pair shoe - shoes will be
labeled as 1, while the pair shoes - sandals will be labeled as 0. The
training dataset itself is a weighted sample of synonym candidates
from each synonym mining technique, where the weights are
determined based on the quality of synonyms from each source - i.e. we
oversample synonym types that have a larger fraction of negatives.
For example, through our internal judgment of data quality, we
empirically identified that synonyms based on external knowledge
sources have a higher false positive rate than those from stemming,
and hence we give a higher weight to synonyms from external
sources compared to the ones from stemming. A training example
pair of trigger n-gram and synonym expansion is labeled as relevant
if and only if adding the synonym as an expansion to the original
terms in the query n-gram brings in additional relevant inventory.
The classifier has 100 trees with a tree depth of 20. We use the
sklearn package from python[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] for the classifier operations.
      </p>
      <p>3.2.2 Feature Engineering. A number of behavioral and
lexical features (133 in total) are extracted for the list of candidate
synonyms to be evaluated by the classifier. For behavioral features,
we use clicks, sales and other associated engagement signals from
historic user sessions (inclusive of SRP level data as well for
queryitem and item-item transitions) where the query terms (trigger)
and the corresponding synonym candidate occurred. For this we
sample queries and engaged items in a 2 week window. We are
unable to reveal the full set of behavioral features used beyond the
top 10. Below are examples of two of the behavioral features that
appear in the top 10:</p>
      <p>ExpansionInQueryTS aril дeдer I nT itl e =
i=1
N
Õ[ExpansionInQuerySal e |T riддer InT itle]</p>
      <p>For a given trigger and its expansion, the above equation
calculates number of times a sale event happens when the trigger
appears in an item title, and the expansion appears in the search
query.</p>
      <p>T riддer InQueryCBolitchkT r iддer ExpansionI nT itl e =
[T riддer InQueryClick |BothT r iддer ExpansionI nT itl e]</p>
      <p>Similarly, for a given trigger and its expansion, the above
equation calculates number of times a click event happens when the
trigger appears in the query, and both the trigger and the expansion
appears in an item title. Here N is the number of query-title
transitions considered. Additionally we also derive ratio-based features
from these count-based features.</p>
      <p>We also construct lexical features based on word semantics such
as presence of numeric characters, string lengths, and lengths of
matched and non-matched parts in trigger terms and expansions.
To separate out the most predictive features we also added three
random numbers features. For the final model we used all features
with a feature importance greater than the first random feature,
and ended up with 50 features in total.</p>
      <p>3.2.3 Threshold tuning for classifier scores. After the ML
classifier has predicted a score for each synonym candidate mined
through some of the processes detailed in Section 3.1, we perform
a threshold tuning step to filter out the final set of good synonyms.
The threshold selection is based on analyzing the score
distribution on the training dataset in positive and negative classes and
selecting a cutof that balances between classifier performance and
false negative rate. Figure 2 shows the analysis on the individual
mining processes discussed in Section 3.1. With 1276 human labeled
training samples in the English sites, we look at the distribution
of random forest classifier scores for the negative(Figure 2A) and
positive labels(Figure 2B) sub-sampled in space synonyms (27
positive labels, 240 negative labels), stem synonyms (70 positive labels,
153 negative labels), wordnet synonyms (200 positive labels, 131
negative labels), stop word synonyms (3 positive labels, 55 negative
labels), and phrases (129 positive labels, 50 negative labels) and
select a threshold of 0.3 where 85% of the negative labels and only
5% of the positive training labels are filtered out. The final filtered
synonyms are stored in a dictionary that is used for runtime query
rewriting.
4</p>
    </sec>
    <sec id="sec-11">
      <title>EVALUATION RESULTS</title>
      <p>In this section we discuss the experimental results for our approach.
In Section 4.1, we share details about the synonym datasets, before
and after filtering with the synonym classifier. In Section 4.2 we
report metrics about the synonym classifier described in Section
3.2, and cover impact of synonym based query expansions on the
number of search results returned across three of our major markets
in Section 4.3 and results from our A/B tests in Section 4.4 and finally
some anecdotal examples in Section 4.5.
4.1</p>
    </sec>
    <sec id="sec-12">
      <title>Synonym Datasets</title>
      <p>We filter the synonym candidates generated with various
techniques described in Section 3.1 using a classifier trained on human
judged labels, where a pair of query n-gram and its expansion
is judged based on the utility of the expansion to bring in
additional relevant items matching the query n-gram. For example,
(box, boxing) → irrelevant(label = 0) because adding boxing as a
synonym for box could change the intent from box as a container
to boxing the sport, although they do qualify as stem synonyms,
whereas (rug, carpet) → relevant(label = 1) because adding
carpet as an alternate expansion to rug brings in additional relevant
items for the query. We trained the synonym classifier using 4K
training records from our three biggest markets (US, UK, and
Germany) with 10 fold cross validation. Within this dataset, 1.8K are
positive labels and 2.2K are negative labels. We also use 300 records
as holdout set for final model testing. For generating final
synonym candidates we sampled 2M synonym pairs, from which we
ifltered 940K synonym pairs using the ML classifier based synonym
selection process as discussed in Section 3.2.</p>
    </sec>
    <sec id="sec-13">
      <title>ML Classifier Results</title>
      <p>In this section we discuss in detail the models performance on the
holdout set. Figure 3 shows the AUC and AUPRC performance of
the random forest synonym classification model on the holdout
data. We achieved an AUC of 0.87 and an AUPRC of 0.84 on the
holdout data.</p>
      <p>Figure 4 shows the top 10 important features picked up by the
model. We are unable to disclose the full list of features considered
and the specifics of the diferent engagement signals. We indicate
diferent engagement signals (denoted ef ) in the Figure) with
numbers when they are used as stand alone feature, and as Combined
when all the diferent signals are combined to capture a certain
feature. We observe that most of the top features picked up by the
model are a combination of behavioral and semantic features. For
example, the most important feature is an engagement signal
capturing the fraction of that engagement signal for queries where the
trigger n-gram is present in both the query and title, and the user
behavioral signals are combined together. Another important signal
is the number of items engaged with where the trigger n-gram was
in the query and both the expansion and the trigger n-grams were
present in the title of the engaged items.</p>
      <p>Figure 5 shows the variation of prediction scores across diferent
types of synonyms and the diferent transition types as discussed
in section 3.1. For instance, we observe that for stemming based
synonyms(stemsyn) the median classifier score is lower than the
median score for compounding based synonyms(spacesyn). In
Figure 6 we show the absolute error of the predictions made by the
classifier across diferent rewrite types, and consistent with the
previous observation, the absolute error for stemming based
synonyms overall seems higher than that of the space synonyms for
most subtypes.
Another interesting observation we made is that the change in
test error for the given model with increasing training data size
saturates after 900 records. This suggests that with the given model
complexity and feature set, a small training dataset has a reasonably
high discriminating capability. Figure 7 shows the change in test
error with increasing training data size.
4.3</p>
    </sec>
    <sec id="sec-14">
      <title>Recall Impact of Query Rewriting</title>
      <p>We sample queries across the three major sites and ran an
experiment where we recorded the mean number of search results across
the query set (recall size), with and without including these
synonym based query expansions. The result is shown in Figure 8. This
analysis shows the direct recall change we make in eBay search
using synonym query expansions (on average 3x or more items are
returned with synonym query expansions) for the impacted query
set.</p>
    </sec>
    <sec id="sec-15">
      <title>4.4 A/B Test Results</title>
      <p>We ran online A/B tests to evaluate the value of the synonym mining
and filtering techniques described in this work. For the test we used
33 percent of the live trafic for test and control, ran over a period
of 3 weeks. We observed improvements in both relevance of the
search result page (SRP) and user engagement across all markets,
but are unable to share more specifics on the business metrics.</p>
    </sec>
    <sec id="sec-16">
      <title>4.5 Anecdotal Examples</title>
      <p>In this section we show an example of how good rewrites for a given
term helps get more relevant results. The example shows the rewrite
airpods for the term air pods in figure 9. As we can see 6 out of
the 8 top listings for the query don’t contain the original query
term, and are present in the SRP due to the synonym expansion.
All these results are relevant to the original query.</p>
    </sec>
    <sec id="sec-17">
      <title>5 CONCLUSION</title>
      <p>In this work, we presented our system for e-commerce query
rewriting through token expansions. We described our end-to-end process
for generating token synonym candidates, evaluating them and
incorporating them for improving retrieval. We also shared our results
in terms of the quality of the query rewrites filtered with a machine
learned classifier when evaluated against a human labeled dataset
as well as its value in improving the quality of search results. For
future work, we plan to incorporate more techniques for mining
synonym candidates, incorporating more context in the synonym
selection process as well as leverage other sources like knowledge
graphs for query rewriting.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Ricardo</given-names>
            <surname>Baeza-Yates</surname>
          </string-name>
          and
          <string-name>
            <given-names>Alessandro</given-names>
            <surname>Tiberi</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Extracting semantic relations from query logs</article-title>
          .
          <source>In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM</source>
          ,
          <volume>76</volume>
          -
          <fpage>85</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Leo</given-names>
            <surname>Breiman</surname>
          </string-name>
          .
          <year>2001</year>
          .
          <string-name>
            <given-names>Random</given-names>
            <surname>Forests</surname>
          </string-name>
          .
          <source>Mach. Learn</source>
          .
          <volume>45</volume>
          ,
          <issue>1</issue>
          (Oct.
          <year>2001</year>
          ),
          <fpage>5</fpage>
          -
          <lpage>32</lpage>
          . https: //doi.org/10.1023/A:1010933404324
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Hang</given-names>
            <surname>Cui</surname>
          </string-name>
          ,
          <string-name>
            <surname>Ji-Rong</surname>
            <given-names>Wen</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jian-Yun Nie</surname>
          </string-name>
          , and
          <string-name>
            <surname>Wei-Ying Ma</surname>
          </string-name>
          .
          <year>2002</year>
          .
          <article-title>Probabilistic query expansion using query logs</article-title>
          .
          <source>In Proceedings of the 11th international conference on World Wide Web. ACM</source>
          ,
          <volume>325</volume>
          -
          <fpage>332</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Ingo</given-names>
            <surname>Feinerer</surname>
          </string-name>
          and
          <string-name>
            <given-names>Kurt</given-names>
            <surname>Hornik</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>wordnet: WordNet Interface</article-title>
          . https://CRAN. R-project.
          <source>org/package=wordnet R package version 0</source>
          .
          <fpage>1</fpage>
          -
          <lpage>14</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Christiane</given-names>
            <surname>Fellbaum</surname>
          </string-name>
          .
          <year>1998</year>
          .
          <article-title>WordNet: An Electronic Lexical Database</article-title>
          . Bradford Books.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Jianfeng</given-names>
            <surname>Gao</surname>
          </string-name>
          , Xiaodong He,
          <string-name>
            <surname>Shasha Xie</surname>
            , and
            <given-names>Alnur</given-names>
          </string-name>
          <string-name>
            <surname>Ali</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Learning lexicon models from search logs for query expansion</article-title>
          .
          <source>In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Association for Computational Linguistics</source>
          ,
          <fpage>666</fpage>
          -
          <lpage>676</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Jianfeng</given-names>
            <surname>Gao</surname>
          </string-name>
          , Gu Xu,
          <string-name>
            <given-names>and Jinxi</given-names>
            <surname>Xu</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Query expansion using path-constrained random walks</article-title>
          .
          <source>In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval. ACM</source>
          ,
          <volume>563</volume>
          -
          <fpage>572</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Julio</given-names>
            <surname>Gonzalo</surname>
          </string-name>
          , Felisa Verdejo, Irina Chugur, and
          <string-name>
            <given-names>Juan</given-names>
            <surname>Cigarran</surname>
          </string-name>
          .
          <year>1998</year>
          .
          <article-title>Indexing with WordNet synsets can improve text retrieval</article-title>
          .
          <source>arXiv preprint cmp-lg/9808002</source>
          (
          <year>1998</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Yunlong</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <surname>Jiliang Tang</surname>
            , Hua Ouyang, Changsung Kang, Dawei
            <given-names>Yi</given-names>
            n, and Yi
          </string-name>
          <string-name>
            <surname>Chang</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Learning to rewrite queries</article-title>
          .
          <source>In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. ACM</source>
          ,
          <volume>1443</volume>
          -
          <fpage>1452</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Chien-Kang</surname>
            <given-names>Huang</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee-Feng Chien</surname>
          </string-name>
          , and
          <string-name>
            <surname>Yen-Jen Oyang</surname>
          </string-name>
          .
          <year>2003</year>
          .
          <article-title>Relevant term suggestion in interactive web search based on contextual information in query session logs</article-title>
          .
          <source>Journal of the American Society for Information Science and Technology 54</source>
          ,
          <issue>7</issue>
          (
          <year>2003</year>
          ),
          <fpage>638</fpage>
          -
          <lpage>649</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Rosie</given-names>
            <surname>Jones</surname>
          </string-name>
          , Benjamin Rey, Omid Madani, and Wiley Greiner.
          <year>2006</year>
          .
          <article-title>Generating query substitutions</article-title>
          .
          <source>In Proceedings of the 15th international conference on World Wide Web. ACM</source>
          ,
          <volume>387</volume>
          -
          <fpage>396</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Rila</surname>
            <given-names>Mandala</given-names>
          </string-name>
          , Takenobu Tokunaga, and
          <string-name>
            <given-names>Hozumi</given-names>
            <surname>Tanaka</surname>
          </string-name>
          .
          <year>1999</year>
          .
          <article-title>Combining multiple evidence from diferent types of thesaurus for query expansion</article-title>
          .
          <source>In SIGIR</source>
          , Vol.
          <volume>99</volume>
          .
          <fpage>15</fpage>
          -
          <lpage>19</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>Roberto</given-names>
            <surname>Navigli</surname>
          </string-name>
          and
          <string-name>
            <given-names>Paola</given-names>
            <surname>Velardi</surname>
          </string-name>
          .
          <year>2003</year>
          .
          <article-title>An analysis of ontology-based query expansion strategies</article-title>
          .
          <source>In Proceedings of the 14th European Conference on Machine Learning, Workshop on Adaptive Text Extraction and Mining</source>
          ,
          <string-name>
            <surname>Cavtat-Dubrovnik</surname>
          </string-name>
          , Croatia. Citeseer,
          <volume>42</volume>
          -
          <fpage>49</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>F.</given-names>
            <surname>Pedregosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Varoquaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gramfort</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Michel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Thirion</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Grisel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Blondel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Prettenhofer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Weiss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Dubourg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Vanderplas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Passos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Cournapeau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Brucher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Perrot</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E.</given-names>
            <surname>Duchesnay</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Scikit-learn: Machine Learning in Python</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          <volume>12</volume>
          (
          <year>2011</year>
          ),
          <fpage>2825</fpage>
          -
          <lpage>2830</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Martin</surname>
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Porter</surname>
          </string-name>
          .
          <year>2001</year>
          .
          <article-title>Snowball: A language for stemming algorithms</article-title>
          .
          <source>Published online. (October</source>
          <year>2001</year>
          ). http://snowball.tartarus.org/texts/introduction.
          <source>html Accessed</source>
          <volume>11</volume>
          .
          <fpage>03</fpage>
          .
          <year>2008</year>
          ,
          <volume>15</volume>
          .
          <year>00h</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>Yonggang</given-names>
            <surname>Qiu</surname>
          </string-name>
          and
          <string-name>
            <surname>Hans-Peter Frei</surname>
          </string-name>
          .
          <year>1993</year>
          .
          <article-title>Concept based query expansion</article-title>
          .
          <source>In Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval. ACM</source>
          ,
          <volume>160</volume>
          -
          <fpage>169</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>Stefan</given-names>
            <surname>Riezler</surname>
          </string-name>
          and
          <string-name>
            <given-names>Yi</given-names>
            <surname>Liu</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Query rewriting using monolingual statistical machine translation</article-title>
          .
          <source>Computational Linguistics</source>
          <volume>36</volume>
          ,
          <issue>3</issue>
          (
          <year>2010</year>
          ),
          <fpage>569</fpage>
          -
          <lpage>582</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Ariel</surname>
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Schwartz</surname>
            and
            <given-names>Marti A.</given-names>
          </string-name>
          <string-name>
            <surname>Hearst</surname>
          </string-name>
          .
          <year>2003</year>
          .
          <article-title>A Simple Algorithm for Identifying Abbreviation Definitions in Biomedical Text</article-title>
          .
          <source>In Pacific Symposium on Biocomputing</source>
          .
          <volume>451</volume>
          -
          <fpage>462</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Prathyusha</given-names>
            <surname>Senthil</surname>
          </string-name>
          <string-name>
            <surname>Kumar</surname>
          </string-name>
          , Vamsi Salaka, Tracy Holloway King, and Brian Johnson.
          <year>2014</year>
          .
          <article-title>Mickey Mouse is not a Phrase: Improving Relevance in E-Commerce with Multiword Expressions</article-title>
          .
          <source>In Proceedings of the 10th Workshop on Multiword Expressions (MWE)</source>
          .
          <article-title>Association for Computational Linguistics</article-title>
          , Gothenburg, Sweden,
          <fpage>62</fpage>
          -
          <lpage>66</lpage>
          . https://doi.org/10.3115/v1/
          <fpage>W14</fpage>
          -0810
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>Andrew</given-names>
            <surname>Trotman</surname>
          </string-name>
          , Jon Degenhardt, and
          <string-name>
            <given-names>Surya</given-names>
            <surname>Kallumadi</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>The architecture of ebay search</article-title>
          .
          <source>In ACM SIGIR Forum. ACM.</source>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>Mike</given-names>
            <surname>Wallace</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Jawbone Java WordNet API</article-title>
          . http://mfwallace.googlepages. com/jawbone
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>Jinxi</given-names>
            <surname>Xu</surname>
          </string-name>
          and
          <string-name>
            <given-names>W. Bruce</given-names>
            <surname>Croft</surname>
          </string-name>
          .
          <year>1996</year>
          .
          <article-title>Query Expansion Using Local and Global Document Analysis</article-title>
          .
          <source>In Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '96)</source>
          . ACM, New York, NY, USA,
          <fpage>4</fpage>
          -
          <lpage>11</lpage>
          . https://doi.org/10.1145/243199.243202
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>