<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>SIGIR Workshop on eCommerce, Jul</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Enhancement by Early Product Categorization</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Gregory Goren</string-name>
          <email>ggoren@ebay.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ido Guy</string-name>
          <email>idoguy@acm.org</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Slava Novgorodov</string-name>
          <email>slavanov@post.tau.ac.il</email>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>eBay Research</institution>
          ,
          <addr-line>Netanya</addr-line>
          ,
          <country country="IL">Israel</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>17</volume>
      <issue>2025</issue>
      <abstract>
        <p>Product categorization in e-commerce platforms is an action of placement and organization of products into their respective classes. It has attracted a lot of research interest and attention as one of the most fundamental tasks in e-commerce. Categorization has high importance for both buyers and sellers, and plays a significant role for various downstream tasks such as product search, price comparison, and complementary product recommendation. In this work, we study product categorization based solely on the product's title and, specifically, prefixes of the title. This aims at identifying the category at an early stage of the selling flow, the process in which a seller uploads an item ofered for sale to the e-commerce platform. Once the item's category is identified, the rest of the selling process can be adapted accordingly and expedited towards a smooth conclusion. We perform an extensive analysis of title prefix categorization, inspecting to what degree the product categorization task could be efectively accomplished while only using the beginning of the title. To this end, we propose BERT-Attrs, an extension of BERT that considers, in addition to the prefix's representation, also the association of its tokens with attributes, such as brand, color, or material. Evaluation, conducted over datasets from two of the world's largest e-commerce platforms, with hundreds of categories, considers the prefix-based categorization task both from a classification and recommendation points of view. To the best of our knowledge, we are the first to introduce and study the task of product categorization based on title prefixes.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        E-commerce platforms have demonstrated tremendous growth over the past decade, with the
number of products available for online shopping rapidly increasing year over year. Maintaining
these products in an organized manner requires much efort both from the sellers who upload
their inventory for sale and from the e-commerce platforms hosting these products. Sellers are
expected to provide as accurate product data as possible, which in turn has a major efect on
buyers’ purchase likelihood and the marketplace’s success as a whole [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. E-commerce platforms
need to match products ofered for sales across multiple sellers, present them to buyers in an
engaging manner, and, ultimately, facilitate the transaction between sellers and buyers.
      </p>
      <p>
        When sellers upload a new product for sale on an e-commerce platform, they need to provide
much information, so that it can be correctly and attractively presented to potential buyers.
Often, the upload process involves coming up with a product title, selecting the most relevant
category in the platform’s product taxonomy, deciding which product attributes to provide,
uploading images, writing a product description, and providing information pertaining to price,
shipping, and available quantity in stock. We refer to this process, supported by a dedicated
user interface in leading e-commerce platforms, as the selling flow (also sometimes referred to
as the listing flow) [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ]. This process usually consist of many steps and is known to often be
cumbersome, requiring a substantial time investment from sellers and sometimes forming an
entry barrier for the selling process as a whole [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ]. In particular, sellers often struggle to adapt
https://slavanov.com (S. Novgorodov)
(CC BY 4.0).
      </p>
      <p>CEUR
Workshop</p>
      <p>
        ISSN1613-0073
to the specific taxonomy and terminology of the e-commerce platform, come up with the best
title for their product, and verify that the most important attributes are provided, to ensure that
the product is easy to find, highly ranked by recommendation algorithms, attracts the buyer’s
attention, and is perceived as a quality purchase [
        <xref ref-type="bibr" rid="ref1 ref6">1, 6</xref>
        ]. Research has therefore been devoted to
optimizing various phases of the selling flow, in forms such as price guidance (e.g., [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ]) and
title optimization (e.g., [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ]).
      </p>
      <p>
        A first step that can play a key role in making the selling flow smoother is the identification of
the correct category of the product ofered for sale. Once the ofered product can be associated
with a category or type (e.g., a wristwatch, a juicer, or a backpack), the selling flow can be
adapted to facilitate the remainder of the process for the seller and enable a swift and productive
conclusion. Specifically, once the category is known, the seller can be directed to provide the
most relevant information for that category. For instance, the platform can explicitly ask the
seller to provide the name of the network carrier and the capacity of the internal storage if it
categorizes the product as a smartphone. In other cases, the platform can redirect the seller to
a separate flow if it detects the specific product category. For example, a platform for selling
consumer electronics may not support personal computers and therefore redirect sellers to an
afiliated website when it detects that they try to sell a laptop. Furthermore, domain-specific
tools can be applied on top of seller-provided information once the product’s category is known.
For instance, computer vision models for identifying product attributes that specialize in specific
business verticals (e.g., Jewelry, Toys) can be applied to seller-provided images [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>
        Automatic product categorization is one of the most fundamental tasks in e-commerce and
received attention in diferent studies over the last two decades. Many works addressed this task
under varying assumptions and product facets as the sources of data. Several works use product
descriptions and reviews (e.g., [
        <xref ref-type="bibr" rid="ref12 ref13 ref14">12, 13, 14</xref>
        ]), while others use a combination of titles and images
(e.g., [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]). Since title is the most essential and typically the first facet of the product provided
by sellers, multiple works have focused on the task of product categorization based on the title
only [
        <xref ref-type="bibr" rid="ref16 ref17 ref18">16, 17, 18</xref>
        ]. Yet, coming up with a good title can be a demanding task [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. In this work,
we therefore suggest that categorization can be efectively performed before a complete title is
typed in. Such early categorization, based on the first few typed tokens of the first draft of the
title, can help enhance the selling flow. Devising a method that can address this task based on
as few tokens as possible is the focus of our study.
      </p>
      <p>
        It should be noted that manual product categorization (i.e., when sellers explicitly assign a
category to the product at the beginning of the selling flow) is quite dificult and sometimes even
unfeasible. First, the seller should be familiar with the platform’s set of categories (which may
consists of hundreds and sometimes thousands of various categories). Moreover, the category
structure and hierarchy may difer from one platform to another, making the process even more
cumbersome and confusing. It is therefore not surprising that all large e-commerce platforms
apply an automatic approach for product categorization [
        <xref ref-type="bibr" rid="ref17 ref19">17, 19</xref>
        ].
      </p>
      <p>
        Figure 1 shows an example for the potential operation of early categorization as part of the
selling flow on eBay, one of the world’s largest e-commerce platforms. This example considers
a seller who intends to upload a camera and comes up with the title Sony Black CyberShot
W830 20MP 8x Zoom Compact Digital Camera. This title contains 10 tokens, however the
product category (digital camera) is mentioned only at its end. As the seller starts to type the
title, it is dificult to identify an accurate category based on the first two tokens, since Sony is a
manufacturer of many types of electronic goods (e.g., TVs, Cameras, Smartphones) and Black
is a common color across a wide variety of categories within the Electronics business vertical.
However, the third token (CyberShot) is a well-known series of digital cameras, therefore already
disclosing the product’s type without an explicit mention. This early category recognition allows
for applying an attribute extraction model [
        <xref ref-type="bibr" rid="ref20 ref21 ref22">20, 21, 22</xref>
        ] specialized in the cameras domain, which
can identify the existing attributes from the title’s prefix (brand, color, and series, in our example)
and suggest the seller to fill in the remaining key attributes for this type of product (model,
zoom, and resolution), after they have typed only 3 tokens rather than 10. Once the seller has
provided these, they may never need to type the whole title. The platform can automatically
suggest a title based on the already-provided attributes (potentially reusing titles of existing
products uploaded by other sellers) and the seller’s eforts may focus on uploading some images
or providing price and stock information.
      </p>
      <p>
        To our knowledge, all prior eforts on product categorization considered the product title as a
whole [
        <xref ref-type="bibr" rid="ref16 ref17 ref18">16, 17, 18</xref>
        ], based on a user-provided indication that the title has been completed (e.g.,
clicking on the next requested field). Since product titles in e-commerce are typically long – our
analysis indicates that the median title length on eBay is 12 tokens – early categorization can
save substantial eforts. Specifically, we consider early categorization based on at most of half
of the title’s tokens, aiming to save at least half of the title typing efort. No less important,
however, is the ability to suggest key attributes to the seller before having to complete a full
title (see last step in Figure 1). This spares the need to apply attribute extraction techniques
over the rest of the title [
        <xref ref-type="bibr" rid="ref20 ref23">20, 23</xref>
        ], saves the seller from having to familiarize with the taxonomy
and terminology of the e-commerce platform, and allows to automatically suggest a full title
that the seller can use.
      </p>
      <p>
        The task of early categorization based on only very few tokens is challenging, since the input
is short and partial. Our solution is based on state-of-the-art natural language processing
methods, which involve contextual embeddings of the prefix tokens produced by language
models. To allow more eficient category inference, we extend BERT [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] in a novel manner, by
encoding attribute-value information identified in the prefix. While we examine the traditional
classification approach, aiming to identify the one specific category matching the prefix, we also
consider a recommendation approach, which allows involving the seller in the process, asking
them to select one out of few (3 or 5) options as part of the selling flow, after typing the first few
title tokens. Our experiments over large datasets from two of the world’s largest e-commerce
platforms, eBay and Amazon, show the superiority of our novel approach relative to baselines.
      </p>
      <p>Our work’s main contributions can be summarized as follows:
• To the best of our knowledge, we are the first to introduce and study the task of product
categorization based on title prefixes, motivated by the need to enhance the selling flow
experience on e-commerce platforms.
• We provide an extensive analysis of the efect of title prefix length on product categorization
quality, also tying to the length of the original title.
• We propose a novel method for product categorization that extends BERT to include product
attributes, which we empirically show to consistently improve performance.
• We address the product categorization task as a recommendation problem, which can be an
integral part of the selling flow, and show that this approach yields high performance gains,
making it applicable for short prefixes, with as few as 3 to 4 tokens.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        E-commerce platforms provide a space for buyers and sellers to connect and engage in online
transactions. While much research has been conducted to improve the experience of buyers,
for instance in product search [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ], personalized recommendations [
        <xref ref-type="bibr" rid="ref26 ref27">26, 27</xref>
        ], and product review
summarization [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ], less attention has been dedicated to improving the end-to-end selling
experience. Nevertheless, in addition to studies on product categorization (e.g., [
        <xref ref-type="bibr" rid="ref12 ref17 ref29">12, 29, 17</xref>
        ],
discussed in detail below), several works have focused on diferent phases of the selling process.
Specifically, recent work explored title optimization to help sellers come up with the most efective
and attractive title [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ]; attribute enhancement to ensure that the product information most
important to consumers is provided [
        <xref ref-type="bibr" rid="ref22 ref30 ref31">30, 31, 22</xref>
        ]; and description generation to equip sellers with
automatically-produced descriptions of their products [
        <xref ref-type="bibr" rid="ref32 ref33">32, 33</xref>
        ]. Additionally, research has been
dedicated to the task of price guidance (e.g., [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ] ), which aims to provide sellers with price
suggestions, typically based on the rate of similar products recently purchased.
      </p>
      <p>
        Our work examines the task of product categorization, motivated by the desire to enhance
the selling flow as early as possible, by adapting it to the type of the product ofered for sale.
To this end, we take a novel approach by using only the first few tokens of a title to predict
the product’s category, rather than relying on the whole title or additional information such
as description, attributes or images, as has been done in previous work (e.g., [
        <xref ref-type="bibr" rid="ref14 ref15 ref34">14, 34, 15</xref>
        ]). To
the best of our knowledge, we are the first to examine such early categorization based on title
prefixes.
      </p>
      <p>
        In general, the task of product categorization in e-commerce has received a lot of attention
in the last two decades. Several works used product descriptions to identify a product’s
category [
        <xref ref-type="bibr" rid="ref12 ref13 ref14 ref29 ref35">29, 12, 14, 13, 35</xref>
        ]. For instance, Liu and Wangperawong [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ] fine-tuned a BERT model
for product categorization via descriptions. Their work highlights the efectiveness of BERT with
respect to XLNet [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ] over that task. Other works performed categorization based on additional
facets of the product, such as reviews [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ] or images [
        <xref ref-type="bibr" rid="ref37 ref38">37, 38</xref>
        ]. Using images for categorization in
e-commerce can be challenging, as they are often of poor quality or missing altogether [
        <xref ref-type="bibr" rid="ref39 ref40">39, 40</xref>
        ].
      </p>
      <p>
        A few works used solely the title for product categorization, since it is typically available
for all products. Shen et al. [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] proposed a hierarchical approach that decomposed the
categorization problem into a coarse-level task and a fine-level task for categorizing the title.
Paulucio et al. [
        <xref ref-type="bibr" rid="ref41">41</xref>
        ] used a fine-tuned BERT model’s title embedding as an input to additional
machine learning models for product categorization. Their work showed that BERT was able to
produce efective representations of product titles for the categorization task. Other works used
the title for hierarchical categorization and highlighted the diferences between title and text
classification [
        <xref ref-type="bibr" rid="ref17 ref18">17, 18</xref>
        ]. For instance, stemming and stopword removal can both be beneficial for
text classification, but are not suggested for title classification. As mentioned above, our work
difers from prior eforts by its focus on incomplete information, i.e., studying the categorization
task over title prefixes.
      </p>
      <p>
        A related body of research to our work focuses on short text classification. The core challenge
of this task is to address the lack of available information in short texts. A number of works
proposed methods that employ external information, such as latent topical representation and
taxonomy-based semantic features, to enhance the representation of the short texts [
        <xref ref-type="bibr" rid="ref42 ref43">42, 43</xref>
        ].
Moreover, query classification can be viewed as a short text classification problem. The average
length of a typed query in web search ranges between 1.9 and 3.2 tokens [
        <xref ref-type="bibr" rid="ref44 ref45">44, 45</xref>
        ]. Several
studies tackled query classification by exploiting diferent types of external information to
overcome the ambiguity manifested by the shortness of queries [
        <xref ref-type="bibr" rid="ref46 ref47 ref48">46, 47, 48</xref>
        ]. Other works used
deep neural models for the task of query classification for event categorization and extreme
classification [
        <xref ref-type="bibr" rid="ref49 ref50">49, 50</xref>
        ]. Our work has two primary diferences from short text classification. First,
short text classification typically addresses short textual snippets that are complete, while in our
case the prefix only covers the beginning of the title and may miss key information that appears
later. Second, to our knowledge, none of the existing works on short text classification specifically
focuses on product titles, which pose unique characteristics and loose language structure, with a
high number of nouns (attribute values) and low grammatically [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ].
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Prefix-Based Categorization</title>
      <p>and methods we use to address it.</p>
      <sec id="sec-3-1">
        <title>3.1. Categorization approaches</title>
        <p>In this section, we describe the product categorization task we examine, as well as the models
The task of product categorization aims to identify the category of the product given its
characteristics. In our work, the source of information taken into account for conducting the
task is only the product’s title, and specifically a prefix of the title. We examine categorization
in its classic approach, as a classification problem. To further improve performance in a way that
would be applicable early in the selling flow, we also suggest and evaluate a recommendation
approach, which entails the suggestion of the top-k most probable categories. Both approaches
require a learned model and the recommendation approach also requires the cooperation of the
seller in selecting the correct category out of a short list of suggested categories during the selling
lfow. The benefit of the recommendation approach is that it allows more room for error, since
identifying the correct category among the top-k is suficient. On the other hand, classification
can facilitate an automatic flow without any seller involvement (e.g., via an API that allows
batch listing of multiple items). It is therefore interesting to examine the trade-of between these
approaches.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Models</title>
        <p>
          loss:
Our experimental setting focuses on analyzing the performance of models based on the
pretrained BERT architecture [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ], which has been previously shown to represent product titles
efectively for categorization and outperformed other pre-trained language models [
          <xref ref-type="bibr" rid="ref36 ref41">36, 41</xref>
          ]. The
pre-training corpora are the BookCorpus dataset [
          <xref ref-type="bibr" rid="ref51">51</xref>
          ] and English Wikipedia. We fine-tune our
pre-trained models with an additional classification layer ∈ℝ ×
by optimizing the cross-entropy
L
= − log
exp((    ) )

∑=1 exp((    ) )

where  is the number of categories,  is the dimension of BERT’s hidden state representation,
 is the index of the correct category, and   ∈ ℝ is the final hidden state representation of the
special [CLS] token. We estimate the category association probabilities via 
exp((    ) )
∑=1 exp((    ) )

and we
use the ordering induced from these probabilities for both the classification and recommendation
approaches. We examined multiple training techniques over title prefixes, along with diferent
input enhancement methods, and compared them to learning based on complete titles.
        </p>
        <p>In addition to diferent methods utilizing BERT, we also inspect the results of LSTM, a
recurrent neural network based on long short-term memory architecture [52], with pre-trained
word2vec embeddings using continuous bag of words (CBOW) with negative sampling [53].</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Title prefix training methods</title>
        <p>Throughout this work, we considered prefixes of length in the range of ∈[1, ⌊
title length in tokens. For instance, for titles of length 12, we examined prefixes of length
1 to 6
tokens. We inspected two diferent learning methods for prefix categorization. The first trains
only on complete product titles and the second trains on prefixes. Specifically for the latter, for
each title in the training set, in each iteration of the training process (epoch), a prefix length is

2
 ⌋], where   is the
drawn uniformly at random out of the [1, ⌊ 
 ⌋] range and the corresponding prefix is used for
training.1</p>
        <p>As we will later show, prefix-based training substantially outperformed training based on
complete titles. We therefore proceeded to build our models based on title prefix training. Since
title prefixes consists of very few tokens, we sought to extend their representation with additional
information to allow more efective learning, as described in the next section.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Attribute-enhanced BERT</title>
        <p>In addition to the standard BERT-based approach (training using randomized prefixes), we
devised an extension of BERT, termed BERT-Attrs, which makes use of novel input enhancement
methods. As will be shown in Section 5, BERT consistently outperformed LSTM when trained
on title prefixes, and we therefore opted to focus on a BERT-based extension. Concretely,
BERTAttrs leverages additional information regarding the attributes of a product, extracted using a
state-of-the-art named entity recognition (NER) method specialized and trained for attribute
extraction from product titles [54]. This NER method aims at associating each token in the title
prefix with an attribute name such as brand, color, or size2. Overall, 63.68% of the tokens in the
training set could be associated with an attribute name. The extracted attribute information
is then injected into the BERT input as an additional sentence in the form: “[    1]
value1 [    2] value2 [     ] valueN”, where each [    ] is defined
as a special token. For example, for the prefix:</p>
        <p>“green adidas cotton”
the derived “attribute tokens” sentence would be:</p>
        <p>“[Color] green [Brand] adidas [Material] cotton”.</p>
        <p>In cases where prefix tokens could not be associated with an attribute name, BERT-Attrs
injects them along with the special corresponding token [UNKNOWN] as their attribute name.
Considering the example above, suppose the last token in “green adidas cotton” could not be
classified into an attribute name, BERT-Attrs would inject the sentence:</p>
        <p>
          “[Color] green [Brand] adidas [UNKNOWN] cotton”
Based on this injection, the whole input to BERT is of the form:
[]
&lt;prefix tokens&gt; [ ]
&lt;attribute tokens&gt; [ ]
where [CLS] and [SEP] are standard tokens, used as in the original BERT [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ], to denote the start
of the input and a separation between two input sentences, respectively. To examine whether
the first part of this input, i.e., the prefix in its original form, is necessary to include in addition
to the attribute tokens, we also experimented with a variant that only considers the attribute
tokens as the prefix’s representation input to BERT. We refer to this as BERT-AttrsOnly.
        </p>
        <p>In addition to the methods mentioned above, we examined several other representations that
consider attribute information as part of the input to BERT. First, we considered a variant of
BERT-Attrs that excludes tokens that cannot be associated with an attribute from the attribute
injection altogether. In other words, only tokens that can be associated with an attribute name
are included in the attribute-based representation that follows the original prefix. Second, we
experimented with a variation that aims to exploit BERT’s pre-trained language understanding
by adding the attribute information as natural language to the prefix. For instance, the attribute
1Wtiteleaolsfoleenxgptehri miennteeadchwtitrhainainvagriitaetrioantioonf.thTishiaspvparroiaactihonth,ahtowtreaviners, opveerrfoarlmlperdefixsiemsiilnartlhyearnadngoeften[1,s⌊li 2g⌋h]tfloyrmeaocrhe
poorly compared to drawing a prefix at random, while bearing substantially higher computational costs.
2We used an in-house NER model trained on eBay’s product titles. The quality of the model is depicted at [54].
injection of our example prefix would be in the form: “The prefix green adidas cotton contains
green as [Color], adidas as [Brand] and cotton as [Material]”. Finally, we considered an approach
that operates directly on BERT’s architecture. This is done by adding the word embedding of
the corresponding attribute to each token’s representation. Each token’s embedding is therefore
derived by:</p>
        <p>
          ⊕      ⊕   ⊕   
where ⊕ marks the tensor addition operation; WordEmb, TokenTypeIDEmb, and PositionEmb
are the standard components in the BERT architecture [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ]; and AttributeEmb is the added
attribute representation. For tokens without any attribute association, we experimented both
with a variant that adds the representation of the special token [UNKNOWN] and a variant
that does not add any attribute representation.
        </p>
        <p>All methods described in the previous paragraph consistently underperformed both
BERTAttrs and BERT-AttrsOnly. We therefore exclude them from our reported results, for clarity
of presentation, and focus on comparing LSTM, BERT, BERT-Attrs, and BERT-AttrsOnly,
trained over title prefixes as described in Section 3.3.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experimental Setup</title>
      <p>In this section, we provide details about our setup for experimentation, including the two datasets
and their basic characteristics and evaluation metrics. We publicly release our implementation
for reproducibility purposes.3</p>
      <sec id="sec-4-1">
        <title>4.1. Datasets</title>
        <p>We experiment with two diferent datasets. Our main dataset is from eBay, one of the world’s
largest e-commerce platforms.4 For reproducibility purposes, we also include results over a
3https://github.com/titleprefixes/prefixes_code
4A small sample of eBay’s data can be found at https://github.com/titleprefixes/prefixes_code/blob/main/data/ebay_
example_data.tsv
public dataset – the Amazon products dataset. In both datasets, the categories are defined by a
taxonomy , where verticals are at the highest level and the categories we consider for the task
are at the lowest level.</p>
        <sec id="sec-4-1-1">
          <title>4.1.1. eBay dataset</title>
          <p>This dataset includes a sample of 17 million listing titles and their corresponding categories
from eBay’s logs. The sample contains listed items (listings) that were ofered for sale on the
United States site during December 2020. In this dataset, the listings stem from 6 diferent
verticals and 704 corresponding categories. We randomly split the data into training, validation,
and test sets with a ratio of 60%, 20%, and 20%, respectively. Table 1 presents several examples
of titles and their corresponding verticals and categories in the dataset. In some cases, the
diferences between categories are subtle. For example, Watches Parts and Watches Accessories
or Golf Equipment and Golf Parts Repair. These minor diferences and overall large number of
categories entail dificulty in the categorization task.</p>
        </sec>
        <sec id="sec-4-1-2">
          <title>4.1.2. Amazon dataset</title>
          <p>We consider the Clothing, Shoes and Jewelry vertical (also referred to as “Fashion”) of the publicly
available product dataset from Amazon [55], another one of the world’s largest e-commerce
platforms. Overall, this portion of the dataset contains 464,745 product titles spanning 147
categories. As in the eBay dataset, some categories are rather similar to one another, for instance
Men’s Shirts and Women’s Shirts or Men’s Wrist Watches and Men’s Pocket Watches. We use
the same splitting scheme of 60%-20%-20% for training, validation, and test sets, respectively, as
for the eBay’s dataset. The products in this dataset span the years 1996-2014. Table 1 presents
several examples of titles and their corresponding categories in the dataset. Basic characteristics
of both the eBay and Amazon datasets are summarized in Table 2.</p>
        </sec>
        <sec id="sec-4-1-3">
          <title>4.1.3. Dataset characteristics</title>
          <p>the number of titles in the dataset associated with that category). These statistics convey the
extreme imbalance with respect to class size in both datasets. In the eBay dataset, 10.79% of
the categories (up to 100 titles per category) cover only 0.02% of all titles, while in the Amazon
dataset, 29.25% of the categories (also up to 100 titles per category) cover 0.2% of all titles. On
the other hand, on eBay, 4.54% of the categories (more than 100 titles per category) account
for 62.6% of all titles, while in Amazon 6.12% of the categories (bin “10K-100K”) cover 42.71% of
the titles. Naturally, this imbalance stems from the skewed distribution of e-commerce items
across categories, reflecting high diferences in their popularity.
eBay dataset, the average length is 12.24, with a standard deviation of 3.59 and a median of 12.
In the Amazon dataset, the average length is 11.06 tokens, with a standard deviation of 4.24,
and a median of 11. The titles on the eBay’s platform are strictly restricted to 80 characters,
whereas on Amazon there is a restriction of 200 characters. Thus, in rare cases, Amazon’s titles
can become substantially longer when compared to eBay. This phenomenon at the tail of the
distribution can be observed in Figure 2.</p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Evaluation</title>
        <sec id="sec-4-2-1">
          <title>4.2.1. Classification</title>
          <p>For the category classification task, we report accuracy, macro precision, and macro recall [ 56].
We focus on these metrics to understand the model’s performance in the scenario of extreme
imbalance. The macro metrics consider the average precision and recall across all classes, while
assigning each class with an equal weight, regardless of the number of its associated instances.
We omit micro precision and recall as they are both equal to accuracy when each data point is
assigned to exactly one class. Formally, the metrics are defined as:
   =</p>
          <p>=

∑=1 1 ̂ = 

.</p>
          <p>1
 =1
∑  
where  is number of instances,  ̂ is the predicted label of instance  ,   is its true label,  is the
number of classes and ∈{ ,   }</p>
        </sec>
        <sec id="sec-4-2-2">
          <title>4.2.2. Recommendation</title>
          <p>For the category recommendation task, we focus on the Hits@k metric [57, 58], defined as the
fraction of examples where the correct class is among the top-k ranks. It is worth noting that
Hits@1 is equivalent to the accuracy metric. Formally, it is defined as:</p>
          <p>∑=1 1
  ∈ ({ ̂

 }=1 )
where</p>
          <p>is the set of  highest-ranked classes according to model’s predictions.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>
        In this section, we report results for classification and recommendation, using the methods
described in Section 3. We start by showing that the results produced by training using complete
titles are not satisfactory, and thus motivate our approach of training over prefixes. We then
compare diferent approaches for the prefix-based product categorization task using prefixes for
training, for the problem of classification, followed by analogous results for recommendation.
Finally, we examine the contribution of diferent attributes to the categorization performance,
using ablation tests. The majority of the results are reported over a test set of titles with an
original length of 12 tokens, which is the most common across our two datasets combined, as
described in Section 4.1. We report results for all prefix lengths in the range of [
        <xref ref-type="bibr" rid="ref1 ref6">1, 6</xref>
        ]. Evaluation
using other title lengths yielded very similar results, hence excluded here.
      </p>
      <sec id="sec-5-1">
        <title>5.1. Title-trained classification</title>
        <p>We first examine the use of complete titles as training examples for prefix-based categorization.
To this end, we apply the BERT and LSTM models trained over complete titles on our
varyinglength prefixes. The classification results for titles of length 12 are summarized in Table 4.
Overall, we observe low values with respect to all reported metrics and datasets. Even for
prefixes of length 6 (half of the title), the accuracy did not exceed 74% and 60% over the eBay
and Amazon datasets, respectively. However, when applied on complete titles, the accuracy of
both models exceeds 90% on both datasets. This indicates that the learned patterns over the
complete titles do not generalize well when used to categorize prefixes. The setup of our proposed
methods is intended to address this issue and our observations provide empirical support.</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Prefix-trained classification</title>
        <p>In this section, we evaluate our ability to classify title prefixes using our proposed methods,
discussed in Section 3, and explore the trade-ofs between prefix length and categorization
performance. Tables 5 and 6 present the classification performance results when training on
prefixes using the LSTM, BERT, BERT-Attrs and BERT-AttrsOnly methods over the eBay and
Amazon datasets, respectively.</p>
        <p>As expected, classification performance degrades as prefix length decreases. For prefixes up to
3 tokens, we observe poor classification performance across both the eBay and Amazon datasets.
This reflects the dificulty in identifying the correct category based on very few tokens. Consider,
for example, the titles presented in Table 1 in the Fashion vertical. In both datasets, we observe
titles that share the same prefix when considering the first 3 tokens (i.e., Calvin Klein Womens
or The North Face), but are from diferent categories. Therefore, even a perfect classifier would
not distinguish between them based on three-token prefixes. On the other hand, for prefixes of
length 6, i.e., 50% of the title’s tokens, we can observe much improved performance. In particular,
over the eBay dataset, all BERT-based methods (BERT, BERT-Attrs and BERT-AttrsOnly)
reach accuracy of over 80%.</p>
        <p>Examining Table 5, it can be observed that the BERT-Attrs method outperforms all other
methods in terms of accuracy across all diferent prefix lengths. It also outperforms the other
methods in the vast majority of the cases in terms of macro precision and macro recall. The
consistent gap in performance between BERT-Attrs and BERT indicates that modeling the
attribute information as part of the input prefixes helps the categorization process. The uplift
of BERT-Attrs on top of BERT is more substantial for short prefixes (e.g. +5.7% for prefix
length 1 compared to 0.8% for prefix length 6), which are the most dificult to categorize and
therefore can benefit the most from the additional information used by the BERT-Attrs model.
The BERT-AttrsOnly model yields consistently lower performance than the BERT-Attrs model
(in accuracy), indicating that the inclusion of the title in its original form as part of the input
is not redundant. The results over the Amazon dataset in Table 6 show similar trends, with
BERT-Attrs outperforming the other methods across all prefix lengths except for length 1 where
BERT-AttrsOnly performs best. The gap between BERT-Attrs and BERT is consistent, while
largest for prefix lengths of 3 and 4, and substantially diminishing for the longer prefixes of 5
and 6 tokens.</p>
        <p>
          Figure 3 presents the classification accuracy as a function of the original titlle length (for
original length ∈[
          <xref ref-type="bibr" rid="ref18 ref9">9, 18</xref>
          ]), for prefixes of length 4, 5, and 6, using the BERT-Attrs method. The
results indicate that the performance is rather invariant to the original length of the title. That
is, the accuracy remains stable across all the original title lengths, while the prefix length is the
main factor that afects the performance. We note that the results when using the BERT and
BERT-AttrsOnly methods show a similar trend.
        </p>
        <p>Prefix len = 4</p>
        <p>Prefix len = 5</p>
        <p>Prefix len = 6
100
95
y90
c
rau85
c
cA80
75
70</p>
      </sec>
      <sec id="sec-5-3">
        <title>5.3. Prefix ranking</title>
        <p>In this section, we examine the performance of the learned models in the scenario of category
recommendation for title prefixes. That is, we use our models to recommend the top-k most
probable categories based on the input. This is achieved by considering the scores of our classifier
as ranking scores over categories and using the induced ranking for the recommendation.</p>
        <p>Table 7 presents the Hits@k results of all four methods over both the eBay and Amazon
datasets for titles of length 12. We focus on ∈{3, 5} , as we assume these values allow sellers to
traverse the candidate list and select the suitable category without substantial cognitive load.
We also focus on low values of  to allow the seller to view the entire list of categories on a single
screen.</p>
        <p>Observing the table, we can see that as in the classification results, performance improves
with the length of the prefix. Moreover, as in the classification case, the BERT-Attrs model
outperforms all other models across all prefix lengths for the eBay dataset, and in the vast
majority of the cases for the Amazon dataset (in a few cases BERT-AttrsOnly slightly outperforms
BERT-Attrs). The gap between BERT-Attrs and BERT is consistent for both Hits@3 and Hits@5
across all prefix lengths over both datasets, and is particularly large for the shorter prefixes,
where attribute information is most essential to allow narrowing down the list of potentially
matching categories. Overall, these results reinforce the added value of modeling attribute
information. The small but rather consistent gap between BERT-Attrs and BERT-AttrsOnly
attests again to the benefit of including the original title as part of the input representation.
Also consistent with previous findings is the lower performance of the LSTM model compared to
the BERT-based models.</p>
        <p>In contrast to the classification case, however, the performance using all four models reaches
high values starting already from very short prefixes. For example, considering the best
performing BERT-Attrs model, while its accuracy over the eBay dataset only exceeds 82% for a
prefix length of 6 (Table 5), it exceeds 82% with Hits@3 when the prefix length is only 3 tokens
(25% of the title) and exceeds 81% with Hits@5 when the prefix length is only 2 tokens. This
indicates that while short prefixes tend to be ambiguous and may fit multiple categories, the
cardinality of the overall set of potential matching categories is already low after 2-3 tokens, in
many cases. This enables the recommendation approach, which loops in sellers and allows them
to disambiguate at an early stage, to be highly productive. In Table 1, the provided examples in
the Electronics vertical demonstrate the above point. For example, Sony Playstation 5 (or PS5)
is a prefix that can be shared across diferent categories, however, the list of relevant categories
is limited to the video games domain, with only few alternatives.</p>
        <p>Table 7 indicates that by the time a seller has typed 5 of the title’s tokens, the Hits@k values
using BERT-Attrs over both datasets are above 90%. For prefix length of 6 tokens, Hits@3 on
both datasets is higher than 92% and Hits@5 exceeds 95.5%, reaching as high as 96.14% over the
eBay dataset.</p>
        <p>
          To conclude this section, we explore in more depth the potential gains in performance applying
the recommendation method with diferent values of  . We focus on short prefixes of length 2, 3,
and 4, as these demonstrated rather low accuracy using the classification approach (Tables 5
and 6). Figure 4 plots the Hits@k as a function of ∈[
          <xref ref-type="bibr" rid="ref1 ref10">1, 10</xref>
          ] for these prefix lengths over the
eBay dataset, for titles of original length of 12 tokens. It can be observed that a substantial
performance gain is achieved when moving from Hits@1 (accuracy) to Hits@2, for all three prefix
lengths. For instance, when the prefix length is 2, performance rises from 54.34 to 66.87, when
moving from Hits@1 to Hits@2. As the value of  increases, the performance gain naturally
becomes smaller, but the overall performance continues to increase. For =10 , which entails a
rather cognitive-heavy selection from the seller, a prefix length of 2 tokens is suficient to yield a
hit in 88.73% of the cases.
        </p>
      </sec>
      <sec id="sec-5-4">
        <title>5.4. Attribute contribution</title>
        <p>Our BERT-Attrs model relies on attribute information to learn diferent patterns within the
titles to perform categorization. We therefore set out to explore the influence of key attributes
on the categorization task using ablation tests. In this experiment, we focus on three diferent
verticals within the eBay dataset – Fashion, Electronics, and Home &amp; Garden. For each vertical,
we consider the most popular category within the eBay’s test set (see Section 4.1), as measured
by the number of associated listings. The categories explored are Shirts, Video games, and
Home Decor from the Fashion, Electronics, and Home &amp; Garden verticals, respectively. For
each category, we consider all its associated titles in our test set of at least 12 tokens and
their corresponding 6-token prefixes. The total number of such prefixes in our test set from
each category is 236,223, 39,757 and 47,623, respectively. For each category, we use the NER
model [54] to extract attribute values from the prefixes. We report results over the top 7 most
frequent attributes in each category. For each such attribute, we consider all prefixes in the
corresponding category that include it. We measure the categorization accuracy across these
prefixes, and compare it to the accuracy over the same set of prefixes while removing all tokens
that correspond to values extracted for that attribute. This allows us to measure the relative
impact of removing diferent attribute values from the prefix on the categorization performance.</p>
        <p>Table 8 reports the accuracy diference between prefixes that include and exclude each of
the top 7 attributes in each of the categories. Alongside the accuracy diference, which attests
to its importance to categorization performance, the average length (in tokens) of each such
attribute across all category prefixes that include it, is presented. Observing the table, it can be
seen that the type attribute is the most important for Shirts and Home Decor categories. In
the Shirts category, the average length of the type values is only 1.38 tokens, ranking only 5th
out of the seven attributes in terms of length, indicating that the contribution to performance
does not necessarily coincide with length. In the Home Decor category, the average length of
type is 1.93 tokens, which is the third highest. The importance of type to the categorization
task is intuitive, as it is typically closely associated with the category. For example, possible
values for the attribute type in Shirts and Home Decor categories are T-shirt and Wall Picture,
prefix len = 2
prefix len = 3
prefix len = 4
1
2
3
4
5
6
7
8
9</p>
        <p>10
respectively. These values are associated strongly with their respective categories. We note that
type attribute tends to appear towards the end of the title. For instance, in the Fashion vertical
of the eBay dataset, it appears on the second half of the title in 65.5% of its occurrences.</p>
        <p>Other than type, the style attribute (e.g., retro, loungewear) is found to be the most important
attribute for categorization in the Shirts category. In Home Decor, the model attribute is as nearly
important for categorization as type, with values such as Cierra by Uttermost and ST1216B by
FAIRFIELD.</p>
        <p>For the Video Games category, the most important attribute is game name. This is an example
of an attribute that is mostly unique to a specific category (e.g., Tony Hawk’s Pro Skater) and
therefore its occurrence is highly revealing. It also tends to be exceptionally long at over 4
tokens. Therefore, its removal leaves very few tokens on the prefix, leading to the very large
performance gap in its ablation test. The type attribute, which is most important in the other
two categories, tends to be more generic in Video Games: its most common values are Disc or
Game. Accordingly, its importance as reflected in the ablation test is lower.</p>
        <p>The commonly-used brand attribute is not found to be among the most important attributes
for the categorization task, ranked fourth for both the Shirts and Home Decor categories. The
relatively low importance of the brand attribute in Shirts is intuitive, as the same brand can
span diferent categories. An example of such phenomenon is the known brand of Michael Kors,
which has products in various categories, including dresses, shoes, shirts, handbags, and even
watches. However, for the Home Decor category, the brand attribute can be more distinctive.
For example, it can be associated with the designer’s name (or an artist’s name), which is less
likely to be shared across diferent categories. The brand attribute is particularly common at the
beginning of the title: in the eBay dataset, it appears on the first half of the title in 88.6% and
83.2% of its occurrences in Fashion and Home &amp; Garden, respectively.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusions and Implications</title>
      <p>We studied the task of product categorization based on title prefixes. To our knowledge, we are
the first to introduce and explore prefix-based categorization, which can play an important role
in e-commerce platforms’ selling flow, allowing to adapt early to the unique characteristics and
attributes of a specific category, out of hundreds of available options. We first demonstrated that
using both BERT and LSTM models, trained on complete titles, for the prefix categorization
task, results in poor performance, and hence suggested a simple approach for training a model
based on title prefixes, where for each title a prefix of random length is selected in each iteration
of the training process. This method accomplished substantial performance improvements, with
an accuracy of roughly 80% for prefixes of half the length of the title (e.g., 6 out of 12 tokens),
and BERT consistently outperforming LSTM. Following, we suggested BERT-Attrs, a novel yet
simple extension of BERT for learning over title prefixes, by extending the representation of
the prefix with attribute information. This extended representation showed to yield consistent
performance enhancement, especially over prefixes of short length, where the original information
provided in the prefix is more limited. Interestingly, our results indicated that classification
performance using our prefix-trained approach is highly dependent on the prefix length, naturally
increasing as the prefix is longer, but is almost completely oblivious to the length of the original
title, for a given prefix length.</p>
      <p>We demonstrated the dificulty of prefixed-based classification, which often introduces
ambiguity, as the first few tokens are not uniquely identifying the specific category. Our attribute
ablation tests showed that many attributes that commonly occur on titles, such as color, size,
material, and even brand, which often appear at the beginning of the title, do not contribute
much to accurate categorization. On the other hand, a revealing attribute such as type commonly
appears toward the end of the title. This ambiguity, however, typically narrows down to only a
handful of categories after as few as 2 or 3 tokens have been provided. We therefore suggest
addressing the task using a recommendation approach, which loops sellers in the process and
allows them to select the category out of a shortlist of candidates. We showed that this approach
can dramatically improve the performance, reaching a hit rate of over 95% over both datasets
for prefixes of 6 out of 12 tokens, when shortlisting to 5 candidates. Moreover, for prefixes of 3
tokens only, the hit rate within the top 5 already exceeds 88%. The category selection out of
the recommended shortlist is, thus, a key step in our envisioned selling flow. It enables quick
identification of the type of product ofered for sale, and adaptation of the remainder of the
process towards a simplified and smooth conclusion.</p>
      <p>Our results quantitatively draw the trade-ofs between both the prefix length and the size of
the candidate shortlist, and categorization performance. These suggest a variety of alternatives
for e-commerce platforms to implement prefix-based categorization, considering how early they
want to provide a suggestion, whether they want to allow sellers to select the category out of a
certain-sized list, and how accurate they desire the results to be. All of our results generalize
similarly across two of the world’s largest e-commerce platforms, eBay and Amazon, and can be
reproduced over the public Amazon dataset.</p>
      <p>As the task of prefix-based categorization has not been previously studied, it ofers many
opportunities for future research. Validating the results on additional e-commerce platforms,
with diferent types of category sets and granularity, can help to further generalize. Even more
importantly, in-vivo experimentation with our proposed solution is necessary to quantify its
impact on the simplification and improvement of the selling flow. This experimentation would
also provide an opportunity to leverage otherwise unattainable user feedback, such as whether
and which suggested category was selected, at what stage, and how the selection was reflected in
the remainder of the selling flow. Using prefix-based categorization in an online setting may also
influence the way titles are formulated. Sellers may become increasingly aware of this feature
and adjust their titles to contain category-distinctive tokens, such as type and model, in the
beginning, to facilitate faster categorization. Finally, experimentation with additional models,
including the potential use of LLMs within the flow, is an interesting future direction. Both in
terms of the potential to increase performance and in terms of the challenge of applying large
models in an online scenario.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>The authors have not employed any Generative AI tools.
books, in: Proc. of ICCV, 2015, pp. 19–27.
[52] S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural computation 9 (1997)
1735–1780.
[53] T. Mikolov, K. Chen, G. Corrado, J. Dean, Eficient estimation of word representations in
vector space, arXiv:1301.37810 (2013).
[54] Y. Xin, E. Hart, V. Mahajan, J.-D. Ruvini, Learning better internal structure of words for
sequence labeling, arXiv:1810.12443 (2018).
[55] R. He, J. McAuley, Ups and downs: Modeling the visual evolution of fashion trends with
one-class collaborative filtering, in: Proc. of WWW, 2016, pp. 507–517.
[56] M. Grandini, E. Bagli, G. Visani, Metrics for multi-class classification: an overview,
arXiv:2008.05756 (2020).
[57] L. Yao, C. Mao, Y. Luo, Kg-bert: Bert for knowledge graph completion, arXiv:1909.03193
(2019).
[58] Y. Tay, V. Q. Tran, M. Dehghani, J. Ni, D. Bahri, H. Mehta, Z. Qin, K. Hui, Z. Zhao,
J. Gupta, et al., Transformer memory as a diferentiable search index, arXiv:2202.06991
(2022).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>F.</given-names>
            <surname>Moraes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , V. Murdock,
          <article-title>The role of attributes in product quality comparisons</article-title>
          ,
          <source>in: Proc. of CHIIR</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>253</fpage>
          -
          <lpage>262</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <article-title>[2] How to create a listing: the step-by-step guide</article-title>
          ,
          <year>2022</year>
          . URL: https://export.ebay.com/en/ first-steps/
          <article-title>how-create-listing/how-create-listing/.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3] How to start selling on Amazon,
          <year>2022</year>
          . URL: https://sell.amazon.com/sell.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>G.</given-names>
            <surname>Fuchs</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Roitman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mandelbrod</surname>
          </string-name>
          ,
          <article-title>Automatic form filling with form-bert</article-title>
          ,
          <source>in: Proc. of SIGIR</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>1850</fpage>
          -
          <lpage>1854</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>H.</given-names>
            <surname>Aragonda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Shaik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Jain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <article-title>Accurate and real time assisted cataloging in e-commerce using dual images</article-title>
          ,
          <source>in: Proc. of CODS-COMAD</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>265</fpage>
          -
          <lpage>269</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Niemir</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mrugalska</surname>
          </string-name>
          ,
          <article-title>Product data quality in e-commerce: Key success factors and challenges</article-title>
          ,
          <source>Production Management and Process Control</source>
          <volume>36</volume>
          (
          <year>2022</year>
          )
          <fpage>1</fpage>
          -
          <lpage>12</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Jin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>Incorporating price into recommendation with graph convolutional networks</article-title>
          ,
          <source>TKDE</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Pang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhao</surname>
          </string-name>
          , Preorder price guarantee in e-commerce,
          <source>M&amp;SOM</source>
          <volume>23</volume>
          (
          <year>2021</year>
          )
          <fpage>123</fpage>
          -
          <lpage>138</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Qiu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Si</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lan</surname>
          </string-name>
          ,
          <article-title>A multi-task learning approach for improving product title compression with user search log data</article-title>
          ,
          <source>in: Proc. of AAAI</source>
          , volume
          <volume>32</volume>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>L.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , W. Yan,
          <article-title>Toor: A novel product title optimization method based on online reviews in e-commerce, Frontiers of Business Research in China 9 (</article-title>
          <year>2015</year>
          )
          <fpage>536</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>A.</given-names>
            <surname>Dagan</surname>
          </string-name>
          , I. Guy,
          <string-name>
            <given-names>S.</given-names>
            <surname>Novgorodov</surname>
          </string-name>
          ,
          <article-title>An image is worth a thousand terms? analysis of visual e-commerce search</article-title>
          ,
          <source>in: Proc. of SIGIR</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>102</fpage>
          -
          <lpage>112</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Cevahir</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Murakami</surname>
          </string-name>
          ,
          <article-title>Large-scale multi-class and hierarchical product categorization for an e-commerce giant</article-title>
          ,
          <source>in: Proc. of COLING</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>525</fpage>
          -
          <lpage>535</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Krishnan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Amarthaluri</surname>
          </string-name>
          ,
          <article-title>Large scale product categorization using structured and unstructured attributes</article-title>
          , arXiv:
          <year>1903</year>
          .
          <volume>04254</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>M. Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kok</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Tan</surname>
          </string-name>
          ,
          <article-title>Don't classify, translate: Multi-level e-commerce product categorization via machine translation</article-title>
          , arXiv:
          <year>1812</year>
          .
          <volume>05774</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>S.</given-names>
            <surname>Eskesen</surname>
          </string-name>
          ,
          <article-title>Improving product categorization by combining image and title,</article-title>
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>D.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-D.</given-names>
            <surname>Ruvini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Sarwar</surname>
          </string-name>
          ,
          <article-title>Large-scale item categorization for e-commerce</article-title>
          ,
          <source>in: Proc. of CIKM</source>
          ,
          <year>2012</year>
          , pp.
          <fpage>595</fpage>
          -
          <lpage>604</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>I.</given-names>
            <surname>Hasson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Novgorodov</surname>
          </string-name>
          , G. Fuchs,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Acriche</surname>
          </string-name>
          ,
          <article-title>Category recognition in e-commerce using sequence-to-sequence hierarchical classification</article-title>
          ,
          <source>in: Proc. of WSDM</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>902</fpage>
          -
          <lpage>905</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>H.-F. Yu</surname>
            ,
            <given-names>C.-H.</given-names>
          </string-name>
          <string-name>
            <surname>Ho</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Arunachalam</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Somaiya</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.-J. Lin</surname>
          </string-name>
          ,
          <article-title>Product title classification versus text classification, Csie</article-title>
          . Ntu. Edu. Tw (
          <year>2012</year>
          )
          <fpage>1</fpage>
          -
          <lpage>25</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Kozareva</surname>
          </string-name>
          ,
          <article-title>Everyone likes shopping! multi-class product categorization for e-commerce</article-title>
          ,
          <source>in: Proc. of the NAACL-HLT</source>
          ,
          <year>2015</year>
          , pp.
          <fpage>1329</fpage>
          -
          <lpage>1333</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>A.</given-names>
            <surname>More</surname>
          </string-name>
          ,
          <article-title>Attribute extraction from product titles in ecommerce</article-title>
          ,
          <source>arXiv:1608.04670</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>D.</given-names>
            <surname>Putthividhya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <article-title>Bootstrapped named entity recognition for product attribute extraction</article-title>
          ,
          <source>in: Proc. of EMNLP</source>
          ,
          <year>2011</year>
          , pp.
          <fpage>1557</fpage>
          -
          <lpage>1567</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>L.</given-names>
            <surname>Bing</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.-L.</given-names>
            <surname>Wong</surname>
          </string-name>
          , W. Lam,
          <article-title>Unsupervised extraction of popular product attributes from e-commerce web sites by considering customer reviews</article-title>
          ,
          <source>TOIT</source>
          <volume>16</volume>
          (
          <year>2016</year>
          )
          <fpage>1</fpage>
          -
          <lpage>17</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>G.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mukherjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X. L.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>Opentag: Open attribute value extraction from product profiles</article-title>
          ,
          <source>in: Proc. of KDD</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>1049</fpage>
          -
          <lpage>1058</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , Bert:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          , arXiv:
          <year>1810</year>
          .
          <volume>04805</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>A.</given-names>
            <surname>Ahuja</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Rao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Katariya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Subbian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. K.</given-names>
            <surname>Reddy</surname>
          </string-name>
          ,
          <article-title>Language-agnostic representation learning for product search on e-commerce platforms</article-title>
          ,
          <source>in: Proc. of WSDM</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>7</fpage>
          -
          <lpage>15</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>H.</given-names>
            <surname>Hwangbo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. S.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. J.</given-names>
            <surname>Cha</surname>
          </string-name>
          ,
          <article-title>Recommendation system development for fashion retail e-commerce</article-title>
          ,
          <source>ECRA</source>
          <volume>28</volume>
          (
          <year>2018</year>
          )
          <fpage>94</fpage>
          -
          <lpage>101</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>L.</given-names>
            <surname>Jiang</surname>
          </string-name>
          , Y. Cheng, L.
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Yan</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          <string-name>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>A trust-based collaborative filtering algorithm for e-commerce recommendation system</article-title>
          ,
          <source>JAIHC</source>
          <volume>10</volume>
          (
          <year>2019</year>
          )
          <fpage>3023</fpage>
          -
          <lpage>3034</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>A.</given-names>
            <surname>Mabrouk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. P. D.</given-names>
            <surname>Redondo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kayed</surname>
          </string-name>
          ,
          <article-title>Seopinion: Summarization and exploration of opinion from e-commerce websites</article-title>
          ,
          <source>Sensors</source>
          <volume>21</volume>
          (
          <year>2021</year>
          )
          <fpage>636</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>J.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Warren</surname>
          </string-name>
          ,
          <article-title>Cost-sensitive learning for large-scale hierarchical classification</article-title>
          ,
          <source>in: Proc. of CIKM</source>
          ,
          <year>2013</year>
          , pp.
          <fpage>1351</fpage>
          -
          <lpage>1360</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>T.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <article-title>Multimodal joint attribute prediction and value extraction for e-commerce product</article-title>
          , arXiv:
          <year>2009</year>
          .
          <volume>07162</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>I.</given-names>
            <surname>Guy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Milo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Novgorodov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Youngmann</surname>
          </string-name>
          ,
          <article-title>Improving constrained search results by data melioration</article-title>
          ,
          <source>in: Proc. of ICDE</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>1667</fpage>
          -
          <lpage>1678</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>S.</given-names>
            <surname>Novgorodov</surname>
          </string-name>
          , I. Guy, G. Elad,
          <string-name>
            <given-names>K.</given-names>
            <surname>Radinsky</surname>
          </string-name>
          ,
          <article-title>Generating product descriptions from user reviews</article-title>
          ,
          <source>in: Proc. of WWW</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>1354</fpage>
          -
          <lpage>1364</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <surname>M.-T. Nguyen</surname>
            , P.-T. Nguyen,
            <given-names>V.-V.</given-names>
          </string-name>
          <string-name>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <string-name>
            <surname>Q.-M. Nguyen</surname>
          </string-name>
          ,
          <article-title>Generating product description with generative pre-trained transformer 2</article-title>
          ,
          <source>in: Proc. of CITISIA</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>S.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Peng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Niu</surname>
          </string-name>
          ,
          <article-title>Fine-grained product features extraction and categorization in reviews opinion mining</article-title>
          ,
          <source>in: Proc. of ICDM Workshops</source>
          ,
          <year>2012</year>
          , pp.
          <fpage>680</fpage>
          -
          <lpage>686</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Dai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yang</surname>
          </string-name>
          , J. Carbonell,
          <string-name>
            <given-names>R. R.</given-names>
            <surname>Salakhutdinov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q. V.</given-names>
            <surname>Le</surname>
          </string-name>
          ,
          <article-title>Xlnet: Generalized autoregressive pretraining for language understanding</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>32</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>X.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Wangperawong</surname>
          </string-name>
          ,
          <article-title>Transfer learning robustness in multi-class categorization by ifne-tuning pre-trained contextualized language models</article-title>
          , arXiv:
          <year>1909</year>
          .
          <volume>03564</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>P.</given-names>
            <surname>Ristoski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Petrovski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mika</surname>
          </string-name>
          ,
          <string-name>
            <surname>H. Paulheim,</surname>
          </string-name>
          <article-title>A machine learning approach for product matching and categorization</article-title>
          ,
          <source>Semantic web 9</source>
          (
          <year>2018</year>
          )
          <fpage>707</fpage>
          -
          <lpage>728</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>P.</given-names>
            <surname>Wirojwatanakul</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Wangperawong</surname>
          </string-name>
          <article-title>, Multi-label product categorization using multi-modal fusion models</article-title>
          , arXiv:
          <year>1907</year>
          .
          <volume>00420</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>F.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bubnov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Kiapour</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Piramuthu</surname>
          </string-name>
          , Visual search at ebay,
          <source>in: Proc. of SIGKDD</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>2101</fpage>
          -
          <lpage>2110</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <given-names>A.</given-names>
            <surname>Goswami</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Chittar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. H.</given-names>
            <surname>Sung</surname>
          </string-name>
          ,
          <article-title>A study on the impact of product images on user clicks for online shopping</article-title>
          ,
          <source>in: Proc. of WWW (Companion Volume)</source>
          ,
          <year>2011</year>
          , pp.
          <fpage>45</fpage>
          -
          <lpage>46</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [41]
          <string-name>
            <given-names>L. S.</given-names>
            <surname>Paulucio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. M.</given-names>
            <surname>Paixão</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. F.</given-names>
            <surname>Berriel</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. F. De Souza</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Badue</surname>
          </string-name>
          ,
          <string-name>
            <surname>T.</surname>
          </string-name>
          Oliveira-Santos,
          <article-title>Product categorization by title using deep neural networks as feature extractor</article-title>
          ,
          <source>in: Proc. of IJCNN</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          [42]
          <string-name>
            <given-names>H.</given-names>
            <surname>Linmei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Ji</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>Heterogeneous graph attention networks for semi-supervised short text classification</article-title>
          ,
          <source>in: Proc. of EMNLP-IJCNLP</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>4821</fpage>
          -
          <lpage>4830</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          [43]
          <string-name>
            <given-names>B.</given-names>
            <surname>Škrlj</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Martinc</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kralj</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Lavrač</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pollak</surname>
          </string-name>
          , tax2vec:
          <article-title>Constructing interpretable features from taxonomies for short text classification</article-title>
          ,
          <source>Computer Speech &amp; Language</source>
          <volume>65</volume>
          (
          <year>2021</year>
          )
          <fpage>101104</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          [44]
          <string-name>
            <given-names>A.</given-names>
            <surname>Spink</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. J.</given-names>
            <surname>Jansen</surname>
          </string-name>
          ,
          <article-title>A study of web search trends</article-title>
          ,
          <source>Webology</source>
          <volume>1</volume>
          (
          <year>2004</year>
          )
          <article-title>4</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref45">
        <mixed-citation>
          [45]
          <string-name>
            <surname>I. Guy</surname>
          </string-name>
          ,
          <article-title>Searching by talking: Analysis of voice queries on mobile web search</article-title>
          ,
          <source>in: Proc. of SIGIR</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>35</fpage>
          -
          <lpage>44</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref46">
        <mixed-citation>
          [46]
          <string-name>
            <given-names>D.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Pan</surname>
          </string-name>
          , J.-T. Sun,
          <string-name>
            <given-names>J. J.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Yin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <article-title>Query enrichment for web-query classification</article-title>
          ,
          <source>TOIS</source>
          <volume>24</volume>
          (
          <year>2006</year>
          )
          <fpage>320</fpage>
          -
          <lpage>352</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref47">
        <mixed-citation>
          [47]
          <string-name>
            <given-names>D.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-T.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <article-title>Building bridges for web query classification</article-title>
          ,
          <source>in: Proc. of SIGIR</source>
          ,
          <year>2006</year>
          , pp.
          <fpage>131</fpage>
          -
          <lpage>138</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref48">
        <mixed-citation>
          [48]
          <string-name>
            <given-names>H.</given-names>
            <surname>Cao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. H.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-T.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <article-title>Context-aware query classification</article-title>
          ,
          <source>in: Proc. of SIGIR</source>
          ,
          <year>2009</year>
          , pp.
          <fpage>3</fpage>
          -
          <lpage>10</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref49">
        <mixed-citation>
          [49]
          <string-name>
            <given-names>S.</given-names>
            <surname>Gandhi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mansouri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Campos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jatowt</surname>
          </string-name>
          ,
          <article-title>Event-related query classification with deep neural networks</article-title>
          ,
          <source>in: Proc. of WWW (Companion Volume)</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>324</fpage>
          -
          <lpage>330</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref50">
        <mixed-citation>
          [50]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kharbanda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Banerjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Palrecha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Babbar</surname>
          </string-name>
          ,
          <article-title>Embedding convolutions for short text extreme classification with millions of labels</article-title>
          ,
          <source>arXiv:2109.07319</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref51">
        <mixed-citation>
          [51]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kiros</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Zemel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Salakhutdinov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Urtasun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Torralba</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Fidler</surname>
          </string-name>
          ,
          <article-title>Aligning books and movies: Towards story-like visual explanations by watching movies and reading</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>