<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Central Intention Identification for Natural Language Search Query in E-Commerce</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Xusheng</forename><surname>Luo</surname></persName>
						</author>
						<author>
							<persName><forename type="first">Yu</forename><surname>Gong</surname></persName>
						</author>
						<author>
							<persName><forename type="first">Xi</forename><surname>Chen</surname></persName>
						</author>
						<author>
							<affiliation key="aff0">
								<orgName type="laboratory">Alibaba Group</orgName>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff1">
								<orgName type="laboratory">Alibaba Group</orgName>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff2">
								<orgName type="laboratory">Alibaba Group</orgName>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff3">
								<address>
									<settlement>Ann Arbor</settlement>
									<region>Michigan</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Central Intention Identification for Natural Language Search Query in E-Commerce</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">9921F28177770BFBE43F2D2B83FA18F0</idno>
					<idno type="DOI">10.1145/nnnnnnn.nnnnnnn</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T06:31+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>Query Intent &amp; Understanding, Natural Language Query</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This paper is a preliminary work, which studies the problem of finding central intention of natural language queries with multiple intention terms in e-commerce search. We believe it is a new and interesting topic since natural language based e-commercial search is still very young currently. We propose a neural network model with bi-LSTM and attention mechanism, aiming to find the semantic relatedness between natural language context words and central intention term. Initial experimental result reports that our model outperforms baseline method and shows a positive and important gain brought by a deep network model, comparing to rule based approach.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">INTRODUCTION</head><p>As the AI technologies develop rapidly, the services provided by e-commerce companies become more and more intelligent. One inevitable tendency, different from earlier online shopping experiences, is that customers will be able to use natural language instead of key words when searching for the products they want to buy. For example, customers can ask the online shopping search engine: "I would like to buy a red fashionable short dress under 200 dollars." instead of type key words like "short dress, red, fashion, cheaper than 200". Comparing to key words, using natural language is a more comfortable way for people to go online shopping since it is the way we communicate with each other in daily life.</p><p>The very first step for search engine to understand user query is to identify the query intention. In the case of the previous query, that means to know it is a dress the customer want to buy. Here "short dress" is an intention term (a term can be a word or a phrase), which indicates the e-commercial category of a product. The recognition of intention term is usually performed by a module called I want to buy a pair of stockings for my new short dress. Query Tagging, which is similar to Named Entity Recognition <ref type="bibr" target="#b6">[7,</ref><ref type="bibr" target="#b7">8]</ref>.</p><p>Sometimes, there will be more than one intention term within a single natural language user query such as "I want to buy a pair of stockings for my new short dress." (Figure <ref type="figure" target="#fig_0">1</ref>), where "stockings" and "short dress" are both intention terms, which makes it more difficult for machines to identify the true intention of this query (stockings rather than short dress). Cases like this are not rare in natural language queries, as we found that there are around 20% of voice queries (voice query is more likely to be in natural language form since people tend to use natural language as they speak), which contains more than one intention term after query tagging. This motivates us to identify the central intention of a user query among all intention terms so that our machine can better understand search queries.</p><p>Multiple intention terms in one query is also common in nowadays key-word based e-commerce search. However, those queries are tend to be short and in fix-pattern such as "laptop backpack", where "laptop" and "backpack" are both intention terms and we all know the true intention is backpack. In general, we will analyze the query log and corresponding click log to find out what products the users are clicking and viewing after type the query in the search box and then we construct a multi-terms→ central-term map offline. Thus, next time when we see a query with multiple short intention terms, we can easily know the actual intention by looking up the map. However, this method is not helpful and limited when dealing with natural language queries, which are much longer and more complicated. With natural language interaction grows, there will be more and more new intention combinations.</p><p>We believe a deep model can work more effectively and hence we dig a little deeper towards this topic and make the following contributions:  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">APPROACH</head><p>The central intention identification task is defined as follows. The input query is a sequence of word terms q = (x 1 , x 2 , . . . , x n ), with at least two intention terms. A term x i can be a word or a phrase. Our task is to output only one intention term x i as the central intention, while other intention terms modify the central intention. Defined in this way, we actually make a hypothesis that each search query contains only one actual goal product. We do not consider queries where a user ask for two or more items at the same time. Now, we describe our neural network model and baseline method for query intention identification. Figure <ref type="figure" target="#fig_1">2</ref> gives a general view of the proposed neural network model. Given the context words of a query qc = (x 1 , x 2 , . . . , x i−1 , x i+1 , . . . , x n ), which is the terms left after taking the intention term way , together with the intention term qi = x i , our model will output a score score(qc, qi), measuring the compatibility between them.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Term Embedding</head><p>Typically, a term contains up to three words, thus we simply represent it as the average embedding of the words it contains. We train word embeddings and term embeddings on large text corpus. Embeddings are fed to model as input and will be updated during training.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Bi-LSTM with Attention</head><p>Recurrent neural networks (RNNs) are a powerful family of neural networks designed for sequential data and have shown great promise in many NLP tasks. RNNs take a sequence of vector (x 1 , x 2 , . . . , x n ) and return another sequence (h 1 , h 2 , . . . , h n ) that represents the hidden state information about the sequence at each time step in the input. In theory, RNNs can learn long dependencies but in practice they seem to be biased towards their most recent inputs of the sequence. Thus, LSTMs <ref type="bibr" target="#b2">[3]</ref> are proposed and they have shown great capabilities to capture long-range dependencies.</p><p>To encode the query context, we first look up an embedding matrix E x ∈ R d ×v to get the term embeddings q = (x 1 , x 2 , . . . ,</p><formula xml:id="formula_0">x i−1 , [X], x i+1 , . . . , x n ).</formula><p>Here, [X] is a wildcard embedding to indicate the position of intention term in the query. d denotes the dimension of the embeddings and v denotes the vocabulary size of natural language words. Then, the embeddings are fed into a bidirectional LSTM networks. If we use unidirectional LSTM, the outcome of current word is only based on the words before it so the information of the words after it is totally lost. To avoid this, we use bi-LSTM which consists a forward network handles the query from left to right and a backward network does in the reverse order. Therefore, we get two hidden state sequences,</p><formula xml:id="formula_1">( − → h 1 , − → h 2 , . . . , − → h n ) from forward network and ( ← − h 1 , ← − h 2 , . . . , ← − h n ) from backward network.</formula><p>We concatenate the forward hidden state of each word with corresponding backward hidden state, resulting in a representation</p><formula xml:id="formula_2">H i = [ − → h i ; ← − h i ] ∈ R k ×1</formula><p>. Thus, we obtain the representation of each word in the query context.</p><p>Attention mechanisms <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b3">4]</ref> have become an integral part of sequence modeling and transduction models in various NLP tasks, which allows better understanding sequential data. Based on our assumption, different intention terms should have different attention towards the same query, The extent of attention can be calculated by the relatedness between each word representation H i and an intention embedding qi, where qi = W T i x i and W i ∈ R k ×1 . We propose the following formulas to calculate the attention weights.</p><formula xml:id="formula_3">a i = exp(w i ) n i=1 exp(w i )<label>(1)</label></formula><formula xml:id="formula_4">w i = W T a (tanh[H i ; qi]) + b (2)</formula><p>Here, a i denotes the attention weight of the ith term in the query context, in terms of intention e, where qi is a hidden representation of one intention term. n is the length of the query. W a ∈ R 2k ×1 is an intermediate matrix and b is an offset value. These two parameters are randomly initialized and updated during training. Subsequently, the attention weights a (Figure <ref type="figure" target="#fig_1">2</ref>) are employed to calculate a weighted sum of the query terms, resulting in a semantic representation qc which represents the query context, according to the specific intention term.</p><formula xml:id="formula_5">qc = n i=1 a i H i<label>(3)</label></formula><p>Thus, the final output score which is regarded as a measurement of the compatibility of query context qc and intention term qi can be calculated as follows.</p><formula xml:id="formula_6">S(qc, qi) = qc • qi<label>(4)</label></formula><p>Therefore, we use intention term qi as attention query to guide the model weighting each context term differently, aiming to better justify compatibility between current intention term and the whole user query. When we consider an intention term, we will re-read the query to find out which part of the query should be more focused (handling attention). We believe that this attention mechanism is beneficial for the system to better understand the query with the help of the intention term, and leads to a performance improvement.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3">Training and Prediction</head><p>Since there is no ground truth currently and it is extremely costly to annotate the central intention for user queries with multiple intention terms. Thus, we choose those natural language queries with only one intention term as our training data. We believe it is a reasonable degeneration since our goal is to dig the semantic relationship between natural language context words and some target intention term. This relatedness can be learned from single-intention queries and then apply to multi-intention queries. We use a dynamic programming max-matching algorithm to match terms in the query to an existing dictionary containing all the intention terms such as "连衣裙 (Dress)" and "丝袜 (Stocking)". We only keep queries with only one exactly matched intention term. After this "query tagging" step, we can identify the intention term and regard &lt;query context, intention term&gt; pair in each query as a positive sample. Then we randomly choose some unrelated intention terms as negative samples. We use hinge loss to train the model:</p><formula xml:id="formula_7">loss = qi ′ ∈N max(0, 1 − score(qc, qi) + score(qc, qi ′ ))<label>(5)</label></formula><p>Where qc is the query context, qi is the positive query intention and qi ′ is the corrupted query intention term from negative samples N . The function score represents the model output. We evaluate our model on a dataset labeled by human. Each query in our testing set contains more than one intention term. When testing a query with one intention term of it, we take away the intention term and feed the rest of query, i.e. query context into model. The intention term with highest output score is considered as the central intention.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4">Baseline</head><p>We use a rule based method as our baseline method. We perform dependency parsing on the input user query. A dependency parser analyzes the grammatical structure of a sentence, establishing relationships between "head" words and words which modify those heads. Among all the intention terms, we choose the one at highest position in the parsing tree as the central intention. As shown in Figure <ref type="figure" target="#fig_2">3</ref>, we use an internal e-commercial query parser as our baseline method. In this example of query "I want to buy a pair of stockings for my new short dress (我 想要 一双 搭配 连衣裙 的 长 筒 丝袜)", "丝袜 (Stocking)" is at a higher position than "连衣裙 (Dress)" in the parsing tree. Thus we choose "丝袜 (Stocking)" as the central intention of this query. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">EXPERIMENTS 3.1 Dataset</head><p>We train our model on 10, 000 single intention Chinese voice search queries and test on two datasets. We filter out queries whose length is shorter than 10 words. One is single-intention query set. We construct it by corrupting the intention term of 10, 000 single-intention queries with randomly chosen intention terms. The other one is multiintention query set. It contains 300 multi-intention search queries, which consists of 150 2-intentions queries, 100 3-intentions queries and 50 4-intentions queries. The size of this dataset is limited since it need a lot of human labeling efforts. We use an e-commerce query tagging tool to preprocess all the training and testing queries.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Implement Details</head><p>We pre-train word and term embeddings on a large Chinese ecommerce corpus. This corpus comes from a module in Chinese e-commerce giant Taobao * named "有好货" † , which is written by online merchants. We use word2vec <ref type="bibr" target="#b4">[5]</ref> CBOW model with context window size 5, negative sampling size 5, iteration steps 5 and hierarchical softmax 1. The size of pre-trained word embeddings is set to 200. For Out-Of-Vocabulary (OOV) words, embeddings are initialized as zero. All embeddings are updated during training. We use an e-commerce Chinese word segmentation tool for word segmentation. For recurrent neural network component in our system, we used a two-layers LSTM network with unit size 512. All natural language queries are padded to a maximum sentence length of 30. We use Adam optimizer, and the learning rate is initialized with 0.01.</p><p>For baseline method, we use an internal e-commercial query parser to do dependency parsing. This parser is similar as the famous Stanford Dependency Parser <ref type="bibr" target="#b1">[2]</ref> but is optimized specially for ecommercial scenario.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">End-to-end Result</head><p>Now we report the experimental results as follows. First we show the accuracy on single-intention query set. The goal of this experiment is to evaluate the training quality explicitly. The model has to identify the correct intention terms from the corrupted ones. As shown in Table <ref type="table" target="#tab_2">2</ref>, it achieves 0.813 in accuracy. Considering the user queries always contain a lot of noises, this number shows power of our model at learning semantic relations between natural language query context and query intention. Besides, the result proves that attention mechanism is effective in this task.</p><p>In the experiment on multi-intention query set, we assigned three human annotators to judge whether the model output is correct, i.e. whether the intention term with the highest score is the central query intention. Based on majority voting, we calculate the accuracy in Table <ref type="table" target="#tab_3">3</ref>. Our model with attention mechanism outperforms baseline method and the one without attention mechanism by up to 13%. Baseline method based on dependency parsing suffers from bad performance on short sentence, since search queries in e-commerce tend to be short and less grammatical. On the other hand, deep neural network model shows potential to learn rich semantic relatedness between context words and intention terms regardless of sentence size.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4">Case Study &amp; Error Analysis</head><p>In Table <ref type="table" target="#tab_1">1</ref>, we show some real cases of intention identification of search queries. In each case, the underlined terms are the intention terms recognized by query tagging and the red-colored term is the central intention identified by model. Take the first query "我 想 要 穿着 显瘦 only 牌子 的 连衣裙 最好 是 能 搭配 耳坠 的。" as example, the baseline method using e-commercial dependency parsing regards "耳坠 (Earring)" as root thus discards terms including "连衣裙 (Dress)" which is actually the true central intention. Our model can output the correct intention after seeing enough semantic information in training data and believes "穿着","显瘦", "only" are more likely to describe "连衣裙 (Dress)" rather than "耳 坠(Earring)".</p><p>Since this work is in the preliminary stage, we actually find several problems in our experiments. First, the quality of queries are not as high as what we expect. Currently the main interactive way between a customer and online e-commerce search engine is still based on key words. Thus, at current stage, it is hard to get enough high-quality natural language query log. That is why we choose voice queries as the source of natural language queries. However, the precision of speech recognition becomes a problem, especially when people say something very domain-specific.</p><p>Second, the habit of using key words to do online shopping can not be easily changed. Within voice queries, there still exists quite a few queries which are some combination of several similar key words which actually mean the same product. However, the goal of our model is to dig the semantic relatedness between query words and intention terms. This idea can not hold if the terms of a query are not in natural order or the query is not even a natural language sentence.</p><p>Besides, we also find some cases where simple rule or patterns may works better than deep models. For example, the central intention of "连衣裙上面的绿色纽扣(Green buttons of dress)" is "纽扣 (button)" but it becomes "连衣裙(dress)" if we change only one word to "连衣裙上面有绿色纽扣 (Dress with green buttons)". Although these cases are rare and extreme, it is indeed a challenge for our model. Maybe some syntactic and rule based features should be fed to model somehow to help it deal with this problem.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">FUTURE WORK</head><p>In this paper, we explore the area where e-commerce search queries are in natural language form and multiple intention terms are appearing together in the same query. We proposed a deep neural network to identify the true intention and made some delighted progress comparing to rule based method. In the future, we will try to construct a larger and cleaner dataset for both training and testing and make it public. This work is a preliminary attempt currently and it need to be further improved such as adding syntactical and rule based features to the model in the future.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Example query with multiple intention terms</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Overview of proposed model</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Dependency parsing example of query with multiple intention terms</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 1 :</head><label>1</label><figDesc>Real cases of central intention identification#1我 想要 穿着 显瘦 only 牌子 的 连衣裙 最好 是 能 搭配 耳坠 的。 I want to buy an ONLY-brand thin-looking dress which is suitable for earrings. 足球 都 可以 穿 的 nike 鞋，没有 鞋带。 Nike shoes without shoelace, for both basketball and soccer.</figDesc><table><row><cell>#2</cell><cell>汽车 上面 用 的 那个 小的 吸尘器 有没有 的？ Do you have small vacuum cleaner for cars?</cell></row><row><cell>#3</cell><cell>黄色 T恤衫 前面 就是 有 2个 耳坠 那种。 Yellow T-shit with a pair of earrings in the front.</cell></row><row><cell>#4</cell><cell>打 篮球 踢</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 2 :</head><label>2</label><figDesc>Accuracies on Single-Intention Queries</figDesc><table><row><cell>Approach</cell><cell>Acc</cell></row><row><cell cols="2">Model (-attention) 0.803</cell></row><row><cell cols="2">Model (+ attention) 0.813</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 3 :</head><label>3</label><figDesc>Accuracies on Multi-Intention Queries</figDesc><table><row><cell>Approach</cell><cell cols="3">2-intents 3-intents 4-intents</cell></row><row><cell>Baseline</cell><cell>0.60</cell><cell>0.54</cell><cell>0.32</cell></row><row><cell>Model (-att)</cell><cell>0.67</cell><cell>0.66</cell><cell>0.40</cell></row><row><cell>Model (+ att)</cell><cell>0.68</cell><cell>0.67</cell><cell>0.46</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0">* https://www.taobao.com/ † https://h5.m.taobao.com/lanlan/index.html</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">Neural machine translation by jointly learning to align and translate</title>
		<author>
			<persName><forename type="first">Dzmitry</forename><surname>Bahdanau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kyunghyun</forename><surname>Cho</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yoshua</forename><surname>Bengio</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1409.0473</idno>
		<imprint>
			<date type="published" when="2014">2014. 2014</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">A fast and accurate dependency parser using neural networks</title>
		<author>
			<persName><forename type="first">Danqi</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Christopher</forename><surname>Manning</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP)</title>
				<meeting>the 2014 conference on empirical methods in natural language processing (EMNLP)</meeting>
		<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="740" to="750" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Long short-term memory</title>
		<author>
			<persName><forename type="first">Sepp</forename><surname>Hochreiter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jürgen</forename><surname>Schmidhuber</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Neural computation</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="page" from="1735" to="1780" />
			<date type="published" when="1997">1997. 1997</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<title level="m" type="main">Effective approaches to attention-based neural machine translation</title>
		<author>
			<persName><forename type="first">Minh-Thang</forename><surname>Luong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hieu</forename><surname>Pham</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Christopher</forename><forename type="middle">D</forename><surname>Manning</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1508.04025</idno>
		<imprint>
			<date type="published" when="2015">2015. 2015</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<title level="m" type="main">Efficient estimation of word representations in vector space</title>
		<author>
			<persName><forename type="first">Tomas</forename><surname>Mikolov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kai</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Greg</forename><surname>Corrado</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jeffrey</forename><surname>Dean</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1301.3781</idno>
		<imprint>
			<date type="published" when="2013">2013. 2013</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Recurrent neural network based language model</title>
		<author>
			<persName><forename type="first">Tomáš</forename><surname>Mikolov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Martin</forename><surname>Karafiát</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lukáš</forename><surname>Burget</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jan</forename><surname>Černockỳ</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sanjeev</forename><surname>Khudanpur</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Eleventh Annual Conference of the International Speech Communication Association</title>
				<imprint>
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">A survey of named entity recognition and classification</title>
		<author>
			<persName><forename type="first">David</forename><surname>Nadeau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Satoshi</forename><surname>Sekine</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Lingvisticae Investigationes</title>
		<imprint>
			<biblScope unit="volume">30</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="3" to="26" />
			<date type="published" when="2007">2007. 2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition</title>
		<author>
			<persName><forename type="first">Erik</forename><forename type="middle">F</forename></persName>
		</author>
		<author>
			<persName><forename type="first">Tjong</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sang</forename></persName>
		</author>
		<author>
			<persName><forename type="first">Fien</forename><surname>De Meulder</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003-Volume 4</title>
				<meeting>the seventh conference on Natural language learning at HLT-NAACL 2003-Volume 4</meeting>
		<imprint>
			<date type="published" when="2003">2003</date>
			<biblScope unit="page" from="142" to="147" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
