<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Augmentation-based Answer Type Classification of the SMART dataset</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Aleksandr</forename><surname>Perevalov</surname></persName>
							<email>aleksandr.perevalov@hs-anhalt.de</email>
							<affiliation key="aff0">
								<orgName type="institution">Anhalt University of Applied Sciences</orgName>
								<address>
									<settlement>Köthen (Anhalt)</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Andreas</forename><surname>Both</surname></persName>
							<email>andreas.both@hs-anhalt.de</email>
							<affiliation key="aff0">
								<orgName type="institution">Anhalt University of Applied Sciences</orgName>
								<address>
									<settlement>Köthen (Anhalt)</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Augmentation-based Answer Type Classification of the SMART dataset</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">AC16ACC88692E65DABFF075ED53BD088</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T21:16+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Answer type classification</term>
					<term>Text classification</term>
					<term>Text augmentation</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Recent progress in deep-learning-enabled AI researchers and developers to invest minimal efforts to achieve state-of-the-art results. Specifically, in such a task as text classification -text preprocessing and feature generation does not play a significant role anymore thanks to such a landmark model as BERT and other related models. In this paper, we present our solution for the Semantic Answer Type prediction task (SMART task). The solution is based on the application of several data augmentation techniques: machine translation to popular languages, round-trip translation, named entities annotation with linked data. The final submission was generated as a weighted result from several successful system outputs.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Understanding a question's answer type is one of the significant steps in a question-answering process <ref type="bibr" target="#b3">[4]</ref>. With the help of an answer type classifier -a Question Answering system (QA system) could narrow the answer search space and filter the inappropriate answer candidates <ref type="bibr" target="#b5">[6]</ref>.</p><p>In general, the answer type classification task can be interpreted as a multiclass text classification task. However, the SMART task <ref type="bibr" target="#b4">[5]</ref> proposes a more complicated structure of the data. There are two class levels: answer category (resource, literal, boolean) and answer type.</p><p>According to the official description of the data<ref type="foot" target="#foot_0">1</ref> : If the category is "resource", answer types are ontology classes from either the DBpedia ontology<ref type="foot" target="#foot_1">2</ref> or the Wikidata ontology <ref type="foot" target="#foot_2">3</ref> . If the category is "literal", answer types are either "number", "date", or "string". For the category "boolean" no additional specialization is defined. It is worth mentioning that in this work we concentrate only on the DBpedia dataset.</p><p>Each "resource" answer type contains a ranked list of the DBpedia ontology types. All items contained in a list are part of one hierarchy, for example: ["dbo:Person", "dbo:Agent"] or ["dbo:Opera", "dbo:MusicalWork", "dbo:Work"]. The most general ontology type is at the end of a list.</p><p>The DBpedia dataset contains 21,964 (train -17,571, test -4,393) questions. The evaluation metric for answer category prediction task is accuracy, the metric for answer type prediction is lenient NDCG@k with a Linear decay <ref type="bibr" target="#b1">[2]</ref>.</p><p>Our solution focuses on data augmentation techniques. In Section 2 we describe the dataset in detail. Section 3 incorporates the description for the data augmentation methods used by us, as well as an algorithm for merging answer type lists. In Section 4 we show our experimental results and describe the local evaluation pipeline. Finally, in Section 5 the conclusions are presented.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Dataset analysis and transformation</head><p>The original dataset is presented using the JSON format. To train a model on the data, it needs to be transformed into a feature-target form.</p><p>In the case of the prediction answer category, the task is trivial -there is just one target value for one question and it is considered as a multi-class classification task. While predicting an answer type -things are more complicated: we have to predict a list, which items are ordered according to the level of taxonomy and has to match one hierarchy (e.g., dbo:Opera, dbo:MusicalWork, and dbo:Work). The first constraint does not allow us to consider this task as a multi-class classification. That is why we decided to make each item of a list as an individual target value, so we can train separate models for each of them. We took only 5 most general types for each question because 95% of the answer type list's lengths are not more than this value. The head of the resulting dataset is presented in Figure <ref type="figure" target="#fig_0">1</ref>. Hence, we consider the solution for the SMART challenge task to be represented as two-level architecture where the higher-level decisions activate lowerlevel classifiers: Level 1 The category is classified (Figure <ref type="figure" target="#fig_0">1</ref>, column: "category"). Thereafter, the classification system can decide for the next required classifiers. Level 2 The second-level decisions are considered to be two independent tasks:</p><p>-Classification of literal type (Figure <ref type="figure" target="#fig_0">1</ref>, column: "type 1") -Classification of resource types (Figure <ref type="figure" target="#fig_0">1</ref>, columns: "type 1", "type 2", "type 3", "type 4", "type 5") The training dataset had 43 questions with empty textual representation. These questions were removed. The resulting dataset has the following characteristics:</p><p>-17,528 questions are contained; -Distribution: 9,573 question point to resources, 5,156 point to a literal datatype and 2,799 are Boolean questions; -The 95th percentile of the answer type lists' length is 5; -The maximum number of tokens in a question is 60.</p><p>In Figure <ref type="figure">2</ref>, the top 10 most common resource answer types are presented. It shows that all top 10 resource types belonging either to dbo:Agent or dbo:Place or their sub-classes.</p><p>3 Proposed solution</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Classifier Architecture</head><p>The classification pipeline was created with a tree-like structure and 7 classifiers in total (see Figure <ref type="figure" target="#fig_1">3</ref>). First, the category is classified. Then, depending on the category, the corresponding models are chosen.</p><p>For example, if the category is "resource", then the pipeline classifies a question using 5 models reflecting the decision for "type 1", "type 2", "type 3", "type 4", and "type 5" (cf., Figure <ref type="figure" target="#fig_0">1</ref>). Given the results of these classifiers, the answer type list is created from the computed results (obeying the correct order). As there are only 5 models (one model for one list item) -the answer type list's length will contain no more than 5. Sometimes it may be less (when the prediction is None). </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Data Augmentation</head><p>To extend the training data, we used several augmentation strategies for the given dataset: D 1 Machine translation to German, French, Spanish, and Russian is used for each question. Hence, in total there are 5x more questions (separated in 5 different languages) resulting in 87,640 questions. As the dataset has become a multilingual one, we will use a multilingual model. There are two types of prediction for such a dataset: Use the original English text or use predictions for all languages and a majority voting algorithm. D 2 Round-trip translation <ref type="bibr" target="#b0">[1]</ref> (English-German-English, English-Russian-English) -in total, there are 3x more questions, and we use a single language model. The dataset consists of 52,584 questions; D 3 Each question is annotated with it's named entities pointing to DBpedia resources -each named entity is replaced with one of its RDF types. The data is extracted from DBpedia with help of DBpedia Spotlight<ref type="foot" target="#foot_3">4</ref> . The dataset consists of 163,488 questions.</p><p>Google Cloud Translation<ref type="foot" target="#foot_4">5</ref> was used to translate the data for D 1 and D 2 automatically. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Cm-1</head><p>Dm-1 Hence, additionally to the original dataset -we call it D 0 -we have created here 3 more dataset (D 1 , D 2 , and D 3 ) that are used to spawn 4 independent classifier pipelines (C 1 , C 2 , C 3 , and C 4 ). Consequently, the results R Ci of all classifier pipelines C i need to be merged. Figure <ref type="figure" target="#fig_2">4</ref> shows an example of merging process. The next section gives a detailed description of the process.</p><formula xml:id="formula_0">RC m-1 PA n ,R Final = m i=0 Wi • PA n,RC i R Final</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Results Merging</head><p>Each classification pipeline -C 0 , C 1 , C 3 , and C 4 -provide a list of classification results. It is reasonable to assume that they also have a distinguished classification quality.</p><p>Hence, while merge the classification results -identified by R C0 , R C1 , R C2 , and R C3 -to establish a final result set R Final as shown in Figure <ref type="figure" target="#fig_3">5</ref>. The merging of R Ci with i ∈ {0, 1, 2, 3} is computed while numerically calculating a weighted rank for each answer type that was predicted by at least one classifier pipeline C i . The rank P An,R Final of an answer type A n in R Final is computed as follows:</p><formula xml:id="formula_1">P An,R Final = m i=0 W i • P An,RC i , where P An,RC i = rank of n in R Ci if n in R Ci fallback rank f else</formula><p>and m is the number of classification pipelines</p><p>Typically, the quality of Level-1 decisions would be high. However, there also exists a special case where a different answer category was predicted by the classifier pipelines. In this case, we currently follow a static rule-based decision process that is favoring the more specific predictions, i.e., if one classifier pipeline predicted the category boolean, then all other results are discarded. And, else if one classifier pipeline is predicting the literal category, then all non-literal categories are discarded.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Experiments</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Evaluation</head><p>We used Bert-base-cased and Bert-base-multilingual-cased models <ref type="bibr" target="#b2">[3]</ref> in our classification pipeline. Training data was split into two sets: train and validation set.</p><p>The validation set was created by random choice of 4400 questions and the test set consists of 4381 questions. The models were fine-tuned on the training set with the following hyperparameters: EPOCHS=2, MAX LEN=60, BATCH SIZE=16.</p><p>The training process was performed on GPU resources provided by the Kaggle.com platform (NVIDIA TESLA P100 GPU, 16 GB RAM). The results shown in Table <ref type="table" target="#tab_1">1</ref> enable us to compare the effectiveness of each augmentation technique. The results were obtained on the validation set locally (MV -corresponds to Majority Voting algorithm, see Section 3.2): </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>D0</head><p>D1 D1+MV D2 D3 Accuracy 0.969 0.968 0.962 0.357 0.959 NDCG@5 0.533 0.704 0.708 0.165 0.363 NDCG@10 0.499 0.661 0.665 0.140 0.317</p><p>The best performing datasets are multilingual ones (D 1 ). The round-trip translation (D 2 ) approach caused overfitting because of small differences in questions forms. The same situation occurred with the named entities annotation approach (D 3 ). The original dataset (D 0 ) showed comparable performance. A detailed analysis of the errors is given in Section 4.2.</p><p>For the final analysis, we took only predictions from the models trained on the original (D 0 ) and the multilingual dataset (D 1 ) into account. We used both prediction approaches for the multilingual data: using the multilingual model to predict the answer type of English questions and using the same model while retrieving predictions for all 5 languages and taking the majority vote result. The predictions were merged using the algorithm described at the end of the previous section, we used several weights combinations to achieve the highest quality. The evaluation results for final submission are presented in Table <ref type="table" target="#tab_2">2</ref>. The highest score on the test dataset was achieved with a merged combination of 3 predictions (see the second column of Table <ref type="table" target="#tab_2">2</ref>). We evaluated the weight combinations where each weight w i was chosen between 0.0 and 1.0, s.t., the sum of all used weights equals 1.0. The following best weight combination was created using this process: 30% -D 0 , 30% -D 1 and 40% -D 1+MV . The fallback rank f for the merging algorithm was taken equal to 10 (see <ref type="bibr">Subsection 3.3)</ref>. This combination was submitted as the final solution for the task. As the weights were obtained manually and intuitively, we can not make a statement about its application on the other datasets. Moreover, these weights can be overfitted to the test set because the final predictions were given by the organizers based on the whole test dataset without private/public test splits. Hence, the weights were selected according to the public test set results. This is a limitation of our merging approach.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Error analysis</head><p>As we reported in the previous subsection, the approach D 1 outperformed D 2 and D 3 due to the model overfitting caused by the nearly same surface form of the obtained questions. The corresponding example of D 2 is given below: Hence, we have to recognize that the questions generated using round trip translation are not differing significantly: En-De-En differs in one word, absence of the definite article, and non-capitalized letter "T" in the last definite article, almost the same is true for the En-Ru-En translation.</p><p>We can assume that round-trip translation to languages, that are non-popular or distant from the English language, would possibly resolve this issue. Each named entity was replaced with its URI's type in the DBpedia. As a resource in the DBpedia may contain up to several thousands of variants corresponding to each combination of the types. There are two major limitations of this approach: the DBpedia resource may contain errors w.r.t. its type and the Named Entity Linking tool may extract and link entities incorrectly. In the given example, the "the Chief Justice of The United States" should be replaced with a single type, while it was replaced with two different types which are incorrect.</p><p>The D 3 showed the best performance, here is the example of its fragment:</p><p>Original: Who replaced Charles Evans Hughes as the Chief Justice of The United States? German: Wer hat Charles Evans Hughes als Oberster Richter der Vereinigten Staaten abgelöst? French: Qui a remplacé Charles Evans Hughes en tant que juge en chef des États-Unis?</p><p>However, despite the augmentation approaches, there is one significant limitation of our prediction approach -each element of the answer type list is predicted independently and therefore the elements may not from the same hierarchy. For example, for the question "What is the horse characters of Madame Sans-Gêne play?" predicted answer type list is ["dbo:Person", "dbo:Work"] while the true value is ["dbo:Animal", "dbo:Eukaryote", "dbo:Species"]. Despite the prediction is completely incorrect, it has items "dbo:Person" and "dbo:Work" which are located in the different ontology branches (hierarchies).</p><p>Consequently, the mechanism of checking the correctness of the hierarchy should be created. One of the possible solutions may be the prediction of the most specific answer type and making the prediction according to the actual hierarchy.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Conclusion</head><p>In this work, we described our solution for the Semantic Answer Type prediction task. The goal was to predict the corresponding answer category and answer types. To solve the task, we created a tree-like classification pipeline and implemented several text augmentation methods described in Section 3.</p><p>The results of our experiments show that the multilingual dataset has the highest performance in contrast to the other augmented data. To prepare the final submission, we used the weighted merging algorithm on top of our best predictions (see Section 4).</p><p>Obviously, there is room for improvement. In future work, we would use an ensemble learning approach to merge the results instead of the current static approach. Additionally, we would also consider each language classifier independently assuming a distinguished translation quality leading to different classification quality. Also, the hierarchy accordance and hierarchy level validation mechanism might be used for the prediction process.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. Tabular representation of the training dataset.</figDesc><graphic coords="2,134.77,491.90,345.83,90.25" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Fig. 3 .</head><label>3</label><figDesc>Fig. 3. Tree-like classification pipeline C</figDesc><graphic coords="4,177.99,200.09,259.37,141.37" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Fig. 4 .</head><label>4</label><figDesc>Fig. 4. Example of merging 3 lists with specified weights</figDesc><graphic coords="5,134.77,115.84,345.82,125.49" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Fig. 5 .</head><label>5</label><figDesc>Fig. 5. Overview of the final process.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Original:</head><label></label><figDesc>Who replaced Charles Evans Hughes as the Chief Justice of The United States? En-De-En: Who succeeded Charles Evans Hughes as Chief Justice of the United States? En-Ru-En: Who replaced Charles Evans Hughes as Chief Justice of the United States?</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>The example of D 3</head><label>3</label><figDesc>is given below: Original: Who replaced Charles Evans Hughes as the Chief Justice of The United States? Variant 1: Who replaced DBpedia:Athlete as the DBpedia:Person of The DBpedia:PopulatedPlace? Variant 2: Who replaced DBpedia:Person as the DBpedia:Person of The DBpedia:Country?</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 1 .</head><label>1</label><figDesc>Local validation results</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 2 .</head><label>2</label><figDesc>Final evaluation results</figDesc><table><row><cell></cell><cell cols="4">.3D0+.3D1+.4D1+MV .5D1+.5D1+MV .3D1+.7D1+MV .7D1+.3D1+MV</cell></row><row><cell>Accuracy</cell><cell>0.976</cell><cell>0.965</cell><cell>0.965</cell><cell>0.972</cell></row><row><cell>NDCG@5</cell><cell>0.762</cell><cell>0.752</cell><cell>0.752</cell><cell>0.759</cell></row><row><cell>NDCG@10</cell><cell>0.725</cell><cell>0.714</cell><cell>0.716</cell><cell>0.722</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://smart-task.github.io/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">http://mappings.dbpedia.org/server/ontology/classes/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">https://www.wikidata.org/wiki/Wikidata:WikiProject_Ontology</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">https://www.dbpedia-spotlight.org/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_4">https://cloud.google.com/translate</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">The efficacy of round-trip translation for mt evaluation</title>
		<author>
			<persName><forename type="first">M</forename><surname>Aiken</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Park</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Translation Journal</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="1" to="10" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Hierarchical target type identification for entityoriented queries</title>
		<author>
			<persName><forename type="first">K</forename><surname>Balog</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Neumayer</surname></persName>
		</author>
		<idno type="DOI">10.1145/2396761.2398648</idno>
		<ptr target="https://doi.org/10.1145/2396761.2398648" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 21st ACM international conference on Information and knowledge management</title>
				<meeting>the 21st ACM international conference on Information and knowledge management</meeting>
		<imprint>
			<date type="published" when="2012">2012</date>
			<biblScope unit="page" from="2391" to="2394" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m" type="main">Bert: Pre-training of deep bidirectional transformers for language understanding</title>
		<author>
			<persName><forename type="first">J</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">W</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">N</forename><surname>Toutanova</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
	<note type="report_type">ArXiv e-prints</note>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Leveraging question target word features through semantic relation expansion for answer type classification</title>
		<author>
			<persName><forename type="first">T</forename><surname>Hao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Xie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Weng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Qu</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.knosys.2017.06.030</idno>
		<ptr target="https://doi.org/https://doi.org/10.1016/j.knosys.2017.06.030" />
	</analytic>
	<monogr>
		<title level="j">Knowledge-Based Systems</title>
		<imprint>
			<biblScope unit="volume">133</biblScope>
			<biblScope unit="page" from="43" to="52" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">SeMantic AnsweR Type prediction task (SMART) at ISWC</title>
		<author>
			<persName><forename type="first">N</forename><surname>Mihindukulasooriya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Dubey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gliozzo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Lehmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">C N</forename><surname>Ngomo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Usbeck</surname></persName>
		</author>
		<idno>CoRR/arXiv abs/2012.00555</idno>
		<ptr target="https://arxiv.org/abs/2012.00555" />
	</analytic>
	<monogr>
		<title level="m">2020 Semantic Web Challenge</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Multi-class hierarchical question classification for multiple choice science exams</title>
		<author>
			<persName><forename type="first">D</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Jansen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Martin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Xie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Yadav</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">T</forename><surname>Madabushi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Tafjord</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Clark</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of The 12th Language Resources and Evaluation Conference</title>
				<meeting>The 12th Language Resources and Evaluation Conference</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="5370" to="5382" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
