<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Can models learned from a dataset reflect acquisition of procedural knowledge? An experiment with automatic measurement of online review quality</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Martina</forename><forename type="middle">Megasari</forename><surname>Pandu</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">University of Tours</orgName>
								<address>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Wicaksono</forename><surname>Chiao</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">University of Tours</orgName>
								<address>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Yun</forename><surname>Li</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">University of Tours</orgName>
								<address>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Clément</forename><surname>Chaussade</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">University of Tours</orgName>
								<address>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Shibo</forename><surname>Cheng</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">University of Tours</orgName>
								<address>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Nicolas</forename><surname>Labroche</surname></persName>
							<affiliation key="aff1">
								<orgName type="institution">University of Tours</orgName>
								<address>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Patrick</forename><surname>Marcel</surname></persName>
							<affiliation key="aff1">
								<orgName type="institution">University of Tours</orgName>
								<address>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Verónika</forename><surname>Peralta</surname></persName>
							<affiliation key="aff1">
								<orgName type="institution">University of Tours</orgName>
								<address>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Can models learned from a dataset reflect acquisition of procedural knowledge? An experiment with automatic measurement of online review quality</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073)</idno>
					</monogr>
					<idno type="MD5">E13C25C1268C5A3FE4CE62C3D6685101</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T03:33+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Can models learned from a dataset reflect how good are humans at mastering a particular skill? This paper studies this question in the context of online reviews writing, where the skill corresponds to the procedural knowledge needed to write helpful reviews. To this end, we model the quality of a review by a combination of various metrics stemming from text analysis (like readability, polarity, spelling errors or length) and we use customer declared helpfulness as a ground truth for constructing the model. We use Knowledge Tracing, a popular model of skill acquisition, to measure the evolution of the ability to write reviews of good quality over a period of time. While recent studies have tried to measure the quality of a review and correlate it to helpfulness, to the best of our knowledge, our work is the first to address this question as the exercise of a reviewer's skill over a sequence of reviews. Our experiments on a set of 41,681 Amazon book reviews show that it is possible to accurately assess the individual skill acquisition of writing a helpful review, based on a statistical model of the procedural knowledge at hand rather than human evaluations prone to subjectivity and variations over time.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">INTRODUCTION</head><p>In today's era of big and open data, plenty of datasets are analyzed to derive models mimicking humans by using machine learning techniques. The representation and assessment of user knowledge opens new possibilities for big data analytics, as differentiating among novice and expert users, taking advantage of user experience for recommending (e.g. products or actions), calculating advanced scores (e.g. credibility), assessing the quality of users' analysis, etc. In this paper we focus on the assessment of procedural knowledge from large data collections.</p><p>Procedural knowledge is the knowledge about how to do something. Different from declarative knowledge, that is often verbalized, application of procedural knowledge may not be easily explained <ref type="bibr" target="#b2">[3]</ref>. Models exists to evaluate procedural knowledge acquisition, like for instance the popular Bayesian Knowledge Tracing <ref type="bibr" target="#b5">[6]</ref>.</p><p>Many open datasets illustrate the application of procedural knowledge. For instance, Amazon review datasets like those provided by He and McAuley <ref type="bibr" target="#b11">[12]</ref> contain customer written reviews, where the skill of writing helpful reviews is an example of application of procedural knowledge. However, this skill is difficult to define and assess. Reviews can be voted helpful or not by customers, but this assessment is subjective and as such subject to variations over time, and it is difficult to construct a model that accurately predicts helpfulness of a review <ref type="bibr" target="#b15">[16]</ref>.</p><p>In this paper, we show that it is possible to benefit from such very large datasets to learn an individual model of procedural knowledge acquisition. The resulting model of knowledge has several nice properties: <ref type="bibr" target="#b0">(1)</ref> it is not prone to the usual bias caused by a single small set of evaluators that might be non representative or produce a subjective evaluation, (2) it avoids defining explicitly the procedural knowledge at hand that is replaced by a statistical model learned over the large dataset. As a consequence, the larger the dataset, the more accurate is the modeling of the procedural knowledge, and the better the evaluation of the skill for a user is.</p><p>To illustrate this, we experiment a use case with a dataset of the aforementioned Amazon on-line product reviews. We chose this use case because it is prototypical of how procedural knowledge influences decision making. For instance, Mayzlin and Chevalier studied the effects of on-line book reviews of Amazon.com and Barnesandnoble.com and found positive correlation between the reviews and the transactions of the book <ref type="bibr" target="#b3">[4]</ref>. This means that the reviewers opinion play an important role in users' decision on the transaction. Automatic measurement of the reviewer skill may be beneficial to predict how helpful the review is. A skillful writer is assumed to be able to write a good review, which can help the customer to make a better decision on the transaction.</p><p>To motivate our approach, suppose that we want to determine whether a reviewer is assumed to master the skill of writing helpful reviews. This is preferable to trying to predict helpfulness of the reviews, because of the high variability of the reviewer profiles, reviews and votes received by reviews. However this skill corresponds to procedural knowledge and it is difficult to define. Therefore to evaluate the skill of each reviewer, we use the classical Knowledge Tracing model. But instead of using the Knowledge Tracing directly over the votes received by reviews, we apply it over a model of helpfulness learned from each review. Our research question is: can this model of helpfulness be used to assess the skill accurately? Consider the 4 curves displayed in Figure <ref type="figure" target="#fig_0">1</ref>. These curves are related to the evolution over time of the skill of writing helpful reviews of a particular reviewer (randomly extracted from the Amazon book review dataset). The helpfulness curve is the normalized score of helpfulness received by the 20 reviews written by this reviewer. The model curve is the helpfulness score as predicted for this reviewer by a model learned over the entire dataset. The KT helpfulness curve predicts the probability that this reviewer has acquired the skill of writing helpful reviews, computed with the helpfulness score. The KT model curve is the same probability computed with the model. On this example, it is obvious that even though the skill can be considered acquired, helpfulness score is difficult to predict due to subjectivity of the voters. On the other hand, a model of helpfulness can be learned to predict if the skill has been acquired.</p><p>The contributions of this paper are the following: (1) assuming that writing helpful reviews is a hard to define skill, we propose a model for it. We use low level features of the on-line review such as rating, spelling error ratio or readability score to build the model that infers a high level and human-related feature which is helpfulness. This model is learned over the entire dataset and can be used to predict the helpfulness of future reviews for one particular reviewer. (2) Using Knowledge Tracing, we show that this model can be used to assess skill acquisition without relying on human entered votes. In particular, we show that this model, although learned over the entire dataset, is accurate enough to predict if the skill is acquired by each individual reviewer. To the best of our knowledge, this work is the first to evaluate a reviewer's skill over a sequences of reviews with Knowledge Tracing.</p><p>The remainder of the paper is organized as follows. Section 2 discusses related works. Section 3 defines the features used to build the model of helpfulness. Section 4 details our approach. Section 5 explains how the experiment is performed to build the model and exposes the results. Finally, Section 6 concludes the paper and discusses some possible future work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">RELATED WORKS AND BACKGROUND</head><p>We first review recent works on online review evaluation and then describe the Bayesian Knowledge Tracing model and some of its extensions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Online review evaluation</head><p>Readability tests play an important role in online review evaluation. Various indexes have been proposed to quantify readability of an English text. Most of these indexes are related to the level of studies a person needs to understand the text at the first reading, according to American standard. They are computed considering the number of words, number of sentences, number of syllables or number of characters as components. The Gunning-Fog Index (FOG) <ref type="bibr" target="#b9">[10]</ref> aims to estimate the years of formal education a person needs to understand the text during the first reading. The Flesch Reading Ease (FK) <ref type="bibr" target="#b14">[15]</ref> indicates the difficulties of a text using the number of words, number of sentences and number of syllables. Higher values indicate better readability. The Automated Readability Index (ARI) <ref type="bibr" target="#b22">[23]</ref> measures the approximate representation of the US grade level needed to understand the text. The Coleman-Liau Index (CLI) <ref type="bibr" target="#b4">[5]</ref> is the approximation of US grade level needed to understand the text. More background on readability tests can be found in <ref type="bibr" target="#b15">[16]</ref>.</p><p>Previous works have studied the evaluation of online reviews due to the popularity of online marketing nowadays. Authors often pay attention to the influence of online reviews on helpfulness. <ref type="bibr">Korfiatis et al. investigated</ref> the interplay between helpfulness, rating score and qualitative characteristics of the review text of 37,221 online reviews collected from Amazon UK during March to April in 2008 <ref type="bibr" target="#b15">[16]</ref>. The authors theorize that helpfulness relates to a model with three aspects: conformity (relation between the review text and the rating), understandability (readability of the review text) and expressiveness (length of the review text). The authors formulate several hypotheses and perform linear regression to validate the relationship between the metrics derived from reviews and the helpfulness of the reviews. Regarding understandability, four common readability scores -indicating the education level the readers need to have in order to understand the content -are computed: FOG, FK, ARI and CLI. Their results indicate that helpfulness of a review is directionally affected by its qualitative characteristics and in particular by review text readability. Precisely, the relationship between reviews with average length and their readability scores holds for both moderate and extreme reviews. In addition, readability has more impact on the length of the reviews. In their work, metrics related to polarity, summary text of reviews and rating deviation (between the average rating and the reviewer's one) are not considered. Moreover, due to the purpose of the work, the books having special offers are not considered to avoid the price effect. In our work, such books are chosen due to the amount of reviews resulting from this price effect.</p><p>Based on the 7,659 book reviews on Amazon UK, Wu et al. explored whether a negative bias exists in terms of evaluating the helpfulness <ref type="bibr" target="#b26">[27]</ref>. The assumption was that negative reviews may be more helpful than positive ones. After applying a regression model controlling factors such as readability and length of the reviews, the result shows that the assumption is not yet readily applicable to online reviews.</p><p>Mudambi and Schuff analyzed 1,587 reviews from Amazon.com <ref type="bibr" target="#b18">[19]</ref> to understand how review extremity, review depth and product type affect the perceived helpfulness of the review. Their helpfulness model is based on features rating, review text word count, total votes and product type. Product type is either Experience goods or Search goods, where Experience goods are products that require sampling or purchase in order to evaluate product quality. Books are examples of experience goods. They found that for experience goods, moderate reviews are more helpful than extreme reviews (whether they are strongly positive or negative). In contrast, it has been observed that reviews closer to the general opinion of people (average rating score) may be considered more helpful by the potential buyers <ref type="bibr" target="#b13">[14]</ref>.</p><p>Mc Auley and Leskovec <ref type="bibr" target="#b17">[18]</ref> propose a latent-factor model for recommending products that may be preferred by the users according to their experience level at the moment. The model evaluates the evolution of users' experiences and is based on the rating that users give to products. Unlike other works on temporal dynamics, which rely on the hypothesis that two users rating a product at the same time will provide the same rating, Mc Auley and Leskovec's model takes users' personal development into consideration in order to evaluate the expertise degree of the reviewers. Experiments showed for example that experts' ratings are easier to predict and are more similar to each other. While close to our work in the idea of taking the evolution of the user into account, this work focuses on ratings and not helpfulness, and therefore does not consider the linguistic aspect of review text.</p><p>Liu et al. considered a complex model learned using non-linear regression, that combines the reviewer's expertise (based on the number of similar reviews written in the past), the writing style of the review (characterized with part of speech tagging and counting the number of words in each tag), and the timeliness of the review <ref type="bibr" target="#b16">[17]</ref>. They showed that the three factors predict accurately helpfulness, over a dataset of 22,819 reviews collected from IMDB.</p><p>In <ref type="bibr" target="#b25">[26]</ref>, review helpfulness is considered through five features including user profile aspects (age, verified purchase) together with rating, text length and the rank of the review in the webpage. A model learned on 12,756 reviews was shown to be reasonably robust.</p><p>Agnihotri and Bhattacharya explored how the helpfulness of online reviews is affected by content readability (FK Index), sentiment analysis and the number of reviews written by a reviewer <ref type="bibr" target="#b0">[1]</ref>. It was observed on 1608 Amazon reviews that the content readability and text sentiment of the reviews follow curvilinear relationship with review helpfulness. Reviews whose readability score are very high or sentiment are very good would be perceived less helpful.</p><p>Hong and Xu analyze the impact of review message and reviewer profile on the helpfulness of 2997 online reviews collected from Douban.com <ref type="bibr" target="#b12">[13]</ref>. Using negative binomial regression, the authors show that reader participation is positively related to online review helpfulness; Reader participation fully mediates the effect of reviewer expertise history on online review helpfulness and partially mediates the effects of other three metrics: average rating, title depth and reviewer network centrality.</p><p>To the best of our knowledge, no work ever focused on the evolution of the quality of review text under the angle of skill acquisition, with a model learned only on the review content.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Knowledge Tracing Models</head><p>The Bayesian Knowledge Tracing model was proposed by Corbett and Anderson, using Bayesian network to assess people's procedural knowledge acquisition or simply put "skill level" <ref type="bibr" target="#b5">[6]</ref>. An individual's grasp of the procedural knowledge is expressed as a binary variable expressing whether the corresponding skill has been mastered or not. The knowledge of an individual cannot be directly observed, but it can be induced by observing the individuals' answers to a series of questions (or opportunities to exercise the skill) in order to guess probability distribution of knowledge mastering. Observation variables are also binary: the answer to the question is either correct or wrong.</p><p>Specifically, the Knowledge Tracing model has four parameters, namely, two learning parameters, P (L 0 ) and P (T ), and two performance parameters, P (G) and P (S ). P (L 0 ) is the probability that the skill has been mastered before answering the questions. P (T ) is the knowledge transformation probability: the probability that the skill will be learned at each opportunity to use the skill (i.e., the transition from not mastered to mastered). P (G) is the probability of guess: in the case of knowledge not mastered, the probability that the individual can still answer correctly. P (S ) is the probability of slip, i.e. to fail while the skill is already mastered. The model uses these parameters to calculate the learning probability after each question to monitor individual's knowledge status and predict their future learning probability of knowledge acquisition using a Bayesian Network.</p><p>The probability that a skill L at question i + 1 is mastered, denoted P (L i+1 ) is the sum of two probabilities: (1) the posterior probability that the skill was already learned, contingent on the evidence at time i, i.e. the i th opportunity to evaluate the skill, that can either be Correct or Incorrect, and (2) the probability that the knowledge changes from not mastered to mastered at the i th opportunity. It can be shown in the following formula:</p><formula xml:id="formula_0">P (L i+1 ) = P (L i |Evidence i ) + (1 − P (L i |Evidence i )) * P (T ) (1)</formula><p>where:</p><formula xml:id="formula_1">P (L i |Evidence i = Correct ) = P (L i ) * P (¬S ) P (L i ) * P (¬S ) + P (¬L i ) * P (G) P (L i |Evidence i = Incorrect ) = P (L i ) * P (S ) P (L i ) * P (S ) + P (¬L i ) * P (¬G)</formula><p>Due to its predictive accuracy, Corbett and Anderson's Bayesian Knowledge Tracing is one of the most popular models. However, several challenges, including local minimum, degenerate parameters and computational costs during fitting, still exist. Hawkins et al. proposed a fitting method avoiding these problems while achieving a similar predictive accuracy, and evaluated it against one of the most popular fitting methods: Expectation-Maximization <ref type="bibr" target="#b10">[11]</ref>. In this extension, the parameters are fitted by estimating the most likely opportunity at which each individual learned the skill. Learner's performance is thus annotated with an estimate of when the skill is learned, assuming that a known state can never be followed by an unknown state. This annotation is used to construct knowledge sequences, that when compared with the actual performance sequence allows to empirically derivate the model's four parameters.</p><p>As aforementioned, traditionally, the performance of an individual is presented in binary value, correct or wrong, which does not account for all the cases of skill learning situation. Wang et al. proposed to extend the Knowledge Tracing model by replacing the discrete binary performance node with continuous partial credit node <ref type="bibr" target="#b24">[25]</ref>. In this extension, it is assumed that P (G) and P (S ) follow two Gaussian distributions, that are described respectively by their means and standard deviations. Prediction of the performance node also follows a Gaussian distribution, in which the mean value is used for the prediction. Noticeably, the standard deviation contains the information of how good the prediction is. Experiments with this extension show that by relaxing the assumption of binary correctness, the predictions of an individual's performance can be improved.</p><p>These two improvements of the Knowledge Tracing model (in the fitting method and the use of partial credits) were used successfully in sequencing educational content to students <ref type="bibr" target="#b6">[7]</ref>. We conclude this section by noting that other models exist for predicting a learner's skill. Specifically, Performance Factor Analysis <ref type="bibr" target="#b19">[20]</ref> uses standard logistic regression with the student performance as dependent variable. Interestingly, it is shown in <ref type="bibr" target="#b8">[9]</ref> that Knowledge Tracing can achieve comparable predictive accuracy as Performance Factor analysis. Finally, Deep Knowledge Tracing <ref type="bibr" target="#b21">[22]</ref> uses Recurrent Neural Networks to model student learning, with the advantage of not having to set explicit probabilities for slip and guess. However these models need very large datasets to learn the latent state from sequences, and most importantly, the encoding of the input vectors depends on an upper bound on the number of exercises which does not directly fit our context.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">FEATURES AND METRICS</head><p>Consistently with the previous work of Korfiatis et al. <ref type="bibr" target="#b15">[16]</ref>, our model of helpfulness is based on features that are grouped in three categories: Conformity, Understandability and Extensiveness, with additional features compared to <ref type="bibr" target="#b15">[16]</ref>. We derive metrics, i.e., numerical attributes to be used in the definition of our model, from these features. Conformity expresses the consistency of a review being written. In addition to the classical rating, we add two metrics in this category: Polarity and Deviation. Understandability measures how good is the quality of the written text in terms of readability. We derived five metrics to measure the score: Spelling Error Ratio and 4 readability metrics (FOG, FK, ARI, and CLI). Finally extensiveness refers to the length of the review. In total, 16 metrics are defined, since length and readability metrics apply both to the review text and summary. We detail them below, a summary of the features used in the experiments with their name, category, theoretical and empirical range is provided in Tables <ref type="table" target="#tab_1">1 and 2</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Conformity</head><p>Metrics in this category relate to the consistency of the review. As the content of a review consists in a rating and a written text, we can derive a relation between them. A rating should correspond to the written review and vice versa, hence difference between these two contents might indicate that the review is inconsistent. For example, a review having 5 stars rating and very negatively written is inconsistent. Needless to say, inconsistent reviews may lead to lower score of helpfulness due to the confusion it brings. From this perspective, we consider Polarity of the text, which indicates the positiveness or negativeness of a review as a metric. Besides, the extremity of the rating given by the reviewer may indicate that the reviewer is biased and has a subjective point of view on the product being reviewed. Extremely high and low rating is associated with lower levels of helpfulness than reviews with moderate rating <ref type="bibr" target="#b18">[19]</ref>. In contrast, reviews closer to the general opinion of people (average rating score) may be considered more helpful by the potential buyers <ref type="bibr" target="#b13">[14]</ref>. From this perspective, we derived the Deviation score, quantifying how much different the rating given by the reviewer is to the average rating.</p><p>Rating. The Rating of a review is the user input quantitative indicator of the quality of the item reviewed (e.g., rating is from 1 to 5 for Amazon Book Reviews).</p><p>Polarity. Polarity of a text is measured by using a word list that indicates the positivity, negativity and objectivity of each synset. Polarity score of a word with the part of speech is calculated as the score of the positivity subtracted by the score of negativity. The range of the value of polarity is between -1 and 1, -1 indicates that the written text is very negative and 1 indicates that the written text is very positive.</p><p>Deviation. Deviation is calculated as the absolute difference between the rating of a review and the average rating of the item reviewed.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Readability</head><p>Metrics in this category relate to the effort needed to understand the text of the review. This is measured based on the number of spelling errors in the written text, which is expected to be negatively correlated to helpfulness <ref type="bibr" target="#b7">[8]</ref>, and with various readability measures.</p><p>Spelling Error Ratio (SER). Spelling Error Ratio is the number of spelling errors divided by the text length.</p><p>Gunning-Fog Index (FOG). The FOG <ref type="bibr" target="#b9">[10]</ref> aims to estimate the years of formal education (according to the American system) a person needs to understand the text during the first reading. This index uses the number of words, the number of sentences and the number of complex words to measure the years. A word is considered as a complex word if the word is using more than two syllables.</p><formula xml:id="formula_2">FOG = 0.4[( nbW ords nbSentences ) + 100( nbComplexW ords nbW ords )]<label>(2)</label></formula><p>Flesch Reading Ease (FK). The FK index <ref type="bibr" target="#b14">[15]</ref> indicates the difficulties of a text using the number of words, number of sentences and number of syllables.</p><formula xml:id="formula_3">F K = 206.835 − 1.015( nbW ords nbSentences ) − 84.6( nbSyllables nbW ords )<label>(3)</label></formula><p>Automated Readability Index (ARI). The ARI <ref type="bibr" target="#b22">[23]</ref> approximates the US grade level needed to understand the text. This index uses number of characters, number of words and number of sentences. Coleman-Liau Index (CLI). The CLI <ref type="bibr" target="#b4">[5]</ref>, Like ARI, is the approximation of US grade level needed to understand the text. This index also uses number of characters, number of words and number of sentences as components. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Extensiveness</head><p>The textual part of the review consists of a text and a summary of this text. For both we measure the length in characters, respectively called Review Text Length and Summary Length.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">METHODOLOGY</head><p>Our approach is divided into three phases: metric extraction, model construction and skill evaluation. These phases are detailed below.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Metric extraction and feature selection</head><p>In the first phase, we calculate for each review the scores for the metrics presented in Section 3, that we use to build the model of helpfulness. Then we apply feature selection to reduce the set of metrics by removing redundant ones, while avoiding losing too much information on the data set. We use a heuristic greedy method by calculating all the pairwise correlations between metrics. For those metrics that are highly correlated, only the ones highly correlated with the helpfulness score will be kept, the others being discarded. Finally, we normalize the scores in order to be independent of attribute ranges and units and highlight the actual importance of each attribute. We use Min-Max Scaling normalization strategy.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Model construction</head><p>We build our model to measure the quality of a review, where quality is defined by the helpfulness ratio of the review:</p><formula xml:id="formula_4">help f ulness = nbHelp f ulV otes nbV otes<label>(6)</label></formula><p>where nbHelp f ulV otes is is the number of positive votes received by the review and nbV otes is the total number of votes received by the review. This constitutes the class attribute value of a supervised machine learning method to build our simple model of helpfulness as a linear combination of the metrics. Thus, our predicted output variable y ∈ R will be expressed as a weighted sum of input features x i , ∀i ∈ [1, m], m being the number of features:</p><formula xml:id="formula_5">y = m i=1 ω i × x i + b<label>(7)</label></formula><p>where ω i ∈ R is the weight reflecting the contribution of feature i to the overall decision and b ∈ R stands for the bias.</p><p>The intuition behind restricting our study to linear models is mainly for two reasons. First, these models are more simple and can be calculated more efficiently. Second, they allow for a direct interpretation of the contribution of each feature to the final helpfulness decision. To this end, we try a variety of methods and keep the one best fitting the dataset.</p><p>In our tests, error measurement is done using classical correlation coefficient, Efron's R 2 , MAE and RMSE scores.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Skill evaluation</head><p>In this last phase, we apply Knowledge Tracing (KT) to sequences of reviews in order to estimate reviewers' skills. We proceed as follows: We group the reviews by reviewers, obtaining one sequence of reviews per reviewer. Each review is considered as an opportunity to learn the skill (i.e. being able to write useful reviews) and is graded with a score, representing the reviewer's performance (i.e. how useful is the review). We compute two KT scores: (i) directly from helpfulness ratings, and (ii) from the learned helpfulness model. In the former, the reviewer's performance is calculated as the helpfulness score of the review. In the latter, it is predicted by the helpfulness model. In both cases, the final score, output by KT model, expresses the probability that the skill is mastered by the reviewer.</p><p>We use the continuous version of KT described in <ref type="bibr" target="#b24">[25]</ref> since the scores we will consider are continuous. In this extension of KT, P (G) and P (S ) are assumed to follow a Gaussian distribution, and as such, they are represented by a mean value and a standard deviation. As a consequence, and opposed to binary KT, the prediction P (L n ) also follows a Gaussian distribution, whose mean expresses the value of the prediction and whose standard deviation expresses the confidence attached to this prediction. To learn the 6 parameters of continuous KT, we extend the approach proposed by Hawkins et al. <ref type="bibr" target="#b10">[11]</ref> so that it outputs estimates of P (G) and P (S ) described by a mean and a standard deviation. Then, based on these 6 parameters, the estimation of each skill acquisition P (L n ) is performed by running 100 tests, with randomly generated values for P (G) and P (S ) following their respective distribution. From these 100 P (L n ) estimates, we compute a mean and a standard deviation following the normal hypothesis.</p><p>However, the KT efficiency is known to be dependent on the granularity of skills that are fed to the model: generally, the more focused the skills, the better the prediction of skill acquisition. In this respect, it is possible to consider that each of the features that fed our linear predictive model of helpfulness can be considered as a sub-skill related to helpfulness. For this reason, we define two distinct tests to evaluate the learned model of helpfulness: In the first we simply use the output of the linear regression model as the predicted helpfulness for a review. In the second, we consider each feature metric as a possible sub-skill evaluation of the reviewer. We then learn as many KT models as there are features. In the end, we have the probabilities that sub-skills corresponding to each feature are acquired. These sub-skills scores are then aggregated into one single skill acquisition probability.</p><p>The global validation of our proposal is given by measuring the error between the KT based on real ratings, the KT based on the general linear model and the KT based on aggregated featurebased models. This error is evaluated by RMSE, which has been shown to be the strongest performance indicator for binary KT with significantly higher correlation than Log Likelihood and Area Under Curve <ref type="bibr" target="#b20">[21]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">EXPERIMENTS</head><p>Our implementation is done in Java 8, with Weka 3.8 for model learning. We used our own implementation of the knowledge tracing, whose code has been made available through Github 1 as one contribution of this paper. For polarity extraction, we use SentiWordNet <ref type="bibr" target="#b1">[2]</ref>, that lists the positivity, negativity and objectivity of each synset (set of synonyms). SentiWordNet provides the score of each word with the part-of-speech, hence we do POS tagging for each word using Stanford POS tagging library <ref type="bibr" target="#b23">[24]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1">Dataset description</head><p>The dataset we use for experiments is Amazon Book Review Data provided by Julian McAuley from UCSD <ref type="bibr" target="#b11">[12]</ref>. We select the book category in this dataset resulting in 22,507,155 total reviews.</p><p>As one of our goals is to measure the evolution of the ability to write reviews of good quality, we need to obtain for each reviewer a sequence of reviews long enough to observe that evolution. Therefore, we define reviewers with less than 30 reviews as not so active reviewers and filter them out. In addition, we only consider the reviews that have been scored by customers by means of votes (helpful review or not).</p><p>To confirm the hypothesis that few reviewers have written many reviews and that many reviewers have written few reviews, we plotted on Figure <ref type="figure" target="#fig_3">2</ref> the number of reviewers (on a logarithmic scale) by number of reviews, for reviewers with more than 30 reviews. Each points (x, y) in this figure represents that x reviewers have written y reviews. Furthermore, we found reviewers writing so much reviews that are dubious and possibly bias their reviews. For instance, reviewer of ID A14OJS0VWMOSWO wrote 43,201 reviews with an average score of 4.9991 out of 5. The reviewer received 240,262 votes, of which 199,573 are helpful. In our opinion, such reviewers introduce a bias in the dataset. Hence we limited our experiment and selected reviewers that have 30 to 50 reviews.</p><p>We calculate the score of each feature from the dataset and calculate their standard deviations, reported in the last column of Table <ref type="table" target="#tab_1">2</ref>. The standard deviation of the helpfulness, that varies in [0,1], is 0.32, which indicates that the score is quite spread out 1 https://github.com/Cubiccl/Continuous-Knowledge-Tracing/releases/tag/1.0 and the dataset has a wide enough variety, from helpful reviews and not helpful reviews. Moreover, the standard deviations of the features indicate that creating a model from this dataset is difficult.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2">Model construction</head><p>We now describe how the model of helpfulness is learned from the dataset. Consistently with <ref type="bibr" target="#b15">[16]</ref> our model of helpfulness is constructed as a linear combination of the metrics extracted from the review text and summary. More precisely, as explained in Section 4, we use a linear classifier to learn a weight for each of the features introduced in the previous section, in order to understand its contribution to the helpfulness score. We tested three different approaches to learn the feature's weights: Linear Regression, Perceptron and Support Vector Machine with linear kernel. We used out-of-the-shelf Weka algorithms with 10-fold cross validation. Table <ref type="table" target="#tab_6">5</ref> summarizes the results of those tests, for various size of dataset selected according to minimum number of votes for the reviews (918 reviews with number of votes being at least 200, up to 522804 reviews with at least 1 vote). Results for Perceptron and SVM are not reported for the largest dataset due to too much time consumption. The results show that linear regression achieves a good compromise of accuracy and computation time, with better accuracy on smaller datasets and better at handling larger datasets with no significant drop in accuracy. We therefore chose to work with linear regression in what follows.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.1">Preprocessing.</head><p>We recall that our definition of helpfulness is the number of helpful votes divided by the total number of votes, hence, a review with large number of votes is a genuine representation of helpfulness from a customer's point of view. But a review with only one vote, being a helpful one, can still obtain a maximum helpfulness score, which is not desirable. Filtering the dataset by number of votes becomes necessary. In order to find the appropriate minimum number of votes for each review, we iterated this parameter from 1 to 25 for the most important features of our model (i.e., after feature selection), and checked the results in terms of correlation and expressiveness (contribution of each metric), reported in Table <ref type="table" target="#tab_3">3</ref>. We decided to choose 2 datasets among those tested, based on, first, expressiveness (determined by the non zero value of coefficient in the linear model), and second, correlation coefficient (that indicates to which extent the model matches the dataset), for more than 10,000 reviews. The best interestingness and correlation coefficients were obtained for, respectively at least 12 votes and at least 23 votes. In this phase, we are not sure about the effect of these parameters on knowledge tracing model. Therefore, we keep two data sets, to see which can give a better result in knowledge tracing model. In what follows, the first dataset is called minV otes = 12 and consists of 41,681 reviews while the second dataset is called minV otes = 23 and consists of 11,083 reviews.</p><p>Using linear regression on the two datasets minV otes = 12 and minV otes = 23 results in the models described in Tables <ref type="table" target="#tab_8">6 and 7</ref> respectively. The models constructed are evaluated with correlation coefficient, Efron's R 2 , MAE and RMSE scores, reported in Table <ref type="table" target="#tab_9">8</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.2">Feature selection impact.</head><p>We then proceed to feature selection, as described in Section 4.1. As shown in table 8, our models before and after feature selection achieve very similar accuracy results. If efficiency in learning the model is an issue, or if the model should remain as simple as possible, one can   A second lesson learned with our feature selection step is that, interestingly, for both datasets, the features selected include features that were not present in <ref type="bibr" target="#b15">[16]</ref>, namely spelling Error Ratio, polarity and deviation. With the notable exception of Summary Spelling Error Ratio, these features' weights remain steady, and in some cases relatively important, after feature selection. Quite surprisingly, ReviewTextSER has no impact on helpfulness, while as expected deviation highly contributes negatively to it.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>5.2.3</head><p>Comparison with the state-of-the-art. As to model accuracy, Table <ref type="table" target="#tab_9">8</ref> shows that the results we obtained are notably comparable, and in some cases slightly better than those reported   <ref type="bibr" target="#b15">[16]</ref> on datasets of similar size (37,221 Amazon UK reviews were analyzed in that work). In that work, 3 models were constructed, and their fitness to the dataset was reported in terms of Efron's R 2 scores. Their three models obtained respectively 0.316, 0.354 and 0.451 while ours scores at 0.3697 for minV otes = 12 and 0.4651 for minV otes = 23 (the higher the better for the Efron's R 2 ). Importantly, their models incorporate the features number of votes and number of helpful votes, which we have deliberately not included in ours, since we aim at predicting helpfulness when no such socres are available.</p><p>Finally, the two datasets minV otes = 12 and minV otes = 23 achieve comparable MAE and RMSE, even though minV otes = 23 shows a better correlation coefficient or Efron's R 2 . This illustrates the robustness of our model construction approach to larger but more skewed datasets.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.3">Skill evaluation</head><p>In this section, we show that the model obtained can be used to accurately predict the learning of the skill of writing helpful reviews.</p><p>After training the Knowledge Tracing (KT) model as explained in Section 4.3 using a 10 fold cross validation, we acquire the average of the six parameters and the KT model RMSE scores. We also learn one KT per sub-skill and aggregate them to obtain a single probability, as explained in Section 4.3. To be consistent with the learning of the linear regression model, this aggregation is done with the weights learned for this model. The results are reported in Table <ref type="table" target="#tab_10">9</ref> and Table <ref type="table" target="#tab_0">10</ref>. Each table shows the average skill acquisition probability (mean(L n )) for the actual helpfulness skill, the helpfulness model and the aggregation of the sub-skills. We also report the parameters learned for the KT of the model.  <ref type="table" target="#tab_0">10</ref>: KT parameters, prediction and predictive accuracy after feature selection probability (i.e., the value to be predicted) is high. We conjecture that this is due to the importance of the filtering, in terms of number of reviews per reviewer and number of votes, applied over the dataset.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Scores</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.3.1">Accuracy of the two KT models.</head><p>The key observation is that switching to KT achieves very good to excellent RMSE scores, whatever the dataset considered. Notably, predicting the skill of writing helpful reviews is done much more accurately than predicting helpfulness. This allows to answer positively to the question expressed at the beginning of this paper: a model constructed on a large dataset can be used to assess procedural knowledge acquisition. Interestingly, predicting each sub-skill (corresponding to each feature) and combining these predictions to infer the global skill of writing helpful reviews is significantly better than predicting the skill at the coarse level of the model. In our test, this combination was naively done with the weights learned using the linear regression algorithm, normalized, bias included, to build the model of helpfulness. It is left as future works to determine more sophisticated weight combination. The small RMSE indicates that the KT model is good at predicting the learning of the writing skill of the reviewers. However, in order to validate the hypothesis that these good results do not come from an intrinsic smoothing behavior of the KT model, we ran the model on random sequences of helpfulness score. To this end, we generated as many sequences as the original data set has and faked the helpfulness scores with generated random numbers between 0 and 1. The result, reported in Table <ref type="table" target="#tab_11">11</ref> confirms that for both datasets the RMSE values are bad. It infers that for random sequence of numbers as the score of helpfulness, the model fails to predict the skill of the reviewers (that in this case is expectedly close to 0.5).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Scores</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">CONCLUSION</head><p>In this paper, we experimented with a large dataset of Amazon book reviews to show that a model of review helpfulness can be used to assess the acquisition of the skill of writing helpful reviews. Learning such an individual model of procedural knowledge acquisition has the advantages to be less prone to human variation and subjectivity (e.g., in judging the helpfulness of a review) and to not have to define precisely a hard to define skill, that is replaced by a model learned over the dataset. In our experiments, we modeled the quality of a review by a linear combination of metrics stemming from text analysis (like readability, polarity, spelling errors or length) and we use customer declared helpfulness as a ground truth for constructing the model. This model achieves comparable to slightly better accuracy results when compared to a state-of-the-art approach. We used Bayesian Knowledge Tracing (KT), a popular model of skill acquisition, to measure the evolution of the ability to write reviews of good quality over a period of time. Our tests validated our hypothesis, showing that the model of skill acquisition achieves a very good to near perfect accuracy score.</p><p>Our short term future works include the revision of both the helpfulness model and the skill acquisition model. In particular, the helpfulness model can be extended with advanced features like sentiment analysis or reviewer profiles features, while Deep Knowledge Tracing could be used instead of classical Knowledge Tracing. We also want to better understand the relation between the linear coefficient learned for the helpfulness model and the KT parameters of the corresponding sub-skills. Long term goals include the generalization of our approach to other datasets and skills. We are particularly interested in better understanding in what contexts skill acquisition with model building is more relevant than only building the model.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Evolution of helpfulness for a reviewer and different models of it</figDesc><graphic coords="2,127.56,85.60,340.16,170.08" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>ARI = 4</head><label>4</label><figDesc></figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>CLI = 5</head><label>5</label><figDesc></figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Number of reviews by number of reviewers</figDesc><graphic coords="7,127.55,85.61,340.18,170.07" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 :</head><label>1</label><figDesc>Summary of the main features</figDesc><table><row><cell>Feature name</cell><cell cols="2">Category</cell><cell cols="3">Applies to Range</cell></row><row><cell>rating</cell><cell cols="2">Conformity</cell><cell></cell><cell>all</cell><cell>[1, 5]</cell></row><row><cell>polarityReviewText</cell><cell cols="2">Conformity</cell><cell></cell><cell>text</cell><cell>[-1,1]</cell></row><row><cell>polaritySummary</cell><cell cols="2">Conformity</cell><cell></cell><cell>summary</cell><cell>[-1,1]</cell></row><row><cell>deviation</cell><cell cols="2">Conformity</cell><cell></cell><cell>all</cell><cell>[0,5]</cell></row><row><cell>reviewTextSER</cell><cell cols="2">Readability</cell><cell></cell><cell>text</cell><cell>[0,1]</cell></row><row><cell>summarySER</cell><cell cols="2">Readability</cell><cell></cell><cell>summary</cell><cell>[0,1]</cell></row><row><cell>reviewTextFOG</cell><cell cols="2">Readability</cell><cell></cell><cell>text</cell><cell>R +</cell></row><row><cell>summaryFOG</cell><cell cols="2">Readability</cell><cell></cell><cell>summary</cell><cell>R +</cell></row><row><cell>reviewTextFK</cell><cell cols="2">Readability</cell><cell></cell><cell>text</cell><cell>R</cell></row><row><cell>summaryFK</cell><cell cols="2">Readability</cell><cell></cell><cell>summary</cell><cell>R</cell></row><row><cell>reviewTextARI</cell><cell cols="2">Readability</cell><cell></cell><cell>text</cell><cell>R</cell></row><row><cell>summaryARI</cell><cell cols="2">Readability</cell><cell></cell><cell>summary</cell><cell>R</cell></row><row><cell>reviewTextCLI</cell><cell cols="2">Readability</cell><cell></cell><cell>text</cell><cell>R</cell></row><row><cell>summaryCLI</cell><cell cols="2">Readability</cell><cell></cell><cell>summary</cell><cell>R</cell></row><row><cell>reviewTextLength</cell><cell cols="2">Extensiveness</cell><cell></cell><cell>text</cell><cell>N +</cell></row><row><cell>summaryLength</cell><cell cols="2">Extensiveness</cell><cell></cell><cell>summary</cell><cell>N +</cell></row><row><cell>Feature name</cell><cell>Min</cell><cell>Max</cell><cell></cell><cell cols="2">Mean Std Dev.</cell></row><row><cell>rating</cell><cell>1</cell><cell cols="2">5</cell><cell>4.112</cell><cell>1.183</cell></row><row><cell>polarityReviewText</cell><cell>-0.875</cell><cell cols="2">0.875</cell><cell>0.027</cell><cell>0.052</cell></row><row><cell>polaritySummary</cell><cell>-0.875</cell><cell cols="2">1</cell><cell>0.029</cell><cell>0.137</cell></row><row><cell>deviation</cell><cell>0</cell><cell cols="2">3.786</cell><cell>0.452</cell><cell>0.615</cell></row><row><cell>reviewTextSER</cell><cell>0</cell><cell cols="2">0.5</cell><cell>0.009</cell><cell>0.008</cell></row><row><cell>summarySER</cell><cell>0</cell><cell cols="2">1</cell><cell>0.014</cell><cell>0.038</cell></row><row><cell>reviewTextFOG</cell><cell>0</cell><cell cols="2">740.8</cell><cell>13.983</cell><cell>8.45</cell></row><row><cell>summaryFOG</cell><cell>0</cell><cell cols="2">42.4</cell><cell>9.524</cell><cell>10.038</cell></row><row><cell>reviewTextFK</cell><cell>-1788.235</cell><cell cols="2">121.22</cell><cell>58.96</cell><cell>24.407</cell></row><row><cell>summaryFK</cell><cell cols="3">-1824.58 121.728</cell><cell>59.537</cell><cell>51.228</cell></row><row><cell>reviewTextARI</cell><cell cols="3">-6.837 919.088</cell><cell>11.41</cell><cell>10.374</cell></row><row><cell>summaryARI</cell><cell>-16.22</cell><cell cols="2">261.67</cell><cell>5.162</cell><cell>7.769</cell></row><row><cell>reviewTextCLI</cell><cell>-22.24</cell><cell cols="2">39.133</cell><cell>8.64</cell><cell>2.549</cell></row><row><cell>summaryCLI</cell><cell>-58.13</cell><cell cols="2">307.6</cell><cell>5.387</cell><cell>9.417</cell></row><row><cell>reviewTextLength</cell><cell>0</cell><cell cols="4">32669 1152.094 1261.787</cell></row><row><cell>summaryLength</cell><cell>1</cell><cell cols="2">257</cell><cell>28.875</cell><cell>16.786</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2 :</head><label>2</label><figDesc>Empirical Values of Metrics</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 3 :</head><label>3</label><figDesc>Correlation Coefficient for various minVotes then safely decide to use the model learned on only the selected features. In what follow, we report the results for both sets of features.</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_5"><head>Table 4 :</head><label>4</label><figDesc>Coefficient of Linear Regression Model for minV otes = 12 and minV otes = 23</figDesc><table><row><cell>Algorithm</cell><cell cols="2">Dataset Exec.</cell><cell cols="2">Correlation RMSE</cell></row><row><cell></cell><cell>size</cell><cell>time</cell><cell>coefficient</cell><cell>score</cell></row><row><cell cols="2">Linear Regression 918</cell><cell>0.01</cell><cell>0.6455</cell><cell>0.2005</cell></row><row><cell>Perceptron</cell><cell>918</cell><cell>0.12</cell><cell>0.5071</cell><cell>0.2635</cell></row><row><cell>SVM</cell><cell>918</cell><cell>0.25</cell><cell>0.6352</cell><cell>0.218</cell></row><row><cell cols="2">Linear Regression 3414</cell><cell>0.02</cell><cell>0.7226</cell><cell>0.1957</cell></row><row><cell>Perceptron</cell><cell>3414</cell><cell>0.44</cell><cell>0.5135</cell><cell>0.2569</cell></row><row><cell>SVM</cell><cell>3414</cell><cell>6.39</cell><cell>0.7199</cell><cell>0.1992</cell></row><row><cell cols="2">Linear Regression 10971</cell><cell>0.02</cell><cell>0.6888</cell><cell>0.2023</cell></row><row><cell>Perceptron</cell><cell>10971</cell><cell>1.39</cell><cell>0.5349</cell><cell>0.2658</cell></row><row><cell>SVM</cell><cell>10971</cell><cell cols="2">101.87 0.6846</cell><cell>0.2062</cell></row><row><cell cols="2">Linear Regression 29808</cell><cell>0.04</cell><cell>0.6303</cell><cell>0.2064</cell></row><row><cell>Perceptron</cell><cell>29808</cell><cell>3.81</cell><cell>0.499</cell><cell>0.2401</cell></row><row><cell>SVM</cell><cell>29808</cell><cell cols="2">829.65 0.627</cell><cell>0.2119</cell></row><row><cell cols="3">Linear Regression 522801 0.67</cell><cell>0.3352</cell><cell>0.3028</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_6"><head>Table 5 :</head><label>5</label><figDesc>Test of 3 linear model algorithms on various datasets</figDesc><table><row><cell>minV ot es = 12</cell><cell>Before</cell><cell>After</cell></row><row><cell>rating</cell><cell>0.31117594</cell><cell>0.31312877</cell></row><row><cell cols="2">polarityReviewText 0.36708846</cell><cell>0.3655654</cell></row><row><cell>polaritySummary</cell><cell>0.05166703</cell><cell>0.05351795</cell></row><row><cell>deviation</cell><cell cols="2">-0.20847153 -0.20951913</cell></row><row><cell>reviewTextSER</cell><cell>0</cell><cell>-0.03361242</cell></row><row><cell>summarySER</cell><cell cols="2">-0.28603436 -0.31027976</cell></row><row><cell>reviewTextFOG</cell><cell cols="2">-1.10263702 N.A</cell></row><row><cell>summaryFOG</cell><cell>0</cell><cell>-0.04014441</cell></row><row><cell>reviewTextFK</cell><cell>4.37638627</cell><cell>0.4228708</cell></row><row><cell>summaryFK</cell><cell>0.12251469</cell><cell>N.A</cell></row><row><cell>reviewTextARI</cell><cell>5.01873535</cell><cell>N.A</cell></row><row><cell>summaryARI</cell><cell>-0.4099729</cell><cell>N.A</cell></row><row><cell>reviewTextCLI</cell><cell>0.31215745</cell><cell>0.04970302</cell></row><row><cell>summaryCLI</cell><cell>0.79694206</cell><cell>0.40990694</cell></row><row><cell>reviewTextLength</cell><cell>0.30807426</cell><cell>0.3077809</cell></row><row><cell>summaryLength</cell><cell>0</cell><cell>0.03922442</cell></row><row><cell>bias</cell><cell cols="2">-4.26391009 -0.12802418</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_7"><head>Table 6 :</head><label>6</label><figDesc>Models of helpfulness before and after feature selection for minV otes = 12</figDesc><table><row><cell>minV ot es = 23</cell><cell>Before</cell><cell>After</cell></row><row><cell>rating</cell><cell>0.37121056</cell><cell>0.37369313</cell></row><row><cell cols="2">polarityReviewText 0.27873667</cell><cell>0.28253483</cell></row><row><cell>polaritySummary</cell><cell>0.08006764</cell><cell>0.0821465</cell></row><row><cell>deviation</cell><cell>-0.1951008</cell><cell>-0.19656865</cell></row><row><cell>reviewTextSER</cell><cell>0</cell><cell>0</cell></row><row><cell>summarySER</cell><cell cols="2">-0.25242002 -0.29930955</cell></row><row><cell>reviewTextFOG</cell><cell cols="2">-0.40678142 N.A</cell></row><row><cell>summaryFOG</cell><cell>0.02506183</cell><cell>-0.021072</cell></row><row><cell>reviewTextFK</cell><cell>2.3020136</cell><cell>0.13767929</cell></row><row><cell>summaryFK</cell><cell>0.13780373</cell><cell>N.A</cell></row><row><cell>reviewTextARI</cell><cell>2.47411051</cell><cell>N.A</cell></row><row><cell>summaryARI</cell><cell cols="2">-0.10677126 N.A</cell></row><row><cell>reviewTextCLI</cell><cell>0.21371702</cell><cell>0</cell></row><row><cell>summaryCLI</cell><cell>0.28470061</cell><cell>0.13525824</cell></row><row><cell>reviewTextLength</cell><cell>0.3431656</cell><cell>0.34368253</cell></row><row><cell>summaryLength</cell><cell>0.03837902</cell><cell>0.07231388</cell></row><row><cell>bias</cell><cell cols="2">-2.21159487 0.13145667</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_8"><head>Table 7 :</head><label>7</label><figDesc>Models of helpfulness before and after feature selection for minV otes = 23</figDesc><table><row><cell>Evaluation Metrics</cell><cell cols="2">minV otes = 12 minV otes = 23</cell></row><row><cell>Total Number of Reviews</cell><cell>41681</cell><cell>11083</cell></row><row><cell cols="2">|Before feature selection|</cell><cell></cell></row><row><cell>Correlation Coefficient</cell><cell>0.608</cell><cell>0.682</cell></row><row><cell>Efron's R 2</cell><cell>0.3697</cell><cell>0.4651</cell></row><row><cell>Mean Absolute Error</cell><cell>0.1521</cell><cell>0.1494</cell></row><row><cell>Root Mean Squared Error</cell><cell>0.2014</cell><cell>0.201</cell></row><row><cell cols="2">|After feature selection|</cell><cell></cell></row><row><cell>Correlation Coefficient</cell><cell>0.6065</cell><cell>0.6804</cell></row><row><cell>Efron's R 2</cell><cell>0.3678</cell><cell>0.4629</cell></row><row><cell>Mean Absolute Error</cell><cell>0.1526</cell><cell>0.15</cell></row><row><cell>Root Mean Squared Error</cell><cell>0.2017</cell><cell>0.2014</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_9"><head>Table 8 :</head><label>8</label><figDesc>Evaluation of the models in</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_10"><head>Table 9 :</head><label>9</label><figDesc>KT parameters, prediction and predictive accuracy before feature selectionFor the sake of readability, we recall that RMSE scores are generated in three ways:• RMSE as reported in table 8 represents the error between the helpfulness model scores and the actual helpfulness scores, without KT involved at that point. • actual-model Knowledge RMSE (a-mKRMSE) represents the error between the KT of the actual helpfulness scores and the KT of the helpfulness as computed with the model. • actual-Aggregated Knowledge RMSE (a-AggKRMSE) represents the error between the KT of the actual helpfulness scores and the aggregation of the KT scores of each feature taken independently (i.e., each sub-skill). Before commenting the results of the tests, it is important to note that the average value of the helpfulness skill acquisition</figDesc><table><row><cell></cell><cell></cell><cell cols="2">minVotes = 12 minVotes = 23</cell></row><row><cell cols="2">Actual mean(L n )</cell><cell>0.968337</cell><cell>0.960511</cell></row><row><cell>skill</cell><cell>variation(L n )</cell><cell>0.025213</cell><cell>0.033238</cell></row><row><cell></cell><cell>P (L 0 )</cell><cell>0.007504</cell><cell>0.033457</cell></row><row><cell></cell><cell>P (T )</cell><cell>0.030262</cell><cell>0.077669</cell></row><row><cell></cell><cell>mean(P (G))</cell><cell>0.349992</cell><cell>0.369982</cell></row><row><cell cols="3">Model variation(P (G)) 0.007067</cell><cell>0.0147</cell></row><row><cell></cell><cell>mean(P (S ))</cell><cell>0.412574</cell><cell>0.412882</cell></row><row><cell></cell><cell cols="2">variation(P (S )) 0.016212</cell><cell>0.025905</cell></row><row><cell></cell><cell>mean(L n )</cell><cell>0.783885</cell><cell>0.800687</cell></row><row><cell></cell><cell>variation(L n )</cell><cell>0.090915</cell><cell>0.090820</cell></row><row><cell cols="2">Aggre-mean(L n )</cell><cell>0.999943</cell><cell>0.999991</cell></row><row><cell>gated</cell><cell>variation(L n )</cell><cell>0.000584</cell><cell>0.000122</cell></row><row><cell></cell><cell>a-mKRMSE</cell><cell>0.164619</cell><cell>0.156373</cell></row><row><cell></cell><cell>a-AggKRMSE</cell><cell>0.064818</cell><cell>0.081964</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_11"><head>Table 11 :</head><label>11</label><figDesc>KT parameters, prediction and predictive accuracy for random sequences of helpfulness 5.3.2 Comparison with random sequences of helpfulness scores.</figDesc><table><row><cell></cell><cell cols="2">minVotes = 12 minVotes = 23</cell></row><row><cell>P (L 0 )</cell><cell>0</cell><cell>0</cell></row><row><cell>P (T )</cell><cell>0.014637</cell><cell>0.016329</cell></row><row><cell>mean(P (G))</cell><cell>0.346532</cell><cell>0.345426</cell></row><row><cell cols="2">variation(P (G)) 0.007084</cell><cell>0.014245</cell></row><row><cell>mean(P ((S ))</cell><cell>0.409464</cell><cell>0.409721</cell></row><row><cell cols="2">variation(P (S )) 0.016246</cell><cell>0.029494</cell></row><row><cell>mean(L n )</cell><cell>0.518769</cell><cell>0.518702</cell></row><row><cell>variation(L n )</cell><cell>0.110546</cell><cell>0.106100</cell></row><row><cell>a-mKRMSE</cell><cell>0.673532</cell><cell>0.669856</cell></row></table></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Online Review Helpfulness: Role of Qualitative Factors</title>
		<author>
			<persName><forename type="first">Arpita</forename><surname>Agnihotri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Saurabh</forename><surname>Bhattacharya</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Psychology &amp; Marketing</title>
		<imprint>
			<biblScope unit="volume">33</biblScope>
			<biblScope unit="page" from="1006" to="1017" />
			<date type="published" when="2016-12">2016. Dec 2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">SentiWord-Net 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining</title>
		<author>
			<persName><forename type="first">Stefano</forename><surname>Baccianella</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Andrea</forename><surname>Esuli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Fabrizio</forename><surname>Sebastiani</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">LREC</title>
				<imprint>
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Studying Knowledge Acquisition: Distinctions among Procedural, Conceptual and Logical Knowledge</title>
		<author>
			<persName><forename type="first">Kathleen</forename><forename type="middle">M</forename><surname>Cauley</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">67th Annual Meeting of the American Educational Research Association</title>
				<imprint>
			<date type="published" when="1986">1986</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">The effect of word of mouth on sales: Online book reviews</title>
		<author>
			<persName><forename type="first">Judith</forename><forename type="middle">A</forename><surname>Chevalier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Dina</forename><surname>Mayzlin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of marketing research</title>
		<imprint>
			<biblScope unit="volume">43</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="345" to="354" />
			<date type="published" when="2006">2006. 2006</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">A computer readability formula designed for machine scoring</title>
		<author>
			<persName><forename type="first">Meri</forename><surname>Coleman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ta</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Liau</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Applied Psychology</title>
		<imprint>
			<biblScope unit="volume">60</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page">283</biblScope>
			<date type="published" when="1975">1975. 1975</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Knowledge tracing: Modeling the acquisition of procedural knowledge</title>
		<author>
			<persName><forename type="first">T</forename><surname>Albert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">John</forename><forename type="middle">R</forename><surname>Corbett</surname></persName>
		</author>
		<author>
			<persName><surname>Anderson</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">User modeling and user-adapted interaction</title>
		<imprint>
			<biblScope unit="volume">4</biblScope>
			<biblScope unit="page" from="253" to="278" />
			<date type="published" when="1994">1994. 1994</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Sequencing educational content in classrooms using Bayesian knowledge tracing</title>
		<author>
			<persName><forename type="first">Yossi</forename><surname>Ben</surname></persName>
		</author>
		<author>
			<persName><forename type="first">David</forename></persName>
		</author>
		<author>
			<persName><forename type="first">Avi</forename><surname>Segal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ya</forename><surname>Kobi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">)</forename><surname>Gal</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">LAK</title>
		<imprint>
			<biblScope unit="page" from="354" to="363" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">The EconoMining project at NYU: Studying the economic value of user-generated content on the internet</title>
		<author>
			<persName><forename type="first">Anindya</forename><surname>Ghose</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Panagiotis</forename><surname>Ipeirotis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Revenue and Pricing Management</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="page" from="241" to="246" />
			<date type="published" when="2009">2009. 2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Comparing Knowledge Tracing and Performance Factor Analysis by Using Multiple Model Fitting Procedures</title>
		<author>
			<persName><forename type="first">Yue</forename><surname>Gong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Joseph</forename><forename type="middle">E</forename><surname>Beck</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Neil</forename><forename type="middle">T</forename><surname>Heffernan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ITS</title>
		<imprint>
			<biblScope unit="page" from="35" to="44" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<title level="m" type="main">The technique of clear writing</title>
		<author>
			<persName><forename type="first">Robert</forename><surname>Gunning</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1952">1952. 1952</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Learning Bayesian Knowledge Tracing Parameters with a Knowledge Heuristic and Empirical Probabilities</title>
		<author>
			<persName><forename type="first">William</forename><forename type="middle">J</forename><surname>Hawkins</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Neil</forename><forename type="middle">T</forename><surname>Heffernan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ryan Shaun Joazeiro De</forename><surname>Baker</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ITS</title>
		<imprint>
			<biblScope unit="page" from="150" to="155" />
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering</title>
		<author>
			<persName><forename type="first">Ruining</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Julian</forename><surname>Mcauley</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">WWW</title>
		<imprint>
			<biblScope unit="page" from="507" to="517" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Research of online review helpfulness based on negative binary regress model the mediator role of reader participation</title>
		<author>
			<persName><forename type="first">Hong</forename><surname>Hong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Di</forename><surname>Xu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">12th International Conference on Service Systems and Service Management (ICSSSM)</title>
				<imprint>
			<date type="published" when="2015">2015. 2015</date>
			<biblScope unit="page" from="1" to="5" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Do Negative Experiences Always Lead to Dissatisfaction? -Testing Attribution Theory in the Context of Online Travel Reviews</title>
		<author>
			<persName><forename type="first">Jingxian</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ulrike</forename><surname>Gretzel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Rob</forename><surname>Law</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ENTER</title>
		<imprint>
			<biblScope unit="page" from="297" to="308" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<title level="m" type="main">Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel</title>
		<author>
			<persName><forename type="first">Kincaid</forename><surname>Peter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Robert P Fishburne</forename><surname>Jr</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Richard</forename><forename type="middle">L</forename><surname>Rogers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Brad</forename><forename type="middle">S</forename><surname>Chissom</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1975">1975</date>
		</imprint>
		<respStmt>
			<orgName>Naval Technical Training Command Millington TN Research Branch</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Technical Report</note>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Evaluating content quality and helpfulness of online product reviews: The interplay of review helpfulness vs. review content</title>
		<author>
			<persName><forename type="first">Nikolaos</forename><surname>Korfiatis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Elena</forename><surname>García-Bariocanal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Salvador</forename><surname>Sánchez-Alonso</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Electronic Commerce Research and Applications</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="205" to="217" />
			<date type="published" when="2012">2012. 2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Modeling and Predicting the Helpfulness of Online Reviews</title>
		<author>
			<persName><forename type="first">Yang</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Xiangji</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Aijun</forename><surname>An</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Xiaohui</forename><surname>Yu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ICDM</title>
		<imprint>
			<biblScope unit="page" from="443" to="452" />
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">From amateurs to connoisseurs: modeling the evolution of user expertise through online reviews</title>
		<author>
			<persName><forename type="first">Julian</forename><surname>John</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Mcauley</forename></persName>
		</author>
		<author>
			<persName><forename type="first">Jure</forename><surname>Leskovec</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">WWW</title>
		<imprint>
			<biblScope unit="page" from="897" to="908" />
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">What Makes a Helpful Online Review? A Study of Customer Reviews on Amazon</title>
		<author>
			<persName><forename type="first">Susan</forename><forename type="middle">M</forename><surname>Mudambi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">David</forename><surname>Schuff</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">MIS Quarterly</title>
		<imprint>
			<biblScope unit="volume">34</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="185" to="200" />
			<date type="published" when="2010">2010. 2010</date>
		</imprint>
	</monogr>
	<note>com</note>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Performance Factors Analysis -A New Alternative to Knowledge Tracing</title>
		<author>
			<persName><forename type="first">Philip</forename><forename type="middle">I</forename><surname>Pavlik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kenneth</forename><forename type="middle">R</forename><surname>Hao Cen</surname></persName>
		</author>
		<author>
			<persName><surname>Koedinger</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">AIED</title>
				<imprint>
			<date type="published" when="2009">2009</date>
			<biblScope unit="page" from="531" to="538" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Metrics for Evaluation of Student Models</title>
		<author>
			<persName><forename type="first">Radek</forename><surname>Pelánek</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">EDM</title>
		<imprint>
			<biblScope unit="volume">19</biblScope>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Deep Knowledge Tracing</title>
		<author>
			<persName><forename type="first">Chris</forename><surname>Piech</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jonathan</forename><surname>Bassen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jonathan</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Surya</forename><surname>Ganguli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Mehran</forename><surname>Sahami</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Leonidas</forename><forename type="middle">J</forename><surname>Guibas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jascha</forename><surname>Sohl-Dickstein</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">NIPS</title>
				<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="505" to="513" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<monogr>
		<title level="m" type="main">Automated readability index</title>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">J</forename><surname>Senter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Edgar</forename><forename type="middle">A</forename><surname>Smith</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1967">1967</date>
		</imprint>
		<respStmt>
			<orgName>Univ. Cincinati</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Technical Report</note>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network</title>
		<author>
			<persName><forename type="first">Kristina</forename><surname>Toutanova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Dan</forename><surname>Klein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Christopher</forename><forename type="middle">D</forename><surname>Manning</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yoram</forename><surname>Singer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">HLT-NAACL</title>
				<imprint>
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Extending Knowledge Tracing to Allow Partial Credit: Using Continuous versus Binary Nodes</title>
		<author>
			<persName><forename type="first">Yutao</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Neil</forename><forename type="middle">T</forename><surname>Heffernan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">AIED</title>
				<imprint>
			<date type="published" when="2013">2013</date>
			<biblScope unit="page" from="181" to="188" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">Review popularity and review helpfulness: A model for user review effectiveness</title>
		<author>
			<persName><forename type="first">Jianan</forename><surname>Wu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Decision Support Systems</title>
		<imprint>
			<biblScope unit="volume">97</biblScope>
			<biblScope unit="page" from="92" to="103" />
			<date type="published" when="2017">2017. 2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<monogr>
		<title level="m" type="main">The Influences of Negativity and Review Quality on the Helpfulness of Online Reviews</title>
		<author>
			<persName><forename type="first">Philip Fei</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hans</forename><surname>Van Der Heijden</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Nikolaos</forename><surname>Korfiatis</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2011">2011</date>
			<publisher>ICIS</publisher>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
