=Paper=
{{Paper
|id=Vol-3290/short_paper1885
|storemode=property
|title=Reviewer Preferences and Gender Disparities in Aesthetic
Judgments
|pdfUrl=https://ceur-ws.org/Vol-3290/short_paper1885.pdf
|volume=Vol-3290
|authors=Ida Marie S. Lassen,Yuri Bizzoni,Telma Peura,Mads Rosendahl Thomsen,Kristoffer Nielbo
|dblpUrl=https://dblp.org/rec/conf/chr/LassenBPTN22
}}
==Reviewer Preferences and Gender Disparities in Aesthetic
Judgments==
Reviewer Preferences and Gender Disparities in Aesthetic Judgments Ida Marie S. Lassen1 , Yuri Bizzoni1 , Telma Peura1 , Mads Rosendahl Thomsen2 and Kristo昀昀er Nielbo1 1 Center for Humanities Computing Aarhus, Aarhus University, Jens Chr. Skous Vej 4, Building 1483,DK-8000 Aarhus C 2 School of Communication and Culture - Comparative Literature, Aarhus University, Langelandsgade 139, Building 1580, DK-8000 Aarhus C Abstract Aesthetic preferences are considered highly subjective resulting in inherently noisy judgments of aes- thetic objects, yet certain aspects of aesthetic judgment display convergent trends over time. This paper presents a study that uses literary reviews as a proxy for aesthetic judgment in order to identify system- atic components that can be attributed to bias. Speci昀椀cally, we 昀椀nd that judgments of literary quality di昀昀er across media types and display a gender bias. In newspapers, male reviewers have a same-gender preference while female reviewers show an opposite–gender preference. On the other hand, in the bl- ogosphere female reviewers prefer female authors. While alternative accounts exist of this apparent gender disparity, we argue that it re昀氀ects a cultural gender antagonism that is necessary to take into account when doing computational assessment of aesthetics. Keywords aesthetic judgement, gender, bias analysis, literary review 1. Introduction Aesthetic judgments are notoriously complex and subject to considerable variation because aesthetic objects are complex (ex. literature is a complex linguistic phenomenon that con- veys information indirectly), aesthetic preferences are subjective (ex. readers have di昀昀erent aesthetic preferences), and there is a general lack of a shared measurement (ex. there is no de昀椀nitive metric to measure aesthetics or aesthetic judgments). Literary quality, for instance, can be considered one of the most subjective 昀椀elds of evaluation, and variation is mostly at- tributable to noise introduced by individual preferences. Yet the perception of literary quality from large amounts of readers over time does show convergent trends: communities tend to establish and update canons [12]; speci昀椀c texts and narratives manage to remain popular [22] despite the changing of fashions and political phases and certain author names become epony- mous of literary quality in di昀昀erent countries and throughout the social spectrum [2]. Some CHR 2022: Computational Humanities Research Conference, December 12 – 14, 2022, Antwerp, Belgium £ idamarie@cas.au.dk (I. M. S. Lassen); yuri.bizzoni@cc.au.dk (Y. Bizzoni); tpeura@cc.au.dk (T. Peura); madsrt@cc.au.dk (M. R. Thomsen); kln@cas.au.dk (K. Nielbo) ȉ 0000-0001-6905-5665 (I. M. S. Lassen); 0000-0002-6981-7903 (Y. Bizzoni); 0000-0001-8896-8603 (T. Peura); 0000-0002-4975-6752 (M. R. Thomsen); 0000-0002-5116-5070 (K. Nielbo) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) 280 facets of literary quality can be explained in terms of the literary content (ex. predictability of content, coherence of the narrative), while others depend on socio-cultural priors that intro- duce systematic variation in aesthetic judgments. It is the latter that are the object of this study, speci昀椀cally the possible e昀昀ects of gender on the assessment of literary quality as an example of how aesthetic judgments can be biased by contextual factors. There are two important caveats to consider. First, we are not claiming that variability in aesthetic judgment is undesirable, on the contrary, it facilitates expressive variation and coun- ters aesthetic standardization as has been the norm under some authoritarian regimes [4, 10, 13, 17]. De昀椀ning bias as a deviation from statistical parity, we are only interested in the systematic components of aesthetic judgment that can be attributed to such bias, speci昀椀cally gender bias, and approach this problem from the perspective of fairness challenges in the classi昀椀cation of real-world data [20]. Second, it is not our intention to ‘point 昀椀ngers’ or address speci昀椀c indi- viduals (ex. speci昀椀c reviewers) or institutional levels of biases (ex. speci昀椀c outlets). Fairness challenges 昀椀rst and foremost concern a systemic level of biases, that is, macro-relations that are systematic and disadvantaged groups of people based on their identity (gender, race, class, sexual orientation), while at the same time advantaging members of a dominant group. While at the individual level a bias e昀昀ect may seem small or trivial, it is important to emphasize that systems of bias can result in rampant injustice [15]. The problem of literary quality’s subjective status becomes even more intriguing when we turn to the challenge of its computational assessment. Most studies assume the possibility of one one-dimensional ground truth by modeling literary quality as a single rating or class asso- ciated with a text [9, 26, 25]. These ground truths are retrieved from various sources: literary critics, book sale numbers, bestseller lists, or crowd-sourced reader opinions. Such approaches have several limitations: relying only on experts’ judgment (ex. awards, prestigious reviews) biases the model to re昀氀ect their preferences, but striving for representativity by crowd-sourcing opinions ends up ignoring important di昀昀erences in the readers’ population. To properly under- stand the scienti昀椀c value of these ground truths and develop standardized measures of quality, it is necessary to model possible sources of bias. Recent studies have analyzed the impact of the gender of authors as well as of reviewers in literary reviews. [24] investigates di昀昀erences in sentiments in Norwegian book reviews and how literary reviewers are describing authors of the same and opposite gender. Their 昀椀ndings show di昀昀erences in how female and male book authors are positively or negatively described and that the gender of the critics in昀氀uences this di昀昀erence. In line with the 昀椀ndings in [23] on Goodreads reviews, the authors point out, that male critics deem crime novels written by female authors and sentimental romance novel by male authors as negative and suggest that this indicate that book reviews contain the social hierarchies tending to ascribe emotional traits to women. In the Goodreads reviews, di昀昀erences are both found in preferences of types of books as well as within genres, meaning that when reviewers of both genders read and review books of the same genre, di昀昀erences in grading are found between male and female reviewers. In addition, the results show that within the majority of genres, readers prefer books written by an author of their own gender. Similarly, when Dutch readers were asked to rate both read and unread novels on a scale of 1–7 for their literary and overall quality [16] show that female authors receive signi昀椀cantly lower ratings than their male counterparts. In the greater context of circulation and reception of books, [21] and [6] address the role 281 of both review and reviewer in the broader Anglophone literary 昀椀eld. The former point to an imbalance found in the British and Australian review scenes: Most book reviewers are men, and books reviewed are o昀琀en written by men, resulting in books written by female authors being treated like a niche. The latter o昀昀ers a historical account of the gendered structure of the literary 昀椀eld and maps out how authors build their reputations and accumulate prestige in contemporary book publishing. By taking the historical perspectives of the literary 昀椀eld into account, it might not be surprising that gender disparities still exist. As the reception and judgment of books exist within a greater societal context where structural oppression occurs, signs of systemic inequality call for further investigation. In other areas, studies have examined the role of gender in assessment situations. [18] shows how students’ ratings of instructors are biased towards a more positive assessment of male instructors compared to female instructors. By conducting their study in an online learning situation, the authors were able to disguise the gender identity of the instructors. The found bias was not dependent on the actual gender of the instructor, but on the perceived gender of the instructor. That allows for a conclusion that points out that the gender bias is not a result of the gendered behavior of the instructors, but actual bias in the students, suggesting that a female instructor would have to work harder than a male to receive comparable ratings. In an academic context, a study from 2020 [14] shows how female applicants were less likely than male applicants to receive access to resources (in terms of telescope time) when the review process was single–, rather than dual–anonymized. In particular, the 昀椀ndings indicate that male reviewers rated female applicants signi昀椀cantly worse than they rated male applicants before dual–anonymization was adopted, and a昀琀er applying dual–anonymization, the gender bias was reduced. Similar results are shown in the hiring process of orchestra musicians [11] and several studies have shed light upon the e昀昀ect of gender in hiring processes [3, 5]. Evidence of gender bias across domains may indicate that similar structural dynamics are at play, and hence, not a unique gender bias evolving in the 昀椀eld of literature. In addition to gender disparities, other social markers might also play a role in aesthetic judgments and requires some awareness in the following analysis. [19] investigates di昀昀erences between theater reviews written in blogs and newspapers and concludes that even though such reviews are highly similar, di昀昀erences are found at a subtle level: whereas bloggers tend to focus on categories related to a昀昀ect and audience relations, reviews written by journalists rely on descriptive approaches to the play at hand. With a focus on book blogs, the analysis in [8] shows how the blog media enables mass participation in reader culture. However, even within the blogosphere, a hierarchy of ’reader capital’ exist, and some bloggers obtain status as ’tastemakers.’ These 昀椀ndings indicate a need for clustering of reader types when modeling reader preferences [1]. 2. Methods 2.1. Data The data set covers book reviews published in Danish media in the years 2010-2021. The data are retrieved from the online platform bog.nu’s API which collects book reviews published in Danish media. This includes reviews written in national newspapers, literary magazines, 282 online media as well as in personal blogs. See table 1 for a brief overview of the data set. 2.2. Grade Transformation and Estimation As di昀昀erent media use di昀昀erent grading scales, the grades on bog.nu are transformed to a 100- point grading scale. This approach, however, results in a sparse distribution of grades as the use of the original grading scales maps onto di昀昀erent intervals on the 100-point scale. Instead of this naive approach, we have used the original grade and applied a linear transformation to map all grades to a shared 6-point scale.1 Mapping from an �㕎-�㕏–point scale to a 1-6–point scale: (ā − �㕎) (ā − �㕎) 5(ā − �㕎) �㕌 = ⌊(þ − ý) + ý⌋ = ⌊(6 − 1) + 1⌋ = ⌊ + 1⌋ (1) (�㕏 − �㕎) (�㕏 − �㕎) �㕏 − �㕎 Figure 1 shows the distributions of grades in Danish Newspapers transformed into a shared 6–point scale. Some media do not provide a grade in a given review, but only a qualitative review. Bog.nu does however provide a quanti昀椀cation of the review, which is estimated by a human editor. For reviews written in Danish newspapers, this estimation procedure is used in less than 25% of the cases. Two important clari昀椀cations are needed: 昀椀rst, these estimates are made for both genders of both reviewers and authors. Secondly, to test the robustness of these estimates, the analysis below was performed both on the full data set and on the subset with original quantitative grades given in the reviews. We see that the same trends occur when excluding the reviews with estimated grades. Figure 1: Histogram over grades, in newspapers and blogs, a昀琀er grades are transformed into shared 6-point Likert scale. 2.3. Feature Distributions The original data set from bog.nu does not contain gender for all authors and reviewers. We used a gendered name list to retrieve the missing gender variables. We are working with a 1 It should be noted that most six-point scales have become the standard in many review outlets. 283 Table 1 An overview of the dataset presented in this paper. The category of Online media includes (literary) sites that fall between online newspapers and personal blogs. Data set overview No. of reviews 57 369 No. of di昀昀erent titles 14 647 No. of reviews by media type Male reviewers 19 119 Male authors 8 056 Newspapers 22 131 Female reviewers 29 084 Female authors 6 591 Blogs 16 791 Unknown 9 166 Online media 10 635 Blog-like websites 3 456 Weekly magazines 1 566 Professional magazines 168 binary understanding of gender, and we have used the API genderize.io that returns the prob- ability of a name being either male or female, based on a data set of 250,000 names.2 We are aware of the problems with this method and how it rules out other gender identities [7]. How- ever, a binary understanding of gender is necessary for our analysis to understand the existing structures between men and women in contemporary society – and the literary review scene. Looking at feature distributions in our data set, we see that both gender variables and grades di昀昀er across media types. As shown in Figure 2, we see a highly skewed gender distribution across media types: at the number of reviews, male reviewers reviewing male authors are the dominant group in newspapers, whereas female reviewers reviewing female authors are the dominant group in the blogosphere. Focusing on the number of reviewers the gender distribution re昀氀ects the one shown in the number of reviews. In the data set, we have 621 unique reviewers writing in Danish newspapers. Out of these, 239 are women, 378 are men, and 4 are unknown according to the gender retrieval method described above. As the bog.nu data set does not contain reviewer id for blog reviewers, a similar calculation cannot be made for blogs. Besides the distribution of gender, we have furthermore identi昀椀ed di昀昀erent ‘grading behav- iors’ in newspapers and in blogs (see right-hand side of Figure 2). Hence, due to di昀昀erent distributions of gender as well as grades given across media types, we have in the rest of this study divided our analysis in two: newspapers and blogs. 2.4. Model In order to estimate the relative e昀昀ect of author and reviewer gender on reviewer assigned grade (six-point scale), we 昀椀t the following linear model: �㕔�㕟�㕎�㕑�㕒 Ă�㕖 = ā�㕖 �㗽�㕎þýℎ�㕜�㕟 ∗ ā�㕖 �㗽�㕟�㕒ÿ�㕖�㕒Ā�㕒�㕟 + �㔖�㕖 (2) with the null model that �㗽�㕎þýℎ�㕜�㕟 = �㗽�㕟�㕒ÿ�㕖�㕒Ā�㕒�㕟 = 0 2 Testing the accuracy of genderize on a gendered name list from Statistic Denmark: ACC = 0.93 for n = 10,000 284 Figure 2: The histogram shows the distribution of reviewer- and author gender across the media types blog, online media, and newspapers. The line plots to the right show the average grade in newspapers and blogs respectively. Where Ă�㕖 is the grade of review �㕖, ā�㕖 is the predictor value (gender) of review �㕖, �㗽 represents unknown parameters and �㔖 is the error terms. A linear model is 昀椀tted for both blogs and newspapers respectively. We test 3. Results For newspaper reviews, 昀椀tting Ă�㕖 (grade of review) in the model above with ordinary least squares (OLS), we get the results shown in Table 2. The model and all contrasts are statisti- cally signi昀椀cant (�㕝 < .0001). Conceptually, same gender (male reviewer–male author, female reviewer–female author) reviews span the extreme values, while the opposite gender (female reviewer–male author, male reviewer–female author) represents the middle of the distribution, see le昀琀-hand side of Figure 3. Female reviewers reviewing female authors account for the on average lowest grade. Male reviewers reviewing male authors results in the highest grade, with a 0.2 average grade increase. Opposite gender reviews are statistically speaking indistinguish- able, but they di昀昀er on average by 0.1-grade point from the same gender scoring. Table 2 Newspapers: Results for an OLS predicting grade based on the gender combination: reviewer and author. The ý-test shows that all results are statistically significant. Gender combination (reviewer * author) Average grade �㕆Ā ý �㕝 < |ý| ÿ�㔼 95% Intercept [female * female] 3.9830 0.015 268.616 0.0001 [3.954, 4.012] female * male 0.0881 0.022 3.946 0.0001 [0.044, 0.132] male * female 0.1093 0.021 5.179 0.0001 [0.068, 0.151] male * male 0.2079 0.018 11.544 0.0001 [0.173, 0.243] For the data set on the blogs, 昀椀tting Ă�㕖 (grade of review) in the model above with ordinary least squares (OLS), we get the results shown in Table 3. The model and all contrasts besides 285 the male-male combination are statistically signi昀椀cant (�㕝 < .0001 for female-female and female- male, and �㕝 < .05 for male-female) and we see that in contrast to the results in newspapers, female reviewer–female author account for the on average highest grade while the combination female reviewer–male author result in the, on average, lowest grade with a di昀昀erence of 0.1- grade point between those two. The large standard deviation for male reviewer–female author is due to the low number of reviews with this combination (n=881). See right-hand side of Figure 3 Table 3 Blogs: Results for an OLS predicting grade based on the gender combination: reviewer and author. The ý-test shows that all results besides the male-male combination are statistically significant. Gender combination (reviewer * author) Average grade �㕆Ā ý �㕝 < |ý| ÿ�㔼 95% Intercept [female * female] 4.6737 0.012 401.865 0.0001 [4.651, 4.697] female * male -0.1212 0.019 -6.402 0.0001 [-0.158, -0.084] male * female -0.0931 0.038 -2.440 0.015 [-0.168, -0.018] male * male -0.0586 0.030 -1.983 0.047 [-0.117, -0.001] Figure 3: Point plots showing the average grades in each of the four gender combinations. The lines indicate the standard deviations with a confidence interval of 95%. The le昀琀-hand figure shows grades given in newspapers, the right-hand figure shows grades given in blogs. Finally, as mentioned in section 2 and shown in Figure 2, men account for the majority of reviews in the newspapers. Men actually dominate in the number of reviews (63% are written by men and out of these reviews 69% are reviews of male authors). Figure 4 shows the develop- ment over the years 2010-2021. Here we see that the fraction of female authors being reviewed is slightly increasing, but the fraction of male reviewers is relatively stable through the years. For blogs, we see an extreme overweight of female reviewers reviewing female authors (85% are written by women and out of these reviews 60% are reviews of female authors) with little to almost no changes in the years 2016-2021. Be aware that the gender distribution shown in Fig- ure 4 are made on the number of reviews and not on the unique reviewers. As some bloggers are highly productive, the picture might look di昀昀erent if we looked at unique reviewers. How- 286 ever, as mentioned in section 2 this is not possible for blogs as reviewer id for blog reviewers is lacking from the bog.nu data set. Nevertheless, looking at the number of published reviews show the gender distribution in media coverage. Figure 4: Stacked areas plots showing the percentage distribution of the four di昀昀erent gender combi- nations over the years (2010-2021 for newspapers and 2016-2021 for blogs) on the number of published reviews. The le昀琀-hand figure shows gender balance in newspapers, the right-hand figure shows gender balance in blogs. 4. Discussion In line with the results in [23, 24, 16], we show that the gender of authors as well as of reviewers play a role in literary reviews. In particular, the results above show that • in blogs, which women strongly dominate, women review same-gender authors more positively than opposite-gender authors. • in newspapers, which men dominate, men review same gender authors more positively than opposite gender authors, and women show the reverse pattern, that is, same gender authors are reviewed more negatively than opposite gender authors. From this, we can conclude that female grading behavior di昀昀ers in media type. We see a preference for female authors in blogs and an opposite preference in newspapers. Still, we also note that this di昀昀erence correlates with the gender majority in the media type – female reviewers prefer female authors in blogs where women dominate, and like male authors in newspapers where men dominate several reviewers and authors. Where the blogosphere is a new medium, the newspaper outlet is a well-established form of traditional media, which historically has excluded women and minority people, potentially in昀氀uencing gender distribution today. A partial explanation of the grading behavior is that if males display a same-gender preference and male reviewers make up the majority of news- paper reviewers, the gender minority adapts to this preference and develops a same–gender antagonism. At a more general level, the same gender preference of males in aesthetic judg- ment may re昀氀ect a cultural gender antagonism that follows a long historical trajectory. The 287 female opposite gender preference in newspapers is likely to follow the same cultural gender antagonism. As the number of male blog reviewers reviewing male authors is too low to show statistically signi昀椀cant results, a similar conclusion cannot be drawn for the blogs. There are caveats to this interpretation. First, the gender bias may be confounded with expertise bias, that is, that speci昀椀c literary language leads to higher literary appreciation. If women, in general, write more genre literature, then the observed di昀昀erence may stem from a di昀昀erence in the complexity of linguistic features. To resolve this, we would need genre classi昀椀cation for all reviewed books and a model of the distribution of genres across media types. Second, although the average di昀昀erences in grades are highly signi昀椀cant, the e昀昀ect size is not considerable (ex., 0.2 points on a six-point scale for the same gender in newspaper reviews). This, however, begs the question, how large is a systematic di昀昀erence supposed to be before it counts as a bias? We would argue that whenever we 昀椀nd the systematic variation that co-insides with demographic variables, we will likely see an indication of a relevant bias irrespective of the e昀昀ect size. Conversely, if only a di昀昀erence of a large magnitude (ex. two to three points on a six-point scale) were to count, then biases would only re昀氀ect common- sense propositions that most of us would share irrespective of their truth (ex. if women were, on average reviewed two to three points lower, most of us would agree that they were worse writers). The last caveat points to an important issue; we are not arguing that a speci昀椀c newspaper, or all newspapers for that matter, follow an explicit exclusionary strategy formulated by male re- viewers and editors – nor that bloggers purposely exclude male authors. There are two sources of error whenever we make a judgment: bias and noise. While noise is randomly distributed and lacks a systematic explanatory mechanism, biases are systematic and can be explained in terms of a mechanism. Demographic biases o昀琀en originate in the systemic oppression of mi- nority groups. For the speci昀椀c review cases, the results are likely to mirror existing societal oppressive structures such as those found in [6, 18, 14, 11, 3, 5]. We expect that majority groups, in general, will de昀椀ne norms and values that result in biased judgments irrespective of societal domain. References [1] Y. Bizzoni, I. Lassen, T. Peura, M. Thomsen, and K. Nielbo. “Predicting Literary Quality How Perspectivist Should We Be?” In: 1st Workshop on Perspectivist Approaches to NLP (NLPerspectives). Marseille, France: European Language Resources Association (ELRA), 2022, pp. 20–25. [2] H. Bloom. The western canon: The books and school of the ages. Houghton Mi昀툀in Harcourt, 2014. [3] A. Booth and A. Leigh. “Do employers discriminate by gender? A 昀椀eld experiment in female-dominated occupations”. In: Economics Letters 107.2 (2010), pp. 236–238. [4] E. Buch, I. C. Zubillaga, and M. D. Silva. Composing for the State: Music in Twentieth- Century Dictatorships. Routledge, 2016. 288 [5] M. S. Cole, H. S. Feild, and W. F. Giles. “Interaction of recruiter and applicant gender in resume evaluation: a 昀椀eld study”. In: Sex Roles 51.9 (2004), pp. 597–608. [6] A. Dane. Gender and Prestige in Literature. Springer, 2020. [7] S. Dev, M. Monajatipoor, A. Ovalle, A. Subramonian, J. M. Phillips, and K. Chang. “Harms of Gender Exclusivity and Challenges in Non-Binary Representation in Language Tech- nologies”. In: CoRR abs/2108.12084 (2021). [8] E. Driscoll. “Book blogs as tastemakers”. In: Participations. Journal of Audience & Recep- tion Studies 16 (2019), pp. 280–305. [9] C. Ferrer. “Canonical values vs. the Law of large numbers: The Canadian Literary Canon in the Age of Big Data”. In: Rupkatha Journal on Interdisciplinary Studies in Humanities 5.3 (2013), pp. 81–90. [10] V. Frajese. Nascita dell’Indice: la censura ecclesiastica dal Rinascimento alla Controriforma. Vol. 13. Morcelliana, 2006. [11] C. Goldin and C. Rouse. “Orchestrating impartiality: The impact of” blind” auditions on female musicians”. In: American economic review 90.4 (2000), pp. 715–741. [12] J. Guillory. Cultural capital: The problem of literary canon formation. University of Chicago Press, 1993. [13] A. Herrero-Olaizola. The censorship 昀椀les: Latin American writers and Franco’s Spain. SUNY Press, 2012. [14] S. K. Johnson and J. F. Kirk. “Dual-anonymization yields promising results for reducing gender bias: A naturalistic 昀椀eld experiment of applications for Hubble Space Telescope time”. In: Publications of the Astronomical Society of the Paci昀椀c 132.1009 (2020), p. 034503. [15] D. Kahneman, O. Sibony, and C. R. Sunstein. Noise: A Flaw in Human Judgment. New York: Little, Brown Spark, 2021. 464 pp. [16] C. Koolen, K. van Dalen-Oskam, A. van Cranenburgh, and E. Nagelhout. “Literary qual- ity in the eye of the Dutch reader: The National Reader Survey”. In: Poetics 79 (2020), p. 101439. [17] Y. Kwon and J. Wood. “Literature and art in North Korea: Theory and policy”. In: Korea Journal 31.2 (1991), pp. 56–70. [18] L. MacNell, A. Driscoll, and A. N. Hunt. “What’s in a name: Exposing gender bias in student ratings of teaching”. In: Innovative Higher Education 40.4 (2015), pp. 291–303. [19] M. Maignant, G. Brison, and T. Poibeau. “Text Zoning of Theater Reviews: How Di昀昀erent are Journalistic from Blogger Reviews?” In: Workshop on Natural Language Processing for Digital Humanities. 2021, pp. 138–143. [20] T. Miconi. “The impossibility of” fairness”: a generalized impossibility result for deci- sions”. In: arXiv preprint arXiv:1707.01195 (2017). [21] C. Squires. “The Review and the Reviewer”. In: Routledge, 2020, pp. 117–132. [22] J. Stephens and R. McCallum. Retelling stories, framing culture: traditional story and meta- narratives in children’s literature. Routledge, 2013. 289 [23] M. Thelwall. “Reader and author gender and genre in Goodreads”. In: Journal of Librari- anship and Information Science 51.2 (2019), pp. 403–430. [24] S. Touileb, L. Øvrelid, and E. Velldal. “Gender and sentiment, critics and authors: a dataset of Norwegian book reviews”. In: Proceedings of the Second Workshop on Gender Bias in Natural Language Processing. 2020, pp. 125–138. [25] M. Walsh and M. Antoniak. “The Goodreads ‘Classics’: A Computational Study of Read- ers, Amazon, and Crowdsourced Amateur Criticism”. In: Journal of Cultural Analytics 4 (2021), pp. 243–287. [26] X. Wang, B. Yucesoy, O. Varol, T. Eliassi-Rad, and A.-L. Barabási. “Success in books: pre- dicting book sales before publication”. In: EPJ Data Science 8.1 (2019), pp. 1–20. 5. Online Resources See https://zenodo.org/record/7050235 for code. 290