Hits or Misses? A Linguistically Explainable Formula for
                                Fanfiction Success
                                Giulio Leonardi1,*,† , Dominique Brunato2,† and Felice Dell’Orletta2,†
                                1
                                    University of Pisa
                                2
                                    Istituto di Linguistica Computazionale “Antonio Zampolli”, ItaliaNLP Lab, Pisa


                                                  Abstract
                                                  This study presents a computational analysis of Italian fanfiction, aiming to construct an interpretable model of successful
                                                  writing within this emerging literary domain. Leveraging explicit features that capture both linguistic style and semantic
                                                  content, we demonstrate the feasibility of automatically predicting successful writing in fanfiction and we identify a set of
                                                  robust linguistic predictors that maintain their predictive power across diverse topics and time periods, offering insights into
                                                  the universal aspects of engaging storytelling. This approach not only enhances our understanding of fanfiction as a genre
                                                  but also offers potential applications in broader literary analysis and content creation.

                                                  Keywords
                                                  fanfiction, Italian corpus, success prediction, linguistic features, Explainable Boosting Machine


                                1. Introduction and Motivation                                                                             machine learning offer a powerful lens for making ex-
                                                                                                                                           plicit patterns that may explain the complex interplay
                                The growing proliferation of online literary content has                                                   between reader engagement and content success.
                                led to the emergence of new genres and storytelling                                                           This paper moves in this field and presents a computa-
                                forms, with fanfiction being particularly popular among                                                    tional analysis focused on Italian fanfiction, addressing
                                teens and young adults. Fanfiction consists of stories                                                     the following research questions: i.) Can the success of
                                created by fans (mostly hobby authors) that extend or                                                      Italian fanfiction be automatically predicted using stylis-
                                alter the narrative of existing popular media like books,                                                  tic and lexical features of the texts?; ii.) Which types of
                                movies, comics or games, and represents a significant                                                      features demonstrate the highest predictive capability,
                                portion of user-generated content on the web [1]. In re-                                                   and how consistent are these features across different
                                cent years, the widespread popularity that this genre has                                                  time periods and thematic domains?; iii.) To what ex-
                                assumed has prompted research into the linguistic and                                                      tent can these features be explained in terms of their
                                stylistic elements that contribute to its success, mirror-                                                 contribution to predicting success?
                                ing studies conducted on more traditional literary genres                                                     Our contributions. i.) We collected a corpus of Ital-
                                [2, 3, 4], among others.                                                                                   ian fanfiction stories enriched with metadata considered
                                   Understanding the elements that contribute to narra-                                                    as proxies of their success; ii.) We investigate the relation-
                                tive success is a fascinating area of research with implica-                                               ship between stylistic and lexical features of stories and
                                tions across various fields, from literary analysis to digital                                             their success from a modeling perspective; iii.) We iden-
                                humanities. From a socio-linguistic perspective, it can                                                    tified the most influential features in success prediction,
                                offer deeper insights into people and culture. It also has                                                 showing the key role played by form and stylistic related
                                significant applications in areas such as personalized con-                                                features across time and thematic domains of fanfictions.
                                tent recommendation and educational technology [5, 6].                                                        The paper is structured as follows: Section 2 briefly
                                While personal interests undoubtedly play a crucial role                                                   contextualizes our study among relevant literature; Sec-
                                in predicting a reader’s engagement with a literary con-                                                   tion 3 presents the reference corpus of Italian fanfiction
                                tent, the way information is presented can also evoke                                                      stories that we collected; in Section 4 we provide an
                                different reactions and levels of interaction, ultimately                                                  overview of the approach we devised including the de-
                                influencing the narrative’s success. In this regards, recent                                               scription of features used for classification and the classi-
                                advancements in Natural Language Processing (NLP) and                                                      fiers employed. Section 5 discusses the main findings and
                                                                                                                                           offers a fine-grained analysis of the classification results
                                CLiC-it 2024: Tenth Italian Conference on Computational Linguistics,                                       in terms of feature explainability. In Section 6 we sum-
                                Dec 04 — 06, 2024, Pisa, Italy
                                *
                                  Corresponding author.
                                                                                                                                           marize key findings and outlining promising directions
                                †
                                  These authors contributed equally.
                                                                                                                                           for future research in this field.
                                $ g.leonardi5@studenti.unipi.it (G. Leonardi);
                                dominique.brunato@ilc.cnr.it (D. Brunato);
                                felice.dellorletta@ilc.cnr.it (F. Dell’Orletta)
                                            © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License
                                            Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
2. Related Work                                             techniques is still limited. Mattei et al. [14] employ lin-
                                                            guistic profiling to analyze a corpus of Italian fanfiction
The exploration of online content and its engagement inspired by the Harry Potter series, with the purpose of
levels has increasingly benefited from advancements in identifying linguistic patterns associated with success.
NLP and machine learning. Different perspectives have Inspired by this previous study, our research aims to ex-
been touched upon considering different textual domains, tend these findings through a computational modeling
typology of linguistic features and quantitative metrics approach, investigating the power of linguistic features
to operationalize a very subjective concept like success. for predicting fanfiction success and their generalization
The study by Toubia and colleagues [7] explores how across different experimental settings.
the structure of narratives, particularly the internal se-
mantic progression measured by features derived from
dense word representations, affects the success of stories 3. Corpus Construction
across different text typologies (movies, TV shows, and
academic papers). Berger and colleagues [8] examine As a first step, we compiled a reference corpus of Ital-
how the linguistic structure of online content affects user ian fanfiction. To this end, we searched available texts
engagement, specifically by modeling sustainable atten- on efpfanfic.net, one of the largest Italian websites dedi-
tion. This concept goes beyond just attracting a reader cated to publishing and reading amateur stories, focusing
with a catchy headline or advertisement; it also encom- specifically on stories labeled in the fanfiction genre.
passes the likelihood that a reader will continue viewing      Using a web scraping system, we extracted fanfictions
or reading the content. In their analysis of more than      based  on the Harry Potter series, a highly popular fandom
35,000 online contents from heterogeneous sources, they     on  the site, boasting 57,196 stories published between
emphasize the role of features related to processing ease 2003 and 2023. Figure 1 presents the temporal distribution
and emotional language.                                     of these fanfictions up to 2020.
   In the realm of literary works, Ashok et al. [2] first      Additionally, we gathered a secondary corpus consist-
leverage stylometric analysis and machine learning tech- ing of 2,441 stories based on The Lord of the Rings series.
niques to predict the success of popular English novels This secondary corpus served as a test set to assess the
from the Gutenberg Project. Their approach demon- influence of thematic domains on the analysis of story
strated the potential of these techniques for assessing success.
literary success. Extending these findings, Maharajan          For this study, we focused on the first chapter of each
et al. [9] proposed a multi-task approach to simultane-     fanfiction  to ensure a consistent analysis. While it is
ously evaluating success and genre prediction. Using        widely  recognized   that thematic units within stories —
deep learning representations, in addition to hand-craft particularly the beginnings and endings — often differ
features related to topic, sentiment, writing style, and from the middle sections due to their distinct narrative
readability of books, they obtained better performance roles, we observed that the majority of stories (69%) con-
than the single success prediction task approach. Focus- sist of only a single chapter, making them effectively self-
ing on contemporary English-language literature, the contained. The efpfanfic portal allows users to review
study by Bizzoni and colleagues [10] investigate how per- each chapter with ratings marked as negative, neutral, or
ceived novel quality is influenced by a broad spectrum of positive. Consistent with prior research such as [9] we
textual features — such as those related to readability and used the absolute number of reviews to define the success
sentiment — and how these perceptions vary depending of a story, which we consider broadly as popularity. This
on the reader’s level of expertise.                         approach is based on the assumption that a high num-
   The growing volume of online fanfiction has also been ber of interactions, regardless of their sentiment, reflects
the subject of numerous studies, either from the perspec- strong reader’s engagement. This is especially confirmed
tive of text mining by using NLP or through a qualita- since in our dataset negative reviews represent less than
tive lens via a manual examination. A comprehensive 1% of the total.
survey of analyses in this direction has been recently         To formulate our success prediction task, we estab-
provided by [11]. For example, Milli and Bamman [12]        lished a review threshold to classify each story as either
explore the relationship between fanfiction and its orig- a success or a failure. After analyzing the distribution of
inal canon, offering one of the first empirical analyses reviews for Harry Potter texts (Figure 2), we decided to
of this genre. Similarly, Sourati et al. [13] find that the exclude stories that fell in the middle of the distribution –
similarity between fanfictions and their original stories those that could not be clearly defined as successes or fail-
— particularly in terms of emotional arcs and character ures. Consequently, stories with fewer than two reviews
dynamics—correlates significantly with fanfiction’s pop- (25th percentile) were classified as failures, and those
ularity.                                                    with more than six reviews (75th percentile) as successes.
   In the context of Italian fanfiction, research using NLP Stories within the interquartile range were excluded from
Table 1                                                           success grounded on interpretable factors, we decided
Descriptive Statistics for the Harry Potter (HP) and Lord of      to leverage explicit features modelling both style-related
The Rings (LOTR) Corpora                                          and lexical aspects of text as input for the classification
                                                                  system. To evaluate the effectiveness and robustness of
 Corpus      #texts    #negatives     #positives    avg. #tok
                                                                  these features, we conducted experiments across three
   HP        26,032      13,058         12,974        1911        conceptually distinct scenarios to evaluate the ability to
  LOTR        932         526            406          1946        discriminate success in different contexts. Specifically,
                                                                  the first scenario is in-domain: the classifier is evaluated
                                                                  on texts within the same thematic domain as the training
                                                                  set, using 10-fold cross-validation on the HP corpus. The
                                                                  second scenario is out-domain: the classifier is evalu-
                                                                  ated on texts from a different thematic domain than the
                                                                  training set. In this case, the HP corpus is used as the
                                                                  training set, while the LOTR corpus serves as the test set.
                                                                     Finally, in the cross-time scenario, the temporal im-
                                                                  pact on classification is considered. The classifier is
                                                                  trained solely on texts from the HP corpus published
                                                                  in 2011 and sequentially tested on texts from each other
                                                                  year from 2003 to 2020. The 2011 texts were chosen for
                                                                  training because this year has the largest amount of data
Figure 1: Distribution of all fanfictions from the Harry Potter   (3,755 texts), is approximately central within the tempo-
corpus by year of publication (up to 2020).
                                                                  ral range [2003, 2020], and is particularly significant for
                                                                  fanfiction production due to the release of the final film
                                                                  in the Harry Potter saga.
                                                                     The main components of our approach are detailed in
                                                                  the following sections.

                                                                  4.1. Success Predictors
                                                                  A comprehensive set of features was extracted for each
                                                                  story in the corpus. These features were categorized into
                                                                  two primary groups: linguistic features, reflecting the
                                                                  text’s linguistic style and structure and lexical features,
                                                                  representing the semantic content of the text.
Figure 2: Distribution of published fanfiction from the Harry
Potter corpus by number of reviews in the first chapter.          4.1.1. Linguistic Features
                                                            To model text’s linguistic style and structure, we drew in-
                                                            spiration from the linguistic profiling framework, a NLP-
the analysis. We also excluded texts published after 2020, based methodology in which a large set of linguistically-
considering them too recent for meaningful comparison. motivated features automatically extracted from anno-
   As summarized in Table 1, the final corpora, hereafter tated texts is used to obtain a vector-based representa-
abbreviated as HP (Harry Potter) and LOTR (The Lord of tion of it. Such representations can be then compared
the Rings), consist of 26,032 and 932 texts, respectively. across texts representative of different textual genres
                                                            and varieties to identify the peculiarities of each [15].
                                                            For our study, we relied on Profiling-UD1 , a multilin-
4. Methodology                                              gual tool inspired by this framework, which extracts over
                                                            130 linguistic features from texts using the Universal De-
Based on the newly collected dataset and its internal
                                                            pendencies (UD) annotation formalism. As described in
distinction, we formulated the task of success prediction
                                                            Brunato et al. [16], these features encompass a range of
as a binary classification problem, that is: given a story,
                                                            linguistic phenomena that can be classified into distinct
the model is asked to predict whether it belongs to the
                                                            groups covering e.g. shallow text features (e.g. document
successful or unsuccessful class, where the two classes
                                                            and sentence length, average word length), distribution
were defined according to the metric based on the number
                                                            of grammatical categories, inflectional morphology and
of reviews received by readers.
   In line with our main purpose to construct a model of 1 http://linguistic-profiling.italianlp.it/
syntactic properties related to local and global parse tree   Table 2
depth structure.                                              Classification Accuracy(%) of the Models. ‘Ling.’ and ‘Lex.’
   These features have proven effective in tasks related      refer respectively to models trained on linguistic and lexical
to modeling text form, such as assessing text complex-        features. The baseline corresponds to the majority class label.
ity, and identifying stylistic traits of authors or author         Scenario      SVM Ling. EBM Ling. SVM Lex. Baseline
groups. Building on previous research on a similar cor-
pus of fanfiction [14], we hypothesize that these features       in-domain          65.03      66.15       69.95     50.16
can also distinguish between successful and unsuccessful        out-domain          59.22      64.70       43.45     56.43
                                                               avg. cross-time      62.02      62.81       49.31     49.20
fanfictions from a modeling perspective.
                                                                   average          62.09      64.55       54.24     51.93

4.1.2. Lexical Features
The second representation employed is based on lexi-          better accuracy compared to linear models. Additionally,
cal information and leverages the relative frequency of       with a reasonable number of features, the model remains
n-grams in each document. The choice of n-grams, in           explainable. Each shape function can be visualized as
contrast to more powerful semantic representation de-         a two-dimensional plot, with the feature value on the
rived from embeddings, is deliberately motivated by the       x-axis and the score assigned by the shape function on
desire to use lexical features that remain completely ex-     the y-axis. A score greater than 0 indicates a contribution
plicit. The model, henceforth referred to as the Lexical      towards the positive class, whereas a score less than 0
Model, consists of the following features:                    indicates a contribution towards the negative class. The
                                                              final prediction value for a record is simply the sum of
     • Forms: unigrams, bigrams, and trigrams of to-          the scores obtained from each shape function, potentially
       kens.                                                  transformed by the link function. Beyond analyzing in-
     • Lemmas: unigrams, bigrams, and trigrams of lem-        dividual shape functions, the average contribution of
       mas.                                                   each feature can be evaluated by taking the mean of the
     • Characters: sequences of characters at the be-         absolute values of the assigned scores.
       ginning or end of words, ranging from 1 to 4              There are various algorithms within the family of
       characters in length.                                  GAMs, primarily distinguished by the method used to
                                                              fit the shape functions. In the case of the EBM, stan-
4.2. Classifiers                                              dard gradient boosting is used. However, in each boost-
                                                              ing iteration, the algorithm sequentially cycles through
In line with our research questions, the explainability       each feature, constructing each univariate shape function
of the classification is crucial to evaluate the impact of    through bagged boosted trees. This method has proven
linguistic and lexical features on the prediction of suc-     to be one of the most effective for training a GAM.
cess. Therefore, two classification algorithms that allow        For our study, the EBM was employed exclusively for
for a precise global explanation of the predictions were      experiments based on linguistic features due to the ex-
selected.                                                     cessive dimensionality of the lexical model. This high
   The first classifier employed is a linear Support Vector   dimensionality would have rendered the GAM too com-
Machine. By fitting a decision hyperplane in the feature      plex to interpret and too time-expensive to train.
space, this method enables the examination of the hy-
perplane’s coefficients to assess the importance of the
features.                                                     5. Results and Discussion
   The second algorithm employed is the Explainable
Boosting Machine (EBM), which belongs to the family of       The classification results are summarized in Table 2, for
Generalized Additive Models (GAMs). As explained in          each model and scenario under evaluation.
[17] a GAM is a model of the form:                              For models using linguistic features, in the in-domain
                                                             scenario both the SVM and the EBM outperform the ma-
                                                             jority class baseline, with accuracies of 65.03% and 66.15%
                               ∑︁
                 𝑔(𝑦) = 𝛽0 +        𝑓𝑛 (𝑥𝑛 )            (1)
                                                             respectively, compared to 50.16% for the baseline. This
   where 𝑔(.) is called the link function, used to model indicates that both classifiers are effectively capturing
the output (e.g., the logistic function for classification). the linguistic patterns associated with success within the
Each 𝑓𝑛 (.) is referred to as a shape function, which is a same thematic domain.
univariate function modeling the relationship between           For linguistic models, in the out-domain scenario the
the feature 𝑛 and the target.                                performance of the SVM drops significantly, with an ac-
   The prediction is thus a sum of 𝑛 non-linear and arbi- curacy of 59.22%, whereas the EBM experiences a less
trarily complex shape functions, generally resulting in
                                                                 tures. We provide an in-depth analysis of this model in
                                                                 the following section.

                                                                 5.1. The Model of Success
                                                                 To gain a better understanding of the classification results
                                                                 and identify the most influential features for predicting
                                                                 success, we ranked the features according to the absolute
                                                                 value of their weight in the EBM classifier model trained
                                                                 on the entire training set. Table 3 presents an extract of
Figure 3: Classification Accuracy in the Cross-Time Setting
                                                                 the top 15 features. The analysis reveals that, in addi-
                                                                 tion to basic text features such as the average document
                                                                 length (measured in tokens [1]) and the average word
drastic decline, achieving an accuracy of 64.70%. How-           length (in characters [2]), more complex linguistic prop-
ever, both classifiers still perform better than the baseline,   erties play a crucial role. Among these, features related
suggesting some degree of ability to generalize of the lin-      to verbal predicates and verbal morphology emerge as
guistic features across different thematic domains.              particularly influential. This suggests that the syntac-
   The lexical model, in the in-domain scenario, achieves        tic and morphological characteristics of verbs, such as
an accuracy of 69.56%, outperforming all models with lin-        tense, mood and person, provide valuable information
guistic features, suggesting that lexical features provide       for the classifier prediction, highlighting the importance
a more powerful representation for in-domain success             of deeper linguistic structures in building a model of
prediction. Nevertheless, in the out-domain scenario, the        successful writing.
lexical model does not surpass the baseline, indicating             While this ranking highlights the ‘global’ importance
a complete lack of predictive ability. This suggests that        of features, it does not explain their effect on classifica-
lexical features, which are primarily based on the content       tion. For a more detailed analysis, Figure 4 in Appendix
of the specific fanfiction’s narrative universe, perform         A highlights the threshold values for each of the top
well within the same thematic domain but lose all sig-           15 ranked features, indicating the point at which the
nificance outside of it. Conversely, linguistic features,        expected classification shifts from one class to another.
which focus on the form of the text, appear to be more           Additionally, it provides the number of instances in the
adaptable regardless of the theme.                               training set for each feature value. Interestingly, there
   Figure 3 presents the performance over time for classi-       are some features which split almost exactly the amount
fiers trained with linguistic features. Additionally, two        of data into two subsets. For example, the features rep-
baselines are shown: "Random Choice", which randomly             resenting word length (char_per_tok) has a discriminant
selects between the two classes, and "Maj. Class", which         threshold of 4.55 characters which distinguishes success-
always assigns the majority class from the correspond-           ful stories – typically with longer words – from unsuc-
ing training set (2011 stories), i.e. the positive one. The      cessful ones – usually with shorter words. Similarly, fea-
results of the lexical model in the cross-time scenario          tures related to the (morpho-)syntactic profile of the text
were insignificant, as they were very similar to the "Maj.       such as the percentage of conjunctions (dep_dist_conj)
Class" baseline. The classifier, therefore, defaults to as-      and non-finite verb forms (verbs_form_dist_Fin) show a
signing the negative class, demonstrating no predictive          similar pattern. For these features, values lower than the
capability. To avoid confusion, the lexical model results        discriminant threshold contribute to predicting the nega-
are not included in this Figure. In contrast, the cross-         tive class, effectively splitting the data into two groups
time results for models using linguistic features are more       with comparable densities. Regarding verb presence (ver-
meaningful: the results remain stable around an average          bal_head_per_sentence), an increased use of verbs corre-
of 62%, regardless of the dominant class in the tested           lates with the unsuccessful class. This finding contradicts
year and the classifier used (avg. cross-time in Table 2).       the idea that higher readability, typically conveyed by a
   The cross-time scenario further suggests that linguistic      predominantly verbal prose rather than a nominal one,
features possess greater adaptability beyond their own           is a good indicator of writing quality. However, it aligns
domain, maintaining a considerable degree of general-            with observations by Ashok et al. [2], who identified
ization over time. Conversely, lexical features seem func-       similar patterns in canonical literary novels.
tional only within the specific domain of the training set,         Features related to verbal morphology also show a
losing all predictive power for texts from different do-         peculiar trend. For instance, a complementary perspec-
mains. Overall the model that performed best on average          tive emerges concerning the use of person morphology.
across the three scenarios, and with the least variance          Increasing the use of second person plural beyond a rela-
in performance, is the EBM trained with linguistic fea-          tively low threshold (0.4) positively affects the prediction
of success, which may indicate an alignment with the impact of the author’s popularity and productivity on
Reader-Insert2 format, a specific type of fanfiction where the success of their fanfiction.
the reader assumes the role of the protagonist, heavily
relying on second-person narration. In contrast, an ex-
cessive use of the first person plural is associated with References
the negative class.
                                                            [1] K. Hellekson, K. Busse, Fan fiction and fan com-
                                                                 munities in the age of the internet: new essays,
Table 3                                                          McFarland, 2014.
Top 15 Scores of the EBM Trained with Linguistic Features   [2] V. G. Ashok, S. Feng, Y. Choi, Success with style:
         #     feature                         score             Using writing style to predict the success of novels,
                                                                 in: Proceedings of the 2013 conference on empirical
         #1    n_tokens                        0.121
                                                                 methods in natural language processing, 2013, pp.
         #2    char_per_tok                    0.098
         #3    verbal_root_perc                0.095
                                                                 1753–1764.
         #4    verbs_num_pers_dist_Plur+2 0.090             [3] J. Brottrager, A. Stahl, A. Arslan, U. Brandes,
         #5    verbs_num_pers_dist_Plur+1 0.088                  T. Weitin, Modeling and predicting literary recep-
         #6    upos_dist_SYM                   0.080             tion. a data-rich approach to literary historical re-
         #7    n_sentences                     0.077             ception, Journal of Computational Literary Studies
         #8    aux_tense_dist_Imp              0.077             1 (2022). URL: https://doi.org/10.48694/jcls.95.
         #9    verbs_tense_dist_Imp            0.072        [4] M. Algee-Hewitt, S. Allison, M. Gemma, R. Heuser,
        #10 aux_tense_dist_Pres                0.067             F. Moretti, H. Walser, Canon/archive : large-scale
        #11 verbal_head_per_sent               0.066             dynamics in the literary field, 2018. URL: https://
        #12 dep_dist_conj                      0.065
                                                                 litlab.stanford.edu/LiteraryLabPamphlet11.pdf.
        #13 tokens_per_sent                    0.064
                                                            [5] Reviews matter: How distributed mentoring
        #14 verbs_form_dist_Fin                0.053
        #15 n_prepositional_chains             0.052             predicts lexical diversity on fanfiction.net, 2018.
                                                                 URL: https://api.semanticscholar.org/CorpusID:
                                                                 265096028.
                                                            [6] S. Sauro, Fan fiction and informal language learning,
6. Conclusion                                                    The handbook of informal language learning (2019)
                                                                 139–151.
Understanding success factors in literary writing is an     [7] O. Toubia, J. A. Berger, J. Eliashberg, How quan-
evolving area of cross-disciplinary research. This study         tifying the shape of stories predicts their success,
on Italian fanfiction demonstrated the feasibility of pre-       Proceedings of the National Academy of Sciences of
dicting success using computational methods and ex-              the United States of America 118 (2021). URL: https:
plainability techniques. Notably, we found that features         //api.semanticscholar.org/CorpusID:235648521.
related to style and structure of texts show greater ro- [8] J. A. Berger, W. W. Moe, D. A. Schweidel, What
bustness than lexical ones across different domains and          holds attention? linguistic drivers of engagement,
time periods. This suggests that the way a story is crafted      Journal of Marketing 87 (2023) 793 – 809. URL: https:
may be more universally appealing than specific word             //api.semanticscholar.org/CorpusID:255250393.
choices or thematic elements.                               [9] S. Maharjan, J. Arevalo, M. Montes, F. A. González,
   We believe that the implications of this study extend         T. Solorio, A multi-task approach to predict lika-
far beyond fanfiction research. On the one hand, it pro-         bility of books, in: Proceedings of the 15th Confer-
vides new methodologies for analyzing online literary            ence of the European Chapter of the Association for
phenomena offering potential contributions to digital hu-        Computational Linguistics: Volume 1, Long Papers,
manities. From the NLP perspective, it could inform text         2017, pp. 1217–1227.
generation models, potentially guiding the creation of [10] Y. Bizzoni, P. F. Moreira, I. M. S. Lassen, M. R. Thom-
content that resonates more effectively with readers.            sen, K. Nielbo, A matter of perspective: Build-
   Future research could explore the generalizability of         ing a multi-perspective annotated dataset for the
these findings to other languages and genres, as well            study of literary quality, in: N. Calzolari, M.-Y.
as the investigation on the dynamics of evolving reader          Kan, V. Hoste, A. Lenci, S. Sakti, N. Xue (Eds.),
preferences over time by also considering alternative            Proceedings of the 2024 Joint International Con-
measures to gauge success. Additionally, this study does         ference on Computational Linguistics, Language
not take into account the importance of the author; a            Resources and Evaluation (LREC-COLING 2024),
potential future development would be to consider the            ELRA and ICCL, Torino, Italia, 2024, pp. 789–800.
2
                                                                 URL: https://aclanthology.org/2024.lrec-main.71.
https://fanlore.org/wiki/Reader-Insert
[11] D. Nguyen, S. Zigmond, S. Glassco, B. Tran, P. J.          A. Top 15 Features of the EBM
     Giabbanelli, Big data meets storytelling: using ma-
     chine learning to predict popular fanfiction, Social
     Network Analysis and Mining 14 (2024) 58.
[12] S. Milli, D. Bamman, Beyond canonical texts: A com-
     putational analysis of fanfiction, in: J. Su, K. Duh,
     X. Carreras (Eds.), Proceedings of the 2016 Confer-
     ence on Empirical Methods in Natural Language
     Processing, Association for Computational Lin-
     guistics, Austin, Texas, 2016, pp. 2048–2053. URL:
     https://aclanthology.org/D16-1218. doi:10.18653/
     v1/D16-1218.
[13] Z. Sourati Hassan Zadeh, N. Sabri, H. Chamani,
     B. Bahrak, Quantitative analysis of fanfictions’ pop-
     ularity, Social Network Analysis and Mining 12
     (2022) 42.
[14] A. Mattei, D. Brunato, F. Dell’Orletta, The style of
     a successful story: a computational study on the
     fanfiction genre, in: J. Monti, F. Dell’Orletta, F. Tam-
     burini (Eds.), Proceedings of the Seventh Italian
     Conference on Computational Linguistics, CLiC-it
     2020, Bologna, Italy, March 1-3, 2021, volume 2769
     of CEUR Workshop Proceedings, CEUR-WS.org, 2020.
     URL: https://ceur-ws.org/Vol-2769/paper_52.pdf.
[15] H. van Halteren, Linguistic profiling for authorship
     recognition and verification, in: Proceedings of the
     42nd Annual Meeting of the Association for Com-
     putational Linguistics (ACL-04), Barcelona, Spain,
     2004, pp. 199–206. URL: https://aclanthology.org/
     P04-1026. doi:10.3115/1218955.1218981.
[16] D. Brunato, A. Cimino, F. Dell’Orletta, G. Venturi,
     S. Montemagni, Profiling-UD: a tool for linguis-
     tic profiling of texts, in: N. Calzolari, F. Béchet,
     P. Blache, K. Choukri, C. Cieri, T. Declerck, S. Goggi,
     H. Isahara, B. Maegaard, J. Mariani, H. Mazo,
     A. Moreno, J. Odijk, S. Piperidis (Eds.), Proceedings
     of the Twelfth Language Resources and Evaluation
     Conference, European Language Resources Associ-
     ation, Marseille, France, 2020, pp. 7145–7151. URL:
     https://aclanthology.org/2020.lrec-1.883.
[17] Y. Lou, R. Caruana, J. Gehrke, Intelligible mod-
     els for classification and regression, Proceedings
     of the ACM SIGKDD International Conference
     on Knowledge Discovery and Data Mining (2012).
     doi:10.1145/2339530.2339556.
Figure 4: Visualization of the Shape Functions of the Top 15 Linguistic Features of the EBM. In each graph pair, the x-axis
represents the feature value, the y-axis of the line plot indicates the score assigned by the shape function, and the marked
threshold value denotes the feature value at the zero score point. For the features represented by absolute numbers (i.e.
n_tokens, char_per_tok, n_sentences, and n_prepositional_chains), the values are displayed as raw counts. For the remaining
features, which are expressed as percentage distributions, the values are shown accordingly. More details about how these
features are calculated are reported in [16].