<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Measuring Narrative Fluency by Analyzing Dynamic Interaction Networks in Textual Narratives</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>O-Joun Lee</string-name>
          <email>ojlee112358@postech.ac.kr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jin-Taek Kim⇤</string-name>
          <email>jintaek@postech.ac.kr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Future IT Innovation Laboratory, Pohang Univ. of Science and Technology</institution>
          ,
          <addr-line>Pohang-si</addr-line>
          ,
          <country>Republic of</country>
          <addr-line>Korea 37673</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <abstract>
        <p>This study aims to assess the fluency of narratives in textual multimedia (e.g., news articles, academic publications, novels, etc.). We measure the narrative fluency based on whether relationships between entities in the narrative (i.e., subjects and objects of events that compose the narrative) are consistently described with adequate rapidity. The relationships are represented by a dynamic interaction network (called 'entity network'), which has entities as nodes and co-occurrences between the entities as edges. Lack of consistency makes users confused about what the textual narratives want to present. If a narrative consistently concentrates on a topic or subject, its entity network will have few entities with high node centrality. Using consistency of the high centrality entities, we assess the fluency with three criteria: (i) consistency in each paragraph, (ii) consistency in the overall narrative, and (iii) consistency between the title and body. The rapidity of narrative development has to be appropriate for expected readers of the textual narratives. Too low rapidity causes redundancy, and high rapidity hinders the understandability of the narratives. We assume structural changes in the entity network reflect the narrative rapidity. The structural change is measured by embedding structures of the entity network. Finally, we evaluated the e↵ectiveness of the proposed methods using the editorials of the New York Times and human evaluators.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Recently, various studies have attempted to quantitatively measure what has been qualitatively assessed based on
human intuition and experience, such as story similarity [LJJ18, LJ19b, LJ20], creativity [ES15], trustworthiness
[LNJ+17], and so on. These studies have mainly been conducted on interdisciplinary areas between computer
science and the humanities/social science. They attempt to make subjects of the humanities/social science
computational.</p>
      <p>As one of these attempts, this study aims to quantify the fluency of narratives, which is one of the significant
factors for evaluating writings [Huc15]. Narratives are the most fundamental media for exchanging information
between human beings. According to Lako↵ and Narayanan [LN10], “Narratives structure our understanding of
the world and of ourselves.” Therefore, assessing the quality of narratives is significant not only for multimedia
content analysis and its applications but also for human-computer interaction. Taghipour and Ng [TN16] also
attempted to score essays automatically by using the convolutional recurrent neural network. However, their
method cannot be used as a standard indicator of the narrative quality, which should always assign the same
score on a narrative.</p>
      <p>Various studies [BGL+19, LB19, LJ19a, WCW09] have applied interaction networks between characters
(character networks) for analyzing fictional narratives (stories). We extend this model, which has only been applied
to fictional and artistic narratives, to cover general narratives, including news articles or academic publications.
The character network foots on the assumption that interactions between characters compose fictional
narratives. A narrative is a series of events, and an event consists of interactions between characters [McK97, McK16].
However, general narratives do not only depict relationships between personified entities (characters).</p>
      <p>For example, a history book can describe interactions between nations or other social organizations, and
research articles depict relationships among abstractive concepts. To interpret what the narratives accurately
attempt to describe, we have to analyze the meanings of each interaction and relationship. However, even though
we do not know the meanings of relationships, we can analyze how the entire relationships between entities are
gradually presented or explained. This point is as with that our previous studies [LJJ18, LJ19b] have attempted
to analyze the narrative development by only using frequencies of interactions between characters. This approach
also enables us to apply the proposed methods on various kinds of media without significant modification, while
the existing studies [SLE15, SMS15] measured the fluency based on the domain knowledge.</p>
      <p>First, we have to define the entities and their interactions. Similar to the existing narrative models [CR17,
MAW+18], we define entities as subjects and objects of each interaction. In video or audio, finding interactions
and entities involved in the interactions is abstruse. Thus, as a preliminary study, we restrict our research
subjects into textual narratives, e.g., news articles, academic publications, non-fiction books, novels, essays, etc.
Each sentence in the text is used as a unit of interactions. And, entities correspond to nouns (or noun phrases),
which can be subjects or objects of the sentence. The entity and the interaction can be defined as follows;
Definition 1 (Entity and Interaction) Suppose that S is a set of sentences in a textual narrative, T. When
si 2 S is the i-th sentence in S, si also corresponds to the i-th interaction between entities. E, which is a set of
entities in T, consists of nouns and noun phrases that appeared in sentences within S. If two entities (ea and
eb) co-occur in si, we can assume that si describes a relationship between ea and eb, even though we do not know
meanings of si.</p>
      <p>The narrative is time-sequential. Thus, the existing studies [Bos16, LJ19a] segmented narratives into logical
and regular units, such as scenes. A scene is defined as a period that does not contain changes in
spatiotemporal backgrounds [McK97]. Each scene describes a concluded event within a background. However, the
general narratives are far more diverse than the fictional ones. Interactions in the general narrative can be
segmented into events, but they do not always have distinguishable backgrounds. Thus, we employ paragraphs
as a unit of events, since paragraphs in well-written texts usually have topical coherence and completion. By
using the paragraph as a time window, we define a dynamic interaction network between entities appeared in a
narrative as follows;
Definition 2 (Entity Network) Suppose that |E| is the number of entities that appeared in a narrative, T.
When N (T) indicates an entity network of T, N (T) can be defined as a matrix 2 R|E|⇥| E|. Each component
of N (T) means relationships between two entities. By defining N (·) in each paragraph, we can observe the
development of the relationships. When P is a set of paragraphs in T, and pl indicates the l-th paragraph, N (pl)
indicates an entity network for pl. This can be formulated as:</p>
      <p>|P |
N (T) = X N (pl) = 64
l=1
2 f1,1
.
.
.</p>
      <p>· · ·
. . .
fN,1 · · ·
f1,N 3
fN,N
... 75 ,
(1)
where fi,j indicates frequency of interactions between ei and ej . We measure fi,j using the number of sentences
that ei and ej co-occurred.</p>
      <p>We measure the narrative fluency based on the entity network and the following two assumptions. First, the
topical coherence of a paragraph will be exposed by the centrality of its keywords on the entity network. Thus,
within a paragraph, there should be few entities with significantly higher centrality than the other entities. If
a narrative consistently focuses on a topic, the keywords will also be consistent in the overall narrative (RQ
1). Second, the relationships between entities have to be described in an appropriate rapidity to deliver the
relationships to users understandably. If we use too few interactions or events for depicting content, there
can be logical leaps. On the other hand, if we describe the content too slowly, there might be meaningless
redundancy. Therefore, the narrative should have an adequate rapidity of its development regarding its purposes
and expected readers. The narrative development will accompany new entities and new relationships between
the entities. Thus, we assume that the rapidity of narrative development can be measured by structural changes
in entity networks (RQ 2).
2</p>
    </sec>
    <sec id="sec-2">
      <title>Measuring Narrative Fluency</title>
      <p>This section first describes the way how we have composed entity networks, briefly. Then, we present the
proposed method for measuring narrative fluency with the two criteria: (i) the narrative consistency and (ii) the
rapidity of the narrative development.
2.1</p>
      <sec id="sec-2-1">
        <title>Composing Entity Networks</title>
        <p>We collected 20 recent editorials published in the New York Times1. Titles, headlines, and bodies of the editorials
were collected and preprocessed by using the NLTK library of Python2. We conducted tokenization, stemming,
and parts of speech (POS) tagging for the collected texts. Then, we annotated occurrences and co-occurrences
of only nouns and pronouns, which are tagged as ‘NN,’ ‘NNS,’ ‘NNP,’ or ‘NNPS’ by the POS tagger in the
NLTK library. These nouns and pronouns are entities. We composed entity networks based on occurrences and
co-occurrences of the entities in each sentence and paragraph. To segment sentences, we used the capitalization,
punctuation marks, and dictionary for frequently-used acronyms (e.g., Mr., Ms., etc.). Also, the entity network
includes cyclic edges (e.g., fa,a in Eq. 1) to represent the occurrence frequency of entities.
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Measuring Narrative Consistency of Paragraphs</title>
        <p>We measure the narrative consistency with two viewpoints. First, each paragraph has to focus on one topic.
Thus, there should be few entities that have far higher centrality than the other entities. The entities will be
keywords that represent the topic. Therefore, we measure three well-known node centrality measurements (e.g.,
degree, betweenness, closeness centrality) for each entity on entity networks. The centrality measurements are
normalized into [0, 1] and aggregated by the arithmetic mean. We use the arithmetic mean as the centrality of
each keyword. However, this way cannot consider which kinds of centrality are more significant for the narrative
fluency. In future studies, we will compare the significance of the centrality measurements by applying weighting
factors to them. We assess the narrative consistency in each paragraph by using the entropy of the centrality of
entities. This can be formulated as:</p>
        <p>Call(pl) =</p>
        <p>1
|El| ⇥</p>
        <p>X
8 ea2 El
log Cl(ea),
where El ⇢ E indicates a set of entities that appeared in pl, and Cl(ea) refers to centrality of ea on N (pl).
Call(pl) measures consistency of a paragraph. For the entire textual narrative, we aggregate the consistency of
paragraphs as: Call(T) = |P1 | ⇥ P8 pl2 P Call(pl).</p>
        <p>Second, keywords of a narrative should have high centrality on overall paragraphs in the narrative. Similar
to the previous one, we assess whether keywords have consistently high centrality, based on the entropy. This
can be formulated as:</p>
        <p>2
Ckey(T) = 4 1 +</p>
        <p>1
|P | · |K| ⇥</p>
        <p>X</p>
        <p>X
8 pl2 P 8 ea2 K
log Cl(ea)5
3
1
where K ⇢ E is a set of keywords of T. We compose K by clustering entities into two clusters according to their
centrality by using k-means clustering with two initial centroids: maximum and minimum centrality. Among
1https://www.nytimes.com/section/opinion/editorials
2http://www.nltk.org/</p>
        <p>Ctitle(T) = |(Et [ Eh) \ K| ,</p>
        <p>|Et [ Eh [ K|
where Et and Eh are sets of entities that appeared in the title and headline of T, respectively. We aggregate the
three proposed measurements (Call, Ckey and Ctitle) by the arithmetic mean, after normalizing them into [0, 1].
2.3</p>
      </sec>
      <sec id="sec-2-3">
        <title>Measuring Rapidity of Narrative Development</title>
        <p>We measure the rapidity of the narrative development by using structural changes in entity networks. To compare
structures of entity networks, we represent the entity networks with a vector by using the graph embedding
technique. We employ the Graph2Vec model [NCV+17], which aims to embed structures of graphs rather than
characteristics of nodes or edges. Graph2Vec first extracts subgraphs from entity networks by using the WL
(Weisfeiler-Lehman) relabeling process [SSvL+11]. Then, PV-DBOW in Doc2Vec [LM14] is applied to learn
representations of the entity networks based on the composition of subgraphs. We notate a vector representation
of N (T) as ( N (T)).</p>
        <p>We measure the rapidity by estimating how significant changes in the entity network are caused by a paragraph.
Thus, we have to compare entity networks before and after the paragraph. The rapidity of narrative development
on pl is measured by the Euclidean distance between ( Pli=1 N (pi)) and ( Pli=11 N (pi)). This can be formulated
as:
the two clusters, we assume that elements in the cluster with higher centrality are keywords. Although using a
threshold will be much simpler than the clustering, it cannot deal with the diversity of textual narratives.</p>
        <p>Third, users expect topics of textual narratives from their titles and keywords annotated by creators. Since
news articles are our experimental subjects, entities in their titles and headlines have to match with keywords
discovered by using the entity network. Thus, we measure their concurrence based on the Jaccard index. This
can be formulated as:
(4)
(5)
(6)
R(pl) =</p>
        <p>l
X N (pi)
i=1
!
l 1
X N (pi)
i=1
!
2
.</p>
        <p>This study focuses on validating that we can measure the narrative fluency by analyzing the interaction network.
We will sophisticatedly tune the proposed measurements in future studies (e.g., comparing the e↵ectiveness of
various distance metrics in measuring the narrative rapidity).</p>
        <p>We assume that too slow or too fast changes in entity networks hinder the readability of the narrative.
Therefore, we assess whether the rapidity is appropriate and consistent. After normalizing the rapidity of
paragraphs into [0, 1], we aggregate the di↵erence between the optimal rapidity and rapidity on each paragraph.
This can be formulated as:</p>
        <p>R(T) =
1
|P | ⇥</p>
        <p>l
X |R(pl) ⇥ R| ,
8 pl2 P
where ⇥ R indicates the optimal rapidity. We attempt to search ⇥ R through empirical experiments in the
following section.</p>
        <p>Additionally, the proposed measurements for the narrative consistency can be manipulated by splitting
paragraphs more finely than the normal. We expect that too short paragraphs make R(pl) small. Thus, narrative
consistency and rapidity will have a trade-o↵ relationship.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Evaluation</title>
      <p>To validate the research questions, we evaluated the accuracy of the measurements by estimating fluency of the
editorials in the New York Times. We compared the results of the proposed measurements with responses of 37
human evaluators. The evaluator group consists of students and faculty members of Chung-Ang University and
Pohang University of Science and Technology. We asked the evaluators to read three editorials that they chose
from our corpus and answer the following questionnaire.</p>
      <p>Q1. What are the keywords of this editorial?
Q2. Is this editorial consistently describing its topic? Please, answer in five degrees: very inconsistent,
inconsistent, normal, consistent, and very consistent.</p>
      <p>Q3. If this editorial is inconsistent, please check paragraphs causing the inconsistency.</p>
      <p>Q4. How rapid is the narrative development of this editorial? Please, answer in five degrees: prolonged,
slow, appropriate, fast, and very fast.</p>
      <p>Q5. If the rapidity of narrative development in this editorial is inadequate, please check paragraphs that
causes the inappropriateness. Also, please annotate whether the paragraphs are redundant or unexpected.
Q6. Is this editorial fluent? Please, answer in five degrees: very non-fluent, non-fluent, normal, fluent, and
very fluent.</p>
      <p>Q7. If this editorial is not fluent, please check paragraphs that causes the non-fluency.</p>
      <p>For normalization, the five choices of Q2 and Q6 were replaced with 0.2, 0.4, 0.6, 0.8, and 1.0, respectively. Also,
the choices of Q4 were transformed into 0.0, 0.5, 1.0, 0.5, and 0.0, respectively.</p>
      <p>Based on the questionnaire, we conducted three experiments. First, if the entity network model is reasonable,
high centrality entities will be keywords of each textual narrative. Thus, we compared keywords annotated by
the evaluators (Q1) with automatically discovered ones. As a baseline method, we measured TF-IDF (Term
Frequency-Inverse Document Frequency) of entities and clustered them according to the TF-IDF scores, similar
to the proposed method. Accuracy of the keywords was assessed by the precision, recall, and F1 measure.</p>
      <p>The second and third rows of Table 1 show that both of the methods have high precision and low recall.
To find its reason, we examined keywords that were not discovered by the centrality and TF-IDF. Most of
the omitted keywords were called by various expressions, including pronouns and synonyms. For example, the
following phrases can be used in similar meanings: U.S. government, American government, Federal government,
Trump administration, Presidency of Donald Trump, Washington D.C., etc. This variety of expressions makes
co-occurrence frequency of entities dispersed. Vocabulary diversity makes texts smooth and fluent, while it is
a challenging issue for composing accurate entity networks. Also, the centrality exhibited higher accuracy for
discovering keywords than TF-IDF. However, the amount of improvement was insignificant. Even if the entity
network is independent of kinds of media, its performance has to be improved, considering the simplicity of
TF-IDF.</p>
      <p>Second, we validated RQ 1 and assessed the e↵ectiveness of the narrative consistency measurements, based
on Q2, Q3, Q6, and Q7. We examined correlations between (i) fluency annotated by the evaluators (Q6; FH),
(ii) annotated consistency (Q2; CH), and (iii) automatically measured consistency (CA), using PCC (Pearson
Correlation Coecient). FH-CH and FH-CA verified RQ 1, and CH-CA and FH-CA exhibited the e↵ectiveness
of the measurements. Table 3 (a) presents the correlation coecients.</p>
      <p>On the experimental results, FH-CH was 0.91. Most of the evaluators gave the same scores for the fluency
and consistency. FH-CA (0.71) was lower than FH-CH but still significant. Thus, the consistency was an
essential factor of the narrative fluency. CH-CA (0.73) was lower than FH-CH but higher than FH-CA. This
point indicates that the proposed measurement adequately reflected the consistency of the editorials.</p>
      <p>Then, we compared inconsistent paragraphs annotated by the evaluators with ones detected by the proposed
method. By modifying Eq. 3, we measured inconsistency of each paragraph as: P8 ea2 K log Cl(ea). According
to this metric, we sorted paragraphs in each editorial with a descending order. Paragraphs in the first quartile
of the order were determined as the inconsistent ones. Accuracy for detecting the inconsistent paragraphs was
assessed by the precision, recall, and F1 measure.</p>
      <p>As shown in the second row of Table 2, the proposed method exhibited high recall but low precision. Since
we use keywords to measure the inconsistency, recognizing synonyms as individual entities might increase the
inconsistency for paragraphs. Although the consistency showed reasonable performance overall, we have to find
a better way of composing the entity network.</p>
      <p>Finally, we validated RQ 2 and verified the e↵ectiveness of the proposed measures for the rapidity of narrative
development, based on Q4 to Q7. As with the previous one, we examined correlations between (i) fluency
annotated by the evaluators (Q6; FH), (ii) annotated rapidity (Q4; RH), and (iii) automatically measured
rapidity (RA). FH-RH and FH-RA verified RQ 1, and RH-RA and FH-RA exhibited the e↵ectiveness of the
rapidity measurement. Table 3 (b) presents the correlation coecients.</p>
      <p>FH-RH (0.66) was relatively lower than FH-CH. Also, FH-RA (0.74) was lower than FH-CA. These results
mean the rapidity was less significant than the consistency to estimate the narrative fluency. One interesting
point was that FH-RA was higher than RH-RA (0.62). The rapidity measurement was correlated to the narrative
fluency but not much proportional to the rapidity of narrative development that the evaluators felt. The following
experiment also showed this problem. Additionally, the proposed measurement exhibited the highest PCC for
RH-RA on ⇥ R = 0.45. We searched the optimal ⇥ R in [0, 1] with a step size +0.05.</p>
      <p>Also, we compared too fast and slow paragraphs annotated by the evaluators with ones detected using the
consistency measurements. Using Eq. 5, we sorted the paragraphs in each editorial with descending order. Then,
paragraphs in the first and fourth quartiles of the order are decided as too fast and slow paragraphs, respectively.
Their accuracy was assessed by using the precision, recall, and F1 measure.</p>
      <p>Di↵erent from the consistency, precision and recall of the rapidity measurement were similar. However, as
displayed in the third row of Table 2, accuracy for detecting abnormality on the rapidity was significantly lower
than on the consistency. To find its reason, we have examined false positives and false negatives of the proposed
method. Interestingly, the false positives were mostly on the beginning and ending parts of the editorials (maybe,
introductions and conclusions), and most of the false negatives were on the middle parts of the editorials. These
results indicate that the optimal rapidity of narrative development can be di↵erent according to the locations of
paragraphs (or narrative time). The low PCC score for RH-RA could be a↵ected by this problem, either.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusion</title>
      <p>We have proposed two kinds of measurements for assessing fluency of textual narratives. Also, their e↵ectiveness
was evaluated based on the editorials of the New York Times. However, this study has a few limitations. First,
we could not conduct experiments on various kinds of textual narratives. Also, we assumed the optimal rapidity
as a static value. Our further research will be focused on resolving these two problems.</p>
      <sec id="sec-4-1">
        <title>Acknowledgements</title>
        <p>This research was supported by the MSIT (Ministry of Science and ICT), Korea, under the ICT Consilience
Creative program (IITP-2019-2011-1-00783) supervised by the IITP (Institute for Information &amp; communications
Technology Planning &amp; Evaluation).
[Bos16]</p>
        <p>Xavier Bost. A storytelling machine?: automatic video summarization: the case of TV series. PhD
thesis, University of Avignon, France, November 2016.</p>
        <p>Emmanouil Theofanis Chourdakis and Joshua Reiss. Constructing narrative using a generative
model and continuous action policies. In Proceedings of the Workshop on Computational Creativity
in Natural Language Generation (CC-NLG@INLG 2017), pages 38–43, Santiago de Compostela,
Spain, September 2017. Association for Computational Linguistics (ACL).</p>
        <p>Ahmed M. Elgammal and Babak Saleh. Quantifying creativity in art networks. In Hannu Toivonen,
Simon Colton, Michael Cook, and Dan Ventura, editors, Proceedings of the 6th International
Conference on Computational Creativity (ICCC 2015), pages 39–46, Park City, Utah, USA, June 2015.
computationalcreativity.net.</p>
        <p>Geo↵rey J. Huck. What Is Good Writing?, chapter Narrative Fluency, pages 102–124. Oxford
University Press, September 2015.</p>
        <p>Vincent Labatut and Xavier Bost. Extraction and analysis of fictional character networks: A survey.
ACM Computing Surveys, 2019. To Appear.</p>
        <p>O-Joun Lee and Jason J. Jung. Integrating character networks for extracting narratives from
multimodal data. Information Processing and Management, 56(5):1894–1923, September 2019.
O-Joun Lee and Jason J. Jung. Modeling a↵ective character network for story analytics.
Generation Computer Systems, 92:458–478, March 2019.</p>
        <p>Future
O-Joun Lee and Jason J. Jung. Story embedding: Learning distributed representations of stories
based on character networks. Artificial Intelligence, 281:103235, April 2020.</p>
        <p>O-Joun Lee, Nayoung Jo, and Jason J. Jung. Measuring character-based story similarity by analyzing
movie scripts. In Al´ıpio M´ario Jorge, Ricardo Campos, Adam Jatowt, and S´ergio Nunes, editors,
Proceedings of the 1st Workshop on Narrative Extraction From Text (Text2Story 2018) co-located
with the 40th European Conference on Information Retrieval (ECIR 2018), volume 2077 of CEUR
Workshop Proceedings, pages 41–45, Grenoble, France, March 2018. CEUR-WS.org.</p>
        <p>Quoc V. Le and Tomas Mikolov. Distributed representations of sentences and documents. In Eric P.
Xing and Tony Jebara, editors, Proceedings of the 31th International Conference on Machine
Learning (ICML 2014), volume 32 of JMLR Workshop and Conference Proceedings, pages 1188–1196,
Beijing, China, June 2014. JMLR.org.</p>
        <p>George Lako↵ and Srini Narayanan. Toward a computational model of narrative. In Proceedings
of the 2010 AAAI Fall Symposium: Computational Models of Narrative, volume FS-10-04 of AAAI
Technical Report, pages 21–28, Arlington, VA, US, November 2010. AAAI.</p>
        <p>O-Joun Lee, Hoang Long Nguyen, Jai E. Jung, Tai-Won Um, and Hyun-Woo Lee. Towards
ontological approach on trust-aware ambient services. IEEE Access, 5:1589–1599, February 2017.
[MAW+18] Lara J. Martin, Prithviraj Ammanabrolu, Xinyu Wang, William Hancock, Shruti Singh, Brent
Harrison, and Mark O. Riedl. Event representations for automated story generation with deep
neural nets. In Sheila A. McIlraith and Kilian Q. Weinberger, editors, Proceedings of the
ThirtySecond AAAI Conference on Artificial Intelligence, (AAAI 2018), the 30th innovative Applications
of Artificial Intelligence (IAAI 2018), and the 8th AAAI Symposium on Educational Advances in
Artificial Intelligence (EAAI 2018), pages 868–875, New Orleans, Louisiana, USA, February 2018.</p>
        <p>AAAI Press.</p>
        <p>Robert McKee. Story: Substance, Structure, Style and the Principles of Screenwriting.
HarperCollins, New York, NY, USA, November 1997.</p>
        <p>Robert McKee. Dialogue: The Art of Verbal Action for Page, Stage, and Screen. Twelve, July 2016.
[NCV+17] Annamalai Narayanan, Mahinthan Chandramohan, Rajasekar Venkatesan, Lihui Chen, Yang Liu,
and Shantanu Jaiswal. graph2vec: Learning distributed representations of graphs. Computing
Research Repository (CoRR), abs/1707.05005, July 2017.
[SMS15]</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [BGL+19]
          <string-name>
            <surname>Xavier</surname>
            <given-names>Bost</given-names>
          </string-name>
          , Serigne Gueye, Vincent Labatut, Martha Larson, Georges Linar`es, Damien Malinas, and Rapha¨el Roth.
          <article-title>Remembering winter was coming</article-title>
          .
          <source>Multimedia Tools and Applications</source>
          ,
          <year>September 2019</year>
          . To Appear.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [TN16]
          <string-name>
            <given-names>Oscar</given-names>
            <surname>Saz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Yibin</given-names>
            <surname>Lin</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Maxine</given-names>
            <surname>Eskenazi</surname>
          </string-name>
          .
          <article-title>Measuring the impact of translation on the accuracy and fluency of vocabulary acquisition of english</article-title>
          .
          <source>Computer Speech &amp; Language</source>
          ,
          <volume>31</volume>
          (
          <issue>1</issue>
          ):
          <fpage>49</fpage>
          -
          <lpage>64</lpage>
          , May
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>Maryam</given-names>
            <surname>Soleimani</surname>
          </string-name>
          , Sima Modirkhamene, and
          <string-name>
            <given-names>Karim</given-names>
            <surname>Sadeghi</surname>
          </string-name>
          .
          <article-title>Peer-mediated vs. individual writing: measuring fluency, complexity, and accuracy in writing</article-title>
          .
          <source>Innovation in Language Learning and Teaching</source>
          ,
          <volume>11</volume>
          (
          <issue>1</issue>
          ):
          <fpage>86</fpage>
          -
          <lpage>100</lpage>
          ,
          <year>June 2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [WCW09]
          <string-name>
            <surname>Chung-Yi</surname>
            <given-names>Weng</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wei-Ta Chu</surname>
          </string-name>
          , and
          <string-name>
            <surname>Ja-Ling Wu</surname>
          </string-name>
          . RoleNet:
          <article-title>Movie analysis from the perspective of social networks</article-title>
          .
          <source>IEEE Transactions on Multimedia</source>
          ,
          <volume>11</volume>
          (
          <issue>2</issue>
          ):
          <fpage>256</fpage>
          -
          <lpage>271</lpage>
          ,
          <year>February 2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [SSvL+11]
          <string-name>
            <surname>Nino</surname>
            <given-names>Shervashidze</given-names>
          </string-name>
          , Pascal Schweitzer, Erik Jan van Leeuwen,
          <string-name>
            <surname>Kurt Mehlhorn</surname>
          </string-name>
          , and
          <string-name>
            <surname>Karsten</surname>
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Borgwardt</surname>
          </string-name>
          .
          <article-title>Weisfeiler-lehman graph kernels</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          ,
          <volume>12</volume>
          :
          <fpage>2539</fpage>
          -
          <lpage>2561</lpage>
          ,
          <year>September 2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Kaveh</given-names>
            <surname>Taghipour and Hwee Tou Ng</surname>
          </string-name>
          .
          <article-title>A neural approach to automated essay scoring</article-title>
          . In Jian Su, Xavier Carreras, and Kevin Duh, editors,
          <source>Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP</source>
          <year>2016</year>
          ), pages
          <fpage>1882</fpage>
          -
          <lpage>1891</lpage>
          , Austin, Texas, USA,
          <year>November 2016</year>
          .
          <article-title>Association for Computational Linguistics</article-title>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>