<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Interactive Text Mining Suite: Data Visualization for Literary Studies</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Olga Scrivner</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jefferson Davis</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Indiana University</institution>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <fpage>29</fpage>
      <lpage>38</lpage>
      <abstract>
        <p>In recent years, there has been growing interest in visualization methods for literary text analysis. While text mining and visualization tools have evolved into mainstream research methods in many fields (e.g. social sciences, machine learning), their application to literary studies still remains infrequent. In addition to technological challenges, the use of these tools requires a methodological shift from traditional close reading to distant reading approaches. This transition also aligns digital humanities with corpus linguistics, which still “remains obscure” and not fully embraced by digital humanists [16]. To address some of these challenges, we introduce Interactive Text Mining Suite, a user-friendly toolkit developed both for digital humanists and corpus linguists. We further demonstrate that the integration of visual analytics and corpus linguistics methods helps unveil language patterns, otherwise hidden from a human eye. Making use of both linguistically annotated data and natural language processing techniques, we are able to discern patterns of part-of-speech uses in Medieval Occitan manuscript Romance de Flamenca and its English translation. Furthermore, visual analysis not only detects stylistic differences at a word level, but also at sentential and document levels. While preserving traditional close reading techniques, this toolkit makes it possible to apply an interactive control over documents, thus allowing for a “synthesis of computational and humanistic modes of inquiry” [18].</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        In the past three decades, the digital humanities has evolved from Text
Encoding Initiative and large-scale digital projects to a field in its own right [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. This
change has also entailed a shift in conceptual and methodological foundations. As
Schnapp and Presner state in the Digital Humanity Manifesto 2.0, “the first wave of
digital humanities work was quantitative, mobilizing the search and retrieval
powers of the database”. With the second wave, the focus has shifted to “qualitative,
interpretive and emotive” aspects, concentrating on “digital toolkits in the service
of the Humanities’ core methodological strengths” [
        <xref ref-type="bibr" rid="ref2 ref23">23, 2</xref>
        ]. As the volumes of
digital collections continue to grow, we are moving into the third wave, where the
emphasis is placed on search, retrieval, and analysis, focusing on “the underlying
computationality of the forms held within a computational medium” [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. With this
shift, traditional methods become increasingly ineffective, leading to a transition
from traditional close reading to distant reading analyses [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. As Matthew
Jockers affirms in his Macroanalysis: Digital Methods and Literary History, "massive
digital corpora offer us unprecedented access to the literary record and invite, even
demand, a new type of evidence gathering and meaning making" [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. Built from
quantitative models and evolutionary theories, distant reading methods encouraged
the use of graphs and maps to interpret textual data [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. With recent advances
in computing, these methods have further evolved into more sophisticated
models involving machine learning algorithms for topic modeling and cluster analysis.
Despite these advances, most commonly used computational methods in literary
studies still remain primitive and are limited to word frequencies, concordances,
and keyword-in-contexts [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. First, many text processing tools require some
programming skills, which take time to learn and are often challenging for literary
scholars. Secondly, while some visualization tools (e.g. Voyant, Weka, and
PaperMachine) provide graphical-user interfaces, social and humanities researchers
seek more interactive and dynamic control of modeling, which can serve as
“holistic support for exploratory analysis” [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ].
      </p>
      <p>
        In this paper, we propose to address these issues by integrating micro and
macroanalyses with a dynamic interactive interface in which a researcher has
control over text analysis and visualization. To illustrate the application of such
techniques for digital humanities, we analyze Medieval Occitan Romance of Flamenca
translated in English by Blodget [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>The remainder of this paper is organized as follows. In Section 2, we review
existing close reading and distant reading visualization tools. In Section 3, we
introduce Interactive Text Mining Suite and its functionalities. Section 4 describes
our visualization analysis of Romance of Flamenca. Our conclusion and future
development directions are presented in Section 5.
2</p>
      <p>
        Close Reading and Distant Reading: Visualization Tools
The tradition of close reading is associated with American New Criticism
developed in the 1930s [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. The close textual analysis of individual texts was thought
of as a principle of order, demonstrating that literature was “an autonomous mode
of discourse with its own special ‘mode of existence’, distinct from that of
philosophy, politics, and history” [
        <xref ref-type="bibr" rid="ref9">9, 145</xref>
        ]. In contrast, distant reading, introduced
by Moretti [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], refers to as “the construction of abstract models” [
        <xref ref-type="bibr" rid="ref21">21, 67</xref>
        ]. These
two terms, close and distant reading, are also denoted as micro- and macroanalysis
[
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
2.1
      </p>
      <p>
        Close Reading
According to Jasinski [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], close reading helps unveil “words, verbal images,
elements of style, sentences, argument patterns, and entire paragraphs” [
        <xref ref-type="bibr" rid="ref14">14, 93</xref>
        ]. In
this textual analysis, scholars make use of color-coding, underlining and marginal
comments. To render close textual analysis digitally, several recent projects have
worked with color-coding, font size, glyphs, and connections, for example, Poem
Viewer [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], PRISM [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ], Juxta [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ] and eMargin1 [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Figure 1 offers a close
reading of Shakespeare’s Julius Caesar performed by eMargin, where words are
colored, tagged and commented.
Distant reading takes a reader from the exhaustive interpretation of individual
passages toward the global visualization of text collections. Drawing from quantitative
history and geography, Moretti [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] uses graphs, maps, and trees to analyze
historical novels. Since the publication of his work, a number of other visual methods
have been put forward in literary studies: tag clouds, heat maps, timelines, network
graphs as well as geographical maps [
        <xref ref-type="bibr" rid="ref13 ref7 ref8 ref9">13, 7-9</xref>
        ]. For example, word clouds have been
used to analyze the style of The Making of America [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and Federal Budget Speech
of Australia [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], whereas heat maps and network graphs were used to look at the
distribution and relationship of literary characters in novels [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ].
      </p>
      <p>
        Furthermore, advances in technology have made it possible to apply more
complex quantitative and visual analyses to literary studies: topic modeling,
summarization, and cluster classification, among others. Topic modeling identifies short
and informative descriptions of each text in a large collection. The main idea of this
model is that text collections are “represented as random mixtures over latent
topics, where each topic is characterized by a distribution over words” [
        <xref ref-type="bibr" rid="ref4">4, 996</xref>
        ]. Topic
modeling has been successfully applied to various text genres, e.g. news articles,
scientific abstract, scientific papers, digital libraries, and twitter [
        <xref ref-type="bibr" rid="ref10 ref12 ref4">4, 10, 12</xref>
        ]. The
common visualization of topics is a list of words associated with each topic and the
correlation between topics and documents (see Figure 2). While there exist many
tools and environments with topic algorithms, most of them require programming
skills. As Blei [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] points out, the developing of interactive user interfaces with
topic visualization is a future direction in the topic modeling field. Social and
humanities fields also express a need for the use of topic modeling in exploratory
literary analysis [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ].
      </p>
      <p>(a) List of topics and words</p>
      <p>(b) Topic-document correlation</p>
      <p>
        The second technique–cluster classification–refers to the automatic algorithm
that groups documents into subgroups. These subgroups, or clusters, “are coherent
internally, but clearly different from each other” [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. The common visualization of
cluster is a dendrogram, where individual texts are grouped based on agglomerative
and distance measures of their similarities, illustrated in Figure 3.
      </p>
      <p>
        In a recent survey of close and distant reading visualization tools, Janicke et al.
[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] observe a large body of work that combines both types of analysis. Despite
the increasing interest in macroanalysis, close interaction between a reader and a
text remains essential to humanities scholars. As Cole [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] state, there is “an urgent
desire in the literary community to embrace and explore the power of
computation while at the same time prioritizing and protecting the relationship between
literature and human readers”.
3 Interactive Text Mining Suite
The purpose of Interactive Text-Mining Suite2 (henceforth, ITMS) is to provide
a dynamic exploration of text collections, while maintaining interaction between
scholars and literary passages. The ITMS is built with R as a back-end and Shiny
app as a front-end. In the back-end, Shiny app consists of two R scripts, namely
server.R and ui.R. Server.R hosts all functions, whereas ui.R provides a
graphical user interface. The use of Shiny web framework for text analysis has several
advantages. First, as a web application, ITMS is platform-independent and does
not require installation, as compared to other text mining tools. Second, as an
R application, ITMS has access to a range of state-of-the-art text analytical,
statistical, and graphical packages (e.g., lda, topicmodels, ggplot2, wordcloud, tm).
Furthermore, Shiny app is designed to build a highly interactive and user-friendly
interface. Finally, the performance of the application is not affected by the local
system performance and memory, thus providing more optimal environment for
data analysis.
      </p>
      <p>The ITMS aims to bridge the gap between close reading and distant reading.
The user has a dynamic bottom-up control of text selection and choices of
exploratory analyses. In this approach, researchers can select a specific section of a
text, or extract certain segments based on KWIC term selection. In addition, the
ITMS allows to upload or extracts metadata (e.g. timestamp, location, language),
which can be used for a chronological analysis.</p>
      <p>At present, several text processing interactive functions are built into the ITMS,
namely, stemming, stopwords, tokenization. At each step, the reader is able to
access selected passages in order to decide which processing techniques to use.
Finally, the reader can perform various text mining and visual methods. For
example, users can analyze word distribution, generate word frequency graphs, perform
cluster and topic analyses.
4</p>
      <p>
        Case Study: Visualization of Medieval Romance
Flamenca
For our study, we have selected 1000 lines from the annotated corpus of Medieval
Occitan Romance of Flamenca [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] and its English translation [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. While
traditional visual tools are unable to perform text analysis using annotated corpora, our
goal is to combine rich linguistic knowledge from annotated corpus with
macroanalysis. First, we can perform a comparison between both documents at a word
2https://languagevariationsuite.shinyapps.io/TextMining/
level using word cloud method. From this analysis, it becomes apparent that verb
forms (VJ) dominate in the original text, whereas pronouns (PRP) prevail in its
translation. Surprisingly, despite the nature of this novel, common nouns and
proper nouns do not seem to be prevalent (see Figure 4).
      </p>
      <p>(a) Occitan corpus</p>
      <p>(b) English translation</p>
      <p>Second, we can examine the use of certain postags by using a keyword-in
context search. To illustrate the potential of this method, we have queried our English
text for existential (EX) and negation (NEG) postags. For example, the use of
existential (there is) is concentrated in the second part of the novel and corresponds
to the nuptial preparation (Figure 5a). The close examination of the context shows
that this section provides many existential constructions describing the bounty of
count Archambaut. On the other hand, the negation (not) is present across the
entire selection; however, it seems to be more concentrated toward the end, which
corresponds to the growing jealousy of Flamenca’s husband (Figure 5b).
(a) Existential pos</p>
      <p>(b) Negation pos</p>
      <p>Furthermore, we can examine stylistic similarities and differences at a sentence
level. First, the peak of sentence length in both documents seems to be concentrated
around 10 words and the overall distribution has a similar shape (see Figure 7). In
contrast, there is a dissonance in their usage frequencies.
Similarly, the choice of punctuation between original and its translation has a
noticeable disaccord. Inspired by the Adam Calhoun’s punctuation heatmap,3 we
assigned colors to specific types of punctuation in order to detect usage patterns.
The heatmap analysis reveals that the original texts contain more quotation marks,
hyphens, and parenthesis, as compared to the translated text.</p>
      <p>(a) Occitan corpus</p>
      <p>(b) English translation</p>
      <p>
        Finally we can examine this entire novel at a document level by using cluster
analysis and topic modeling methods. In order to visually detect similarities in
story development, we have split the novel in six section, based on the story plot,
namely marriage, jealousy, William’s arrival, planning how to meet Flamenca,
finding solution, first meeting with Flamenca, and escape from tower. Cluster analysis
demonstrates the similarities between William’s arrival and William’s search for
escape solution as well as between their first conversation and Flamenca’s escape.
Furthermore, topic visualization by means of word cloud help unveil several
underlying themes for love, jealous, Archambaut and William, prayer, Flamenca.
In recent years, we have seen growing interest in the construction of global features
and visual abstract models of text collections. Many scholars, however, have
expressed the need for a more integrated approach–the “synthesis of computational
and humanistic modes of inquiry” [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. To incorporate this approach, the authors
of this article have proposed to develop a bottom-up application for textual
analysis and visualization. The current project, Interactive Text Mining Suite, aims to
provide interactive control for text preprocessing and analysis. This method assists
with a more meaningful and fine-grained exploration of corpus. Given the
multifaceted nature of the genres of literary research, we have also designed our
graphical user interface to reflect choice of studies: scholarly articles, literary genre,
bibliographical metadata, and annotated corpora. Finally, the accessibility of our web
application facilitates data analysis, as researchers are not constrained by memory
limitation or platform dependency.
      </p>
      <p>There are several developments that we see in the future for our project. Given
its design flexibility and back-end structure written in R, this toolkit can be easily
augmented with additional features. For example, our exploratory analysis can
be enhanced with dynamic network graphs and dynamic diachronic mapping (e.g.
igraph and GoogleViz packages). Another development can be stylometric analysis
provided by a recent R package stylo,4 such as genre and authorship identification.
4The authors would like to thank an anonymous reviewer for this suggestion.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Abdul-Rahman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Coles</surname>
          </string-name>
          , E. Maguire, M. Meyer, M. Wynne,
          <string-name>
            <given-names>C. R.</given-names>
            <surname>Johnson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Trefethen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Chen</surname>
          </string-name>
          .
          <article-title>Rule-based Visual Mappings - With a Case Study on Poetry Visualization</article-title>
          .
          <source>Computer Graphics Forum</source>
          ,
          <volume>32</volume>
          (
          <issue>3</issue>
          PART4):
          <fpage>381</fpage>
          -
          <lpage>390</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>David</surname>
            <given-names>M Berry.</given-names>
          </string-name>
          <article-title>The computational turn: Thinking about the digital humanities</article-title>
          .
          <source>Culture Machine</source>
          ,
          <volume>12</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>22</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>David</given-names>
            <surname>Blei</surname>
          </string-name>
          .
          <article-title>Probabilistic topic models</article-title>
          .
          <volume>55</volume>
          (
          <issue>4</issue>
          ):
          <fpage>77</fpage>
          -
          <lpage>84</lpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>David</given-names>
            <surname>Blei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Andrew</given-names>
            <surname>Ng</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Michael</given-names>
            <surname>Jordan</surname>
          </string-name>
          .
          <article-title>Latent Dirichlet Allocation</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          , pages
          <fpage>993</fpage>
          -
          <lpage>1022</lpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>E.D.</given-names>
            <surname>Blodgett</surname>
          </string-name>
          .
          <source>The Romance of Flamenca. Garland</source>
          , New York,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Tanya</given-names>
            <surname>Clement</surname>
          </string-name>
          , Catherine Plaisant, and
          <string-name>
            <given-names>Romain</given-names>
            <surname>Vuillemot</surname>
          </string-name>
          .
          <article-title>The Story of One: Humanity scholarship with visualization and text analysis</article-title>
          .
          <source>Relation</source>
          ,
          <volume>10</volume>
          (
          <issue>1</issue>
          .43):
          <fpage>84</fpage>
          -
          <lpage>85</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Katherine</given-names>
            <surname>Coles</surname>
          </string-name>
          and
          <article-title>Julie Gonnering Lein</article-title>
          .
          <article-title>Solitary mind, collaborative mind: Close reading</article-title>
          and interdisciplinary research.
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Stephen</given-names>
            <surname>Dann</surname>
          </string-name>
          .
          <article-title>Analysis of the 2008 federal budget speech: Policy, politicking</article-title>
          and marketing messages,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Gerald</given-names>
            <surname>Graff. Professing Literature</surname>
          </string-name>
          :
          <article-title>An Institutional History</article-title>
          . University of Chicago Press,
          <year>1989</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Thomas</surname>
            <given-names>L Griffiths</given-names>
          </string-name>
          and
          <string-name>
            <given-names>Mark</given-names>
            <surname>Steyvers</surname>
          </string-name>
          .
          <source>Finding Scientific Topics. Proceedings of the National Academy of Sciences of the United States of America</source>
          ,
          <volume>101</volume>
          pages
          <fpage>5228</fpage>
          -
          <lpage>35</lpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Susan</given-names>
            <surname>Hockey</surname>
          </string-name>
          .
          <article-title>The history of humanities computing</article-title>
          . In Susan Schreibman, Ray Siemens, and John Unsworth, editors, A companion to Digital Humanities, pages
          <fpage>3</fpage>
          -
          <lpage>19</lpage>
          . Blackwell Publishing, Oxford,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Liangjie</given-names>
            <surname>Hong</surname>
          </string-name>
          and
          <string-name>
            <given-names>Brian D.</given-names>
            <surname>Davison</surname>
          </string-name>
          .
          <article-title>Empirical Study of Topic Modeling in Twitter</article-title>
          .
          <source>Proceedings of the First Workshop on Social Media Analytics</source>
          , pages
          <fpage>80</fpage>
          -
          <lpage>88</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Stefan</surname>
            <given-names>Jänicke</given-names>
          </string-name>
          , Greta Franzini,
          <string-name>
            <given-names>Muhammad F.</given-names>
            <surname>Cheema</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Gerik</given-names>
            <surname>Scheuermann</surname>
          </string-name>
          .
          <source>On Close and Distant Reading in Digital Humanities : A Survey and Future Challenges. Eurographics Conference on Visualization (EuroVis)</source>
          (
          <year>2015</year>
          ), pages
          <fpage>1</fpage>
          -
          <lpage>21</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>James</given-names>
            <surname>Jasinski</surname>
          </string-name>
          .
          <source>Sourcebook on Rhetoric. SAGE Publications</source>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Paul</given-names>
            <surname>Jay</surname>
          </string-name>
          .
          <article-title>The Humanities "Crisis" and the Future of Literary Studies</article-title>
          . Palgrave
          <string-name>
            <surname>Macmillan</surname>
            <given-names>US</given-names>
          </string-name>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>Kim</given-names>
            <surname>Jensen</surname>
          </string-name>
          .
          <article-title>Linguistics in the Digital Humanities: (Computational) Corpus Linguistics</article-title>
          .
          <source>MedieKultur: Journal of media and communication research</source>
          ,
          <volume>30</volume>
          (
          <issue>57</issue>
          ),
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Matthew</surname>
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Jockers</surname>
          </string-name>
          .
          <article-title>Topics in the Digital Humanities: Macroanalysis : Digital Methods and Literary History</article-title>
          . University of Illinois Press, Urbana, IL, USA,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>Lauren</given-names>
            <surname>Klein</surname>
          </string-name>
          and
          <string-name>
            <given-names>Jacob</given-names>
            <surname>Eisenstein</surname>
          </string-name>
          . Reading Thomas Jefferson with TopicViz:
          <article-title>Towards a Thematic Method for Exploring Large Cultural Archives</article-title>
          .
          <source>Scholarly and Research Communication</source>
          ,
          <volume>4</volume>
          (
          <issue>3</issue>
          ),
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Lauren</given-names>
            <surname>Klein</surname>
          </string-name>
          and
          <string-name>
            <given-names>Jacob</given-names>
            <surname>Eisenstein</surname>
          </string-name>
          . Reading Thomas Jefferson with TopicViz:
          <article-title>Towards a Thematic Method for Exploring Large Cultural Archives</article-title>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Christopher. Manning</surname>
          </string-name>
          . An introduction to Information Retrieval.
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>Franco</given-names>
            <surname>Moretti</surname>
          </string-name>
          . Graphs, Maps,
          <source>Trees: Abstract Models for a Literary History. Verso</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Daniela</surname>
            <given-names>Oelke</given-names>
          </string-name>
          , Dimitrios Kokkinakis, and
          <string-name>
            <given-names>Mats</given-names>
            <surname>Malm</surname>
          </string-name>
          .
          <article-title>Advanced visual analytics methods for literature analysis</article-title>
          .
          <source>Proceedings of the 6th EACL Workshop on Language Technology for Cultural Heritage</source>
          ,
          <source>Social Sciences, and Humanities</source>
          , pages
          <fpage>35</fpage>
          -
          <lpage>44</lpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>Jeffrey</given-names>
            <surname>Schnapp</surname>
          </string-name>
          and
          <string-name>
            <given-names>Peter</given-names>
            <surname>Presner</surname>
          </string-name>
          .
          <source>Digital Humanities Manifesto 2.0</source>
          .
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Olga</surname>
            <given-names>Scrivner</given-names>
          </string-name>
          , Sandra Kübler, Barbara Vance, and
          <string-name>
            <given-names>Eric</given-names>
            <surname>Beuerlein. Le Roman de Flamenca</surname>
          </string-name>
          :
          <article-title>An Annotated Corpus of Old Occitan</article-title>
          .
          <source>In the 3rd Workshop on Annotation of Corpora for Research in the Humanities (ACRH-3)</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Brandon</surname>
            <given-names>Walsh</given-names>
          </string-name>
          , Claire Maiers, Gwen Nally, Jeremy Boggs, and
          <string-name>
            <given-names>P.P.</given-names>
            <surname>Team</surname>
          </string-name>
          .
          <article-title>Crowdsourcing individual interpretations: Between microtasking and macrotasking</article-title>
          .
          <source>Literary and Linguistic Computing</source>
          ,
          <volume>29</volume>
          (
          <issue>3</issue>
          ):
          <fpage>379</fpage>
          -
          <lpage>386</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>Dana</given-names>
            <surname>Wheeles</surname>
          </string-name>
          and
          <string-name>
            <given-names>Kristin</given-names>
            <surname>Jensen</surname>
          </string-name>
          .
          <article-title>Juxta commons</article-title>
          .
          <source>In the Digital Humanities</source>
          <year>2013</year>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>