<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Automated Korean Poetry Generation Using LSTM Autoencoder?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Eun-Soon You</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Soohwan Kang</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Su-Yeon O</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Cultural Contents and Management, Inha University 100 Inha-ro</institution>
          ,
          <addr-line>Michuhol-gu, Incheon, 22212</addr-line>
          <country>Republic of Korea</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of French language and culture, Inha University 100 Inha-ro</institution>
          ,
          <addr-line>Michuhol-gu, Incheon, 22212</addr-line>
          <country>Republic of Korea</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Department of Korean Studies, Inha University 100 Inha-ro</institution>
          ,
          <addr-line>Michuhol-gu, Incheon, 22212</addr-line>
          <country>Republic of Korea</country>
        </aff>
      </contrib-group>
      <fpage>3</fpage>
      <lpage>10</lpage>
      <abstract>
        <p>Automatically composing poem is considered a highly challenging task, and it has received increasing attention across various fields. The computer has to ensure readability, meaningfulness, and vocabulary adequacy as well as the semantics of the poet's production, which are otherwise realized by their imagination and inspiration. In this study, a model for Korean poem generation based on long short-term memory (LSTM) is proposed with the aim of creating poems that imitate the writing style of four poets who represent Korea's modern poetry. To do this, 1000 poems by the target poets were collected, and their styles were defined using natural language processing (NLP). Following this, each sentence of the poem was preprocessed, and training was performed using LSTM. When a user selects the desired poet and enters a keyword, the model automatically generates a poem based on that poet's style. The poems produced showed some errors in syntactic structure and semantic delivery, but they successfully reproduced the characteristic vocabulary and emotions of the poet.</p>
      </abstract>
      <kwd-group>
        <kwd>Poem Generation</kwd>
        <kwd>Natural Language Generation</kwd>
        <kwd>Long Short Term Memory (LSTM)</kwd>
        <kwd>Writing Style</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>The question of whether a computer is capable of writing text with creative features such
as poetry (and if so, how it will differ from human creation) has become a popular one
among researchers in the fields of natural language generation (NLG), computational
creativity, and, broadly, artificial intelligence (AI). While a significant number of studies
have long attempted to achieve automatic responses to a series of questions, automatic
poetry generation has been receiving increasing attention in recent times. This is a
challenging research area, because it requires a very high level of skill to satisfy the
formal conditions and content of the poem.</p>
      <p>
        Automated poetry creation not only indicates technological progress, but it represents
a new creative approach altogether that is entirely different from the existing concept and
principles of poetry creation. The shift in the perception of poetry creation methods dates
back to 1920, even before computers were accessible. It can be observed in a poem by
Tristan Tzara, a poet who participated in a new European art movement called Dadaism
in the early 20th century. His poem [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] is as follows:
      </p>
      <sec id="sec-1-1">
        <title>To make a Dadaist poem:</title>
        <p>Take a newspaper.</p>
        <p>Take a pair of scissors.</p>
        <p>Choose an article as long as you are planning to make your poem.</p>
        <p>Cut out the article.</p>
        <p>Then cut out each of the words that make up this article and put them in a bag.
Shake it gently.</p>
        <p>Then take out the scraps one after the other in the order in which they left the
bag.</p>
        <p>Copy conscientiously.</p>
        <p>The poem will be like you.</p>
        <p>And here are you a writer, infinitely original and endowed with a sensibility that
is charming though beyond the understanding of the vulgar.</p>
        <p>Zara’s poem, published in 1920, proposed a complete departure from traditional
poetry. In it, the poet finds adequate poetic words and combines them according to the
rules; their intentions, feelings, or causality cannot be found. This recalls the definition
of an algorithm, meaning a procedure or method for solving a problem or a step for
performing a task.</p>
        <p>
          With the advent of computers, experimental attempts were made to generate poems.
Bailey [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] suggested semi-automatic poem generation, emphasizing the potential of
computer use in poem creation. The French Atelier of Literature Assisted by Maths
and Computers (ALAMO) group [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] proposed ‘rimbaudelaires,’ a method to combine
existing poems in order to create new ones, in which the structure of a poem by Rimbaud
was filled with the vocabulary of Baudelaire’s poems.
        </p>
        <p>The use of deep learning algorithms such as recurrent neural networks (RNNs)
and LSTM in computational creativity has evolved the concept of automatic poetry
generation. In this context, the present study proposes a Korean poetry generation model
based on deep learning which aims to imitate the writing style of a particular poet.</p>
        <p>First, four poets, Kim So Wol, Yoon Dong-Joo, Baek Seok, and Jeong Ji-yong, who
represent Korea’s modern poetry and remain popular amongst Koreans, were selected.
These poets wrote noteworthy poems within the forms of free poetry and lyric poetry
during the Japanese occupation.</p>
        <p>The rests of the paper are organized as follows. We start by reviewing previous works
in the Section 2. Section 3 describes the approach adapted in our experiments. And we
illustrate the evaluation in Section 4. The conclusion of this paper and future work are
demonstrated in Section 5.
2</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        If Bailey [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] demonstrated the possibility of computer-generated poetry, Gervás [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]
marked the beginning of the automatic poetry generation, and various studies have been
since. Wu and Tosa [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] proposed a poem generation system based on Haiku phrase
corpus. When the user enters a word or phrase, the system finds expressions containing
it in the corpus and creates a poem by combining them. Manurung and Thompson [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]
developed a system using genetic algorithms. The poetry generation system, called
McGonagall, finds one of several candidate poems with no grammatical error and clear
meaning transmission according to stochastic search. Das and Gambäck [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] presented a
syllable-based poetry generator. When a user enters a sentence, the syllabification engine
understands its rhythm and generates the appropriate sentence that follows it.
      </p>
      <p>
        Recently, along with the advance of machine learning, poetry generation using deep
learning have emerged. Wang et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] propose the machine poetry generator based
on LSTM for imitating Chinese poet Du Fu’s writing styles. Given the first character,
this model produces a poem reflecting the tone and rhythm of Du Fu’s poem. Zugarini
and Maggini [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] also suggested a system using LSTM to generate terrests, a feature of
Dante Alighieri.
      </p>
      <p>
        Poetic generation researches in various languages such as English, Japanese, Chinese,
and Italian are being actively conducted. Regarding Korean poem generation, Park et al.
[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] introduced a model for generating poems using Sequence Generative Adversarial
Networks (SeqGAN). Korean poem generation is just beginning. In this context, we
present a Korean poetry generation model based on deep learning aiming to imitate the
style of a particular poet.
3
3.1
      </p>
    </sec>
    <sec id="sec-3">
      <title>Approaches</title>
      <sec id="sec-3-1">
        <title>Experimental Workflow</title>
        <p>The whole experiment workflow is shown in Figure 1. At the beginning of the experiment,
we first collected 400 poems written by four poets. The poetic work of the target authors
is usually not enough to successfully train deep neural networks, so we collected a
total of 1000 pieces by adding other poems written during the Japanese colonial period.
Before training the data using LSTM, we removed old language, Chinese character, etc.
from the poem in the pre-processing process and then numbered each line of the poem.
When a user enters a keyword and clicks a poet, a poem is generated.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Stylistic Analysis</title>
        <p>We attempted a quantitative analysis of poetry text to define the style of each poet. To
do this, we extracted high frequency vocabularies using part-of-speech (POS) tagging
and made a cloud composed of top frequent word as shown in Figure 2. Besides, we
analyzed the co-occurrence patterns of words through bi-gram.</p>
        <p>From the results of the stylistic analysis, certain characteristics could be determined.
First, vocabulary related to nature appeared frequently in the poems by the specified
poets; second, the primary emotion influencing their poetry is sorrow; and third, sensory
expressions representing nature are often used. Finally, the use of first-person pronouns
‘I’ and determiner such as ‘this’ was high. Table 1 shows the results of the stylistic
analysis.</p>
        <sec id="sec-3-2-1">
          <title>Categories</title>
        </sec>
        <sec id="sec-3-2-2">
          <title>Nature</title>
        </sec>
        <sec id="sec-3-2-3">
          <title>Emotion Sense</title>
        </sec>
        <sec id="sec-3-2-4">
          <title>Pronoun</title>
          <p>Determiner</p>
          <p>Authors Examples
$Ùü (Yun Dong-Joo) $ (night), X (sky), Ä (star), etc.</p>
          <p>À© (Jeong Ji-yong) ä (sea), &lt; (water), etc.
@Ô (Kim So Wol) ° (mountain), 4 (tree), etc.
1 (Baek Suk) È (bird), l¬ (frog), etc.
¬ (sad), xm (alone), 4m (painful), ý (sorrow), etc.
@ (red), xx (blue), @ (bright), @ (black), @ (high), (p
´ (hot), etc.
 (I)
t(this), ø(that), etc.</p>
          <p>Table 1: The results of the stylistic analysis.
We present an approach based on LSTM to generate Korean poems with a specific
style. As shown in Figure 3 and 4, we constructed a word-level encoder-decoder LSTM
network, entered the author’s name, poem title, the line number of the poem, and trained
to minimize the difference between the target sentence and the sentence generated by
the network.</p>
          <p>When the sequence (author name, poem title, poem’s line number) enters the encoder
network, this network performs word embedding and inputs it to the LSTM network to
extract the feature values of the input sequence. The decoder network applies Attention
Mechanism to train the relationship between input sequences and output sentences.
Hidden size of each network is 256 Dimension and depth of layer is 3. The maximum
length that can be trained is given by 30-word sequence length. All LSTM networks
were initialized to zero before training.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Evaluation</title>
      <sec id="sec-4-1">
        <title>Syntactic and semantic errors</title>
        <p>
          Some researchers have suggested criteria for evaluating automatically composed texts.
For example, Manurung et al. [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] introduced grammaticality, meaningfulness, and
poeticness, while Sten et al. [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] proposed adequacy, fluency, readability, and variation.
        </p>
        <p>In the present study, readability, meaningfulness, and grammaticality were adopted,
and five evaluators chose 60 poems out of 1000 based on the three criteria. However,
some errors were found in the chosen poems, the most prominent of which were syntactic
and semantic errors. In Korean, adjectives are generally placed before nouns. However,
sentences that violate such syntactic rules were found. Additionally, in some cases,
sentence meanings could not be gathered despite the absence of syntactic errors. In order
to improve on these aspects, a significant amount of further study is required.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2 Imitating the style of a poet</title>
        <p>When a user clicks on one of the four poets and enters the desired keyword, a new poem
reflecting the poet’s style is created. Even if the same keyword is entered repeatedly,
new results will be generated each time. The reproductions of the poets’ vocabulary
and emotions were analyzed in 60 poems. Words related to nature, such as rivers, skies,
mountains, and the sea appeared in the title and content of the poem, and emotional
words representing sorrow and loneliness were used. However, some meaninglessly
repeated words affected the readability of the poems.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>Automatic poetry composition is a highly challenging problem because poetry is a
genre of literature that expresses human imagination and creativity. Several studies have
been conducted to generate poems in various languages such as English, Chinese, and</p>
      <p>Japanese. In this paper, an LSTM-based approach to produce a Korean poem with a
specific style is presented. When a user enters a keyword and clicks on a poet, the model
creates a poem that reflects the poet’s writing style. Korean poem generation research is
just beginning; this work is expected to contribute to Korean text generation research.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgment</title>
      <p>This research was supported by the Korea Creative Content Agency, under the Ministry
of Culture, Sports and Tourism.</p>
      <p>You et al.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Bailey</surname>
          </string-name>
          , R.W.:
          <article-title>Computer-assisted poetry: the writing machine is for everybody</article-title>
          . Computers in the Humanities pp.
          <fpage>283</fpage>
          -
          <lpage>295</lpage>
          (
          <year>1974</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Collectifs</surname>
          </string-name>
          , G.: Atlas de Littérature Potentiel. Folio essais, Gallimard, Paris, France (
          <year>Jan 1988</year>
          ), https://www.ebook.de/de/product/10458470/gall_collectifs_atlas_ de_litt_potentiel.html
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Das</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gambäck</surname>
            ,
            <given-names>B.:</given-names>
          </string-name>
          <article-title>Poetic machine: Computational creativity for automatic poetry generation in bengali</article-title>
          . In: Colton,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Ventura</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Lavrac</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Cook</surname>
          </string-name>
          , M. (eds.)
          <source>Proceedings of the 5th International Conference on Computational Creativity (ICCC</source>
          <year>2014</year>
          ). pp.
          <fpage>230</fpage>
          -
          <lpage>238</lpage>
          . computationalcreativity.net, Ljubljana, Slovenia (Jun
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Gervás</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          : Wasp:
          <article-title>Evaluation of different strategies for the automatic generation of spanish verse</article-title>
          .
          <source>In: Time for AI and Society - Proceedings of the AISB Symposium on Creative &amp; Cultural Aspects and Applications of AI &amp; Cognitive Science</source>
          . pp.
          <fpage>93</fpage>
          -
          <lpage>100</lpage>
          . Birmingham, UK (Apr
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Lewis</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          : The Cambridge Introduction to Modernism. Cambridge University Press, Cambridge, UK (
          <year>2007</year>
          ). https://doi.org/10.1017/cbo9780511803055
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Manurung</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ritchie</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thompson</surname>
          </string-name>
          , H.:
          <article-title>Using genetic algorithms to create meaningful poetic text</article-title>
          .
          <source>Journal of Experimental &amp; Theoretical Artificial Intelligence</source>
          <volume>24</volume>
          (
          <issue>1</issue>
          ),
          <fpage>43</fpage>
          -
          <lpage>64</lpage>
          (
          <year>Mar 2012</year>
          ). https://doi.org/10.1080/0952813x.
          <year>2010</year>
          .539029
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Park</surname>
            ,
            <given-names>Y.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jeong</surname>
            ,
            <given-names>H.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kang</surname>
            ,
            <given-names>I.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Park</surname>
          </string-name>
          , C.Y.,
          <string-name>
            <surname>Choi</surname>
            ,
            <given-names>Y.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>K.J.:</given-names>
          </string-name>
          <article-title>Automatic generation of korean poetry using sequence generative adversarial networks</article-title>
          .
          <source>In: Proceedings of the 2018 Annual Conference on Human and Language Technology, Human and Language Technology</source>
          . pp.
          <fpage>580</fpage>
          -
          <lpage>583</lpage>
          (
          <year>Oct 2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Stent</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marge</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Singhai</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Evaluating evaluation methods for generation in the presence of variation</article-title>
          . In: Gelbukh,
          <string-name>
            <surname>A.F</surname>
          </string-name>
          . (ed.)
          <source>Proceedings of the 6th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing</source>
          <year>2005</year>
          ).
          <source>Lecture Notes in Computer Science</source>
          , vol.
          <volume>3406</volume>
          , pp.
          <fpage>341</fpage>
          -
          <lpage>351</lpage>
          . Springer Berlin Heidelberg, Mexico City,
          <source>Mexico (Feb</source>
          <year>2005</year>
          ). https://doi.org/10.1007/978-3-
          <fpage>540</fpage>
          -30586-6_
          <fpage>38</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tian</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gao</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yao</surname>
            ,
            <given-names>C.:</given-names>
          </string-name>
          <article-title>The machine poetry generator imitating du fu's styles</article-title>
          .
          <source>In: Proceedings of the 2018 International Conference on Artificial Intelligence and Big Data (ICAIBD</source>
          <year>2018</year>
          ). pp.
          <fpage>261</fpage>
          -
          <lpage>265</lpage>
          . IEEE, Chengdu, China (May
          <year>2018</year>
          ). https://doi.org/10.1109/icaibd.
          <year>2018</year>
          .8396206
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tosa</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nakatsu</surname>
          </string-name>
          , R.:
          <article-title>New hitch haiku: An interactive renku poem composition supporting tool applied for sightseeing navigation system</article-title>
          . In: Natkin,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Dupire</surname>
          </string-name>
          ,
          <string-name>
            <surname>J</surname>
          </string-name>
          . (eds.)
          <source>Proceedings of the 8th International Conference on Entertainment Computing (ICEC</source>
          <year>2009</year>
          ).
          <source>Lecture Notes in Computer Science</source>
          , vol.
          <volume>5709</volume>
          , pp.
          <fpage>191</fpage>
          -
          <lpage>196</lpage>
          . Springer Berlin Heidelberg, Paris, France (
          <year>Sep 2009</year>
          ). https://doi.org/10.1007/978-3-
          <fpage>642</fpage>
          -04052-8_
          <fpage>19</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Zugarini</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Melacci</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maggini</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Neural poetry: Learning to generate poems using syllables</article-title>
          . In: Tetko,
          <string-name>
            <given-names>I.V.</given-names>
            ,
            <surname>Kurková</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            ,
            <surname>Karpov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Theis</surname>
          </string-name>
          ,
          <string-name>
            <surname>F.J</surname>
          </string-name>
          . (eds.)
          <source>Artificial Neural Networks and Machine Learning - ICANN 2019: Text and Time Series - Proceedings of the 28th International Conference on Artificial Neural Networks. Lecture Notes in Computer Science</source>
          , vol.
          <volume>11730</volume>
          , pp.
          <fpage>313</fpage>
          -
          <lpage>325</lpage>
          . Springer International Publishing, Munich, Germany (Sep
          <year>2019</year>
          ). https://doi.org/10.1007/978-3-
          <fpage>030</fpage>
          -30490-4_
          <fpage>26</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>