<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Storytelling AI: A Generative Approach to Story Narration</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sonali Fotedar</string-name>
          <email>s.fotedar@student.tue.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Koen Vannisselroij</string-name>
          <email>koen.vannisselroij@student.uva.nl</email>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shama Khalil</string-name>
          <email>s.n.khalil@student.tue.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bas Ploeg</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Eindhoven University of Technology</institution>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Greenhouse Group B.V.</institution>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>In: A. Jorge, R. Campos, A. Jatowt, A. Aizawa (eds.): Proceedings of the first AI4Narratives Workshop</institution>
          ,
          <addr-line>Yokohama</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>University of Amsterdam</institution>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper, we demonstrate a Storytelling AI system, which is able to generate short stories and complementary illustrated images with minimal input from the user. The system makes use of a text generation model, a text-to-image synthesis network and a neural style transfer model. The final project is deployed as a web page where a user can build their stories.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Recent advancement in the field of Deep Learning has
brought us closer to the long-standing goal of replicating
human intelligence in machines. This has lead to
increasing experimentation of neural networks as ”generative”, the
most prominent study being Generative Adversarial
Networks [Goodfellow et al., 2014]. The birth of GANs lead
to several variations [Radford et al., 2015] and various
applications in diverse domains such as, data augmentation
[Ratner et al., 2017], audio generation [Yang et al., 2017] and
medicine [Schlegl et al., 2017] amongst many.</p>
      <p>Significant breakthroughs have also been seen recently
towards empowering computers to understand language just as
we do. Natural Language Processing (NLP), when combined
with representation learning and deep learning, saw a spurt
in results showing that these techniques can achieve
state-ofthe-art results in many NLP tasks such as language modelling
[Jozefowicz et al., 2016], question-answering [Seo et al.,
2017], parsing [Vinyals et al., 2014] and many more. 2017
saw a landmark breakthrough when the Transformer model
[Vaswani et al., 2017] was introduced. This
sequence-tosequence model makes use of the attention mechanism, lends
itself to parallelization, and introduces techniques such as
Positional Encoding that brought significant improvement over
the previous sequence-to-sequence models that make use of
Copyright c 2020 by the paper’s authors. Use permitted under
Creative Commons License Attribution 4.0 International (CC BY 4.0).
Recurrent Neural Networks [Sutskever et al., 2014], specially
in terms of scalability. The Transformer model also opened
up a new way of working: transferring the information from a
pre-trained language model to downstream tasks, also known
as Transfer Learning. OpenAI released the OpenAI
Transformer [Radford, 2018], a pre-trained Transformer decoder
Language Model that can be fine-tuned for downstream tasks.
The model improved on several state-of-the-art for tasks such
as, Textual Entailment, Reading Comprehension and
Commonsense Reasoning to name a few.</p>
      <p>Our motivation to study generative models comes after
probing1 into the content creating process within Greenhouse.
Personalised content is a growing expectation that puts
pressure on professionals to create and deliver novel content. We
found out that the pressure of creating new and personalised
content within a time crunch leads to writers’ block and lack
of inspiration.</p>
      <p>More and more industry professionals are benefiting by
using artificial intelligence (AI) to help them with their
processes. The success of these generative models raises an
important question, can AI sufficiently help us in our creative
processes? We try to answer this question by focusing on the
applications of generative models and how they can be used
in content creation. We limited our scope to writing and story
telling content and created the concept of Storytelling AI as
a way to experiment with various generative models to
create text and image content. The idea of a Storytelling AI is
to generate short stories and illustrations using minimal user
input.
2</p>
    </sec>
    <sec id="sec-2">
      <title>System Architecture</title>
      <p>The idea of our Storytelling AI is to generate short stories
using generative models. This is achieved by accomplishing
the following three sub-goals:
1. First, the user inputs a text prompt as a seed for
generating a story.
2. To support the story with visuals, images are generated
that are based on the text of the story.
3. Lastly, for an all-rounded experience, the generated
pictures are made to look like illustrations using neural style
1We conducted interviews with various Greenhouse employees
working in content creation.
Figure 1 gives a visual overview of the adopted methodology.
The goals of the project are achieved by using three different
generative models for the three tasks mentioned above. First,
a language model is trained that learns the representation of
texts in the story for the purpose of generation. Second, two
text-to-image models are assessed and the best approach is
adopted. Finally, a neural style transfer model is trained that
learns to transfer the style of illustrated images to the images
generated from the second task. The final contribution of the
project is a web interface that brings these three components
together where the user can build a story by generating text
and images multiple times and export it in a Portable
Document Format (PDF). We deploy our project by creating an
interactive interface. Interested users can try the project here
https://github.com/shamanoor/final-grimm-prototype.
2.1</p>
      <sec id="sec-2-1">
        <title>Text Generation</title>
        <p>The main component of our system is the generation of
stories. For this purpose, we first need to model natural language
by training a Language Model. Given a vocabulary of words,
a language model learns the likelihood of the occurrence of
a word based on the previous sequence of words used in the
text. Long sequences of text can then be generated by
starting with an input seed and iteratively choosing the next likely
word.</p>
        <p>Our system uses OpenAI’s GPT-2 for the purpose of
language modelling [Radford et al., 2019]. GPT-2 is a large
Transformer based [Vaswani et al., 2017] language model
with 1.5 billion parameters, trained on a data set of 8 million
web pages called WebText. GPT-2 is built using the
transformer decoder blocks with two modifications: first, the
selfattention layer in the decoder masks the future tokens,
blocking information from tokens that are to the right of the current
token and, second, adopting an arrangement of the
Transformer decoder block proposed by [Liu et al., 2018].
Different sized GPT-2 models have been introduced by OpenAI.
Due to the low compute capability of the available hardware,
the small GPT-2 model is used in our system. To achieve our
goal of generation of stories, we fine-tuned the pre-trained
GPT2 language model on a data set of short stories. To
construct our data set, we collected 100 short stories written by
the Brothers Grimm from Project Gutenberg2.
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Text-to-Image Synthesis</title>
        <p>The next step in our work is to generate images that
complement the generated story. Text-to-Image synthesis technique
was adopted for this goal. We explore two ways to do this:
first, to make this process as automated as possible, we adopt
the StackGAN architecture to generate the images given a
text description, second, we adopt a less automated technique
the BigGAN API is used to generate images conditioned on
a class.</p>
        <p>StackGAN. StackGAN was proposed by [Zhang et al.,
2016] to generate photo-realistic images conditioned on
text descriptions. The idea of StackGAN is to decompose a
hard problem into more manageable sub-problems through
a sketch-refinement process, therefore using two models
stacked on one another. The original paper used several
datasets to evaluate their work, but for our system, we only
use the model pre-trained3 on MS COCO dataset [Lin et al.,
2014] since it is a more generalized dataset containing 80
common object categories and relates more to our problem.
BigGAN. The next technique adopted for text-to-image
synthesis is less automated to aim at more controlled and
realistic generations. For this purpose, the pre-trained BigGAN
model from HuggingFace4 is used. BigGAN, proposed by
[Brock et al., 2018], is a class-conditional image synthesis
technique attempting a large scale GAN training for high
fidelity natural image synthesis. The model is trained on the
ImageNet dataset [Deng et al., 2009] and can generate high
fidelity images from 1000 classes. We use the pre-trained
BigGAN deep 256, a 55.9M parameters model generating
256x256 pixels images conditioned on a class5.</p>
        <p>Figure 2 shows text-to-image generation using StackGAN
and BigGAN. It can be seen clearly that the generations
using StackGAN are vague and imprecise. Some images are
2https://www.gutenberg.org/
3https://github.com/hanzhanggit/StackGAN-Pytorch
4https://huggingface.co/
5https://github.com/huggingface/pytorch-pretrained-BigGAN
able to generate the setting of the description, for example,
fields or beaches, but the overall quality of the generation is
very poor. The images generated by BigGAN conditioned on
a class are of far superior quality than the ones generated
using StackGAN. Therefore based on qualitative analysis, we
see a clear trade-off between automation and fidelity in the
process of text-to-image synthesis. Since the aim is to have
image generations of higher quality, we compromise on
automation and use the BigGAN model to obtain better
quality class-conditioned image generations. Images generated
by BigGAN do not depict a whole description with multiple
objects, but we settle for a comparatively higher quality
generation of a single object.
2.3</p>
      </sec>
      <sec id="sec-2-3">
        <title>Neural Style Transfer</title>
        <p>We also aimed at generating images that look like an
illustration, therefore aiming at a more all-rounded storybook feel.
Therefore, the last step in our system is to have illustrated
images by using Neural Style Transfer to transfer the illustration
style to our generated images. We use the CycleGAN model
to perform neural style transfer on our generated images. The
model was proposed by [Zhu et al., 2017] as an approach for
learning to translate an image from a source domain to a
target domain in the absence of paired examples. If X denotes
images from the source domain and Y denotes images from
the target domain, then the goal is to learn a mapping from X
to Y .</p>
        <p>To build our dataset, we randomly sample 6500 images
from the MS COCO data set [Lin et al., 2014] for training and
1000 images for testing. We further collect illustrated images
by crawling through Pinterest boards relating to illustration
art and fairy tale and story illustrations. We scraped 6,308
images from these web pages using BeautifulSoup46 and
Selenium7. The images were manually analysed and noisy
images such as non illustrated images were removed. Black
and while images and images with texts were also removed.
They were also randomly cropped into a 1:1 dimension ratio.
We then train the CycleGAN model from scratch on these
data set. Figure 3 illustrates some examples of style transfer
using the trained model.
2.4</p>
      </sec>
      <sec id="sec-2-4">
        <title>Front-end</title>
        <p>Now that we have the main building blocks for our
storytelling system, the final step is to create a pipeline of these
models using a user interface. In Figure 4, we share a snippet
of the user interface.</p>
        <p>The interface allows the user to input a text prompt that
the trained language model uses as a seed to generate chunks
of stories. Since imperfections in the generated text are
inevitable, the text can be edited to the liking of the user in
the text box. Simultaneously, the user can also generate
illustrated images by choosing an object class from a
dropdown menu that they think would best compliment the text
generated. This process requires two background steps: first,
the selected object class is used as input to generate a
classconditioned image using the pre-trained BigGAN model, and
6https://www.crummy.com/software/BeautifulSoup/bs4/doc/#
7https://www.selenium.dev/
second, the generated image is fed to the trained CycleGAN
model to generate the image with an illustration style. These
generations can be performed multiple times and added to the
final story, where the user can add a story title. When the user
is satisfied with the story, they can export it to a PDF file.
In this work, we demonstrate a Storytelling AI that uses
generative models to create stories with complementing
illustrations with minimal user input. Our aim with this project was
to study generative models and their competency in
generating original content. We believe that given the advanced state
of technology AI techniques can generate human-like content
but it requires human intervention and supervision to a great
extent. With research being conducted towards more
controllable generations, we believe with a well curated data set,
generative models can help conceptors in creating novel and
personalised advertisement sketches, design and images.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Acknowledgments</title>
      <p>We would like to extend our greatest thanks to Dr.
Decebal Mocanu for his constant supervision and invaluable
guidance throughout the course of the internship. Moreover, we
would like to thank Mr. Ruben Mak and Mr. Thom Hopmans
for their coaching and counselling during our internship at
Greenhouse Group.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [Brock et al.,
          <year>2018</year>
          ]
          <string-name>
            <given-names>Andrew</given-names>
            <surname>Brock</surname>
          </string-name>
          , Jeff Donahue, and
          <string-name>
            <given-names>Karen</given-names>
            <surname>Simonyan</surname>
          </string-name>
          .
          <article-title>Large scale GAN training for high fidelity natural image synthesis</article-title>
          .
          <source>CoRR</source>
          , abs/
          <year>1809</year>
          .11096,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [Deng et al.,
          <year>2009</year>
          ]
          <string-name>
            <given-names>J.</given-names>
            <surname>Deng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Socher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.-J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Li</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L.</given-names>
            <surname>Fei-Fei</surname>
          </string-name>
          .
          <article-title>ImageNet: A Large-Scale Hierarchical Image Database</article-title>
          .
          <source>In CVPR09</source>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [Goodfellow et al.,
          <year>2014</year>
          ]
          <string-name>
            <given-names>Ian J.</given-names>
            <surname>Goodfellow</surname>
          </string-name>
          ,
          <string-name>
            <surname>Jean</surname>
            <given-names>PougetAbadie</given-names>
          </string-name>
          , Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and
          <string-name>
            <given-names>Yoshua</given-names>
            <surname>Bengio</surname>
          </string-name>
          .
          <source>Generative adversarial networks</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [Jozefowicz et al.,
          <year>2016</year>
          ]
          <string-name>
            <given-names>Rafal</given-names>
            <surname>Jozefowicz</surname>
          </string-name>
          , Oriol Vinyals, Mike Schuster, Noam Shazeer, and
          <string-name>
            <given-names>Yonghui</given-names>
            <surname>Wu</surname>
          </string-name>
          .
          <source>Exploring the limits of language modeling</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>[Lin</surname>
          </string-name>
          et al.,
          <year>2014</year>
          ]
          <string-name>
            <surname>Tsung-Yi Lin</surname>
            ,
            <given-names>Michael</given-names>
          </string-name>
          <string-name>
            <surname>Maire</surname>
            ,
            <given-names>Serge J.</given-names>
          </string-name>
          <string-name>
            <surname>Belongie</surname>
            ,
            <given-names>Lubomir D.</given-names>
          </string-name>
          <string-name>
            <surname>Bourdev</surname>
          </string-name>
          ,
          <string-name>
            <surname>Ross B. Girshick</surname>
            , James Hays, Pietro Perona, Deva Ramanan, Piotr Dolla´r, and
            <given-names>C. Lawrence</given-names>
          </string-name>
          <string-name>
            <surname>Zitnick</surname>
          </string-name>
          .
          <article-title>Microsoft COCO: common objects in context</article-title>
          .
          <source>CoRR, abs/1405.0312</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [Liu et al.,
          <year>2018</year>
          ]
          <string-name>
            <given-names>Peter</given-names>
            <surname>Liu</surname>
          </string-name>
          , Mohammad Saleh, Etienne Pot, Ben Goodrich, Ryan Sepassi, Lukasz Kaiser, and
          <string-name>
            <given-names>Noam</given-names>
            <surname>Shazeer</surname>
          </string-name>
          .
          <article-title>Generating wikipedia by summarizing long sequences</article-title>
          .
          <source>01</source>
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [Radford et al.,
          <year>2015</year>
          ]
          <string-name>
            <given-names>Alec</given-names>
            <surname>Radford</surname>
          </string-name>
          , Luke Metz, and
          <string-name>
            <given-names>Soumith</given-names>
            <surname>Chintala</surname>
          </string-name>
          .
          <article-title>Unsupervised representation learning with deep convolutional generative adversarial networks</article-title>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [Radford et al.,
          <year>2019</year>
          ]
          <string-name>
            <given-names>Alec</given-names>
            <surname>Radford</surname>
          </string-name>
          , Jeff Wu, Rewon Child, David Luan,
          <string-name>
            <given-names>Dario</given-names>
            <surname>Amodei</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Ilya</given-names>
            <surname>Sutskever</surname>
          </string-name>
          .
          <article-title>Language models are unsupervised multitask learners</article-title>
          .
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <source>[Radford</source>
          , 2018]
          <string-name>
            <given-names>Alec</given-names>
            <surname>Radford</surname>
          </string-name>
          .
          <article-title>Improving language understanding by generative pre-training</article-title>
          .
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [Ratner et al.,
          <year>2017</year>
          ]
          <string-name>
            <given-names>Alexander J.</given-names>
            <surname>Ratner</surname>
          </string-name>
          ,
          <string-name>
            <surname>Henry R. Ehrenberg</surname>
          </string-name>
          , Zeshan Hussain, Jared Dunnmon, and Christopher Re´.
          <article-title>Learning to compose domain-specific transformations for data augmentation</article-title>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [Schlegl et al.,
          <year>2017</year>
          ]
          <string-name>
            <given-names>Thomas</given-names>
            <surname>Schlegl</surname>
          </string-name>
          , Philipp Seebo¨ck,
          <string-name>
            <surname>Sebastian</surname>
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Waldstein</surname>
            , Ursula Schmidt-Erfurth, and
            <given-names>Georg</given-names>
          </string-name>
          <string-name>
            <surname>Langs</surname>
          </string-name>
          .
          <article-title>Unsupervised anomaly detection with generative adversarial networks to guide marker discovery</article-title>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [Seo et al.,
          <year>2017</year>
          ]
          <string-name>
            <given-names>Minjoon</given-names>
            <surname>Seo</surname>
          </string-name>
          , Aniruddha Kembhavi, Ali Farhadi, and
          <string-name>
            <given-names>Hannaneh</given-names>
            <surname>Hajishirzi</surname>
          </string-name>
          .
          <article-title>Bidirectional attention flow for machine comprehension</article-title>
          .
          <source>ArXiv, abs/1611.01603</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [Sutskever et al.,
          <year>2014</year>
          ]
          <string-name>
            <given-names>Ilya</given-names>
            <surname>Sutskever</surname>
          </string-name>
          , Oriol Vinyals, and
          <string-name>
            <surname>Quoc</surname>
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Le</surname>
          </string-name>
          .
          <article-title>Sequence to sequence learning with neural networks</article-title>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [Vaswani et al.,
          <year>2017</year>
          ]
          <string-name>
            <given-names>Ashish</given-names>
            <surname>Vaswani</surname>
          </string-name>
          , Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones,
          <string-name>
            <given-names>Aidan N.</given-names>
            <surname>Gomez</surname>
          </string-name>
          , Lukasz Kaiser, and
          <string-name>
            <given-names>Illia</given-names>
            <surname>Polosukhin</surname>
          </string-name>
          .
          <article-title>Attention is all you need</article-title>
          .
          <source>CoRR, abs/1706.03762</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [Vinyals et al.,
          <year>2014</year>
          ]
          <string-name>
            <given-names>Oriol</given-names>
            <surname>Vinyals</surname>
          </string-name>
          , Lukasz Kaiser, Terry Koo, Slav Petrov, Ilya Sutskever, and
          <string-name>
            <given-names>Geoffrey</given-names>
            <surname>Hinton</surname>
          </string-name>
          .
          <source>Grammar as a foreign language</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [Yang et al.,
          <year>2017</year>
          ]
          <string-name>
            <surname>Li-Chia</surname>
            <given-names>Yang</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Szu-Yu Chou</surname>
            , and
            <given-names>YiHsuan</given-names>
          </string-name>
          <string-name>
            <surname>Yang</surname>
          </string-name>
          .
          <article-title>Midinet: A convolutional generative adversarial network for symbolic-domain music generation</article-title>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [Zhang et al.,
          <year>2016</year>
          ] Han Zhang, Tao Xu,
          <string-name>
            <given-names>Hongsheng</given-names>
            <surname>Li</surname>
          </string-name>
          , Shaoting Zhang, Xiaolei Huang,
          <string-name>
            <given-names>Xiaogang</given-names>
            <surname>Wang</surname>
          </string-name>
          , and
          <string-name>
            <surname>Dimitris</surname>
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Metaxas</surname>
          </string-name>
          . Stackgan:
          <article-title>Text to photo-realistic image synthesis with stacked generative adversarial networks</article-title>
          .
          <source>CoRR, abs/1612.03242</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [Zhu et al.,
          <year>2017</year>
          ]
          <string-name>
            <surname>Jun-Yan</surname>
            <given-names>Zhu</given-names>
          </string-name>
          , Taesung Park, Phillip Isola, and
          <article-title>Alexei A Efros. Unpaired image-to-image translation using cycle-consistent adversarial networkss</article-title>
          .
          <source>In Computer Vision</source>
          (ICCV),
          <year>2017</year>
          IEEE International Conference on,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>