<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Wissem Inoubli, Sabeur Aridhi, Haithem Mezni, Mondher Maddouri, and Engelbert Me-
phu Nguifo. An experimental survey on big data frameworks. Future Generation
Computer Systems</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>A Comparative Study on Streaming Frameworks for Big Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Wissem Inoubli</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sabeur Aridhi</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Haithem Mezni</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mondher Maddouri</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Engelbert Mephu Nguifo</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>College Of Buisness, University of Jeddah</institution>
          ,
          <addr-line>P.O.Box 80327, Jeddah 21589 KSA</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Clermont Auvergne, LIMOS</institution>
          ,
          <addr-line>Clermont-Ferrand</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Jendouba, SMART Lab</institution>
          ,
          <addr-line>Jendouba</addr-line>
          ,
          <country country="TN">Tunisia</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>University of Lorraine</institution>
          ,
          <addr-line>CNRS, Inria, LORIA, F-54000 Nancy</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>University of Tunis El Manar, Faculty of Sciences of Tunis</institution>
          ,
          <addr-line>LIPAH, Tunis</addr-line>
          ,
          <country country="TN">Tunisia</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2018</year>
      </pub-date>
      <volume>86</volume>
      <issue>546</issue>
      <fpage>137</fpage>
      <lpage>144</lpage>
      <abstract>
        <p>Recently, increasingly large amounts of data are generated from a variety of sources. Existing data processing technologies are not suitable to cope with the huge amounts of generated data. Yet, many research works focus on streaming in Big Data, a task referring to the processing of massive volumes of structured/unstructured streaming data. Recently proposed streaming frameworks for Big Data applications help to store, analyze and process the continuously captured data. In this paper, we discuss the challenges of Big Data and we survey existing streaming frameworks for Big Data. We also present an experimental evaluation and a comparative study of the most popular streaming platforms.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        In recent decades, increasingly large amounts of data are generated from a variety of
sources. The size of generated data per day on the Internet has already exceeded two
exabytes [6]. Stream processing problems lead to several research questions such as (
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
how to design scalable environments, (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) how to provide fault tolerance and (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) how to
design efficient solutions. In this context, stream processing frameworks are mainly
designed to process the huge amount of data streams and to make on-the-fly decisions. With
the rise of big data, various organizations have started to employ stream frameworks to
solve major emerging big data problems related to smart ecosystems, healthcare services,
social media, etc. For example, in smart cities, various sensors such as GPS, weather
conditions devices, public transportation smart cards and traffic cameras are installed on
diverse regions (e.g, water lines, utility poles, buses, trains, traffic lights) [10]. From these
sensors, very large quantities of data are collected. To understand such volume of data,
it is important to reveal hidden and valuable information from the big stream/storage of
data. Social media is another representative data source for big data that requires real-time
processing and results [13]. In fact, a huge volume of data is instantly and continuously
generated from a wide range of Internet applications and Web sites. Examples include
online mobile photo and video sharing services (e.g, Instagram, Youtube, Flickr), social
networks (e.g. Facebook, Twitter), business-oriented networks (e.g. LinkedIn), etc. The
adoption of in-stream frameworks that offer iterative processing and learning capabilities
allows to effectively perform specific tasks such as social network analysis, links
prediction, etc. Given the importance of the above discussed real-world scenarios, finding the
relevant framework for the big/high stream-oriented applications becomes a challenging
problem.
      </p>
      <p>Several systems have been proposed in the literature. In this paper, we present a
comparative study of popular stream processing frameworks according to their key
features. The studied frameworks have been chosen based on their number of contributors.
Our contributions are summarized as follows:
• We present four popular streaming frameworks for big data, their architecture and
their internal behavior.
• We compare the presented frameworks according to their key features.
• We evaluate the performance of the presented frameworks in terms of resources
consumption.</p>
      <p>This paper is organized as follows. In Section 2, we present some well-known
stream frameworks for big data. Section 3 is devoted to the comparison of the discussed
frameworks. In Section 4, an experimental evaluation of the presented stream processing
frameworks is provided.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Stream processing frameworks for big data</title>
      <p>Several streaming frameworks for big data have been proposed to allow real-time
largescale stream processing. This section sheds the light on the most popular big data stream
processing frameworks and provides a comparison study of them according to their main
features.</p>
    </sec>
    <sec id="sec-3">
      <title>2.1. Apache Spark</title>
      <p>
        Apache Spark [11] is a powerful processing framework that provides an ease of use tool
for efficient analytics of heterogeneous data. It was originally developed at UC
Berkeley in 2009 [14] . Spark has several advantages compared to other big data frameworks
like Hadoop MapReduce [4] and Storm [12]. A key concept of Spark is Resilient
Distributed Datasets (RDDs). An RDD is basically an immutable collection of objects spread
across a Spark cluster. In Spark, there are two types of operations on RDDs: (
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
transformations and (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) actions. Transformations consist in the creation of new RDDs from
existing ones using functions like map, filter, union and join. Actions consist of final
result of RDD computations. Spark Streaming is a Spark library that enables scalable and
high-throughput stream processing of live data streams.
      </p>
    </sec>
    <sec id="sec-4">
      <title>2.2. Apache Storm</title>
      <p>Storm [12] is an open source framework for processing large structured and unstructured
data in real time. Storm is a fault tolerant framework that is suitable for real time data
analysis, machine learning, sequential and iterative computation. A Storm program is
represented by a directed acyclic graphs (DAG). The edges of the program DAG
represent data transfer. The nodes of the DAG are divided into two types: spouts and bolts.
The spouts (or entry points) of a Storm program represent the data sources. The bolts
represent the functions to be performed on the data. Note that Storm distributes bolts across
multiple nodes to process the data in parallel. Storm is based on two daemons called
Nimbus (in master node) and a supervisor for each slave node. Nimbus supervises the slave
nodes and assigns tasks to them. If it detects a node failure in the cluster, it reassigns
the task to another node. Each supervisor controls the execution of its tasks (affected by
the nimbus). It can stop or start the spots following the instructions of Nimbus. Each
topology submitted to Storm cluster is divided into several tasks.</p>
    </sec>
    <sec id="sec-5">
      <title>2.3. Apache Flink</title>
      <p>Flink [5] is an open source framework for processing data in both real time mode and
batch mode. It provides several benefits such as fault-tolerant and large scale
computation. The programming model of Flink is similar to MapReduce [4] . By contrast to
MapReduce, Flink offers additional high level functions such as join, filter and
aggregation. Flink allows iterative processing and real time computation on stream data collected
by different tools such as Flume [3] and Kafka [7]. It offers several APIs on a more
abstract level allowing the user to launch distributed computation in a transparent and easy
way.</p>
    </sec>
    <sec id="sec-6">
      <title>2.4. Apache Samza</title>
      <p>Apache Samza [9] is an open source distributed processing framework created by
Linkedin to solve various kinds of stream processing requirements such as tracking data,
service logging of data, and data ingestion pipelines for real time services. Since then, it
was adopted and deployed in several projects. Samza is designed to handle large messages
and to provide file system persistence for them. It uses Apache Kafka as a distributed
broker for messaging, and Hadoop YARN for distributed resource allocation and scheduling.
YARN resource manager daemon is adopted by Samza to provide fault tolerance,
processor isolation, security, and resource management in the cluster. Samza is based on
three layers. The first one is devoted to streaming data and uses Apache Kafka to transit
the data flow. The second layer is based on YARN resource manager to handle the
distributed execution of Samza processing and to manage CPU and memory usage across
a multi-tenant cluster of machines. The processing capabilities are available in the third
layer which represents the Samza core and provides API for creating and running stream
tasks in the cluster [9]. In this layer, several abstract classes can be implemented by the
user to perform specific processing tasks. These abstract classes are implemented with a
MapReduce Framework to ensure the distributed processing.</p>
    </sec>
    <sec id="sec-7">
      <title>3. Comparison of stream processing frameworks</title>
      <p>In this section, the frameworks presented above are compared according to several
features (see Table 1) including: data format, types of data sources, programming model,
cluster manager, supported programming languages, latency and messaging capacities.</p>
      <p>We notice that Spark importance lies in its in-memory features and micro-batch
processing capabilities, especially in iterative and incremental processing [2]. Although
Spark is known to be the fastest framework due to the concept of RDD, it remains
characterized by its low throughput compared to other frameworks, while its micro-batch
concept could guarantee the fault tolerance. Flink shares similarities and characteristics
with Spark. It offers good processing performance when dealing with complex big data
structures such as graphs. Although there exist other solutions for large-scale graph
processing, Flink and Spark are enriched with specific APIs and tools for machine learning,
predictive analysis and graph stream analysis [1] [14].</p>
      <sec id="sec-7-1">
        <title>Data format</title>
      </sec>
      <sec id="sec-7-2">
        <title>Data sources</title>
      </sec>
      <sec id="sec-7-3">
        <title>Programming model</title>
      </sec>
      <sec id="sec-7-4">
        <title>Programming</title>
        <p>languages</p>
      </sec>
      <sec id="sec-7-5">
        <title>Cluster manager</title>
      </sec>
      <sec id="sec-7-6">
        <title>Spark</title>
        <p>DStream
HDFS,
and Kafka
Transformation
and action</p>
      </sec>
      <sec id="sec-7-7">
        <title>Storm</title>
        <p>Tuples
DBMS, Spoots</p>
        <sec id="sec-7-7-1">
          <title>Bolts</title>
        </sec>
        <sec id="sec-7-7-2">
          <title>Java, Scala and Java Python Hadoop YARN, Zookeeper Apache Mesos</title>
        </sec>
        <sec id="sec-7-7-3">
          <title>Latency Few seconds Messaging Exactly once</title>
        </sec>
      </sec>
      <sec id="sec-7-8">
        <title>Machine learning SparkMLLIB</title>
        <p>compatibility</p>
      </sec>
      <sec id="sec-7-9">
        <title>Elasticity Yes</title>
      </sec>
      <sec id="sec-7-10">
        <title>Sliding win- time based</title>
        <p>dows/Windowing</p>
      </sec>
      <sec id="sec-7-11">
        <title>Auto</title>
        <p>parallelization</p>
      </sec>
      <sec id="sec-7-12">
        <title>Streaming query</title>
      </sec>
      <sec id="sec-7-13">
        <title>Data Partitioning API</title>
      </sec>
      <sec id="sec-7-14">
        <title>Data transport</title>
        <p>On demand
SparkSQL
Yes
Declaratif
RPC</p>
      </sec>
      <sec id="sec-7-15">
        <title>Flink Samza</title>
        <p>DataStream Message
HDFS, kafka
DBMS, and
Kafka
Actions Mapreduce
functions Job
(map,groupby,..)
Java java</p>
        <sec id="sec-7-15-1">
          <title>Hadoop YARN YARN, Apache Mesos</title>
          <p>Sub-second Sub-second
Exactly once Exactly once
FlinkML Compatible
with
SAMOA
API
No
time based
and count
based
On demand</p>
        </sec>
        <sec id="sec-7-15-2">
          <title>Pipelined processing No</title>
        </sec>
        <sec id="sec-7-15-3">
          <title>Yes (Samza</title>
          <p>SQL API)
Yes
Copositionnel
Kafka</p>
        </sec>
        <sec id="sec-7-15-4">
          <title>Sub-second At least once Compatible with</title>
          <p>SAMOA
API
Yes
time
and
based
Pipelined
processing
No
No No
Copositionnel Declartaif
RPC RPC</p>
          <p>No
based time based
count</p>
          <p>In contrast, resource allocation in Storm is ensured in a dynamic and transparent
way. While existing stream processing frameworks implement their own message
transport protocol, Samza jobs use a set of named Kafka topics as input/output. Although the
low-level one-message-at-a-time model offers some flexibility to Samza, it presents
limitations regarding the frequency of produced errors and the automatic optimization. When
a broker node fails, the messages located in the file system will be lost and cannot be
recovered.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>4. Experiments</title>
      <p>In this section, we first present our experimental environment and protocol. Then, we
discuss the obtained results. More detailed experiments could be found in [8].</p>
    </sec>
    <sec id="sec-9">
      <title>4.1. Experimental environment and protocol</title>
      <p>
        All the experiments were performed in a real cluster called GALACTICA1. The cluster is
composed of 10 machines operating with Linux Ubuntu 16.04. Each machine is equipped
with a 4 CPU, 8GB of main memory and 500 GB of local storage. For our tests, we used
Flink 1.3.2, Spark 1.6.0, Samza 0.10.3 and Storm 1.1.1. All the studied frameworks have
been deployed with YARN as a cluster manager. For our experimental protocol, we used
Twitter4J API2 to stream tweets that contain the ”Big Data” word in real time. Every
tweet consists of a JSON file with a set of attributes such as tweet creation date, tweet
identifier and user informations. Our experimental protocol consists on executing an
Extract, Transform and Load (ETL) routine that (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) extracts tweets using Kafka in order to
ensure the same streaming rate while evaluating the studied frameworks, (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) transforms
the tweets by keeping only attributes like tweet identifier, tweet content, date,
geocoordinate and user informations, and (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) loads the transformed tweets to ElasticSearch. In this
work, we studied: (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) the number of messages processed by each framework in a given
period, (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) the impact of the size of the message on the number of processed messages,
and (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) the resources consumption of the studied frameworks.
      </p>
    </sec>
    <sec id="sec-10">
      <title>4.2. Experimental results</title>
      <p>600
Window time (s)
900</p>
      <p>In the next experiment, we changed the sizes of the processed messages. We used
5 tweets per message (around 500 KB per message). The results presented in Figure 2
show that Samza and Flink are very efficient compared to Spark, especially for large
messages.</p>
    </sec>
    <sec id="sec-11">
      <title>CPU consumption</title>
      <p>As shown in Figure 3, Flink CPU consumption is low compared to Spark, Samza and
Storm. Flink exploits about 10% of the available CPU, whereas Storm CPU usage varies
between 15% and 18%. However, Flink may provide better results than Storm when
CPU resources are more exploited. In the literature, Flink is designed to process large
messages, unlike Storm which is only able to deal with small messages (e.g., messages
coming from sensors). Unlike Flink, Samza and Storm, Spark collects events’ data every
second and performs processing task after that. Hence, more than one message is
processed, which explains the high CPU usage of Spark. Because of Flink’s pipeline nature,
each message is associated to a thread and consumed at each window time. Consequently,
this low volume of processed data does not affect the CPU resource usage. Samza exploits</p>
      <p>Storm Spark Flink Samza
100
) 80
(eg% 60
a
sUU 40
PC 20</p>
      <p>103
s
t
n
e
v
e
f
o
re102.5
b
m
u</p>
      <p>N
Spark</p>
      <p>CPU</p>
      <p>600</p>
      <p>Window time (s)
0 0 100 200 300</p>
      <p>Runtime (S)
0 0 100 200 300</p>
      <p>Runtime (S)
0 0 100 200 300</p>
      <p>Runtime (S)
0 0 100 200 300</p>
      <p>Runtime (S)
about 55% of the available CPU because it is based on the concept of virtual cores and
each job or partition is assigned to a number of virtual cores. In fact, it deploys several
threads (one for each partition), which explains the intensive CPU usage by Samza
compared to the other frameworks.</p>
    </sec>
    <sec id="sec-12">
      <title>RAM consumption</title>
      <p>Figure 4 shows the cost of event stream processing in terms of RAM consumption.
Figure 4 shows the cost of event stream processing in terms of RAM consumption. Spark
reached 6 GB (75% of the available resources) due to its in-memory behavior and its
ability to perform in micro-batch (process a group of messages at a time). Flink, Samza
and Storm did not exceed 5 GB (around 61% of the available RAM) as their stream mode
behavior consists in processing only single messages. Regarding Spark, the number of
processed messages is small.</p>
      <p>6,000
)(segaBM4,000
U
M2,000
A
R
spark</p>
      <p>RAM</p>
      <p>Storm</p>
      <p>RAM
6,000
4,000
2,000</p>
      <p>Flink</p>
      <p>RAM</p>
      <p>Samza</p>
      <p>RAM
6,000
4,000
2,000
0 0 100 200 300</p>
      <p>Runtime (S)
0 0 100 200 300</p>
      <p>Runtime (S)
0 0 100 200 300</p>
      <p>Runtime (S)
0 0 100 200 300</p>
      <p>Runtime (S)
2,000
0 0 100 200 300</p>
      <p>Runtime (S)
0 0 100 200 300</p>
      <p>Runtime (S)
0 0 100 200 300</p>
      <p>Runtime (S)
0 0 100 200 300</p>
      <p>Runtime (S)
0 0 100 200 300</p>
      <p>Runtime (S)
0 0 100 200 300</p>
      <p>Runtime (S)
0 0 100 200 300</p>
      <p>Runtime (S)
0 0 100 200 300</p>
      <p>Runtime (S)</p>
    </sec>
    <sec id="sec-13">
      <title>DISC R/W consumption</title>
      <p>Figure 5 depicts the amount of disk usage by the studied frameworks. The curves
denote the amount of Read/Write operations. The amounts of Write operations in Flink and
Storm are almost close. Flink, Samza and Storm frequently access the disk and are faster
than Spark in terms of the number of processed messages. As discussed in the above
sections, Spark is an in-memory framework which explains its lower disk usage.</p>
    </sec>
    <sec id="sec-14">
      <title>Bandwidth resource usage</title>
      <p>As shown in Figure 6, the amount of data exchanged per second varies between 375 KB/s
and 385 KB/s in the case of Flink, and varies between 387 KB/s and 390 KB/s in the
case of Storm. It is about 400 Mb/s in the case of Samza. This amount is high compared
to Spark as its bandwidth usage did not exceed 220 KB/s. This is due to the reduced
frequency of serialization and migration operations between the cluster nodes, as Spark
processes a group of messages at each operation. Consequently, the amount of exchanged
data is reduced, while Storm, Samza and Flink are designed for the stream processing.</p>
    </sec>
    <sec id="sec-15">
      <title>5. Conclusion</title>
      <p>With the increasing amount of data generated by billions of devices over the world, stream
processing becomes a key requirement of big data frameworks. The main goal of the
present work is to study and experimentally evaluate the most popular frameworks for
large-scale stream data processing. Spark, Storm, Flink and Samza were presented and
categorized according to their main features. We also evaluated the performance of the
presented frameworks in terms of resource consumption. We mention that this work is
a part of our previously published paper [8]. In this work, we focus on the evaluation
of streaming frameworks for Big Data. We mainly added a categorization of the studied
frameworks based on specific features of stream processing systems. In the future, we will
address the velocity of data processing by conducting more experiments on the frequency
and the size of incoming events data.</p>
    </sec>
    <sec id="sec-16">
      <title>Acknowledgements</title>
      <p>This research was partially supported by the General Direction of Scientific Research in
Tunisia (DGRST).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Alexander</given-names>
            <surname>Alexandrov</surname>
          </string-name>
          , Rico Bergmann, Stephan Ewen,
          <string-name>
            <surname>Johann-Christoph</surname>
            <given-names>Freytag</given-names>
          </string-name>
          , Fabian Hueske, Arvid Heise, Odej Kao, Marcus Leich, Ulf Leser,
          <string-name>
            <given-names>Volker</given-names>
            <surname>Markl</surname>
          </string-name>
          , et al.
          <article-title>The stratosphere platform for big data analytics</article-title>
          .
          <source>The VLDB Journal</source>
          ,
          <volume>23</volume>
          (
          <issue>6</issue>
          ):
          <fpage>939</fpage>
          -
          <lpage>964</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Fuad</given-names>
            <surname>Bajaber</surname>
          </string-name>
          , Radwa Elshawi, Omar Batarfi, Abdulrahman Altalhi, Ahmed Barnawi, and
          <string-name>
            <given-names>Sherif</given-names>
            <surname>Sakr</surname>
          </string-name>
          .
          <article-title>Big data 2.0 processing systems: Taxonomy and open challenges</article-title>
          .
          <source>Journal of Grid Computing</source>
          ,
          <volume>14</volume>
          (
          <issue>3</issue>
          ):
          <fpage>379</fpage>
          -
          <lpage>405</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Craig</given-names>
            <surname>Chambers</surname>
          </string-name>
          , Ashish Raniwala, Frances Perry, Stephen Adams, Robert R Henry, Robert Bradshaw, and
          <string-name>
            <given-names>Nathan</given-names>
            <surname>Weizenbaum</surname>
          </string-name>
          .
          <article-title>Flumejava: easy, efficient data-parallel pipelines</article-title>
          .
          <source>In ACM Sigplan Notices</source>
          , volume
          <volume>45</volume>
          , pages
          <fpage>363</fpage>
          -
          <lpage>375</lpage>
          . ACM,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Jeffrey</given-names>
            <surname>Dean</surname>
          </string-name>
          and
          <string-name>
            <given-names>Sanjay</given-names>
            <surname>Ghemawat</surname>
          </string-name>
          .
          <article-title>Mapreduce: simplified data processing on large clusters</article-title>
          .
          <source>Communications of the ACM</source>
          ,
          <volume>51</volume>
          (
          <issue>1</issue>
          ):
          <fpage>107</fpage>
          -
          <lpage>113</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Apache</given-names>
            <surname>Flink</surname>
          </string-name>
          .
          <source>Scalable batch and stream data processing</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>