<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>SMART: Simple Monitoring enterprise Activities by RFID Tags</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Fabrizio Angiulli</string-name>
          <email>fangiulli@deis.unical.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Elio Masciari</string-name>
          <email>masciari@icar.cnr.it</email>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>DEIS-UNICAL Via P. Bucci</institution>
          ,
          <addr-line>87036 Rende (CS)</addr-line>
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute of Italian National Research Council</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Datastreams are potentially in¯nite sources of data that °ow continuously while monitoring a physical phenomenon, like temperature levels or other kind of human activities, such as clickstreams, telephone call records, and so on. RFID technology has lead in recent years the generation of huge streams of data. Moreover, RFID based systems allow the e®ective management of items tagged by RFID tags, especially for supply chain management or objects tracking. In this paper we introduce SMART (Simple Monitoring enterprise Activities by RFID Tags) a system based on outlier template de¯nition for detecting anomalies in RFID streams. We describe SMART features and its application on a real life scenario that shows the e®ectiveness of the proposed method for e®ective enterprise management.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>In this paper we will focus on Radio Frequency Identi¯cation (RFID) data
streams monitoring as RFID based systems are emerging as key components
in systems devoted to perform complex activities such as objects tracking and
supply chain management. Sometimes RFID tags are referred to as electronic
bar codes. Indeed, RFID tags emit a signal that contains basic identi¯cation
information about a product. Such tags can be used to track a product from
manufacturing through distribution and then on to retailers. These features of RFID
tags open new perspectives both for hardware and data management. In fact,
RFID is going to create a lot of new data management needs. In more details,
RFID applications will generate a lot of so called \thin" data, i.e. data pertaining
to time and location. In addition to providing insight into shipment and other
supply chain process e±ciencies, such data provide valuable information for
determining product seasonality and other trends resulting in key information for
the companies management. Moreover, companies are exploring more advanced
uses for RFID. For instance, tire manufacturers plan to embed RFID chips in
tires to determine the tire deterioration. Many pharmaceutical companies are
embedding RFID chips in drug containers to better track and avert the theft
of highly controlled drugs. Airlines are considering RFID-enabling key onboard
parts and supplies to optimize aircraft maintenance and airport gate preparation
turnaround time.</p>
      <p>Such a wide variety of systems for monitoring data streams could bene¯t of
the de¯nition of a suitable technique for detecting anomalies in the data °ows
being analyzed. As a motivating example you may think about a company that
would like to monitor the mean time its goods stay on the aisles. Items are tagged
by RFID tags so the reader continuously produces a readings that report the
electronic product code of the item being scanned, its location and timestamp,
this information can be used, as an example, for signaling that the item lays
too much on the shelf since it is repeatedly scanned in the same position. It
could be the case that the package is damaged and consequently customers tend
to avoid the purchase. If an item exhibits such a feature it deserves further
investigation. Such a problem is relevant to a so huge number of application
scenario that it is impossible to de¯ne an absolute notion of anomalies (in the
follow we refer to anomalies as outliers). In this paper we propose a framework
for dealing with the outlier detection problem in massive datastreams generated
in a network environment for objects tracking and management. The main idea
is to provide users a simple but rather powerful framework for de¯ning the notion
of outlier for almost all the application scenarios at an higher level of abstraction,
separating the speci¯cation of data being investigated from the speci¯c outlier
characterization.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Preliminaries</title>
      <p>An RFID system consists of three components: the tag, the reader and the
application which uses RFID data. Tags consist of an antenna and a silicon
chip encapsulated in glass or plastic. RFID readers or receivers are composed
of a radio frequency module, a control unit and an antenna to query electronic
tags via radio frequency (RF) communication. They also include an interface
that communicates with an application (e.g., the check-out counter in a store).
Readers can be hand-held or mounted in speci¯c locations in order to ensure
they are able to read the tags as they pass through a query zone that is the
area within which a reader can read the tag. The query zone are the locations
that must be monitored for application purposes. In order to explain the typical
features of an RFID application we consider the typical supply chain scenario.</p>
      <p>The chain from the farm to the customer has many stages. At each stage
goods are typically delivered to the next stage, but in some case a stage can
be missing. The following three cases may occur: 1) the goods lifecycle begin
at a given point (i.e. production stages, goods are tagged there and then move
through the chain) and thus the reader in the zone register only departures of
goods, we refer to this reader as source reader ; 2) goods are scanned by the
reader both when they arrive and they leave the aisle, in this case we refer to
these reader as intermediate reader ; 3) goods are scanned and the tag is killed,
we refer to these readers as destination reader.</p>
      <p>A RFID stream is (basically) composed of an ordered set of n sources (i.e.,
tag readers) located at di®erent positions, denoted by fr1; : : : ; rng producing n
independent streams of data, representing tag readings. Each RFID stream can
be basically viewed as a sequence of triplets hidr; epc; ¿si, where: 1) idr 2 f1; ::; ng
is the tag reader identi¯er (observe that it implicitly carries information about
the spatial location of the reader) ; 2) epc is the product code read by the source
identi¯ed by idr and 3)¿s is a timestamp, i.e., a value that indicates the time
when the reading epc was produced by the source idr.</p>
      <p>
        An outlier is an observation that markedly di®ers from other observations as
to lead to the suspect that it was generated by a di®erent mechanism [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. There
exist several approaches to the identi¯cation of outliers, namely, statistical-based
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], distance-based [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], density-based [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and MDEF-based [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. The problem has
been tackled from di®erent viewpoint and in di®erent scenarios such as static
dataset, dynamic dataset and very large dataset[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. In our application scenario
we deal with massive datastreams that can be viewed as kind of a very large
dynamic dataset. Based on the notion of RFID stream introduce so far, it is
easy to see that each RFID reading generated by an RFID tag could be an
outlier either because 1) the (product) features (obtained by the epc such as
price, weight, height and so on) greatly di®ers from the others readings or 2) the
latency time that the tagged item spent in a given location deviates signi¯cantly
from an expected value.
      </p>
      <p>In our system we will assume either distance based outlier function or
statistical based outlier function to catch both source of anomaly and since we
are interested in the problem formalization, we disregard here the actual outlier
function implementation. More formally, given a set of objects S, a positive
integer k, and a positive real number R. An object o 2 S is a DB(k; R)- outlier, or a
distance-based outlier with respect to parameters k and R, if less than k objects
in S lie within distance R from o. This kind of function will be exploited when
searching for outliers based on their product features. To deal with deviation on
time features we resort to statistical based outlier function. We point out that
a formal analysis of the possible outlier detection methods is out of the scope
of this paper, we mentioned here the main approaches used in literature since
in our system implementation we allow any stream oriented implementation of
outlier function to be used. The latter observation guarantees a high °exibility
in our system for dealing with every possible application scenarios.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Statement of the Problem</title>
      <p>In our model, epc is the identi¯er associated with a single unit being scanned
(this may be a pallet or a single item, depending on the level of granularity
chosen for tagging the goods being monitored).</p>
      <p>This basic schema is simple enough to be used as a basic schema for a data
stream environment, anyway since more information are needed about the
outlier being detected we can access additional information by using some auxiliary
tables maintained at a M aster site as shown in ¯gure 2. More in detail, the
M aster maintains an intermediate local warehouse of RFID data that stores
information about items, items' movements, product categories and locations and
is exploited to provide details about RFID data upon user requests. The
information about items' movements are stored in the relation ItemM ovement and the
information about product categories and locations are stored in the relations
P roduct and Locations, respectively. These relations represents, respectively,
the Product and the Location hierarchy. Relation EPCProducts maintains the
association between epcs and product category, that is, every epc is associated to
a tuple at the most speci¯c level of the Product hierarchy. Finally, RFID readers
constitute the most speci¯c level of the Location hierarchy.</p>
      <p>ItemM ovements contains tuples of the form hepc; DLi, where epc has the
usual meaning, and DL is string built as follows: each time an epc is read for the
¯rst time at a node Ni a trigger ¯res and DL is updated appending the node
identi¯er.</p>
      <p>
        In the following we de¯ne a framework for integrating DSMS technologies
and outlier detection framework in order to e®ectively manage outliers in RFID
datastreams. In particular we will exploit the following features: a) The
definition of a template for specifying outlier queries on datastreams that could
be implemented on top of a DSMS by mapping the template in a suitable set
of continuous queries expressed in a continuous query language language
ESLlike[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]; b) The template need to be powerful enough to model all the interesting
surveillance scenarios. In this respect, it should allow the de¯nition of four
components, namely: 1) the kind of objects (O) to be monitored (e.g. RFID data
concerning dairy products),2) the reference population P (due to the in¯nite
nature of datastream) depending on the application context (e.g. a subset of
the items belonging to dairy products category), 3) the attributes (A) of the
population used for signing out anomalies (e.g. time spent at a given location),
4) the outlier de¯nition by means of a suitable function F (P; A; O) ! f0; 1g
(e.g. deviation from the average time spent at a given location by an item); c)
A mapping function that for a given template and DSMS schema, resolve the
template in a set of outlier continuous queries to be issued on the datastream
being monitored.
      </p>
      <p>The basic intuition behind the template de¯nition is that we want to run an
aggregate function that is raises by the Master (that is a central node collecting
the queries and the aggregate statistics along with the sample populations) and
then instantiated on a subset of nodes in the network. An incoming stream is
processed at each node where the template is activated by the M aster that
issue the request for monitoring the stream. Once a possible outlier is detected,
it is signaled to the M aster. The master maintains management information
about the network and some additional information about the items using two
auxiliary tables OutlierM ovement and N ewT rend. In the OutlierM ovement
table it stores information about the outlying objects, in particular it stores their
identi¯ers and the paths traveled so far as explained above for ItemM ovements.
The N ewT rend table stores information about objects that are not outliers but
instead they represent a new phenomenon in the data. It contains tuples of the
form hepc; N; ¿a; ¿l; i, where N is a node, ¿a and ¿l are, respectively, the arrival
time and the time interval spent at node N by the epc. The latter table is
really important since it is intended to deal with the concept drift that could
a®ect the data. Indeed, when items are marked as unusual but they are not an
anomalies as in the case of varied selling rates they are recorded for later use
in outlier de¯nition. In particular, once the new trend has been consolidated,
new statistics for the node where the objects appeared will be computed at
M aster level and then forwarded to the pertaining node in order to update the
parameters of its population.</p>
      <p>As mentioned above candidate outliers are signaled at node level but they
are managed by the master. More in detail, as a possible outlier is signaled by
a given node the master stores it in the OutlierM ovement table along with its
path if it is recognized as an anomaly or in the N ewT rend table if a signaled
item could represent the symptom of a new trend in data. To summarize, given
a signaled object o two cases may occur: 1) o is an outlier and then it is stored
in the Outlier table; 2) o represent a new trend in data distribution and then
it should not be considered an outlier and we store it in the N ewT rend table.
To better understand such a problem we de¯ne three possible scenarios on a toy
example.</p>
      <p>Example 1. Consider a container (whose epc is p1) containing dangerous material
that has to be delivered through check points c1; c2; c3 in the given order and
consider the following sequence of readings: SeqA = f(p1; c1; 1); (p1; c1; 2); (p1; c2; 3);
(p1; c2; 4); (p1; c2; 5); (p1; c2; 6); (p1; c2; 7); (p1; c2; 8); (p1; c2; 9); (p1; c2; 10);
(p1; c2; 11); (p1; c2; 12)g. Sequence A correspond to the case in which the pallet
tag is read repeatedly at the check point c2. This sequence may occur because:
i) the pallet (or the content) is damaged so it can no more be shipped until
some recovery operation has been performed, ii) the shipment has been delayed.
Depending on which one is the correct interpretation di®erent recovery action
need to be performed. To take into account this problem in our prototype
implementation we maintain appropriate statistics on latency time at each node for
signaling the possible outlier. Once the object has been forwarded to the master
a second check is performed in order to store it either in OutlierM ovement or
in N ewT rend table. In particular, it could happen that due to new shipping
policy additional checks have to be performed on dangerous material, obviously
this will cause a delay in shipping operations, thus the tuple has to be stored in
the N ewT rend table.</p>
      <p>Consider now a di®erent sequence of readings: SeqB = f(p1; c1; 1); (p1; c1; 2);
(p1; c1; 3); (p1; c1; 4); (p1; c3; 5); (p1; c3; 6); (p1; c3; 7); (p1; c3; 8); (p1; c3; 9); (p1; c3; 10);
(p1; c3; 11); (p1; c3; 12)g. Sequence B correspond to a more interesting scenario,
in particular it is the case that the pallet tag is read at check point c1, is not
read at check point c2 but is read at checkpoint c3. Again two main explanation
could be considered: i) the original routing has been changed for shipment
improvement, ii) someone changed the route for fraudulent reason (e.g. in order to
steal the content or to modify it). In this case suppose that the shipping plan
has not been changed, this means that we are dealing with an outlier then we
store it in the OutlierM ovement table along with its path.</p>
      <p>Finally, consider the following sequence of readings regarding products p1; p2; p3
that are frozen foods, and product p4 that is perishables, all readings generated
at a freezer warehouse c: SeqC = f(p1; c; 1); (p2; c; 2); (p3; c; 3); (p4; c; 4); (p1; c; 5);
(p2; c; 6); (p3; c; 7); (p4; c; 8); (p1; c; 9); (p2; c; 10); (p3; c; 11); (p4; c; 12)g. Obviously,
p4 is an outlier for that node of the supply chain and this can be easily recognized
using a distance based outlier function since its expiry date greatly deviates from
the expiry dates of other goods.</p>
      <p>The Template in a short In this section we will describe the functionalities
and syntax of the T emplate introduced so far. A T emplate is an aggregate
function that takes as input a stream. Since the physical stream could contain several
attributes as explained in previous sections we allow selection and projection
operation on the physical stream. As will be clear in next section we will use a
syntax similar to ESL with some speci¯c additional features pertaining to our
application scenario. This ¯ltering step is intended for feeding the reference
population P . In particular, as an object is selected at a given node it is included in
the reference population for that node using an Initialize operation, it persists
in the reference population as a Remove operation is invoked (it can be seen as
an Initialize operation on the tuples exiting the node being monitored).</p>
      <p>We recall that a RFID tagged object is scanned multiple times at a given node
N so when the reader no more detects the RFID tag no reading is generated.
First time an object is read a V alidate trigger ¯res and send the information to
the M aster that eventually updates the ItemM ovement table. In response to a
V alidate trigger the M aster performs a check on the item path, in particular it
checks if shipping constraints are so far met. In particular, it checks the incoming
reading for testing if the actual path so far traveled by current item is correct.
This check can be performed by the following operations: 1) selection of the path
for that kind of item stored in ItemM ovement, 2) add the current node to the
path, 3) check the actual path stored in an auxiliary table DeliveryP lans storing
all the delivery plans (we refer to this check as DELIVERY CHECK ). This step
is crucial for signaling path anomalies since as explained in our toy examples that
source of anomaly arise at this stage. If the item is not validated the M aster
stores the item information in order to solve the con°ict, in particular it could
be the case that delivery plans are changing (we refer to this check as NEW
PATH CHECK ) so information is stored in N ewT rend table for future analysis
, otherwise it is stored in the OutlierM ovement table. To better understand this
behavior consider the SeqB in example 1. When the item is ¯rst time detected
at node c3 the V alidate trigger ¯res, the path so far traveled for that object is
retrieved obtaining path = c1, the current node is added thus updating path =
c1:c3 but when checked against the actual path stored in DeliveryP lans an
anomaly is signaled since it was supposed to be c1:c2:c3. In this case the item
is stored in the OutlierM ovement table and the M aster signal for a recovery
action. It works analogously for SeqA as explained in example 1.</p>
      <p>
        When an epc has been validated it is added to the reference population
for that node (PN ) then it stays at the node and is continuously scanned. It
may happen that during its stay at a given node an epc could not be read due
to temporary ¯eld problem, we should distinguish this malfunction from the
\normal" behavior that arise when an item is moved for shipping or (in case of
destination nodes) because it has been sold. To deal with this feature we provide
a trigger F orget that ¯res when an object is not read for a (context depending)
number of reading cycles (we refer in the following as TIMESTAMP CHECK.
We point out that this operation is not lossy since we recall that at each node we
maintain (updated) statistics on items. When F orget runs, it removes the \old"
item from the actual population and update the node statistics. Node statistics
(we refer hereafter to them as modelM where N is the node they refer to) we
take into account for outlier detection are: number of items grouped by product
category (count), average time spent at the node by items belonging to a given
category (m), variance for items (v) belonging to a given category, maximum
time spent at the current node by items belonging to a given category (maxt),
minimum time spent at the current node by items belonging to a given category
(mint). By means of the reference population PN and the node statistics modelN
the chosen outlier function checks for anomalies. In particular, we can search for
two kind of anomalies: 1) item based anomalies, i.e. anomaly regarding the item
features, in this case we will run a distance-based outlier detection function; 2)
time based anomalies, i.e. anomaly regarding arrival time or latency time, in this
case we will run a statistical based outlier detection function.
In this section we formalize the syntax for template de¯nition. For basic stream
operation we will refer to ESL-like syntax[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. We point out that even if in this
paper we focus on RFID data and outlier detection task, the framework is rather
general and could be exploited in several application domains and for other task
such as aggregate queries evaluation.
      </p>
      <p>The ¯rst step is to create the stream related to nodes being monitored. Once
the streams are created at each node the T emplate de¯nition has to be provided.</p>
      <p>Aggregate function can be any SQL available function applied on the
reference population as shown in Fig. 5, where Return and N ext have the same
interpretation as in SQL and &lt; T ype &gt; can be any SQL aggregate function. An
empty T ERM IN AT E clause refer to a non-blocking version of the aggregate.</p>
      <p>As the template has been de¯ned it must be instantiated on the nodes being
monitored. In particular triggers V alidate and F orget are activated at each node.
As mentioned above they will continuously update the reference population and
node and M aster statistics. The syntax of these triggers is shown in ¯gure 6.</p>
      <p>We point out again that V alidate trigger has the important side-e®ect of
signaling path outliers. We point out that the above presented de¯nition is
completely °exible so if the user may need a di®erent outlier de¯nition she simply
needs to add its de¯nition as a plug-in in our system.</p>
      <p>CREATE STREAM &lt; name &gt;
ORDER BY &lt; attribute &gt;
SOURCE &lt; systemnode &gt;
DEFINE OUTLIER TEMPLATE &lt; name &gt;
ON STREAM &lt; streamname &gt;
REFERENCE POPULATION (&lt; def inepopulation &gt;)
MONITORING (&lt; target &gt;)
USING &lt; outlierf unction &gt;
&lt; def inepopulation &gt; INSERT INTO &lt; P opulationName &gt;</p>
      <p>SELECT &lt; attributelist &gt;
FROM &lt; streamname &gt;</p>
      <p>WHERE &lt; conditions &gt;
&lt; target &gt; &lt; attributelist &gt; j &lt; aggegatef unction &gt;
&lt; outlierf unction &gt; &lt; distancebased &gt; j &lt; statisticalbased &gt;
&lt;Function Name&gt; &lt;Type&gt;(Next Real) : Real
&lt;Table Name&gt; (&lt;attribute list&gt;);
f INSERT INTO &lt;Table Name&gt; VALUES (Next, 1); g
f UPDATE &lt;Population Name&gt; SET &lt;update condition&gt;;</p>
      <p>SELECT &lt;output attribute&gt; FROM &lt;Table Name&gt; g</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>F.</given-names>
            <surname>Angiulli</surname>
          </string-name>
          and
          <string-name>
            <given-names>F.</given-names>
            <surname>Fassetti</surname>
          </string-name>
          . DOLPHIN:
          <article-title>An E±cient Algorithm for Mining DistanceBased Outliers in Very Large Datasets</article-title>
          .
          <source>TKDD</source>
          ,
          <volume>8</volume>
          (
          <issue>1</issue>
          ),
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>V.</given-names>
            <surname>Barnett</surname>
          </string-name>
          and
          <string-name>
            <given-names>T.</given-names>
            <surname>Lewis</surname>
          </string-name>
          .
          <article-title>Outliers in Statistical Data</article-title>
          . Wiley and Sons,
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>R. NG E.</given-names>
            <surname>Knorr</surname>
          </string-name>
          and
          <string-name>
            <given-names>V.</given-names>
            <surname>Tucakov</surname>
          </string-name>
          .
          <article-title>Distance-based outlier: Algorithms and applications</article-title>
          . VLDB J.,
          <volume>8</volume>
          (
          <issue>3</issue>
          -4):
          <volume>237</volume>
          {
          <fpage>253</fpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>D.</given-names>
            <surname>Hawkins</surname>
          </string-name>
          .
          <source>Identi¯cation of Outliers. Monographs on Applied Probability and Statistics. Chapman and Hall</source>
          ,
          <year>1980</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>R. Ng J. Sander M.M. Breunig</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Kriegel</surname>
          </string-name>
          . Lof:
          <article-title>Identifying density-based local outliers</article-title>
          .
          <source>In In Proceedings of the International Conference on Managment of Data (SIGMOD00).</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>P.</given-names>
            <surname>Gibbons S. Papadimitriou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kitagawa</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Faloutsos</surname>
          </string-name>
          . Loci:
          <article-title>Fast outlier detection using the local correlation integral</article-title>
          .
          <source>In In Proceedings of the International Conference on Data Enginnering (ICDE)</source>
          , pages
          <fpage>315</fpage>
          {
          <fpage>326</fpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7. H.
          <string-name>
            <surname>Thakkar H. Wang Y. Bai</surname>
            ,
            <given-names>R.C.</given-names>
          </string-name>
          <string-name>
            <surname>Luo</surname>
            and
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Zaniolo</surname>
          </string-name>
          .
          <article-title>An introduction to the Expressive Stream Language (ESL)</article-title>
          .
          <source>Tech. Report.</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>