<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>An agent-based WCET analysis for Top-View Person Re-Identi cation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Marina Paolanti</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Valerio Placidi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michele Bernardini</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrea Felicetti</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rocco Pietrini</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Emanuele Frontoni</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Information Engineering, Universita Politecnica delle Marche</institution>
          ,
          <addr-line>Via Brecce Bianche 12, 60131, Ancona</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Person re-identi cation is a challenging task for improving and personalising the shopping experience in an intelligent retail environment. A new Top View Person Re-Identi cation (TVPR) dataset of 100 persons has been collected and described in a previous work. This work estimates the Worst Case Execution Time (WCET) for the features extraction and classi cation steps. Such tasks should not exceed the WCET, in order to ensure the e ectiveness of the proposed application. In fact, after the features extraction, the classi cation process is performed by selecting the rst passage under the camera for training and using the others as the testing set. Furthermore, a gender classi cation is exploited for improving retail applications. We tested all feature sets using k-Nearest Neighbors, Support Vector Machine, Decision Tree and Random Forest classi ers. Experimental results prove the e ectiveness of the proposed approach, achieving good performance in terms of Precision, Recall and F1-score.</p>
      </abstract>
      <kwd-group>
        <kwd>Real-time</kwd>
        <kwd>WCET</kwd>
        <kwd>Person re-identi cation</kwd>
        <kwd>RGB-D camera</kwd>
        <kwd>Retail</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Nowadays, camera are largely deployed in several sectors ranging from small
business and large retail applications to home surveillance, environment
monitoring and facility access applications. Identi cation cameras are widely employed
in most public areas as shopping centers, airports, stations, o ce buildings and
museums. In these situations, it is advisable to determine whether di erent
instances or images of one person, captured at di erent times, belong to the same
subject. Commonly, \person re-identi cation" (re-id) de nes this kind of
process. Re-id owns a great commercial value because of its wide range of potential
applications and bene ts.</p>
      <p>
        During last years, research oriented to people behaviour analysis has been
totally centered around person re-id, which is seen as the exploitation of many
paradigms and approaches of pattern recognition [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. In such conditions,
algorithms need to be robust to address issues such as widely varying camera
viewpoints and orientations, rapid changes in the appearance of clothing, occlusions,
varied poses and di erent lighting conditions [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        Person re-id means modelling human appearance. In fact, descriptors of
image content have been proposed in order to discriminate identities while
compensating for appearance variability due to changes in illumination, pose, and
camera viewpoint. Re-id is also a learning problem in which either metrics or
discriminative models are actually learned [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Labelled training data are
required for metric learning approaches and new training data are needed
whenever a camera setting changes [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        Recently, person re-id is emerging as a very challenging task for improving
and personalising the shopping experience in the intelligent retail environment.
It is becoming a useful tool to properly recognise consumers in a store, to study
returning consumers and to classify di erent shopper clusters and targets. Re-id
can provide useful information for customer services and shopping space
management. In fact, the increased development and change in consumer purchase
behaviour have led the retailers to adapt their businesses, the products and
services they provide, but also the way in which they communicate to the
customers [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        The use of RGB-D cameras can be strictly linked to this purpose, because it
provides a ordable and additional rough depth information coupled with visual
images, o ering su cient accuracy and resolution for indoor applications. In
the retail, this camera has already been successfully adopted with the aim to
univocally identify customers and analyse their interactions with shoppers [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
The usual choice is RGB-D camera placed in a top view con guration because of
its greater suitability compared with a front view con guration, mostly adopted
for gesture recognition or even for video gaming. The problem of occlusions is
reduced by the choice of a top-view con guration, advantageously being privacy
preserving since person's face cannot be recorded by the camera [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>
        In a previous work, we have built a new dataset for person re-id that uses
an RGB-D camera in a top-view con guration: the TVPR (Top View Person
Re-identi cation) dataset [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. We have chosen an Asus Xtion Pro Live RGB-D
camera because it allows the acquisition of colour and depth information in
an a ordable and fast way [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. The camera was installed on the ceiling above
the area to be analysed. This dataset collects the data of 100 people, acquired
across intervals of days and in di erent times. The camera has been located on
the ceiling above the area of interest.
      </p>
      <p>
        In this paper, the method applied within a real-time scenario is proposed. A
software agent is supposed to recognize a subject when she/he passes under a
camera more than once, in order to provide, at the same time, an instant and
customized service for the single consumer. In the retail sector, the capacity to
identify the consumer characteristics assumes a high relevance in order to o er
personalized promotions, focused on the type of person (i.e., gender, age), the
history of his preferences and shopping habits (i.e., delity card). In a
supermarket where a varied o er is proposed, the goal is to identify the returning
consumer through an RGB-D camera placed at the entrance. After that,
suggestions and o ers tailored to each consumer will be displayed on advertising
screens located immediately after the entrance and noti cations will be
instantaneously sent on their smartphones. Within this context, a worst-case execution
time (WCET) analysis for top-view person re-identi cation has been developed.
The correctness of real time systems does not only depend on the accuracy of
the results, but also on the delivery of the results within established time
constraints [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. To ensure that all deadlines are reached, real-time schedulers need
to estimate the WCET of each process. Classi cation results should be correct
not only in their accuracy but also in the time domain prede ned by the user. A
real-time task is characterized by a deadline, which is the maximum time within
which it must complete its execution [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Depending on the consequences that
may occur because of a missed deadline, a real-time task can be distinguished
as hard, rm and soft category. A real-time task which belongs to the soft
category is producing the results after its deadline, but still has some utility for the
system, although causing a performance degradation. Soft tasks are typically
related to system-user interactions. Such tasks as displaying ads on the screen
or sending alerts are enclosed in this category. in addition an agent-based system
that monitors the whole real time re-id procedure can manage several features
such as:
{ shopping chronology of each consumer connected with the personal delity
card,
{ selection of customized information to be shared to each consumer,
{ entire messaging process for sending personal o ers to advertisement screens
or alerts on smartphones.
      </p>
      <p>
        In any real-time control system, the algorithm of each task is known a priori
and thus can be utilised to estimate its characteristics in terms of computational
time [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Above all, it allows to estimate the WCET parameter, used by the
operating system to know its schedulability within the speci ed timing deadlines.
The various agent activities can be seen as parts of a team cooperating. In a
real-time approach, a WCET analysis guarantees an e cient, instantaneous and
prompt customer service.
      </p>
      <p>
        Moreover, we introduce a method for person re-id based on a set of features
extracted by RGB-D images, used to perform a classi cation process: the rst
passage under the camera is selected as training set, while returns to the initial
position as the testing set. In addition, a gender classi cation focused on colour
and length of the hair, is performed with the aim to improve retail
applications on shopper clustering on di erent targets. In fact, recognising a customer
is a crucial information for retailers who need to know who their potential
customers are in order to adapt the market to them more e ectively. We tested all
feature sets using k-Nearest Neighbors (k-NN), Support Vector Machine (SVM),
Decision Tree (DT) and Random Forest (RF) classi ers, as previously done
in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. The performance evaluation demonstrates the e ectiveness of
the proposed approach, achieving good results in term of Precision, Recall and
F1-score.
      </p>
      <p>This paper is organized as follows: Section 2 provides a description of the
approaches in the context of re-id (Subsection 2.1), a framework of the existing
datasets (Subsection 2.2) and the characterization of the TVPR dataset.
Section 3 gives details on the proposed methodology. It is followed by the process
of evaluation of our dataset with some samples and key statistics of the dataset
and the presentation of results (Section 4). The conclusions and future work in
this direction are elaborated in Section 5.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Background</title>
      <p>This section is an overview of the principal approaches for person re-id. In
particular, Subsection 2.1 presents a review/summary of the works on person re-id,
Subsection 2.2 describes the available datasets that have been used to test re-id
models and Subsection 2.3 provides details on TVPR dataset for person re-id in
a top-view con guration.
2.1</p>
      <p>
        Previous works on person re-identi cation
In the eld of pattern recognition, the re-id problem has gained considerable
attention and several reviews and surveys are available, pointing out di erent
aspects of this topic [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. Four di erent strategies could be de ned,
depending on the camera setup and environmental conditions: biometric, geometric,
appearance based and learning approaches.
      </p>
      <p>
        In the biometric approaches, the person instances are matched together and
are assigned to the same identity by the use of biometric features. The
examples employed in a real situation are faces, gait, iris scans, ngerprints and so
on [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. They are e ective and reliable solutions, but these require a
collaborative behaviour of the persons and suitable sensors. Thus, in the case of low
resolution, poor views, such as the case with common settings for surveillance
cameras, these techniques are not always applicable.
      </p>
      <p>
        The geometric approaches consider the situations when more than one sensors
or cameras collect simultaneously information of the same area, and geometric
relations among the elds of view (epipolar lines, homographies and so on) and
can be adopted to match the di erent detection data [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. The
geometric relations, when available, guarantee strong matches or, at least, a sti
candidate selection.
      </p>
      <p>
        In the general case, only the appearance of the di erent items can be adopted
[
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]. In these situations, the appearance based approaches are used. Re-id
can be correctly done only if the appearance is preserved among the views.
Exploiting dress colours and textures, perceived heights and other similar cues, is
considered to be a soft-biometric approach. Occlusions, di erent sensor qualities,
illumination changes, di erent viewpoints are some of the issues which make the
appearance based re-id a di cult problem. Gray et al. for the rst time
considered the problem of appearance models for person recognition, reacquisition
and tracking in [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ], . They also claimed that these problems had been
evaluated independently and there is a need for metrics that apply to complete
systems [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ], [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ]. A standard protocol to compare results is described. It used the
Cumulative Matching Curve (CMC) and presented the VIPeR dataset for re-id.
In [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ], an algorithm that learns a domain-speci c similarity function using an
ensemble of local features and the AdaBoost classi er is described. In [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], features
are raw colour channels in many colour spaces and texture information captured
by Schmid and Gabor lters. In fact, for person recognition background
clutter highly a ects descriptors of visual appearance. Otherwise, the background
modelling is used in many person re-id approaches [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ], [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ].
      </p>
      <p>
        The re-id has even been considered as a learning problem. In [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ], the authors
have proposed a discriminative model. It is obtained with the use of Partial
Least Squares (PLS). A robust Mahalanobis metric for Large Margin Nearest
Neighbor classi cation with Rejection (LMNN-R) is created with the use of a
metric learning framework in [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ]. In [
        <xref ref-type="bibr" rid="ref32">32</xref>
        ], the approach proposed by the authors
is a supervised technique and pairs of similar and dissimilar images and a relaxed
RankSVM algorithm is used to rank probe images. The work described in [
        <xref ref-type="bibr" rid="ref33">33</xref>
        ]
is another metric learning approach which learns a Mahalanobis distance from
equivalence constraints derived from target labels.
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ] is introduced a comparison model by the Probabilistic Distance
Comparison (PRDC) approach. It aims at maximising the probability of a pair of
correctly matched images having a smaller distance than that of an incorrectly
matched pair. In [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ], the same authors model person re-id as a transfer ranking
problem. The main goal of this paper is to transfer similarity observations from
a small gallery to a larger unlabelled probe set. Camera transfer approaches
have also been described and these use images of the same person captured
from di erent cameras to learn the associated metrics [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ], [
        <xref ref-type="bibr" rid="ref37">37</xref>
        ]. The
Multiple Component Dissimilarity (MCD) framework that allows one to turn a given
appearance-based re-id method into a dissimilarity-based one is described in [
        <xref ref-type="bibr" rid="ref38">38</xref>
        ]
.
2.2
      </p>
      <sec id="sec-2-1">
        <title>Public available datasets</title>
        <p>
          Di erent public datasets used to test re-id models are available. Currently,
VIPeR1, iLIDS,2 ETHZ 3, CAVIAR4REID 4 are the most commonly used for
re-id evaluations. Many aspects of the person re-id problem are covered by these
datasets, such as occlusions, shape deformation, very low resolution images,
illumination changes, image blurring, etc. [
          <xref ref-type="bibr" rid="ref39">39</xref>
          ]. The ViPER dataset [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ] consists
of images of people from two di erent camera views and it has only one image
of each person per camera. The dataset has been collected for testing
viewpoint invariant pedestrian recognition with 632 pedestrian images, normalized
to 48 128 pixels, pairs taken from arbitrary viewpoints under varying
illumination conditions. iLIDS was acquired in crowded public spaces [
          <xref ref-type="bibr" rid="ref39">39</xref>
          ] and it is
used for tracking evaluation. This dataset collects 479 images of 119 people
acquired from non-overlapping cameras. In [
          <xref ref-type="bibr" rid="ref40">40</xref>
          ] a modi ed version of the dataset
of 69 individuals, is introduced, iLIDS 4, because iLIDS does not t well in
a multi-shot scenario. The average number of images per person is 4 and some
individuals have only two images. In iLIDS 4 a subset of individuals with at
least four images has been selected. The ETHZ dataset has images of people
taken by a moving camera [
          <xref ref-type="bibr" rid="ref41">41</xref>
          ] and it contains three sequences and multiple
images of a person from each sequence. It collects three sub-datasets: ETHZ1 of 83
people and 4857 images, ETHZ2 composed by 35 people and 1936 images, and
ETHZ3 of 28 and 1762 images. In [
          <xref ref-type="bibr" rid="ref42">42</xref>
          ], it has been introduced CAVIAR4REID,
which is extracted from another multi-camera tracking dataset captured at an
indoor shopping mall with two cameras with overlapping views in Lisbon. The
dataset described in [
          <xref ref-type="bibr" rid="ref42">42</xref>
          ] contains multiple images of pedestrians. The images
for each pedestrian were selected for maximizing appearance variations due to
resolution changes, occlusions, light conditions, and pose changes. 72 individuals
are identi ed (with images varying from 17 39 to 72 144) and 50 are captured
by both views and 22 by just one camera. In [
          <xref ref-type="bibr" rid="ref43">43</xref>
          ], it is introduced another re-id
dataset, which is composed by 79 people and 4 groups.
2.3
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>TVPR Dataset</title>
        <p>
          The proposed system has been experimentally validated on TVPR (Top View
Person Re-identi cation) dataset5 for person re-id [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ].
        </p>
        <p>TVPR collects videos of 100 individuals recorded in several days from an
RGB-D camera installed in a top-view con guration. The camera is positioned
1 https://vision.soe.ucsc.edu
2 http://www.eecs.qmul.ac.uk
3 https://data.vision.ee.ethz.ch/cvl/aess/dataset
4 http://www.lorisbazzani.info/datasets
5 http://vrai.dii.univpm.it/re-id-dataset
3.31m
(a)
4.43m
(b)
on the ceiling of a laboratory at 4 m above the oor and covers an area of
14:66 m2 (4:43 m 3:31 m). The camera is above the surface which is to be
analysed (Figure 1).</p>
        <p>The 100 people of our dataset were acquired in 23 registration sessions. Each
of the 23 folders has a video of one registration sessions. Acquisitions have been
recorded in 8 days and the total registration time is about 2000 seconds.</p>
        <p>Registrations are performed in an indoor scenario, where people pass under
the camera. A big issue is environmental illumination. In each recording session,
the illumination condition is not constant, because it varies in function of the
di erent hours of the day and it also depends on natural illumination due to
weather conditions.</p>
        <p>Each person during a registration session walked with an average gait within
the recording area in one direction subsequently turning back and repeated over
the same route in the opposite direction. This methodology is used for a better
split of the TVPR in training set (the rst passage of the person under the
camera) and testing set (when the person passes a second time under the camera).
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Methodology and Framework</title>
      <p>In this paper, the main goal is to ensure processing while maintaining the
maximum frame rate of the camera. The camera captures depth and colour images,
both with dimensions of 640 480 pixels, at a rate up to approximately 30 f ps
and illuminates the scene/objects with structured light based on infrared
patterns. In particular, in order to carry out the assigned task in the real-time it
is necessary to keep the entire processing time below 33 ms, which is the time
that occurs between two consecutive frames. For estimating the computational
time, TVPR video of four persons passing under the camera has been taken into
account. The time that the program takes to extract the features is estimated
by using the functions of the C++ \chrono" library.</p>
      <p>
        The second step involves the processing of the data acquired from the RGB-D
camera. Seven out of the nine features selected are anthropometric features
extracted from the depth image: distance between oor and head, d1; distance
between oor and shoulders, d2; area of head surface, d3; head circumference,
d4; shoulders circumference, d5; shoulders breadth, d6; thoracic anteroposterior
depth, d7. The remaining two colour-based features are acquired by the colour
image. In [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], we have also de ned TVH the colour descriptor, TVD the depth
descriptor and TVDH the signature of a person.
      </p>
      <p>
        For our experiments, we perform person re-id classi cation selecting the rst
passage under the camera for training and using a reset to the initial position
as the testing set. We tested all feature sets using k-Nearest Neighbors (kNN)
classi er [
        <xref ref-type="bibr" rid="ref44">44</xref>
        ], Support Vector Machine [
        <xref ref-type="bibr" rid="ref45">45</xref>
        ], [
        <xref ref-type="bibr" rid="ref46">46</xref>
        ], [
        <xref ref-type="bibr" rid="ref47">47</xref>
        ], Decision Tree [
        <xref ref-type="bibr" rid="ref48">48</xref>
        ] and
Random Forest [
        <xref ref-type="bibr" rid="ref49">49</xref>
        ] and we evaluate performance in terms of precision, recall
and F1-score.
      </p>
      <p>Finally, a gender classi cation, based on colour and hair length, is carried out
with the aim to improve retail applications. This aspect could be particularly
useful in retail where new customers are certainly important, but returning
customers should have greater weight. Recognising a customers gender is a crucial
information for retailers who need to know who their potential customers are in
order to adapt the market to them more e ectively.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Results and discussion</title>
      <p>The tests are performed on a notebook PC equipped with a processor Intel
(R) Core (TM) i7-4510U CPU @ 2.00 GHz and 12 GB of RAM with Ubuntu
14.04 operating system. Figure 2a shows eight peaks corresponding to the time
interval in which the person passes under the camera. During this time interval
the features are extracted and the time spent for features extraction is estimated
around 15 ms for frame. Spurious spikes are due to operating system processes
running on the same machine.</p>
      <p>The next step corresponds to identify the person who passes again under the
camera. The classi cation task is based on the predictor features extracted from
each frame when the person passed through. At this point it would be enough
to extract features only from a single frame for identifying the unique id of the
person, but more frames are taken into account, greater will be the accuracy of
the recognition of the correct person.</p>
      <p>It is necessary that feature extraction and classi cation steps must be
performed inside a time interval between two consecutive frames. Therefore it is
resulting in less than 18 ms for the execution time of the classi cation step.</p>
      <p>
        To evaluate our dataset, the performance results are reported in terms of
recognition rate, using the CMC curves, as previously described in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Figure 3
depicts a comparison between TVH and TVD in terms of CMC curves, to
compare the ranks returned by using these di erent descriptors, where the horizontal
axis is the rank of the matching score, the vertical axis is the probability of
correct identi cation.
      </p>
      <p>In particular, Figure 3a represents the CMC obtained for TVH. Figure 3b
provides the CMC obtained for TVD. We compare these results with the average
obtained by TVH and TVD. The average CMC is displayed in Figure 3d.</p>
      <p>It can be assumed that the best performance is achieved when the
combination of descriptors is used. It is possible to infer this aspect from Figure 3d
where the combination of descriptors improve the results obtained by each of
the descriptor separately. This result is due to the depth contribution that may
be more informative. In fact, the depth outperforms the colour measure,
giving the best performance for rank values higher than 15 (Figure 3b). Its better
performance suggests the importance and potential of this descriptor.</p>
      <p>The classi cation process is performed with kNN, SVM, DT and RF
classiers. We carried out two experiments: a classic training/testing experiment and
a gender classi cation, both based on TVPR dataset.</p>
      <p>The task is solved using as a TVD descriptor an SVM with a quadratic
degree of the polynomial kernel function, while the others descriptors are solved
with SVM with a cubic degree of the polynomial kernel function. For the kNN
classi er the \minkowski" as metric distance and \n neighbors = 5" has been
chosen.</p>
      <p>For the rst case, we consider the rst passage under the camera as
training set and the return to the initial position as the testing set. The dataset is
composed by 21685 instances divided in 11683 for training and 10002 for testing.
(d)
Depth+Color
Color
Depth</p>
      <p>L1 City Block
Euclidean Distance
Cosine Distance
L1 City Block
Euclidean Distance</p>
      <p>Cosine Distance
10 20 30 40 50 60 70 80 90 100</p>
      <p>Rank</p>
      <p>Table 1 reports, for each person of TVPR, the recognition results for kNN
classi er with the TVDH descriptor.</p>
      <p>The re-id classi cation performance of TVPR is summarized in Table 2 with
a comparison among the descriptors TVH, TVD and TVDH. Figure 4 shows
the best confusion matrices for the three descriptors: TVD with SVM classi er
(Figure 4a, TVH with kNN classi er (Figure 4b) and TVDH with kNN classi er
(Figure 4c).</p>
      <p>In this case, we could observe high performance for our proposed approach
to re-identify people. This accentuates the feasibility of utilizing colour as an
e ective cue in re-id scenarios. Moreover, by conducting the comparative study
for the two descriptors TVD and TVH, we could observe the in uence of colour
for the re-id top view scenario. However, TVD descriptor is important for re-id,
because it improves the overall precision as Figure 4c shows.</p>
      <p>In this experiment, we try to classify gender considering the length of hair
and colour. The results are summarized in Table 3. Figure 5 depicts the confusion
matrix for the kNN classi er.</p>
      <p>Results con rm the e ectiveness and the suitability of the proposed approach.
In fact, the class F SD \Female with dark and short hair" is confused, because
females commonly have hair with considerable length. Same thing goes for class
M LD \Male with dark and long hair", because generally short hair is an Italian
male hairstyle. For the other class, classi cation overall precision is over 76%.
1
2
3
4
5
6
7
8
9
10
1
12
13
14
15
16
17
18
19
20
21
2
23
24
25
26
27
28
29
30
31
32
3
34
35
36
37
38
39
40
41
42
43
4
l 45
e4467
b48</p>
      <p>Moreover,
by</p>
      <p>the
future
cla
ssi
cation
mo dels
is
b elow
18
ms,</p>
      <p>and
development
wit
hin
the
useful
time
b oundaries
for
the
e
ectiveness
of
the
prop osed
applicatio
recognition
is
also
handled
using
k-Nearest</p>
      <sec id="sec-4-1">
        <title>Neighb ors</title>
        <p>retail
cla
ssier,</p>
        <p>Supp ort
Vector</p>
      </sec>
      <sec id="sec-4-2">
        <title>Machin e,</title>
      </sec>
      <sec id="sec-4-3">
        <title>Decision</title>
      </sec>
      <sec id="sec-4-4">
        <title>Tree and</title>
      </sec>
      <sec id="sec-4-5">
        <title>Random</title>
        <p>Forest
and
we evaluate
the
is
a
p erformance
in
terms
of</p>
      </sec>
      <sec id="sec-4-6">
        <title>Precisio</title>
        <p>n,
cla
ssic
training/testing
exp eriment.</p>
      </sec>
      <sec id="sec-4-7">
        <title>Recall</title>
      </sec>
      <sec id="sec-4-8">
        <title>Thus, and</title>
      </sec>
      <sec id="sec-4-9">
        <title>F1-score.</title>
        <p>The
cla
ssi</p>
        <p>cation
a
gender
cla
ssi
catio
n,
based
on
colour and
hair
length,
is
carried out
with the
aim
to improve
retail
applications.</p>
        <p>This
approach
is
useful
for
di
erent purp oses
in
retail
of
returning
customers
and
the
identi</p>
        <p>cation
pre
dictive
analy
promotions. Customer analytics are also the most useful instrument to address
both consumer and enterprise needs. The experimental results demonstrate the
e ectiveness and suitability of our approach that achieves high accuracy and
performs better without having to rely on the data annotation required in the
other existing approaches. Further investigation will be devoted to improving
our approach by extracting other informative features and setting up a full
neural network for the real time processing of video images. Future works include
also the evaluation of the necessary resources for the design of CNN layers.</p>
        <p>In the</p>
        <p>eld of retail, the long term goal of this work is to integrate this
re-identi cation system</p>
        <p>with an audio framework, and to use other types of
RGB-D cameras such as time of ight (TOF) ones. The system can additionally
be integrated as a source of high semantic level information in a networked
ambient intelligence scenario, to provide cues for di erent problems, such as
detecting abnormal speed and dimension outliers, that can alert one to a possible
uncontrolled circumstance. It would also be interesting to evaluate both colour
female
dark hair
short hair
female
dark hair
long hair
female
light hair
short hair</p>
        <p>female
l light hair
elong hair
b
a
l
eu male
rTdark hair
short hair</p>
        <p>male
dark hair
long hair</p>
        <p>male
light hair
short hair</p>
        <p>male
light hair
long hair</p>
        <p>TVD
TVH
TVDH</p>
        <p>KNN</p>
        <p>SVM
Decision Tree
Random Forest</p>
        <p>KNN</p>
        <p>SVM
Decision Tree
Random Forest</p>
        <p>KNN</p>
        <p>SVM
Decision Tree
Random Forest
fedmalraoklnehgahirair felimgshahtloehrtaihrair Pfelirmglehaotlnedhgahiiracirted lmdaaasrhblkeohretaihrlair
mdaalrolkenhgahirair
mligashhletohrtaihrair
and depth images in a way that does not decrease the performance of the system
when the colour image is being a ected by changes in pose and/or illumination.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgement</title>
      <p>This work was supported by FIT - Fondo speciale rotativo per l'Innovazione
Tecnologica, Programme Title \Study, design and prototyping of an innovative
arti cial vision system for human behaviour analysis in domestic and commercial
environments" (HBA 2.0 { Human Behaviour Analysis).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Vezzani</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baltieri</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cucchiara</surname>
          </string-name>
          , R.:
          <article-title>People reidenti cation in surveillance and forensics: A survey</article-title>
          .
          <source>ACM Computing Surveys (CSUR) 46(2)</source>
          (
          <year>2013</year>
          )
          <fpage>29</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Chahla</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Snoussi</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Abdallah</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dornaika</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Discriminant quaternion local binary pattern embedding for person re-identi cation through prototype formation and color categorization</article-title>
          .
          <source>Engineering Applications of Arti cial Intelligence</source>
          <volume>58</volume>
          (
          <year>2017</year>
          )
          <volume>27</volume>
          {
          <fpage>33</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Hariri</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tabia</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Farah</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Benouareth</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Declercq</surname>
            ,
            <given-names>D.:</given-names>
          </string-name>
          <article-title>3d facial expression recognition using kernel methods on riemannian manifold</article-title>
          .
          <source>Engineering Applications of Arti cial Intelligence</source>
          <volume>64</volume>
          (
          <year>2017</year>
          )
          <volume>25</volume>
          {
          <fpage>32</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Farou</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kouahla</surname>
            ,
            <given-names>M.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Seridi</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Akdag</surname>
          </string-name>
          , H.:
          <article-title>E cient local monitoring approach for the task of background subtraction</article-title>
          .
          <source>Engineering Applications of Articial Intelligence</source>
          <volume>64</volume>
          (
          <year>2017</year>
          )
          <volume>1</volume>
          {
          <fpage>12</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Lisanti</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Masi</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bagdanov</surname>
            ,
            <given-names>A.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Del Bimbo</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Person re-identi cation by iterative re-weighted sparse ranking</article-title>
          .
          <source>IEEE transactions on pattern analysis and machine intelligence</source>
          <volume>37</volume>
          (
          <issue>8</issue>
          ) (
          <year>2015</year>
          )
          <volume>1629</volume>
          {
          <fpage>1642</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Paolanti</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liciotti</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pietrini</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mancini</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frontoni</surname>
          </string-name>
          , E.:
          <article-title>Modelling and forecasting customer navigation in intelligent retail environments</article-title>
          .
          <source>Journal of Intelligent &amp; Robotic Systems</source>
          (
          <year>2017</year>
          )
          <volume>1</volume>
          {
          <fpage>16</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Liciotti</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Contigiani</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frontoni</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mancini</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zingaretti</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Placidi</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Shopper analytics: A customer activity recognition system using a distributed rgbd camera network</article-title>
          .
          <source>In: Video Analytics for Audience Measurement</source>
          . Springer (
          <year>2014</year>
          )
          <volume>146</volume>
          {
          <fpage>157</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Liciotti</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paolanti</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frontoni</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zingaretti</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>People detection and tracking from an rgb-d camera in top-view con guration: Review of challenges and applications</article-title>
          .
          <source>In: International Conference on Image Analysis and Processing</source>
          , Springer (
          <year>2017</year>
          )
          <volume>207</volume>
          {
          <fpage>218</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Liciotti</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paolanti</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frontoni</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mancini</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zingaretti</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Person reidenti cation dataset with rgb-d camera in a top-view con guration</article-title>
          .
          <source>In: Video Analytics for Face, Face Expression Recognition, and Audience Measurement</source>
          . Springer (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Sturari</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liciotti</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pierdicca</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frontoni</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mancini</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Contigiani</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zingaretti</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Robust and a ordable retail customer pro ling by vision and radio beacon sensor fusion</article-title>
          .
          <source>Pattern Recognition Letters</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Calvaresi</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cesarini</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sernani</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marinoni</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dragoni</surname>
            ,
            <given-names>A.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sturm</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Exploring the ambient assisted living domain: a systematic review</article-title>
          .
          <source>Journal of Ambient Intelligence and Humanized Computing</source>
          <volume>8</volume>
          (
          <issue>2</issue>
          ) (
          <year>2017</year>
          )
          <volume>239</volume>
          {
          <fpage>257</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Calvaresi</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marinoni</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sturm</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schumacher</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Buttazzo</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>The challenge of real-time multi-agent systems for enabling iot and cps</article-title>
          .
          <source>In: Proceedings of the International Conference on Web Intelligence</source>
          ,
          <source>ACM</source>
          (
          <year>2017</year>
          )
          <volume>356</volume>
          {
          <fpage>364</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Sernani</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Calvaresi</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Calvaresi</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pierdicca</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morbidelli</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dragoni</surname>
            ,
            <given-names>A.F.</given-names>
          </string-name>
          :
          <article-title>Testing intelligent solutions for the ambient assisted living in a simulator</article-title>
          .
          <source>In: Proceedings of the 9th ACM International Conference on PErvasive Technologies</source>
          Related to Assistive Environments, ACM (
          <year>2016</year>
          )
          <fpage>71</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Paolanti</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kaiser</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schallner</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frontoni</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zingaretti</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Visual and textual sentiment analysis of brand-related social media pictures using deep convolutional neural networks</article-title>
          .
          <source>In: International Conference on Image Analysis and Processing</source>
          , Springer (
          <year>2017</year>
          )
          <volume>402</volume>
          {
          <fpage>413</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Paolanti</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sturari</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mancini</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zingaretti</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frontoni</surname>
          </string-name>
          , E.:
          <article-title>Mobile robot for retail surveying and inventory using visual and textual analysis of monocular pictures based on deep learning</article-title>
          .
          <source>In: Mobile Robots (ECMR)</source>
          ,
          <source>2017 European Conference on, IEEE</source>
          (
          <year>2017</year>
          ) 1{
          <fpage>6</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Sturari</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paolanti</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frontoni</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mancini</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zingaretti</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Robotic platform for deep change detection for rail safety and security</article-title>
          .
          <source>In: Mobile Robots (ECMR)</source>
          ,
          <source>2017 European Conference on, IEEE</source>
          (
          <year>2017</year>
          ) 1{
          <fpage>6</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Messelodi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Modena</surname>
            ,
            <given-names>C.M.:</given-names>
          </string-name>
          <article-title>Boosting sher vector based scoring functions for person re-identi cation</article-title>
          .
          <source>Image and Vision Computing</source>
          <volume>44</volume>
          (
          <year>2015</year>
          )
          <volume>44</volume>
          {
          <fpage>58</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Havasi</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Szlavik</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sziranyi</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Eigenwalks: Walk detection and biometrics from symmetry patterns</article-title>
          .
          <source>In: IEEE International Conference on Image Processing 2005</source>
          . Volume
          <volume>3</volume>
          ., IEEE (
          <year>2005</year>
          ) III{
          <fpage>289</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Fischer</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ekenel</surname>
            ,
            <given-names>H.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stiefelhagen</surname>
          </string-name>
          , R.:
          <article-title>Interactive person re-identi cation in tv series</article-title>
          .
          <source>In: Content-Based Multimedia Indexing (CBMI)</source>
          , 2010 International Workshop on, IEEE (
          <year>2010</year>
          ) 1{
          <fpage>6</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Calderara</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prati</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cucchiara</surname>
          </string-name>
          , R.: Hecol:
          <article-title>Homography and epipolar-based consistent labeling for outdoor park surveillance</article-title>
          .
          <source>Computer Vision and Image Understanding</source>
          <volume>111</volume>
          (
          <issue>1</issue>
          ) (
          <year>2008</year>
          )
          <volume>21</volume>
          {
          <fpage>42</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Javed</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sha</surname>
            <given-names>que</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Rasheed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            ,
            <surname>Shah</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          :
          <article-title>Modeling inter-camera space{time and appearance relationships for tracking across non-overlapping views</article-title>
          .
          <source>Computer Vision and Image Understanding</source>
          <volume>109</volume>
          (
          <issue>2</issue>
          ) (
          <year>2008</year>
          )
          <volume>146</volume>
          {
          <fpage>162</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Gray</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brennan</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tao</surname>
          </string-name>
          , H.:
          <article-title>Evaluating appearance models for recognition, reacquisition, and tracking</article-title>
          .
          <source>In: Proc. IEEE International Workshop on Performance Evaluation for Tracking and Surveillance (PETS)</source>
          . Volume
          <volume>3</volume>
          .,
          <string-name>
            <surname>Citeseer</surname>
          </string-name>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Farenzena</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bazzani</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perina</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Murino</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cristani</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Person reidenti cation by symmetry-driven accumulation of local features</article-title>
          .
          <source>In: Computer Vision and Pattern Recognition (CVPR)</source>
          ,
          <source>2010 IEEE Conference on, IEEE</source>
          (
          <year>2010</year>
          )
          <volume>2360</volume>
          {
          <fpage>2367</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Alahi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vandergheynst</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bierlaire</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kunt</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Cascade of descriptors to detect and track objects across any network of cameras</article-title>
          .
          <source>Computer Vision and Image Understanding</source>
          <volume>114</volume>
          (
          <issue>6</issue>
          ) (
          <year>2010</year>
          )
          <volume>624</volume>
          {
          <fpage>640</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Gandhi</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Trivedi</surname>
            ,
            <given-names>M.M.:</given-names>
          </string-name>
          <article-title>Panoramic appearance map (pam) for multi-camera based person re-identi cation</article-title>
          .
          <source>In: 2006 IEEE International Conference on Video and Signal Based Surveillance</source>
          , IEEE (
          <year>2006</year>
          )
          <volume>78</volume>
          {
          <fpage>78</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Gheissari</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sebastian</surname>
            ,
            <given-names>T.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hartley</surname>
          </string-name>
          , R.:
          <article-title>Person reidenti cation using spatiotemporal appearance</article-title>
          .
          <source>In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)</source>
          . Volume
          <volume>2</volume>
          ., IEEE (
          <year>2006</year>
          )
          <volume>1528</volume>
          {
          <fpage>1535</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>Gray</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tao</surname>
          </string-name>
          , H.:
          <article-title>Viewpoint invariant pedestrian recognition with an ensemble of localized features</article-title>
          .
          <source>In: European conference on computer vision</source>
          , Springer (
          <year>2008</year>
          )
          <volume>262</volume>
          {
          <fpage>275</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>Bazzani</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cristani</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perina</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Farenzena</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Murino</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Multiple-shot person re-identi cation by hpe signature</article-title>
          .
          <source>In: Pattern Recognition (ICPR)</source>
          ,
          <year>2010</year>
          20th International Conference on,
          <source>IEEE</source>
          (
          <year>2010</year>
          )
          <volume>1413</volume>
          {
          <fpage>1416</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          29.
          <string-name>
            <surname>Bazzani</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cristani</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perina</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Murino</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Multiple-shot person reidenti cation by chromatic and epitomic analyses</article-title>
          .
          <source>Pattern Recognition Letters</source>
          <volume>33</volume>
          (
          <issue>7</issue>
          ) (
          <year>2012</year>
          )
          <volume>898</volume>
          {
          <fpage>903</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          30.
          <string-name>
            <surname>Schwartz</surname>
            ,
            <given-names>W.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Davis</surname>
            ,
            <given-names>L.S.</given-names>
          </string-name>
          :
          <article-title>Learning discriminative appearance-based models using partial least squares</article-title>
          .
          <source>In: Computer Graphics and Image Processing (SIBGRAPI)</source>
          ,
          <source>2009 XXII Brazilian Symposium on, IEEE</source>
          (
          <year>2009</year>
          )
          <volume>322</volume>
          {
          <fpage>329</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          31.
          <string-name>
            <surname>Dikmen</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Akbas</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>T.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ahuja</surname>
          </string-name>
          , N.:
          <article-title>Pedestrian recognition with a learned metric</article-title>
          .
          <source>In: Asian conference on Computer vision</source>
          , Springer (
          <year>2010</year>
          )
          <volume>501</volume>
          {
          <fpage>512</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          32.
          <string-name>
            <surname>Prosser</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zheng</surname>
            ,
            <given-names>W.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gong</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xiang</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mary</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          :
          <article-title>Person re-identi cation by support vector ranking</article-title>
          .
          <source>In: BMVC</source>
          . Volume
          <volume>2</volume>
          . (
          <year>2010</year>
          )
          <fpage>6</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          33. Kostinger,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Hirzer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Wohlhart</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Roth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.M.</given-names>
            ,
            <surname>Bischof</surname>
          </string-name>
          , H.:
          <article-title>Large scale metric learning from equivalence constraints</article-title>
          .
          <source>In: Computer Vision and Pattern Recognition (CVPR)</source>
          ,
          <source>2012 IEEE Conference on, IEEE</source>
          (
          <year>2012</year>
          )
          <volume>2288</volume>
          {
          <fpage>2295</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          34.
          <string-name>
            <surname>Zheng</surname>
            ,
            <given-names>W.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gong</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xiang</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Reidenti cation by relative distance comparison</article-title>
          .
          <source>IEEE transactions on pattern analysis and machine intelligence</source>
          <volume>35</volume>
          (
          <issue>3</issue>
          ) (
          <year>2013</year>
          )
          <volume>653</volume>
          {
          <fpage>668</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          35.
          <string-name>
            <surname>Zheng</surname>
            ,
            <given-names>W.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gong</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xiang</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Person re-identi cation by probabilistic relative distance comparison</article-title>
          . In:
          <article-title>Computer vision and pattern recognition (CVPR), 2011 IEEE conference on</article-title>
          ,
          <source>IEEE</source>
          (
          <year>2011</year>
          )
          <volume>649</volume>
          {
          <fpage>656</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          36.
          <string-name>
            <surname>Avraham</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gurvich</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lindenbaum</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Markovitch</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Learning implicit transfer for person re-identi cation</article-title>
          .
          <source>In: European Conference on Computer Vision</source>
          , Springer (
          <year>2012</year>
          )
          <volume>381</volume>
          {
          <fpage>390</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          37.
          <string-name>
            <surname>Hirzer</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Roth</surname>
            ,
            <given-names>P.M.</given-names>
          </string-name>
          , Kostinger,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Bischof</surname>
          </string-name>
          , H.:
          <article-title>Relaxed pairwise learned metric for person re-identi cation</article-title>
          .
          <source>In: European Conference on Computer Vision</source>
          , Springer (
          <year>2012</year>
          )
          <volume>780</volume>
          {
          <fpage>793</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          38.
          <string-name>
            <surname>Satta</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fumera</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Roli</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Fast person re-identi cation based on dissimilarity representations</article-title>
          .
          <source>Pattern Recognition Letters</source>
          <volume>33</volume>
          (
          <issue>14</issue>
          ) (
          <year>2012</year>
          )
          <year>1838</year>
          {
          <fpage>1848</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          39.
          <string-name>
            <surname>Gong</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cristani</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yan</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Loy</surname>
            ,
            <given-names>C.C.</given-names>
          </string-name>
          :
          <article-title>Person re-identi cation</article-title>
          . Volume
          <volume>1</volume>
          . Springer (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          40.
          <string-name>
            <surname>Bazzani</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cristani</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Murino</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Sdalf: modeling human appearance with symmetry-driven accumulation of local features</article-title>
          . In: Person Re-Identi cation. Springer (
          <year>2014</year>
          )
          <volume>43</volume>
          {
          <fpage>69</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          41.
          <string-name>
            <surname>Ess</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leibe</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gool</surname>
            ,
            <given-names>L.V.</given-names>
          </string-name>
          :
          <article-title>Depth and appearance for mobile scene analysis</article-title>
          .
          <source>In: Computer Vision</source>
          ,
          <year>2007</year>
          .
          <article-title>ICCV 2007</article-title>
          . IEEE 11th International Conference on,
          <source>IEEE</source>
          (
          <year>2007</year>
          ) 1{
          <fpage>8</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          42. Cheng, D.S.,
          <string-name>
            <surname>Cristani</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stoppa</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bazzani</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Murino</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Custom pictorial structures for re-identi cation</article-title>
          .
          <source>In: BMVC</source>
          . Volume
          <volume>1</volume>
          . (
          <year>2011</year>
          )
          <fpage>6</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          43.
          <string-name>
            <surname>Barbosa</surname>
            ,
            <given-names>I.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cristani</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Del Bue</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bazzani</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Murino</surname>
          </string-name>
          , V.:
          <article-title>Re-identi cation with rgb-d sensors</article-title>
          .
          <source>In: Computer Vision{ECCV 2012. Workshops and Demonstrations</source>
          , Springer (
          <year>2012</year>
          )
          <volume>433</volume>
          {
          <fpage>442</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          44.
          <string-name>
            <surname>Duda</surname>
            ,
            <given-names>R.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hart</surname>
            ,
            <given-names>P.E.</given-names>
          </string-name>
          , et al.:
          <article-title>Pattern classi cation and scene analysis</article-title>
          . Volume
          <volume>3</volume>
          . Wiley New York (
          <year>1973</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref45">
        <mixed-citation>
          45.
          <string-name>
            <surname>Cortes</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vapnik</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Support-vector networks</article-title>
          .
          <source>Machine learning 20(3)</source>
          (
          <year>1995</year>
          )
          <volume>273</volume>
          {
          <fpage>297</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref46">
        <mixed-citation>
          46.
          <string-name>
            <surname>Vladimir</surname>
            ,
            <given-names>V.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vapnik</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>The nature of statistical learning theory (</article-title>
          <year>1995</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref47">
        <mixed-citation>
          47.
          <string-name>
            <surname>Boser</surname>
            ,
            <given-names>B.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guyon</surname>
            ,
            <given-names>I.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vapnik</surname>
            ,
            <given-names>V.N.:</given-names>
          </string-name>
          <article-title>A training algorithm for optimal margin classi ers</article-title>
          .
          <source>In: Proceedings of the fth annual workshop on Computational learning theory, ACM</source>
          (
          <year>1992</year>
          )
          <volume>144</volume>
          {
          <fpage>152</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref48">
        <mixed-citation>
          48.
          <string-name>
            <surname>Quinlan</surname>
            ,
            <given-names>J.R.</given-names>
          </string-name>
          :
          <source>C4</source>
          .
          <article-title>5: programs for machine learning</article-title>
          .
          <source>Elsevier</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref49">
        <mixed-citation>
          49.
          <string-name>
            <surname>Breiman</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Random forests</article-title>
          .
          <source>Machine learning 45(1)</source>
          (
          <year>2001</year>
          )
          <volume>5</volume>
          {
          <fpage>32</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>