=Paper=
{{Paper
|id=Vol-2872/paper06
|storemode=property
|title=Deep Learning for Law Enforcement: A Survey About Three Application Domains
|pdfUrl=https://ceur-ws.org/Vol-2872/paper06.pdf
|volume=Vol-2872
|authors=Paolo Contardo,Paolo Sernani,Nicola Falcionelli,Aldo Franco Dragoni
|dblpUrl=https://dblp.org/rec/conf/rtacsit/ContardoSFD21
}}
==Deep Learning for Law Enforcement: A Survey About Three Application Domains==
<pdf width="1500px">https://ceur-ws.org/Vol-2872/paper06.pdf</pdf>
<pre>
Deep learning for law enforcement: a survey
about three application domains
Paolo Contardoa,b , Paolo Sernania , Nicola Falcionellia and Aldo
Franco Dragonia
a Information Engineering Department, Università Politecnica delle Marche, Via Brecce Bianche 12, 60131 Ancona, Italy
b Gabinetto Interregionale di Polizia Scientifica per le Marche e l’Abruzzo, Via Gervasoni 19, Ancona 60129, Italy


                                       Abstract
                                       Deep learning is rapidly growing, obtaining groundbreaking results in speech recognition, image pro-
                                       cessing, pattern recognition, and many other application domains. Following the success of deep learn-
                                       ing, many automatic data analysis techniques are becoming common also in law enforcement agencies.
                                       To this end, we present a survey about the potential impact of deep learning on three application do-
                                       mains, peculiar to law enforcement agencies. Specifically, we analyze the findings about deep learning
                                       for Face Recognition, Fingerprint Recognition, and Violence Detection. In fact, combining 1) data from
                                       the routine procedure of collecting a subject frontal and profile pictures and her/his fingerprints, 2) the
                                       pervasiveness of surveillance cameras, and 3) the capability of learning from a huge amount of data,
                                       might support the next steps in crime prevention.

                                       Keywords
                                       Face Recognition, Fingerprint Identification, Fingerprint Verification, Violence Detection, Deep
                                       Learning, Artificial Intelligence, Law Enforcement


1. Introduction                                                                                   ranging from personal health systems [2, 3]
                                                                                                  to police investigations [4], to the modeling of
From its dawn as a discipline, Artificial In-                                                     automata [5] and autonomous agents [6, 7, 8],
telligence (AI) aims to understand if we are                                                      to smart home reasoning systems [9, 10, 11]
able to implement machines with the abil-                                                         and many more. On the other side, machine
ity to think. During this unceasing explo-                                                        learning tries to give to machines the capabil-
ration, symbolic AI, also known as Good Old-                                                      ity of autonomously learning from examples.
Fashioned AI [1], tries to model the knowl-                                                       In this regard, we are witnessing the rapid
edge of the application domains in a high-                                                        growth of deep learning: it aims to build com-
level human readable formalism. As such,                                                          putational models, composed of multiple pro-
countless applications relies on symbolic AI,                                                     cessing layers, able to autonomously learn
                                                                                                  the best representations of data to accomplish
RTA-CSIT 2021: 4th International Conference Recent                                                specific tasks, such as speech recognition, vi-
Trends and Applications In Computer Science And                                                   sual object recognition, pattern recognition,
Information Technology, May 21–22, 2021, Tirana,                                                  and many others [12].
Albania
p.contardo@pm.univpm.it (P. Contardo);                                                               Following the progress achieved by AI, sev-
p.sernani@univpm.it (P. Sernani);                                                                 eral data analysis method based on symbolic
n.falcionelli@pm.univpm.it (N. Falcionelli);                                                      AI and/or deep learning are becoming pop-
a.f.dragoni@univpm.it (A.F. Dragoni)
                                                                                                  ular among law enforcement agencies [13].
                                    © 2021 Copyright for this paper by its authors. Use permit-   To this end, we present a survey about the
                                    ted under Creative Commons License Attribution 4.0 Inter-
                                    national (CC BY 4.0).                                         impact of deep learning techniques on three
                                    CEUR   Workshop                        Proceedings
 CEUR
 Workshop
 Proceedings
               http://ceur-ws.org
               ISSN 1613-0073


                                    (CEUR-WS.org)                                                 application domains, which are common to
law enforcement agencies:                         Face Recognition and this data collection pro-
                                                  cedure, for what concerns the face identifi-
     • Face Recognition, in connection to the cation. In fact, Face Recognition is one of
       use of mugshots gathered during the the most natural biometric technique used for
       routine procedure of collecting a per- identification [14]. It has a significant advan-
       son frontal and profile pictures, her/his tage over other biometric techniques: it can
       fingerprints, and personal information; be done passively, i.e. without explicit actions
     • Fingerprint Recognition and, specifically, by the subject to be identified [15]. Therefore,
       the extraction of minutiae, i.e. the dis- due to the wide range of possible security
       tinctive features used for fingerprint applications, Face Recognition attracted the
       matching;                                  interest of the Computer Vision community
                                                  for more than 40 years.
     • Violence Detection, with the goal of          Thus, early approaches on Face Recogni-
       unburdening law authorities from the tion were based on pure Computer Vision
       need to manually check hours of video methodologies. Turk and Pentland [16] pro-
       footages to identify short events.         posed Eigenfaces, i.e. the application of the
                                                  Principal Component Analysis (PCA) to ex-
While these domains seem different, the com- tract a vector of features that maximize the
puterization of the related tasks has common variance in a set of training images. By pro-
roots in Computer Vision and is rapidly evolv- jecting a face image in the space obtained with
ing thanks to deep learning. Therefore, the the PCA, face identification can be performed
goal of this paper is to give a concise descrip- with a nearest neighbor method, computing
tion of such evolution, showing the potential the distance from training images. While Eigen-
impact of deep learning in security applica- faces maximizes the inter-class variance be-
tions and crime prevention.                       tween face images of different subjects, it does
   The rest of the paper is divided into sec- not take into account the intra-class variance
tions dedicated to each application domain, between the face images of a single subject.
i.e. Face Recognition (Section 2), Fingerprint Instead, the Fisherfaces method [17] adds to
Recognition (Section 3), and Violence Detec- the PCA the Linear Discriminant Analysis
tion (Section 4). Finally, Section 5 draws the (LDA), in order to minimize intra-class vari-
conclusions of this survey, highlighting some ance. Differently from Eigengaces and Fisher-
aspects which we consider worth of further faces, Ahonen et al. [18] proposed to compute
research.                                         Local Binary Patterns Histograms (LBPH) on
                                                  face images, dividing it into region to com-
2. Deep Learning and Face pute                         Local Binary Patterns (LBP). Similarly to
                                                  Eigenfaces and Fisherfaces, a distance func-
      Recognition                                 tion based on LBPHs can be used to perform
                                                  the face identification.
National police forces routinely collect two         While these techniques (and those derived)
pictures (commonly known as mugshots), fin- obtained a good accuracy on datasets where
gerprints, and personal information of a sub- some parameters such as pose, lighting, and
ject, for various purposes, ranging from re- expression are fixed, they are insufficient to
leasing documents to registering criminals. extract stable identity feature invariant to real-
Hence, there is a clear connection between world changes [19], such as in images got
                                                  from videos and surveillance cameras. There-
fore, they are not suitable in law enforcement, 3. Deep Learning and
when comparing the two mugshots (a frontal
and a profile pictures) collected by police agen-
                                                       Fingerprint Recognition
cies in ideal conditions, with images got in the The patterns created by the epidermal ridges
wild. On the contrary, deep learning-based and furrows on our fingers, i.e. fingerprints,
techniques demonstrated capable of extract- have been used for identification for more
ing features that are invariant to changing than 2000 years [28]. As fingerprints are a
conditions about facial expression, lighting, so discriminative biometric characteristic, the
and pose. While there are some early methods implementation of Automated Fingerprint Iden-
which combined multiple Neural Networks tification Systems (AFIS) has been a promi-
and Belief Revision [20, 21] before the deep nent topic in Computer Vision in the last four
learning popularity, Convolutional Neural Net- decades. Specifically, fingerprint matching to
works (CNNs) significantly improved the ac- identify or verify a person’s identity is based
curacy in Face Recognition under unconstrained on the presence of singularities of epidermal
conditions. To this end, Taigam et al. [22] ridges called minutiae [29]. In this regard,
presented DeepFace, a 8-layer CNN to pro- algorithms to extract features and perform
cess 3-channels 152x152 face images, capable matching on fingerprint images focused on
of getting a 97.35% accuracy on the Labeled two basic types of minutiae: bifurcations and
Faces in the Wild (LFW) dataset [23]. Simi- terminations, i.e. the points where a ridge
larly, Schroff et al. [24] proposed Facenet, a splits itself into two ridges and where a ridge
22-layer CNN trained in several experiments ends [30, 31, 32]. In addition to issues such
with a varying number of face images, be- as image noise, distortions, rotations, and dis-
tween 100 and 200 million, belonging to 8 placement, large variability in different im-
million of different subjects. They got 99.63% pressions of the same finger and similarity
accuracy on LFW, using 220 x 220 input im- between two images from different fingers
ages. Cao et al. [25] showed the effectiveness make fingerprint matching a very challeng-
of the ResNet-50 [26], a 50-layer CNN based ing problem [33].
on residual learning able to get a top-1 identi-     Traditional Computer Vision-based algo-
fication error of 3.9% on the VGGFace2 datset rithms demonstrated their effectiveness on
(composed by over 3 million of images of more fingerprint matching, and specifically, on minu-
than 9 thousands subjects).                       tiae matching, evolving over the years. For
   The listed CNN-based techniques for Face example, in 1997, Maio and Maltoni [30] pro-
Recognition are just few examples among the posed to perform ridge line following on gray
many which demonstrated they robustness to scale fingerprint images to identify termina-
changing conditions and unconstrained face tions and bifurcations. Farina et al. [31] pro-
images (see Guo and Zhang [27] for a de- posed to identify minutiae from skeletonized
tailed list of deep learning-based Face Recog- binary images. Fronthaler et al. [32] exploited
nition techniques). However, to the best of symmetry features (linear and parabolic) to re-
our knowledge, there is a lack of research duce noise and extract minutiae on grayscale
in understanding to which extent such tech- images. Cappelli et al. [34] proposed a new
niques are effective in identifying a known representation for minutiae, treating the minu-
subject when only the two standard images tiae extraction and the fingerprint recognition
of police databases are available as training as a 3D pattern matching problem instead of a
samples.                                          2D one, obtaining top-level accuracy results.
   Of course, these are just few examples of        4. Deep Learning and
the many algorithms and techniques avail-
able in fingerprint matching. In fact, as high-
                                                       Violence Detection
lighted in the survey of Peralta et al. [33],       The increasing availability of technologies
even if the best performing algorithms are          for video-surveillance, combined to the need
different, they are based on common features        of unburdening authorities from the task of
such as minutiae coordinates, angle, and type.      checking hours of video recordings, boosted
Which is, then, the role of deep learning in        the attention of the research community to-
fingerprint recognition, given the maturity of      wards the automatic detection of violence in
the field and the good performance of tradi-        videos. The violence and fight detection is
tional Computer Vision-based algorithms? In         considered a task of human action recogni-
recent years, deep learning-based techniques        tion: specifically, it is a binary problem which
have been proven useful to overcome some            consists of recognizing the presence or the
of the limitations of traditional techniques.       absence of violence [44].
While traditional algorithms, such as those            As violence detection is rooted in action
presented, perform well on rolled and plan          recognition, the early works are based on
fingerprints collected with dedicated sensors,      Computer Vision techniques originally im-
they failed on latent fingerprints, i.e. partial    plemented for action recognition and can be
fingerprints unintentionally impressed on sur-      categorized into two classes [45], using hand-
faces [35, 36, 37, 38]. To this end, Tang et        crafted features to represent actions:
al. [36], proposed to convert the traditional
operations for minutiae extraction into a CNN           • in local features-based techniques, the
that can be trained end-to-end. Similarly, Cao            representation of an action is computed
et al. [38] presented a latent fingerprint recog-         by using Points of Interest (POIs) across
nition system based on CNNs. Li et al. [37]               the frames of a video;
also proposed a CNN-based architecture, but
with a different objective: enhance latent fin-         • in global features-based techniques, the
gerprint images to be used for the fingerprint            representation of an action is computed
matching (performed with other applications).             by evaluating characteristics from mul-
   Latent and partial fingerprint recognition             tiple frames as a whole.
is not the only open challenge addressed with          Among the techniques which are based on
deep learning in the field. In the use of finger-   local features, Chen and Hauptmann [46] pro-
prints for authentication, Lin and Kumar [39]       posed MoSIFT, a technique that combines the
presented a model based on CNN to learn dis-        Scale-Invariant Feature Transform (SIFT) [47]
criminative 3D representations of fingerprints      with optical flow to represent the movement
in contactless fingerprint recognition applica-     of POIs. Xu et al. [45] evolved the use of
tions. With the availability of high resolution     MoSIFT by combining it with a non-parametric
scanners, CNN-based architectures have been         Kernel Density Estimation (KDE) to remove
developed to recognize sweat pores in high          redundant and irrelevant features. They achieved
resolution fingerprints [40, 41]. Finally, deep     good results on detecting person-to-person
learning techniques are being investigated to       fights on videos, using sparse coding to rep-
detect malicious attempt to authenticate via        resent the extracted features. Instead, Deniz
artificial fingerprints, for the development of     et al. [48] proposed to compute acceleration
anti-spoofing methods [42, 43].                     from the power spectrum of adiacent frames
to detect a large variation of speed, obtain-      mance in both the Hockey Fight (96% accu-
ing results comparable to MoSIFT, but with a       racy) and Crowd Violence (98%) datasets. In
faster algorithm.                                  addition to 3D CNNs, also the ConvLSTM ar-
   Concerning the techniques based on global       chitecture [56] has been proven effective in
features, Hassner et al. [49] proposed the com-    violence detection. To this end, Sudhakaran
putation of the Violence Flows (VIF) descrip-      and Lanz [57] proposed to aggregate the spa-
tors, an evolution of optical flow which com-      tial information extracted from the frames
putes the changes in the magnitude of flow         by 2D CNNs with a ConvLSTM, achieving a
vectors, obtaining promising results on the de-    97.1% accuracy on the Hockey Fight dataset,
tection of violence in crowds. Gao et al. [50]     and 94.5% on the Crowd Violence dataset.
added to the VIF the orientation of the flow          Therefore, deep learning-based techniques
vector, proposing OVIF, improving the perfor-      demonstrated their accuracy on datasets which
mance on the detection of person-to-person         are traditional in literature such as the Hockey
fights, but with a lower accuracy on crowd         Fight and Crowd Violence. However, there is
violence.                                          still ongoing research to validate their robust-
   Deep learning contributed to advance the        ness against false positives [58], and with real
violence detection field by overcoming some        surveillance camera footages [59].
of the limitations of the optical flow, such as
discontinuities and camera motion, and by
getting very good performance in person-to-        5. Conclusions
person fights and crowd violence with the
                                                   We presented a short survey about deep learn-
same model. Specifically, 3D CNN have been
                                                   ing applications for three application domains
proven capable in learning spatio-temporal
                                                   connected to law enforcement: Face Recogni-
information, i.e. features which represent the
                                                   tion, Fingerprint Recognition, and Violence
motion information in a video, in addition to
                                                   Detection. These three domains have some
the spatial information in a single frame. For
                                                   common characteristics. In fact, early meth-
example, Ding et al. [51] presented a 9-layer
                                                   ods to the computerization of related tasks are
3D CNN for violence detection, obtaining a
                                                   all rooted in Computer Visions, using tech-
91% accuracy on the Hockey Fight dataset [52].
                                                   niques such as Principal Component Analy-
Similarly, Li et al. [53] with a 10-layer 3D CNN
                                                   sis, Image Binarization and Thinning, Optical
alternating dense and transitional layers after
                                                   Flow, etc. However, the use of deep learn-
a convolutional layer, achieved 98.3% accu-
                                                   ing techniques, such as Convolutional Neural
racy on the Hockey Fight dataset, and 97.2%
                                                   Networks (2D and 3D) and ConvLSTMs, sig-
on the Crowd Violence dataset [49]. Trans-
                                                   nificantly improved the accuracy of automatic
fer learning approaches based on 3D CNN
                                                   applications dealing with Face Recognition,
also demonstrated good performances. For
                                                   Fingerprint Recognition, and Violence Detec-
example, in our previous work [44], we used
                                                   tion.
C3D [54], a 3D CNN pre-trained to classify
                                                      While some of these deep learning tech-
sport categories, as a feature extractor, and a
                                                   niques are being integrated in production sys-
Support Vector Machine (SVM) classifier, with
                                                   tems, at least for Face and Fingerprint Recog-
a 98.5% and a 99.2% accuracy on the Hockey
                                                   nition1 , there is still the need to investigate
Fight and the Crowd Violence respectively.
Similary, Ullah et al. [55] used C3D as a fea-          1 See, for example, the Italian system SARI, an exten-
ture extractor, but followed by fully connected    sion of an Automated Fingerprint Identification Systems
layers for classification, with a good perfor-     (AFIS) which supports Face Recognition [60].
their impact in real world applications. For ex-   [2] N. Falcionelli, P. Sernani, A. Brugués,
ample, concerning Face Recognition, there is           D. N. Mekuria, D. Calvaresi, M. Schu-
a lack of research in understanding the effec-         macher, A. F. Dragoni, S. Bromuri, In-
tiveness of face identification when only the          dexing the event calculus: Towards prac-
two mugshots per subject commonly stored               tical human-readable personal health
in law enforcement databases are available             systems,      Artificial Intelligence in
for training. Concerning Fingerprint Recog-            Medicine 96 (2019) 154–166. doi:10.
nition, research is ongoing to get an effective        1016/j.artmed.2018.10.003.
extraction of minutiae from latent fingerprint     [3] N. Falcionelli, P. Sernani, A. Brugués,
images, which are available in crime scenes.           D. N. Mekuria, D. Calvaresi, M. Schu-
Concerning Violence Detection, the accuracy            macher, A. F. Dragoni, S. Bromuri,
of deep learning techniques with real surveil-         Event calculus agent minds applied
lance cameras and their robustness to false            to diabetes monitoring,          in: Au-
positives are among the objectives of current          tonomous Agents and Multiagent Sys-
research.                                              tems, Springer International Publishing,
   Moreover, to be effective in real applica-          2017, pp. 258–274. doi:10.1007/978-
tions, deep learning based techniques, as Ar-          3-319-70887-4_3.
tificial Intelligence in general, need to take     [4] A. F. Dragoni, S. Animali, Maximal
into account concrete real time performances.          consistency, theory of evidence, and
In fact, as pointed out in [61], an intelligent        bayesian conditioning in the investiga-
answer preserves its importance only if given          tive domain, Cybernetics and Sys-
in time. Finally, as the evidence collected us-        tems 34 (2003) 419–465. doi:10.1080/
ing AI should be explainable to a judge in a           01969720302863.
court [13], also Explainable AI (XAI) meth-        [5] N. Falcionelli, P. Sernani, D. Mekuria,
ods, capable to provide human understand-              A. F. Dragoni, An event calculus for-
able explanations of their results [62], should        malization of timed automata, in: Pro-
be investigated in the presented application           ceedings of the 1st International Work-
domains, to avoid the use of deep learning             shop on Real-Time compliant Multi-
techniques as mere “black boxes”.                      Agent Systems co-located with the Fed-
                                                       erated Artificial Intelligence Meeting,
                                                       volume 2156 of CEUR Workshop Proceed-
Acknowledgments                                        ings, 2018, pp. 60–76. URL: http://ceur-
                                                       ws.org/Vol-2156/paper5.pdf.
The presented research has been part of the
                                                   [6] A. F. Dragoni, P. Giorgini, L. Serafini,
Memorandum of Understanding between the
                                                       Mental states recognition from commu-
Università Politecnica delle Marche, Centro
                                                       nication, Journal of Logic and Compu-
“CARMELO” and the Ministero dell’Interno,
                                                       tation 12 (2002) 119–136. doi:10.1093/
Dipartimento di Pubblica Sicurezza, Direzione
                                                       logcom/12.1.119.
Centrale Anticrimine della Polizia di Stato.
                                                   [7] P. Sernani, A. Claudi, A. F. Dragoni,
                                                       Combining artificial intelligence and
References                                             netmedicine for ambient assisted living:
                                                       A distributed bdi-based expert system,
 [1] J. Haugeland, Artificial intelligence: The        International Journal of E-Health and
     very idea, MIT press, 1989.                       Medical Communications 6 (2015) 62–76.
                                                       doi:10.4018/IJEHMC.2015100105.
 [8] P. Sernani, M. Biagiola, N. Fal-                CCNT’12), 2012, pp. 1–6. doi:10.1109/
     cionelli, D. Mekuria, S. Cremonini,             ICCCNT.2012.6396051.
     A. F. Dragoni,       Time aware task [15] R. Jafri, H. R. Arabnia, A survey of face
     delegation in agent interactions for            recognition techniques, Journal of In-
     video-surveillance, in: Proceedings             formation Processing Systems 5 (2009)
     of the 1st International Workshop               41–68. doi:10.3745/JIPS.2009.5.2.
     on Real-Time compliant Multi-Agent              041.
     Systems co-located with the Feder- [16] M. Turk, A. Pentland, Face recognition
     ated Artificial Intelligence Meeting,           using eigenfaces, in: Computer Vision
     volume 2156 of CEUR Workshop Pro-               and Pattern Recognition, 1991. Proceed-
     ceedings, 2018, pp. 16–30. URL: http:           ings CVPR ’91., IEEE Computer Soci-
     //ceur-ws.org/Vol-2156/paper2.pdf.              ety Conference on, 1991, pp. 586–591.
 [9] D. N. Mekuria, P. Sernani, N. Falcionelli,      doi:10.1109/CVPR.1991.139758.
     A. F. Dragoni, Reasoning in multi-agent [17] P. Belhumeur, J. Hespanha, D. Kriegman,
     based smart homes: A systematic liter-          Eigenfaces vs. fisherfaces: recognition
     ature review, in: Ambient Assisted Liv-         using class specific linear projection, Pat-
     ing, Springer International Publishing,         tern Analysis and Machine Intelligence,
     Cham, 2019, pp. 161–179. doi:10.1007/           IEEE Transactions on 19 (1997) 711–720.
     978-3-030-05921-7_13.                           doi:10.1109/34.598228.
[10] D. N. Mekuria, P. Sernani, N. Falcionelli, [18] T. Ahonen, A. Hadid, M. Pietikainen,
     A. F. Dragoni, Smart home reasoning             Face description with local binary pat-
     systems: a systematic literature review,        terns: Application to face recognition,
     Journal of Ambient Intelligence and Hu-         Pattern Analysis and Machine Intelli-
     manized Computing (2019) 1–18. doi:10.          gence, IEEE Transactions on 28 (2006)
     1007/s12652-019-01572-z.                        2037–2041. doi:10.1109/TPAMI.2006.
[11] E. Serral, P. Sernani, A. F. Dragoni,           244.
     F. Dalpiaz, Contextual requirements [19] I. Masi, Y. Wu, T. Hassner, P. Natara-
     prioritization and its application to           jan, Deep face recognition: A sur-
     smart homes, in: Ambient Intelli-               vey, in: 2018 31st SIBGRAPI Confer-
     gence, Springer International Publish-          ence on Graphics, Patterns and Images
     ing, Cham, 2017, pp. 94–109. doi:10.            (SIBGRAPI), 2018, pp. 471–478. doi:10.
     1007/978-3-319-56997-0_7.                       1109/SIBGRAPI.2018.00067.
[12] Y. LeCun, Y. Bengio, G. Hinton, Deep [20] A. Dragoni, G. Vallesi, P. Baldassarri, A
     learning, Nature 521 (2015) 436–444.            continuos learning for a face recognition
[13] S. Raaijmakers, Artificial intelligence         system, in: ICAART 2011 - Proceedings
     for law enforcement: Challenges and             of the 3rd International Conference on
     opportunities, IEEE Security Privacy            Agents and Artificial Intelligence, vol-
     17 (2019) 74–77. doi:10.1109/MSEC.              ume 1, 2011, pp. 541–544.
     2019.2925649.                              [21] P. Sernani, A. Claudi, G. Dolcini,
[14] A. Khairwa, K. Abhishek, S. Prakash,            L. Palazzo, G. Biancucci, A. F. Dragoni,
     T. Pratap, A comprehensive study of             Subject-dependent degrees of reliability
     various biometric identification tech-          to solve a face recognition problem using
     niques, in: 2012 Third International            multiple neural networks, in: Proceed-
     Conference on Computing, Communica-             ings ELMAR-2013, 2013, pp. 11–14.
     tion and Networking Technologies (IC- [22] Y. Taigman, M. Yang, M. Ranzato,
     L. Wolf, DeepFace: Closing the gap to         (Cat. No.PR00446), 1999, pp. 452–459.
     human-level performance in face veri-         doi:10.1109/ICIIS.1999.810315.
     fication, in: 2014 IEEE Conference on [30] D. Maio, D. Maltoni, Direct gray-scale
     Computer Vision and Pattern Recogni-          minutiae detection in fingerprints, IEEE
     tion, 2014, pp. 1701–1708. doi:10.1109/       Transactions on Pattern Analysis and
     CVPR.2014.220.                                Machine Intelligence 19 (1997) 27–40.
[23] E. Learned-Miller, G. B. Huang, A. Roy-       doi:10.1109/34.566808.
     Chowdhury, H. Li, G. Hua, Labeled [31] A. Farina, Z. M. Kovács-Vajna, A. Leone,
     Faces in the Wild: A Survey, Springer         Fingerprint minutiae extraction from
     International Publishing, Cham, 2016,         skeletonized binary images, Pattern
     pp. 189–248. doi:10.1007/978-3-               Recognition 32 (1999) 877–889. doi:10.
     319-25958-1_8.                                1016/S0031-3203(98)00107-1.
[24] F. Schroff, D. Kalenichenko, J. Philbin, [32] H. Fronthaler, K. Kollreider, J. Bi-
     Facenet: A unified embedding for face         gun, Local features for enhancement
     recognition and clustering, in: 2015 IEEE     and minutiae extraction in fingerprints,
     Conference on Computer Vision and             IEEE Transactions on Image Processing
     Pattern Recognition, 2015, pp. 815–823.       17 (2008) 354–363. doi:10.1109/TIP.
     doi:10.1109/CVPR.2015.7298682.                2007.916155.
[25] Q. Cao, L. Shen, W. Xie, O. M. Parkhi, [33] D. Peralta, M. Galar, I. Triguero, D. Pa-
     A. Zisserman, Vggface2: A dataset             ternain, S. García, E. Barrenechea, J. M.
     for recognising faces across pose and         Benítez, H. Bustince, F. Herrera, A sur-
     age, in: 2018 13th IEEE International         vey on fingerprint minutiae-based local
     Conference on Automatic Face Gesture          matching for verification and identifica-
     Recognition (FG 2018), 2018, pp. 67–74.       tion: Taxonomy and experimental eval-
     doi:10.1109/FG.2018.00020.                    uation, Information Sciences 315 (2015)
[26] K. He, X. Zhang, S. Ren, J. Sun, Deep         67–87. doi:10.1016/j.ins.2015.04.
     residual learning for image recognition,      013.
     in: Proceedings of the IEEE Conference [34] R. Cappelli, M. Ferrara, D. Maltoni,
     on Computer Vision and Pattern Recog-         Minutia cylinder-code: A new repre-
     nition (CVPR), volume 1, 2016, pp. 770–       sentation and matching technique for
     778. doi:10.1109/CVPR.2016.90.                fingerprint recognition, IEEE Trans-
[27] G. Guo, N. Zhang, A survey on deep            actions on Pattern Analysis and Ma-
     learning based face recognition, Com-         chine Intelligence 32 (2010) 2128–2141.
     puter Vision and Image Understand-            doi:10.1109/TPAMI.2010.52.
     ing 189 (2019) 102805. doi:10.1016/j. [35] K. Cao, E. Liu, A. K. Jain, Segmentation
     cviu.2019.102805.                             and enhancement of latent fingerprints:
[28] M. Kücken, A. C. Newell, Fingerprint          A coarse to fine ridgestructure dictio-
     formation, Journal of Theoretical Biol-       nary, IEEE Transactions on Pattern Anal-
     ogy 235 (2005) 71–83. doi:10.1016/j.          ysis and Machine Intelligence 36 (2014)
     jtbi.2004.12.020.                             1847–1859. doi:10.1109/TPAMI.2014.
[29] G. Bebis, T. Deaconu, M. Georgiopou-          2302450.
     los, Fingerprint identification using [36] Y. Tang, F. Gao, J. Feng, Y. Liu, Finger-
     delaunay triangulation, in: Proceed-          net: An unified deep network for fin-
     ings 1999 International Conference on         gerprint minutiae extraction, in: 2017
     Information Intelligence and Systems          IEEE International Joint Conference on
     Biometrics (IJCB), 2017, pp. 108–116.            convolutional neural networks and sup-
     doi:10.1109/BTAS.2017.8272688.                   port vector machines, Applied Artificial
[37] J. Li, J. Feng, C.-C. J. Kuo, Deep               Intelligence 34 (2020) 329–344. doi:10.
     convolutional neural network for la-             1080/08839514.2020.1723876.
     tent fingerprint enhancement, Signal [45] L. Xu, C. Gong, J. Yang, Q. Wu,
     Processing: Image Communication 60               L. Yao, Violent video detection based
     (2018) 52–63. doi:doi.org/10.1016/               on mosift feature and sparse coding, in:
     j.image.2017.08.010.                             2014 IEEE International Conference on
[38] K. Cao, A. K. Jain, Automated latent fin-        Acoustics, Speech and Signal Processing
     gerprint recognition, IEEE Transactions          (ICASSP), 2014, pp. 3538–3542. doi:10.
     on Pattern Analysis and Machine Intelli-         1109/ICASSP.2014.6854259.
     gence 41 (2019) 788–800. doi:10.1109/ [46] M. Y. Chen, A. Hauptmann, MoSIFT:
     TPAMI.2018.2818162.                              Recognizing human actions in surveil-
[39] C. Lin, A. Kumar, Contactless and par-           lance videos,        Technical Report
     tial 3D fingerprint recognition using            CMU-CS-09-161,         Carnegie       Mel-
     multi-view deep representation, Pattern          lon University, 2009. URL: http:
     Recognition 83 (2018) 314–327. doi:10.           //ra.adm.cs.cmu.edu/anon/usr/anon/
     1016/j.patcog.2018.05.004.                       home/ftp/2009/CMU-CS-09-161.pdf.
[40] V. Anand, V. Kanhangad, Porenet: [47] D. G. Lowe, Object recognition from
     Cnn-based pore descriptor for high-              local scale-invariant features, in: Pro-
     resolution fingerprint recognition, IEEE         ceedings of the Seventh IEEE Interna-
     Sensors Journal 20 (2020) 9305–9313.             tional Conference on Computer Vision,
     doi:10.1109/JSEN.2020.2987287.                   volume 2, 1999, pp. 1150–1157 vol.2.
[41] F. Liu, Y. Zhao, G. Liu, L. Shen, Fin-           doi:10.1109/ICCV.1999.790410.
     gerprint pore matching using deep [48] O. Deniz, I. Serrano, G. Bueno, T. Kim,
     features,     Pattern Recognition 102            Fast violence detection in video, in: 2014
     (2020) 107208. doi:10.1016/j.patcog.             International Conference on Computer
     2020.107208.                                     Vision Theory and Applications (VIS-
[42] H.-U. Jang, H.-Y. Choi, D. Kim, J. Son, H.-      APP), volume 2, 2014, pp. 478–485.
     K. Lee, Fingerprint spoof detection us- [49] T. Hassner, Y. Itcher, O. Kliper-Gross, Vi-
     ing contrast enhancement and convolu-            olent flows: Real-time detection of vio-
     tional neural networks, in: Information          lent crowd behavior, in: 2012 IEEE Com-
     Science and Applications 2017, Springer          puter Society Conference on Computer
     Singapore, 2017, pp. 331–338. doi:10.            Vision and Pattern Recognition Work-
     1007/978-981-10-4154-9_39.                       shops, 2012, pp. 1–6. doi:10.1109/
[43] D. M. Uliyan, S. Sadeghi, H. A. Jalab,           CVPRW.2012.6239348.
     Anti-spoofing method for fingerprint [50] Y. Gao, H. Liu, X. Sun, C. Wang, Y. Liu,
     recognition using patch based deep               Violence detection using oriented vio-
     learning machine, Engineering Science            lent flows, Image and Vision Comput-
     and Technology, an International Jour-           ing 48-49 (2016) 37–41. doi:10.1016/j.
     nal 23 (2020) 264–273. doi:10.1016/j.            imavis.2016.01.006.
     jestch.2019.06.005.                         [51] C. Ding, S. Fan, M. Zhu, W. Feng,
[44] S. Accattoli, P. Sernani, N. Falcionelli,        B. Jia, Violence detection in video
     D. N. Mekuria, A. F. Dragoni, Violence           by using 3d convolutional neural net-
     detection in videos by combining 3D              works, in: G. Bebis, R. Boyle, B. Parvin,
     D. Koracin, R. McMahan, J. Jerald,                 tional long short-term memory, in:
     H. Zhang, S. M. Drucker, C. Kamb-                  2017 14th IEEE International Confer-
     hamettu, M. El Choubassi, Z. Deng,                 ence on Advanced Video and Signal
     M. Carlson (Eds.), Advances in Vi-                 Based Surveillance (AVSS), 2017, pp. 1–6.
     sual Computing, Springer International             doi:10.1109/AVSS.2017.8078468.
     Publishing, 2014, pp. 551–558. doi:10. [58] M. Bianculli, N. Falcionelli, P. Sernani,
     1007/978-3-319-14364-4_53.                         S. Tomassini, P. Contardo, M. Lombardi,
[52] E. Bermejo Nievas, O. Deniz Suarez,                A. F. Dragoni, A dataset for automatic
     G. Bueno García, R. Sukthankar, Vio-               violence detection in videos, Data in
     lence detection in video using computer            Brief 33 (2020) 106587. doi:10.1016/j.
     vision techniques, in: P. Real, D. Diaz-           dib.2020.106587.
     Pernil, H. Molina-Abril, A. Berciano, [59] M. Cheng, K. Cai, M. Li, RWF-2000: an
     W. Kropatsch (Eds.), Computer Anal-                open large scale video database for vi-
     ysis of Images and Patterns, Springer              olence detection, CoRR abs/1911.05913
     Berlin Heidelberg, Berlin, Heidelberg,             (2019). URL: http://arxiv.org/abs/1911.
     2011, pp. 332–339. doi:10.1007/978-                05913.
     3-642-23678-5_39.                             [60] E. Sacchetto, Face to face: il com-
[53] J. Li, X. Jiang, T. Sun, K. Xu, Efficient vi-      plesso rapporto tra automated
     olence detection using 3d convolutional            facial recognition technology e
     neural networks, in: 2019 16th IEEE                processo penale,          La legislazione
     International Conference on Advanced               penale (2020) 1–14. URL: https:
     Video and Signal Based Surveillance                //iris.unito.it/retrieve/handle/2318/
     (AVSS), 2019, pp. 1–8. doi:10.1109/                1758754/668686/Sacchetto-finale.pdf.
     AVSS.2019.8909883.                            [61] A. F. Dragoni, P. Sernani, D. Calvaresi,
[54] D. Tran, L. Bourdev, R. Fergus, L. Torre-          When rationality entered time and be-
     sani, M. Paluri, Learning spatiotemporal           came real agent in a cyber-society, in:
     features with 3d convolutional networks,           Proceedings of the 3rd International
     in: 2015 IEEE International Conference             Conference on Recent Trends and Ap-
     on Computer Vision (ICCV), 2015, pp.               plications in Computer Science and In-
     4489–4497. doi:10.1109/ICCV.2015.                  formation Technology, volume 2280 of
     510.                                               CEUR Workshop Proceedings, 2018, pp.
[55] F. U. M. Ullah, A. Ullah, K. Muham-                167–171. URL: http://ceur-ws.org/Vol-
     mad, I. U. Haq, S. W. Baik, Violence               2280/paper-24.pdf.
     detection using spatiotemporal features [62] D. Doran, S. Schulz, T. R. Besold, What
     with 3D convolutional neural network,              does explainable AI really mean? a
     Sensors 19 (2019) 2472. doi:10.3390/               new conceptualization of perspec-
     s19112472.                                         tives, in: Proceedings of the First
[56] X. Shi, Z. Chen, H. Wang, D. Yeung,                International Workshop on Comprehen-
     W. Wong, W. Woo, Convolutional LSTM                sibility and Explanation in AI and ML
     network: A machine learning approach               2017, volume 2071 of CEUR Workshop
     for precipitation nowcasting, CoRR                 Proceedings, 2017, pp. 15–22. URL:
     abs/1506.04214 (2015). URL: http://arxiv.          http://ceur-ws.org/Vol-2071/CExAIIA_
     org/abs/1506.04214.                                2017_paper_2.pdf.
[57] S. Sudhakaran, O. Lanz, Learning to
     detect violent videos using convolu-

</pre>