-

Cutting edge video analytics solutions: from the research to the market

Mattia Marseglia

Domenico Rocco

Stefano Saldutti

Bruno Vento

0 0 A.I. Tech srl -

A.I. Tech was born as a spinof company of the University of Salerno and designs and develops cutting edge video analytics solutions based on deep learning, able to run on board of smart cameras and/or on devices with limited resource capabilities. A.I. Tech solutions are designed to serve various vertical markets: retail, business intelligence, security and safety, smart parking, smart city and smart roads. In this paper we present all these solutions, which are the products of years of research transferred to the market.

eol>A I Tech video analytics cutting edge computer vision

Tech the “Innovation & Excellence Awards” for the year 2022, renewing the award also for the year 2023, considA.I. Tech designs and develops cutting edge video an- ering the company as the most innovative in the field of alytics solutions based on the most advanced artificial “AI Technology”. intelligence and deep learning algorithms, also running The activities that A.I. Tech carries out, with a highly directly on board of smart cameras, and therefore opti- technological and scientific content, require specialized mized for low-performance hardware. A.I. Tech boasts skills in the field of Artificial Intelligence, Artificial Vision partnerships with world leaders in their reference fields, and Embedded Systems. For this reason, the company including (the list is not exhaustive) NVIDIA, Panasonic, has a very close collaboration relationship with the DeSamsung, Hanwha Techwin, Mobotix, Axis, Hikvision, partment of Information and Electrical Engineering and Dahua. In particular, Hanwha Techwin, Panasonic and Applied Mathematics (DIEM) of the University of Salerno. Mobotix resell the video analytics solutions from A.I. In particular, there is also an agreement for the activation Tech on a global scale. In 2017 A.I. Tech has been se- of company internships as well as scientific collaboralected among the Top25 international companies in the tions for the next years. These activities allow to transfer ifeld of Artificial Intelligence by CIO Applications Mag- the scientific skills of the DIEM research group in the azine. In 2018 it enters the Top10 Most Innovative AI field of Artificial Vision and Artificial Intelligence, with Solution Providers. Its technology was selected among a consequent technological transfer of research products the finalists in the Benchmark Innovation Award in 2018, which takes the form of a series of cutting edge artifi2019, 2020, 2021 and 2022. In 2018 it wins the award in cial intelligence products, commercially available at an the Business Intelligence category, with the AI-RETAIL international level. video analytics solution. In 2020 A.I. Tech won the Corporate LiveWire award in the “Most Innovative in Video Analytics” category. In 2020 its solutions are finalists 2. Overview of the solutions in the Security and Fire Excellence Award, for the AICROWD-DEEP product (in the Security Software Prod- Most of the deep learning based systems available nowauct Innovation of the Year category) and for the WOW days in the market are realized on top of of-the-shelf project (in the Security Project of the Year category). The detectors. Anyway, designing software solutions engiAI-TRAFFIC solution for trafic monitoring is also the neered to be as accurate as the state-of-the-art without winner of the IoMOBILITY AWARD 2020, in the Mobil- the computational burden typically required by deep neuity Analytics category. Corporate LiveWire awarded A.I. ral networks, is definitively more challenging. Realizing computationally inexpensive solutions is a mandatory Ital-IA 2024: 4th National Conference on Artificial Intelligence, orga- requirement in several real-world applications where the nized by CINI, May 29-30, 2024, Naples, Italy system is expected to process hundreds of video streams † These authors contributed equally. simultaneously in real-time keeping an afordable cost; $ mattia.marseglia@aitech.vision (M. Marseglia); smart-cities are a noteworthy example of that. Moreover, sdtoemfaennoi.csoa.lrdouctctoi@@aaiitteecchh..vviissiioonn((SD. .SRalodcuctot)i);; br1.vento@gmail.com in diferent contexts the processing is required to be per(B. Vento) formed of on the edge due to environmental constraints, © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License therefore the video analytic application has to run on Attribution 4.0 International (CC BY 4.0). board of smart cameras [ 1 ], with very limited hardware an alarm if two or more persons are not respecting the resources. social distances for a given amount of time; (iv) counting

Within this context, a common design choice of all the of people that cross virtual lines; (v) counting the number A.I.Tech applications is to preserve the accuracy compa- of pedestrians crossing one area and arriving in another, rable with state-of-the-art detectors and classifiers based building the origin-destination matrix. An example of on heavy neural networks, but achieving the lowest hard- the solution in action is shown in Figure 1c. ware requirement together with the higher processing AI-FIREPLUS 4 are the solutions focused on the early throughput. Thanks to this, A.I. Tech plugins are able detection of fires. It combines the analysis of movement to run directly on board of a huge amount of diferent and appearance with a deep neural network to detect smart cameras providing open platforms to specific part- the presence of flame or smoke within an area under ners (and in particular on board of specific models of monitoring [8], it can operate in both indoor and outdoor the following camera manufacturers: Androvideo, Axis, environments. The main benefit of this application is that Bosch, Dahua, Hanwha Techwin, Hikvision, Mobotix, it does not require thermal or thermographic sensors, but Panasonic, Topview, Vivotek). A.I. Tech confirms to be, traditional optic ones instead. An example is shown in in the world, the video analytics vendor supporting the Figure 1d. highest number of camera platforms. AI-INTRUSION 5 is the video analytic solution for the detection of intruders (people or vehicles). It is capable to detect: (i) intrusions or loitering within an area of interest 3. Video analytics products framed by the camera; (ii) the crossing of a virtual line; (iii) the crossing of multiple crossing lines (not necessarily In this section we are going to describe 12 video analytics parallel) in sequence. In addition to the size and the solutions currently available on the market. aspect ratio of the object, it uses a deep neural network

AI-BIO 1 performs face analysis with the purpose of to filter objects according to their class. An example is extracting soft-biometric features like age, gender and reported in Figure 1e. emotion [ 2, 3, 4 ]. The application has a multitask architec- AI-LOST 6 is the video analysis application designed to ture based on multiple deep neural networks engineered detect removed or abandoned objects in restricted envito be executed on board of embedded platforms and smart ronments where constant surveillance cannot be guarancameras. It can be used both for business intelligence and teed [9]. The application can use a deep neural network for digital signage applications [ 5 ]. In particular, in the to recognize garbage or, alternatively, baggage. An exlast case, the aim is to personalize advertisement contents ample is reported in Figure 1f. on a monitor by taking into account the soft-biometric AI-LPR is the solution for license plate detection and features extracted from the face of the person who is recognition. Unlike other products available in the marwatching at the monitor. An example is shown in Figure ket, it is fully based on deep learning for both plate de1a. tection and license character recognition. An example of

AI-CROWDCOUNTING 2 is a video analytics applica- the product is shown in Figure 1g. tion tailored to estimate, for statistical or alerting pur- AI-PARKING 7 is designed to monitor both indoor and poses, the crowd density within specific very crowded outdoor parking, so as to verify whether a parking spot is areas of interest. Powered by a deep learning model and free or occupied. Unlike other solutions based on vehicle boosted by a distinctive training strategy [6], the system detection, this is a very efective application requiring is not only able to detect people fully visible in the scene, that only a part of the vehicle must be visible to monitor but also to identify those that are very occluded, thanks to a spot. An example of AI-PARKING in action is available a point-based head detection algorithm. This makes the in Figure 1h. application particularly suited for very crowded environ- AI-PEOPLE-DEEP 8 is the solution that exploits a deep ments, such as stadiums, concerts or trade fairs. Figure neural network to count the people framed by a camera 1b shows an example of the solution in action. positioned in zenithal view. Inspired by [10], the applica

AI-CROWD-DEEP 3 is the video analytic solution for tion is designed to work both indoors and outdoors where people monitoring. Thanks to the combination of a pro- it is possible to ensure that the illumination conditions prietary deep learning based detector, a multi object controlled. An example is reported in Figure 1i. tracker [7] and a calibration mechanism, it is capable AI-PPE 9 is designed to detect people wearing personal of: (i) estimating the number of people inside an area; (ii) generating an alarm in case of overcrowding situations or in case of gathering detected; (iii) generating 1https://www.youtube.com/watch?v=awze1fHoQEE 2https://youtu.be/h0qDXkZkObU?si=Su6gStufv9NbUrK9 3https://www.youtube.com/watch?v=BiCyon1KZco 4https://www.youtube.com/watch?v=U1SwnESua0g 5https://www.youtube.com/watch?v=3kUUOcofVow 6https://www.youtube.com/watch?v=gq24PrW6UwQ 7https://www.youtube.com/watch?v=VDQ82Di4fZs 8https://www.youtube.com/watch?v=x6N5g4Fs6_U 9https://www.youtube.com/watch?v=-fz25HYcFLo (a) AI-BIO (b) AI-CROWDCOUNTING (c) AI-CROWD-DEEP (d) AI-FIREPLUS (e) AI-INTRUSION (f) AI-LOST (g) AI-LPR (h) AI-PARKING (i) AI-PEOPLE-DEEP (j) AI-PPE (k) AI-RAIL (l) AI-SPILL (m) AI-TRAFFIC-DEEP (n) AI-VIOLATION (o) AI-WEATHER protective equipment (PPE). The application is based on 1m. the architecture described in [11]. The PPE combinations AI-VIOLATION 13 is a vertical solution able to detect that the application is able to detect are: "Helmet", "Vest" trafic light violations (see Fig. 1n), namely the presence and "Helmet and Vest". This solution can be used both of vehicles crossing the stopping line while the trafic in the case of access control system and for the surveil- light is red. It is based on the above mentioned vehicle lance of construction sites or places where works are in detector and a classifier that allows surveillance cameras progress. In the first case, the use of the product is meant (which are commonly installed over the city) to read the to verify that a worker is wearing the specified PPE, in trafic light status without the need to install external order to authorize him to enter a work area. In the sec- devices. The state of a trafic light includes the color of ond, the product can be used for continuous monitoring the active trafic light circle and whether it is blinking or of a work area with the aim of verifying that workers are not. In particular, the application can identify vehicles wearing all the PPE required. An example of the product crossing the stop line at the trafic light while the trafic is reported in Figure 1j. light status is red and send a notification to report the

AI-RAIL 10 is a video analysis application designed for violation. This notification contains also information enhancing railway safety. It combines traditional com- about the vehicle, such as the type (between motorcycle, puter vision techniques along with deep neural networks bicycle, car, truck), the estimated average speed and all to identify and analyze the behavior of vehicles, pedes- the information that are necessary to decide whether trians, and obstacles within sensitive areas such as level there are legal limits for a fine. crossings area or along railway lines. The analysis can AI-WEATHER 14 is an innovative application that uses be activated depending on the barrier status, which can deep neural networks to monitor weather and road conbe obtained by either an external signal or through neu- ditions. This app can recognize a wide range of weather ral networks integrated into the system. An example is states, including sunny, cloudy, rainy, snowy and foggy, shown in Figure 1k. as well as road surface conditions, which can vary be

AI-SPILL 11 is designed to monitor a person walking in tween dry, non-dry and flooding. This application is an unsupervised area and detect if the person falls, rais- designed to operate efectively in outdoor environments ing an alarm if that happens. The analysis is performed and requires visibility of both the road surface and the using a mathematical model that allows to analyse the sky at the same time (see Fig. 1o). AI-Weather ofers a behavior of a person moving in the scenario of interest, variety of useful alerts to users, including sending periespecially walking and falling dynamics. An advanced odic updates on weather and road conditions, as well as neural network, trained with thousands of fallen people instant notifications when the status of one of the sensors samples and optimized for running on board the camera, changes. is then used to confirm the initial outcome of that model.

An example is reported in Figure 1l.

AI-TRAFFIC-DEEP 12 is the video analysis solution References for road monitoring for both statistical and alarmist purposes. Technically speaking, the application is based on a deep learning based vehicle and people detector [12] followed by a multi-object tracking module [7] and an advanced 3D scene reconstruction stage. It is capable of: (i) counting and classifying vehicles among cars, motorcycles and trucks; (ii) estimating the average speed and the color of each detected vehicle; (iii) evaluating the density of vehicles on a road branch and raise an alarm if congestion is detected; (iv) detecting vehicles travelling in the wrong direction or that stopped in some forbidden areas; (v) detecting the presence of pedestrians on the road; (vi) counting the number of vehicles and pedestrians crossing one area and arriving in another, building the origin-destination matrix; (vii) detecting lane changes and abnormal maneuvers (such as U-turns in prohibited areas) made by vehicles, based on crossing a set of userconfigured virtual lines. An example is reported in Figure 10https://youtu.be/cDh1epks3x0?si=TCZlm8QJOG_FJ6bk 11https://www.youtube.com/watch?v=pCFBnWC8uPQ 12https://www.youtube.com/watch?v=6yQS6n_nTcI 13https://www.youtube.com/watch?v=gAVEHPCckbE 14https://www.youtube.com/watch?v=_gn-odtuWJo 2020 IEEE International Workshop on Metrology for Industry 4.0 IoT, 2020, pp. 309–313. [6] L. Fotia, G. Percannella, A. Saggese, M. Vento,

Highly crowd detection and counting based on curriculum learning, in: International Conference on Computer Analysis of Images and Patterns,

Springer, 2023, pp. 13–22. [7] P. Foggia, G. Percannella, A. Saggese, M. Vento,

Real-time tracking of single people and groups simultaneously by contextual graph-based reasoning dealing complex occlusions, in: 2013 IEEE International Workshop on Performance Evaluation of

Tracking and Surveillance (PETS), IEEE, 2013. [8] P. Foggia, A. Saggese, M. Vento, Real-time fire detection for video-surveillance applications using a combination of experts based on color, shape, and motion, IEEE Transactions on Circuits and Systems for Video Technology 25 (2015) 1545–1556. doi:10.

1109/TCSVT.2015.2392531. [9] P. Foggia, A. Greco, A. Saggese, M. Vento, A method for detecting long term left baggage based on heat map., in: VISAPP (2), 2015, pp. 385–391. [10] A. Greco, A. Saggese, B. Vento, A robust and eficient overhead people counting system for retail applications, in: International Conference on Image Analysis and Processing, Springer, 2022, pp.

139–150. [11] A. Greco, S. Saldutti, B. Vento, Fast and efective detection of personal protective equipment on smart cameras, in: International Conference on Pattern

Recognition, Springer, 2022, pp. 95–108. [12] A. Greco, A. Saggese, M. Vento, V. Vigilante, Vehicles detection for smart roads applications on board of smart cameras: A comparative analysis, IEEE Trans. Intell. Transp. Syst. (2021) 1–13.

[1]

Carletti ,

Greco ,

Saggese ,

Vento , An efective real time gender recognition system for smart cameras , J. Ambient Intell. Humaniz. Comput . 11 ( 2020 ) 2407 - 2419 .

[2]

Greco ,

Saggese ,

Vento ,

Vigilante , A convolutional neural network for gender recognition optimizing the accuracy/speed tradeof , IEEE Access 8 ( 2020 ) 130771 - 130781 . doi: 10 .1109/ ACCESS. 2020 . 3008793 .

[3]

Greco ,

Saggese ,

Vento ,

Vigilante , Gender recognition in the wild: a robustness evaluation over corrupted images 12 ( 2021 ).

[4]

Greco ,

Saggese ,

Vento ,

Vigilante , Effective training of convolutional neural networks for age estimation based on knowledge distillation , Neural Comput. Appl . ( 2021 ).

[5]

Greco ,

Saggese ,

Vento , Digital signage by real-time gender recognition from face images , in: