Cutting edge video analytics solutions: from the research to the market Mattia Marseglia1,† , Domenico Rocco1,† , Stefano Saldutti1,† and Bruno Vento1,† 1 A.I. Tech srl - www.aitech.vision, Piazza Vittorio Emanuele 10, Penta(SA), 84123, Italy Abstract A.I. Tech was born as a spinoff company of the University of Salerno and designs and develops cutting edge video analytics solutions based on deep learning, able to run on board of smart cameras and/or on devices with limited resource capabilities. A.I. Tech solutions are designed to serve various vertical markets: retail, business intelligence, security and safety, smart parking, smart city and smart roads. In this paper we present all these solutions, which are the products of years of research transferred to the market. Keywords A.I. Tech, video analytics, cutting edge, computer vision 1. Company presentation Tech the “Innovation & Excellence Awards” for the year 2022, renewing the award also for the year 2023, consid- A.I. Tech designs and develops cutting edge video an- ering the company as the most innovative in the field of alytics solutions based on the most advanced artificial “AI Technology”. intelligence and deep learning algorithms, also running The activities that A.I. Tech carries out, with a highly directly on board of smart cameras, and therefore opti- technological and scientific content, require specialized mized for low-performance hardware. A.I. Tech boasts skills in the field of Artificial Intelligence, Artificial Vision partnerships with world leaders in their reference fields, and Embedded Systems. For this reason, the company including (the list is not exhaustive) NVIDIA, Panasonic, has a very close collaboration relationship with the De- Samsung, Hanwha Techwin, Mobotix, Axis, Hikvision, partment of Information and Electrical Engineering and Dahua. In particular, Hanwha Techwin, Panasonic and Applied Mathematics (DIEM) of the University of Salerno. Mobotix resell the video analytics solutions from A.I. In particular, there is also an agreement for the activation Tech on a global scale. In 2017 A.I. Tech has been se- of company internships as well as scientific collabora- lected among the Top25 international companies in the tions for the next years. These activities allow to transfer field of Artificial Intelligence by CIO Applications Mag- the scientific skills of the DIEM research group in the azine. In 2018 it enters the Top10 Most Innovative AI field of Artificial Vision and Artificial Intelligence, with Solution Providers. Its technology was selected among a consequent technological transfer of research products the finalists in the Benchmark Innovation Award in 2018, which takes the form of a series of cutting edge artifi- 2019, 2020, 2021 and 2022. In 2018 it wins the award in cial intelligence products, commercially available at an the Business Intelligence category, with the AI-RETAIL international level. video analytics solution. In 2020 A.I. Tech won the Cor- porate LiveWire award in the “Most Innovative in Video Analytics” category. In 2020 its solutions are finalists 2. Overview of the solutions in the Security and Fire Excellence Award, for the AI- CROWD-DEEP product (in the Security Software Prod- Most of the deep learning based systems available nowa- uct Innovation of the Year category) and for the WOW days in the market are realized on top of off-the-shelf project (in the Security Project of the Year category). The detectors. Anyway, designing software solutions engi- AI-TRAFFIC solution for traffic monitoring is also the neered to be as accurate as the state-of-the-art without winner of the IoMOBILITY AWARD 2020, in the Mobil- the computational burden typically required by deep neu- ity Analytics category. Corporate LiveWire awarded A.I. ral networks, is definitively more challenging. Realizing computationally inexpensive solutions is a mandatory Ital-IA 2024: 4th National Conference on Artificial Intelligence, orga- requirement in several real-world applications where the nized by CINI, May 29-30, 2024, Naples, Italy system is expected to process hundreds of video streams † These authors contributed equally. simultaneously in real-time keeping an affordable cost; $ mattia.marseglia@aitech.vision (M. Marseglia); smart-cities are a noteworthy example of that. Moreover, domenico.rocco@aitech.vision (D. Rocco); in different contexts the processing is required to be per- stefano.saldutti@aitech.vision (S. Saldutti); br1.vento@gmail.com (B. Vento) formed of on the edge due to environmental constraints, © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License therefore the video analytic application has to run on Attribution 4.0 International (CC BY 4.0). CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings board of smart cameras [1], with very limited hardware an alarm if two or more persons are not respecting the resources. social distances for a given amount of time; (iv) counting Within this context, a common design choice of all the of people that cross virtual lines; (v) counting the number A.I.Tech applications is to preserve the accuracy compa- of pedestrians crossing one area and arriving in another, rable with state-of-the-art detectors and classifiers based building the origin-destination matrix. An example of on heavy neural networks, but achieving the lowest hard- the solution in action is shown in Figure 1c. ware requirement together with the higher processing AI-FIREPLUS 4 are the solutions focused on the early throughput. Thanks to this, A.I. Tech plugins are able detection of fires. It combines the analysis of movement to run directly on board of a huge amount of different and appearance with a deep neural network to detect smart cameras providing open platforms to specific part- the presence of flame or smoke within an area under ners (and in particular on board of specific models of monitoring [8], it can operate in both indoor and outdoor the following camera manufacturers: Androvideo, Axis, environments. The main benefit of this application is that Bosch, Dahua, Hanwha Techwin, Hikvision, Mobotix, it does not require thermal or thermographic sensors, but Panasonic, Topview, Vivotek). A.I. Tech confirms to be, traditional optic ones instead. An example is shown in in the world, the video analytics vendor supporting the Figure 1d. highest number of camera platforms. AI-INTRUSION 5 is the video analytic solution for the detection of intruders (people or vehicles). It is capable to detect: (i) intrusions or loitering within an area of interest 3. Video analytics products framed by the camera; (ii) the crossing of a virtual line; (iii) the crossing of multiple crossing lines (not necessarily In this section we are going to describe 12 video analytics parallel) in sequence. In addition to the size and the solutions currently available on the market. aspect ratio of the object, it uses a deep neural network AI-BIO 1 performs face analysis with the purpose of to filter objects according to their class. An example is extracting soft-biometric features like age, gender and reported in Figure 1e. emotion [2, 3, 4]. The application has a multitask architec- AI-LOST 6 is the video analysis application designed to ture based on multiple deep neural networks engineered detect removed or abandoned objects in restricted envi- to be executed on board of embedded platforms and smart ronments where constant surveillance cannot be guaran- cameras. It can be used both for business intelligence and teed [9]. The application can use a deep neural network for digital signage applications [5]. In particular, in the to recognize garbage or, alternatively, baggage. An ex- last case, the aim is to personalize advertisement contents ample is reported in Figure 1f. on a monitor by taking into account the soft-biometric AI-LPR is the solution for license plate detection and features extracted from the face of the person who is recognition. Unlike other products available in the mar- watching at the monitor. An example is shown in Figure ket, it is fully based on deep learning for both plate de- 1a. tection and license character recognition. An example of AI-CROWDCOUNTING 2 is a video analytics applica- the product is shown in Figure 1g. tion tailored to estimate, for statistical or alerting pur- AI-PARKING 7 is designed to monitor both indoor and poses, the crowd density within specific very crowded outdoor parking, so as to verify whether a parking spot is areas of interest. Powered by a deep learning model and free or occupied. Unlike other solutions based on vehicle boosted by a distinctive training strategy [6], the system detection, this is a very effective application requiring is not only able to detect people fully visible in the scene, that only a part of the vehicle must be visible to monitor but also to identify those that are very occluded, thanks to a spot. An example of AI-PARKING in action is available a point-based head detection algorithm. This makes the in Figure 1h. application particularly suited for very crowded environ- AI-PEOPLE-DEEP 8 is the solution that exploits a deep ments, such as stadiums, concerts or trade fairs. Figure neural network to count the people framed by a camera 1b shows an example of the solution in action. positioned in zenithal view. Inspired by [10], the applica- AI-CROWD-DEEP 3 is the video analytic solution for tion is designed to work both indoors and outdoors where people monitoring. Thanks to the combination of a pro- it is possible to ensure that the illumination conditions prietary deep learning based detector, a multi object controlled. An example is reported in Figure 1i. tracker [7] and a calibration mechanism, it is capable AI-PPE 9 is designed to detect people wearing personal of: (i) estimating the number of people inside an area; (ii) generating an alarm in case of overcrowding situa- 4 https://www.youtube.com/watch?v=U1SwnESua0g tions or in case of gathering detected; (iii) generating 5 https://www.youtube.com/watch?v=3kUUOcofVow 6 https://www.youtube.com/watch?v=gq24PrW6UwQ 1 7 https://www.youtube.com/watch?v=awze1fHoQEE https://www.youtube.com/watch?v=VDQ82Di4fZs 2 8 https://youtu.be/h0qDXkZkObU?si=Su6gStufv9NbUrK9 https://www.youtube.com/watch?v=x6N5g4Fs6_U 3 9 https://www.youtube.com/watch?v=BiCyon1KZco https://www.youtube.com/watch?v=-fz25HYcFLo (a) AI-BIO (b) AI-CROWDCOUNTING (c) AI-CROWD-DEEP (d) AI-FIREPLUS (e) AI-INTRUSION (f) AI-LOST (g) AI-LPR (h) AI-PARKING (i) AI-PEOPLE-DEEP (j) AI-PPE (k) AI-RAIL (l) AI-SPILL (m) AI-TRAFFIC-DEEP (n) AI-VIOLATION (o) AI-WEATHER Figure 1: Some examples of A.I. Tech video analytic plugins in action. Fig. 1a AI-BIO: for each person, the rectangle around the face is shown in pink or in blue, depending on the gender of the person; moreover, the figure shows all the soft-biometric features extracted by the software: the emotion and the age. Fig. 1b AI-CROWDCOUNTING: for each detected person, the application draws a red point, showing in real-time the number of people present in the region of interest. Fig. 1c AI-CROWD-DEEP: the yellow area highlights the region where the analysis is performed. The dotted white-red bounding box around emphasizes a cluster of people that are not respecting the social distances. Fig. 1d AI-FIREPLUS: in green the area of interest, while the red box calls attention to the detected flame. In the black grid, the detected smoke is highlighted in red. Fig. 1e AI-INTRUSION: the intrusion area is the red polygon and the multiple crossing lines are the numbered red lines below. In the example, a person has been detected in the intrusion area. The P at the top left of the bounding box indicates that the object is a person (rather than V for vehicle). Fig. 1f AI-LOST: the area of interest is the polygon in blue. The red bounding box with the G string indicates that the detected object is garbage (instead of B for baggage). Fig. 1g AI-LPR: in green we can see the license plate numbers recognized by the application. Fig. 1h AI-PARKING: the red boxes highlight occupied spots, while green boxes those that are free. Fig. 1i AI-PEOPLE-DEEP: a red bounding box is drawn when a person crosses the virtual line. Fig. 1j AI-PPE: for each detected person, the application draws a bounding box and a string indicating the recognized tool (W for no ppe, WH for only helmet, WV for only vest and WHV for both helmet and vest). Fig. 1k AI-RAIL: a red bounding box is drawn around detected objects if they are within a restricted area when the barrier blocks the road. Fig. 1l AI-SPILL: in the scene a red bounding box is drawn around a person fallen within the area of interest. Fig. 1m AI-TRAFFIC-DEEP: the area of interest where the evaluation is performed is in violet. A three dimensional bounding box is associated to each vehicle, together with the three dimensions of each object (width, length, height), expressed in meters; the speed (s), expressed in km/h; the category of the vehicle (Car in the example). Fig. 1n AI-VIOLATION: the status of the traffic light is shown in the box on the side (green in the example), the area where the analysis is performed is in violet, the application allows to draw the limit of the stopping line (red line). Fig. 1o AI-WEATHER: A sensor is placed near the road for monitoring, and another sensor covering the entire image is utilized for classifying weather conditions. After the observation time within the sensors has passed, the classification outputs are displayed. protective equipment (PPE). The application is based on 1m. the architecture described in [11]. The PPE combinations AI-VIOLATION 13 is a vertical solution able to detect that the application is able to detect are: "Helmet", "Vest" traffic light violations (see Fig. 1n), namely the presence and "Helmet and Vest". This solution can be used both of vehicles crossing the stopping line while the traffic in the case of access control system and for the surveil- light is red. It is based on the above mentioned vehicle lance of construction sites or places where works are in detector and a classifier that allows surveillance cameras progress. In the first case, the use of the product is meant (which are commonly installed over the city) to read the to verify that a worker is wearing the specified PPE, in traffic light status without the need to install external order to authorize him to enter a work area. In the sec- devices. The state of a traffic light includes the color of ond, the product can be used for continuous monitoring the active traffic light circle and whether it is blinking or of a work area with the aim of verifying that workers are not. In particular, the application can identify vehicles wearing all the PPE required. An example of the product crossing the stop line at the traffic light while the traffic is reported in Figure 1j. light status is red and send a notification to report the AI-RAIL 10 is a video analysis application designed for violation. This notification contains also information enhancing railway safety. It combines traditional com- about the vehicle, such as the type (between motorcycle, puter vision techniques along with deep neural networks bicycle, car, truck), the estimated average speed and all to identify and analyze the behavior of vehicles, pedes- the information that are necessary to decide whether trians, and obstacles within sensitive areas such as level there are legal limits for a fine. crossings area or along railway lines. The analysis can AI-WEATHER 14 is an innovative application that uses be activated depending on the barrier status, which can deep neural networks to monitor weather and road con- be obtained by either an external signal or through neu- ditions. This app can recognize a wide range of weather ral networks integrated into the system. An example is states, including sunny, cloudy, rainy, snowy and foggy, shown in Figure 1k. as well as road surface conditions, which can vary be- AI-SPILL 11 is designed to monitor a person walking in tween dry, non-dry and flooding. This application is an unsupervised area and detect if the person falls, rais- designed to operate effectively in outdoor environments ing an alarm if that happens. The analysis is performed and requires visibility of both the road surface and the using a mathematical model that allows to analyse the sky at the same time (see Fig. 1o). AI-Weather offers a behavior of a person moving in the scenario of interest, variety of useful alerts to users, including sending peri- especially walking and falling dynamics. An advanced odic updates on weather and road conditions, as well as neural network, trained with thousands of fallen people instant notifications when the status of one of the sensors samples and optimized for running on board the camera, changes. is then used to confirm the initial outcome of that model. An example is reported in Figure 1l. AI-TRAFFIC-DEEP 12 is the video analysis solution References for road monitoring for both statistical and alarmist pur- [1] V. Carletti, A. Greco, A. Saggese, M. Vento, An poses. Technically speaking, the application is based on effective real time gender recognition system for a deep learning based vehicle and people detector [12] smart cameras, J. Ambient Intell. Humaniz. Comput. followed by a multi-object tracking module [7] and an 11 (2020) 2407–2419. advanced 3D scene reconstruction stage. It is capable [2] A. Greco, A. Saggese, M. Vento, V. Vigilante, A of: (i) counting and classifying vehicles among cars, mo- convolutional neural network for gender recog- torcycles and trucks; (ii) estimating the average speed nition optimizing the accuracy/speed tradeoff, and the color of each detected vehicle; (iii) evaluating the IEEE Access 8 (2020) 130771–130781. doi:10.1109/ density of vehicles on a road branch and raise an alarm if ACCESS.2020.3008793. congestion is detected; (iv) detecting vehicles travelling [3] A. Greco, A. Saggese, M. Vento, V. Vigilante, Gender in the wrong direction or that stopped in some forbidden recognition in the wild: a robustness evaluation areas; (v) detecting the presence of pedestrians on the over corrupted images 12 (2021). road; (vi) counting the number of vehicles and pedestri- [4] A. Greco, A. Saggese, M. Vento, V. Vigilante, Ef- ans crossing one area and arriving in another, building fective training of convolutional neural networks the origin-destination matrix; (vii) detecting lane changes for age estimation based on knowledge distillation, and abnormal maneuvers (such as U-turns in prohibited Neural Comput. Appl. (2021). areas) made by vehicles, based on crossing a set of user- [5] A. Greco, A. Saggese, M. Vento, Digital signage by configured virtual lines. An example is reported in Figure real-time gender recognition from face images, in: 10 https://youtu.be/cDh1epks3x0?si=TCZlm8QJOG_FJ6bk 11 13 https://www.youtube.com/watch?v=pCFBnWC8uPQ https://www.youtube.com/watch?v=gAVEHPCckbE 12 14 https://www.youtube.com/watch?v=6yQS6n_nTcI https://www.youtube.com/watch?v=_gn-odtuWJo 2020 IEEE International Workshop on Metrology for Industry 4.0 IoT, 2020, pp. 309–313. [6] L. Fotia, G. Percannella, A. Saggese, M. Vento, Highly crowd detection and counting based on curriculum learning, in: International Confer- ence on Computer Analysis of Images and Patterns, Springer, 2023, pp. 13–22. [7] P. Foggia, G. Percannella, A. Saggese, M. Vento, Real-time tracking of single people and groups si- multaneously by contextual graph-based reasoning dealing complex occlusions, in: 2013 IEEE Inter- national Workshop on Performance Evaluation of Tracking and Surveillance (PETS), IEEE, 2013. [8] P. Foggia, A. Saggese, M. Vento, Real-time fire detection for video-surveillance applications using a combination of experts based on color, shape, and motion, IEEE Transactions on Circuits and Systems for Video Technology 25 (2015) 1545–1556. doi:10. 1109/TCSVT.2015.2392531. [9] P. Foggia, A. Greco, A. Saggese, M. Vento, A method for detecting long term left baggage based on heat map., in: VISAPP (2), 2015, pp. 385–391. [10] A. Greco, A. Saggese, B. Vento, A robust and effi- cient overhead people counting system for retail applications, in: International Conference on Im- age Analysis and Processing, Springer, 2022, pp. 139–150. [11] A. Greco, S. Saldutti, B. Vento, Fast and effective de- tection of personal protective equipment on smart cameras, in: International Conference on Pattern Recognition, Springer, 2022, pp. 95–108. [12] A. Greco, A. Saggese, M. Vento, V. Vigilante, Vehi- cles detection for smart roads applications on board of smart cameras: A comparative analysis, IEEE Trans. Intell. Transp. Syst. (2021) 1–13.