<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">A Real-Time Machine Learning Based Solution for Privacy Enforcement in Video Recordings and Live Streaming</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Pietro</forename><forename type="middle">Manganelli</forename><surname>Conforti</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Department of Computer, Control and Management Engineering</orgName>
								<orgName type="institution">Sapienza University of Rome</orgName>
								<address>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Matteo</forename><surname>Emanuele</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Department of Computer, Control and Management Engineering</orgName>
								<orgName type="institution">Sapienza University of Rome</orgName>
								<address>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Lorenzo</forename><surname>Mandelli</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Department of Computer, Control and Management Engineering</orgName>
								<orgName type="institution">Sapienza University of Rome</orgName>
								<address>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">A Real-Time Machine Learning Based Solution for Privacy Enforcement in Video Recordings and Live Streaming</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">7A1B179BE14109EC8344BC457E659566</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T18:25+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Image segmentation</term>
					<term>Context Recognition</term>
					<term>Detectron2</term>
					<term>Privacy enforcement</term>
					<term>Covid-19</term>
					<term>Alexnet</term>
					<term>Transfer learning</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>These past years the world had to deal with a whole new situation brought by Covid-19. Everyone's routine changed and we started passing way more time than before on virtual meeting, virtual chats and similar. With this, many privacy problems arised from all the video data generated by a single user. Google and Zoom introduced the possibility to blur out the background while using a front face camera, but this did not solve many privacy concerns ranging from showing people in videos without their permission, to the leaking of sensible data and information from videos uploaded online. We propose a solution build over the use of computer vision techniques like image segmentation and classification for context recognition for a privacy enforcement solution capable of fitting the user's personal need, blurring out selectively specific objects from a video based on the user's preferences for each room in which they are.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>In the past years there has been a solid shift for the entire world population towards a more active presence online. Covid-19 has further pushed many activities to be faced digitally. Virtual meeting application like Zoom, had 10 million daily meeting participants in December 2019, but by April 2020, that number increased to reach up to 300 million <ref type="bibr" target="#b0">[1]</ref>. It is estimated that in 2024 only 25% of the business meetings will take place in person <ref type="bibr" target="#b1">[2]</ref>. Studies started during 2020 have demonstrated that nowadays people spend on average way more time in virtual meetings than before <ref type="bibr" target="#b2">[3]</ref>, leading to many concerns for the single person. Users have started experiencing stress related to not being competent in the use of the technology, but most importantly to "Zoom fatigue" due to it being always "on" <ref type="bibr" target="#b3">[4]</ref>. Many privacy related issues have been crippling the user experience ever since, such as exposing private and personal spaces on camera, unintentionally framing a person who did not give consent to be on video or sharing sensible information leaked from careless online posting. Many solutions have been promptly developed to prevent such things from happening, providing virtual meeting room services with safeguard-privacy functionalities like blurring the background and virtual backgrounds <ref type="bibr" target="#b4">[5,</ref><ref type="bibr" target="#b5">6,</ref><ref type="bibr" target="#b6">7,</ref><ref type="bibr" target="#b7">8]</ref>. We present in this paper a novel computer vision based approach for privacy enforcement in video data, capable of filter out from a video a list of objects that a user does not want </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related works</head><p>With the advancement of technology, people have been sharing a continuously growing amount of personal data online. Additionally to life-logging devices <ref type="bibr" target="#b8">[9]</ref>, social medias have recently stepped in, ending up quickly dominating the landscape of mass produced data with "visual data"(i.e. images and video). For instance, in 2020, the first year of pandemic, users have generated and shared via Facebook a total of 10.5million videos <ref type="bibr" target="#b9">[10]</ref>. This impactful amount of data brought to the attention of experts and users to many privacy related issues; studies started identifying and observing how easily privacy could be violated just from unintentionally sharing personal data contained inside images and videos, and subsequently started proposing privacy models to formally approach and tackle said scenarios <ref type="bibr" target="#b10">[11]</ref>. The scientific world went quickly from defining sub-fields like Privacy-Preserving Machine Learning(PPML) <ref type="bibr" target="#b11">[12]</ref>, to adopting deep learning models for image disguising <ref type="bibr" target="#b12">[13]</ref>, Context Recognition <ref type="bibr" target="#b14">[14]</ref> as well as Image-Based Localization <ref type="bibr" target="#b15">[15]</ref> and again computer vision based framework <ref type="bibr" target="#b16">[16]</ref> as novel solutions for privacy preserving of first person vision image sequences, placing computer vision, Artificial Intelligence and data driven approaches as state of the art techniques for preserving privacy on line. Among the many available solutions for privacy preservation and safeguarding, it is currently missing one which allows single users to selectively censor objects from visual data depending on their personal needs and preferences.</p><p>The propose work aims therefore to provide experiencelacking users with an intuitive, easy to use tool for privacy enforcement in video data based on computer vision techniques.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Implementation</head><p>Sensible users' data is crucial to be kept private. The proposed software tracks such information by means of various modules which pipeline is shown in fig. <ref type="figure" target="#fig_1">1</ref>. Memory buffers are used in between modules to guarantee flexibility towards input videos of any 𝑎𝑠𝑝𝑒𝑐𝑡, 𝑟𝑎𝑡𝑖𝑜 and 𝑓 𝑝𝑠, as well as to stabilize the output overcoming the common flickering experienced in these kind of applications. Thanks to such buffers it is possible to store past frames and reuse those to statistically smooth of the final output; past frames are reused according to level of confidence the class recognition module predicted with. Input videos can be directly uploaded to the system or streamed from cameras(i.e. webcams). The proposed solution separates the overarching learning problem in two sub problems, namely context recognition and image segmentation; this approach guarantees robustness through modularity and simplifies the overall functioning of the software.</p><p>Recognizing the users, their emotional state <ref type="bibr" target="#b17">[17,</ref><ref type="bibr" target="#b18">18,</ref><ref type="bibr" target="#b19">19]</ref>, his attentive state <ref type="bibr" target="#b20">[20,</ref><ref type="bibr" target="#b21">21,</ref><ref type="bibr" target="#b22">22]</ref>, and the context surroundings <ref type="bibr" target="#b23">[23,</ref><ref type="bibr" target="#b24">24,</ref><ref type="bibr" target="#b25">25]</ref> allows to selectively obscure specific elements based on preferences the same user expressed at the registration time; such data is stored into a database for later inference. Context recognition has been tackled with a neural network inspired by Alexnet <ref type="bibr" target="#b26">[26]</ref>, a famous Deep Convolutional Neural Network, designed for image classification.</p><p>Together with an RFID application <ref type="bibr" target="#b14">[14]</ref> Detectron 2, a very powerful instance segmentation network published by Facebook in 2019 <ref type="bibr" target="#b27">[27]</ref>, is used to identify user's context specific, privacy related data, within video frames, with no ambiguity; combining the output masks produced by Detectron 2 with all the information retrieved before, a particular region of the frame is identified and filtered with a Gaussian transform. Similar or identical contexts disambiguation is possible to be tackled and solved with the support of RFID technology: with the introduction of a beacon that send a constant signal, it is possible to recognize and distinguish two apparentlysame looking environment. Such discriminatory action is essential, yet simple to be applied since it is integrable in any environment with low effort or invasiveness. A similar RFID-based solution for context recognition was already presented by another research <ref type="bibr" target="#b14">[14]</ref> conducted some years ago. Finally the desired effect is obtained by processing and collecting all the frames of the video and setting the right frame velocity.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Dataset</head><p>Distinct datasets have been used for the two different learning tasks respectively, image segmentation and context recognition. The choice of using the Detectron2 network for the the image segmentation tasks leaves little to no choice but using the 2017 version for the COCO dataset <ref type="bibr" target="#b28">[28]</ref> which has been demonstrated to be performative with such dataset. COCO is a dataset composed of two groups of elements: images and annotations. Images contain a vast variety of objects, for a total of 80 different category of elements.</p><p>The network was capable to recognize them all, and even apparently odd objects were left untouched and not removed. Together with the set of images, COCO is composed of a set of so-called "annotations" that contain information related to the position of the object masks, </p><formula xml:id="formula_0">their</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Image Segmentation Network</head><p>Detectron2 is Facebook AI Research's library <ref type="bibr" target="#b27">[27]</ref> that provides state-of-the-art detection and segmentation algorithms. It is the successor of Detectron <ref type="bibr" target="#b31">[31]</ref> which is in turn based on the Maskrcnn-benchmark model <ref type="bibr" target="#b32">[32]</ref>. It supports a great number of computer vision research projects thanks to its flexibility, output capabilities and available documentation. Among the available Detectron2' architectures maskrcnn-fpn has been chosen. Such architecture is mainly built from three modules: a Backbone Network, a Region Proposal Network and a Box Head.</p><p>The 𝐵𝑎𝑐𝑘𝑏𝑜𝑛𝑒 𝑁 𝑒𝑡𝑤𝑜𝑟𝑘, whose role is to extract multiscale feature maps with different receptive fields starting from the input image, is based on the Feature Pyramid Network <ref type="bibr" target="#b33">[33]</ref> technique. In this way areas of interest from different points of view are identified and passed to both the two next modules. The 𝑅𝑒𝑔𝑖𝑜𝑛 𝑃 𝑟𝑜𝑝𝑜𝑠𝑎𝑙 𝑁 𝑒𝑡𝑤𝑜𝑟𝑘 detects object regions (which are the so called "proposal boxes" ) based on multi-scale features, which together with the feature maps serve as input for 𝑅𝑜𝑖 (Region of interest) 𝐻𝑒𝑎𝑑. This last module warps feature maps using proposal boxes into multiple fixed-size features, and retrieves the fine-tuned box locations and classification results via fully-connected layers.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Context Recognition Network</head><p>The context recognition task belongs to a classification problem, where each frame of the video is treated as an image to classify. For this task Alexnet has been fine tuned on our reduced dataset by means of transfer learning.</p><p>The structure of the network is shown in figure <ref type="figure" target="#fig_2">2</ref>.</p><p>Here we report the performance with which we evaluated our model. We will give particular importance to both the accuracy and F1-score of each class. The performance of the network are reported in the table 1. As we can see we are capable of obtaining high F1score values for each class and an overall accuracy above 80%, making the results satisfying for our standards. We can consider the macro F1 score as a general metric of evaluation, defined as:</p><formula xml:id="formula_1">𝑚𝑎𝑐𝑟𝑜 − 𝐹 1 = 1 𝑁 𝑁 ∑︁ 𝑖=1 𝐹 1𝑖 = 0.81</formula><p>Which is simply an overall F1 score among all classes. In our case, we can see a macro F1 score of 0.81.</p><p>It is also possible to take vision of the confusion matrix in figure <ref type="figure" target="#fig_3">3</ref>, showing how the different samples from the test set were classified during the test phase.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4.">Output stabilization techniques</head><p>Videos in daily life scenarios, likely happen to contain temporary blank frames, as well as artifacts, due to user or scene related conditions. A context recognition network will therefore generate a classification label that will be trivially assigned, leading to an instability problem. We dealt with this problem through the introduction of a "memory buffer", that is capable of stabilizing statistically the result. This is achieved by endowing the system with two buffers, one for the context recognition network that stores the predicted context classes, and one used to track the instance segmentation network output and to store the classes predicted with enough accuracy. Storage of past data allows to create a time relation between successive frames, thus enforcing the output of each network and stabilizing the final one. This method allows to correlate information inside the video with the least expenditure of resources. There are two buffers.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4.1.">Context memory buffer</head><p>The first buffer visible in the top of fig. <ref type="figure" target="#fig_1">1</ref> is the one dedicated to the output of Alexnet. The assumption taken by this approach is that the context changes don't take place suddenly but instead are related to a smooth trend. For instance, if the frame 𝑛 recognizes a specific context, there is a high probability that also the frame 𝑛 + 1 will carry out similar information and represents the same context. Thus, doing an average among the past frames increases the overall accuracy by smoothing the output trend. The length of the buffer is set dynamically in relation to the 𝑓 𝑝𝑠 value retrieved from the video and the information obtained by the frames from the last half of second gets stored within. The trade-off of this method is a small delay from the context recognition module because the output context label has to stabilize for at least half of the buffer 𝑙𝑒𝑛𝑔𝑡ℎ to change the output and in this short period of time is wrongly classified with the previous stable label. This is strongly compensated by the stability provided and the delay is short enough to be difficult to notice.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4.2.">Instance segmentation memory buffer</head><p>The other output stabilization technique is the instance segmentation memory buffer dedicated to the Detectron output. The rationale here is regarding the threshold used by the model to decide if an element belongs or not to a certain class. Being a privacy concern to conceal as much as possible the sensible information inside the frames, false positives in exchange of a higher number of true positives are preferable. Therefore, two kinds of thresholds are considered: the basic one and the optimal one. The first one is lower than the second one and is the minimal value considered acceptable to take into account the output of the network. If the output confidence regarding a specific instance inside the frame falls below this value is considered too unclear and it is not counted. The second threshold instead represents the optimal value of confidence used by the system in order to properly recognize an element with enough accuracy. This information is used in order to track the last elements appeared to the network, appending them inside the buffer. If the networks find an instance of a class inside the buffer even with a lower confidence value in respect to the optimal threshold it is still considered acceptable, therefore processed and eventually concealed. In this way it gets easier for the network to work with moving objects because this method allows it to trace them even in the case of uncertainty due to movement. The buffer length is dynamically related to the 𝑓 𝑝𝑠 value and stores information regarding the set of frames concerning the last three seconds. If an element is recognized by the network after this time interval in order to be evaluated it needs to overcome the optimal confidence threshold again. The trade-off of this system, as mentioned above, is a higher frequency of false positives, which can be misleading for the final result and inversely proportional in number to the two threshold values. Overall, the accuracy following this approach improved by a discrete percentage, mainly in the more dynamic scenarios.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Results</head><p>The results of the system are evaluated accordingly on how many times the full procedure works consistently to specific information given as input related to a specific test video. Knowing in advance those information which are the settings inside a set of test videos as well as the list of elements inside of them, we can measure the overall accuracy of the system. For instance, if a video displays a specific context with a certain number of known elements inside of it we can control how many times those elements are found by the two networks and by the two memory buffers in the output trend. In order to achieve this result an evaluation procedure was implemented that given an input video follows similar steps that the system does but keeps into account the number of times the output given starting from each input frame is correct in respect to the total number of them. The test video used are three, which are providing the following scenarios:</p><p>• 𝑘𝑖𝑡𝑐ℎ𝑒𝑛 (fig. <ref type="figure" target="#fig_4">4</ref>.a), where we want to blur out a bowl from a table given that the system recognize the context. The instance segmentation network can identify various objects as a table, a oven and bottles. Due to the user preferences, here the stationary objects we want to blur out from all the video frames is simply one, a black bowl.</p><p>• 𝑏𝑎𝑡ℎ𝑟𝑜𝑜𝑚 (fig. <ref type="figure" target="#fig_4">4</ref>.b), where the user want to blur out the WC. In such scenario, the instance seg-mentation network recognize as a WC also the bidet given the similarity in their structure.</p><p>• 𝐵𝑒𝑑𝑟𝑜𝑜𝑚 (fig. <ref type="figure" target="#fig_4">4</ref>.c), In this scenario there is a bowl placed in a flat surface behind the bed, and the user wants to blur it out. This scenario can be potentially challenging since the portion of the room we are framing is very restricted, and the only object that can be considered a strong feature is the bed.</p><p>The results are shown in the following table <ref type="table">.</ref> Here we have 5 different values of evaluation for each test:</p><p>• accuracy of context recognition network respect the total number of frames(C.R.), indicating the percentage of success for the context recognition network applied to the frames of the video. This value does not show the improvements brought by the memory buffer.</p><p>• accuracy of context recognition network + memory buffer respect the total number of frames (C.R. ∖𝑤 B.), indicating the percentage of success for the context recognition network combined with the memory buffer for the context recognition task.</p><p>• Accuracy of the instance segmentation network in finding the objects of interest in a frame respect the total number of frame(I.S.). This indicates the percentage of success for the instance segmentation task respect the objects of interest for the user. This value does not show the improvements brought by the memory buffer.</p><p>• Accuracy of the instance segmentation network + memory buffer in finding the objects of interest in a frame respect the total number of frame(I.S. ∖𝑤 B.). this indicates the percentage of success for the instance segmentation task respect the objects of interest for the user. This value does not show the improvements brought by the memory buffer.</p><p>• overall accuracy of the whole system. This indicates, as the name states, the overall accuracy of the whole pipeline. This accuracy is given as a combined accuracy from the accuracy of the two tasks, obtained as the product between the accuracy of the instance segmentation considering also the buffer and the context recognition considering also the buffer.</p><p>In table <ref type="table" target="#tab_2">2</ref> tests' results are reported, confirming that memory buffers contribute to increase the accuracy of both tasks, translating in overall better system's accuracy.</p><p>It must be noted that accuracy can be further improved by fine tuning the thresholds required by the instance segmentation task. A general thumb rule is that, if the accuracy is similar between the system using the buffers and the system not using it, it is possible to improve the performance through such fine tuning. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusions</head><p>In this paper we presented a machine learning-powered solution for privacy enforcement in video data, a datadriven implementation to safeguard the privacy of any user that may be forced to spend plenty of hours in videos and/or in video meetings. Such solution will solve an untouched problem that wasn't formally faced by the field in the past years, where our time started to be off-centre towards the time spent online. Our implementation presents good performance even in presence of noisy or foggy videos, while perform almost perfectly in the most common scenarios of videos with common perspectives extracted from general recordings using mobile devices. The adaptability of the system to change with the needs of different users both for the objects of interest and the context of interest makes the solution we propose a solid step forward in the field of privacy enforcement for video data.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head></head><label></label><figDesc>ICYRIME 2024: 9th International Conference of Yearly Reports on Informatics, Mathematics, and Engineering. Catania, July 29-August 1, 2024 mandelli@diag.uniroma1.it (L. Mandelli)</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: The pipeline of our system.</figDesc><graphic coords="2,89.32,85.18,416.64,137.34" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Alexnet architecture. All rights reserved to the owner of the picture<ref type="bibr" target="#b34">[34]</ref> </figDesc><graphic coords="4,160.61,84.19,274.05,124.25" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3:The confusion matrix generated with the use of the scikit library<ref type="bibr" target="#b35">[35]</ref> </figDesc><graphic coords="4,89.29,250.70,203.36,171.72" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: a: kitchen b: Bathroom c: Bedroom</figDesc><graphic coords="5,177.98,85.55,74.99,111.27" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head></head><label></label><figDesc>bounding box and their location on the image reference frame.</figDesc><table><row><cell>For what concern the context recognition part, a slightly</cell></row><row><cell>modified version of a dataset available on Kaggle[29] has</cell></row><row><cell>been used; such dataset is composed of 5 different classes</cell></row><row><cell>symbolizing five different kind of rooms, two of which</cell></row><row><cell>has been merged together, namely living room and din-</cell></row><row><cell>ing room. Each element is originally an RGB picture of</cell></row><row><cell>a fixed size of 224x224x3 which has been resized to be</cell></row><row><cell>227x227x3, to better fit through AlexNet (?).</cell></row><row><cell>As part of the training &amp; testing process a defined set of</cell></row><row><cell>image processing techniques have been organized into a</cell></row><row><cell>pipeline. Such transformation pipeline has been imple-</cell></row><row><cell>mented using Albumentation library[30], an easy-to-use</cell></row><row><cell>and intuitive library for image processing; it consists of:</cell></row><row><cell>ShiftScaleRotate, for shifting or rotating images; RGB-</cell></row><row><cell>shift for randomly altering RGB channels' values; Ran-</cell></row><row><cell>domBrightnessContrast for randomly changing iamges'</cell></row></table><note>brightness and contrast; MultiplicativeNoise for randomly adding noise; Normalize, for normalizing data; HueSatu-rationValue, for randomly changing images' saturation.</note></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 1</head><label>1</label><figDesc>Performance of our classification task</figDesc><table><row><cell>Classes</cell><cell>precision</cell><cell>recall</cell><cell>F1-score</cell><cell>overall Accuracy</cell></row><row><cell>Bathroom</cell><cell>0.84</cell><cell>0.90</cell><cell>0.87</cell><cell></cell></row><row><cell>LivingRoom</cell><cell>0.92</cell><cell>0.79</cell><cell>0.85</cell><cell></cell></row><row><cell>Bedroom</cell><cell>0.69</cell><cell>0.76</cell><cell>0.76</cell><cell></cell></row><row><cell>Kitchen</cell><cell>0.75</cell><cell>0.76</cell><cell>0.76</cell><cell></cell></row><row><cell></cell><cell></cell><cell></cell><cell></cell><cell>0.83</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 2</head><label>2</label><figDesc>Performance of the system</figDesc><table><row><cell>Input</cell><cell>C.R.</cell><cell>C.R. \w B.</cell><cell>I.S.</cell><cell>I.S. \w B.</cell><cell>Overall %</cell></row><row><cell>Kitchen</cell><cell>0.91</cell><cell>1.0</cell><cell>0.89</cell><cell>0.98</cell><cell>0.98</cell></row><row><cell>Bathroom</cell><cell>0.74</cell><cell>0.81</cell><cell>0.89</cell><cell>1.0</cell><cell>0.81</cell></row><row><cell>Bedroom</cell><cell>0.92</cell><cell>1.0</cell><cell>0.5</cell><cell>0.78</cell><cell>0.78</cell></row></table></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<author>
			<persName><forename type="first">B</forename><surname>Evans</surname></persName>
		</author>
		<title level="m">The zoom revolution: 10 eye-popping stats from tech&apos;s new superstar</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<author>
			<persName><forename type="first">W</forename><surname>Standaert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Muylle</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Basu</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.im.2020.103393</idno>
		<idno>doi:</idno>
		<ptr target="https://doi.org/10.1016/j.im.2020.103393" />
		<title level="m">How shall we meet? understanding the importance of meeting mode capabilities for different meeting objectives</title>
				<imprint>
			<publisher>Information Management</publisher>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m" type="main">The state of video conferencing</title>
		<author>
			<persName><forename type="first">D</forename><surname>Chew</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Azizi</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2022">2022. 2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Virtual work meetings during the covid-19 pandemic: The good, bad, and ugly</title>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">A</forename><surname>Karl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">V</forename><surname>Peluchette</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Aghakhani</surname></persName>
		</author>
		<idno type="DOI">10.1177/10464964211015286</idno>
		<ptr target="https://doi.org/10.1177/10464964211015286" />
	</analytic>
	<monogr>
		<title level="j">Small Group Research</title>
		<imprint>
			<biblScope unit="volume">0</biblScope>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">A multiscale image compressor with rbfnn and discrete wavelet decomposition</title>
		<author>
			<persName><forename type="first">M</forename><surname>Wozniak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Napoli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Tramontana</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Capizzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Lo Sciuto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">K</forename><surname>Nowicki</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">T</forename><surname>Starczewski</surname></persName>
		</author>
		<idno type="DOI">10.1109/IJCNN.2015.7280461</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the International Joint Conference on Neural Networks</title>
				<meeting>the International Joint Conference on Neural Networks</meeting>
		<imprint>
			<date type="published" when="2015-09">2015. September, 2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">A new iterative fir filter design approach using a gaussian approximation</title>
		<author>
			<persName><forename type="first">G</forename><surname>Capizzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Coco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">L</forename><surname>Sciuto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Napoli</surname></persName>
		</author>
		<idno type="DOI">10.1109/LSP.2018.2866926</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE Signal Processing Letters</title>
		<imprint>
			<biblScope unit="volume">25</biblScope>
			<biblScope unit="page" from="1615" to="1619" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Is the colony of ants able to recognize graphic objects?</title>
		<author>
			<persName><forename type="first">D</forename><surname>Połap</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Woźniak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Napoli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Tramontana</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Damaševičius</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-319-24770-0_33</idno>
	</analytic>
	<monogr>
		<title level="j">Communications in Computer and Information Science</title>
		<imprint>
			<biblScope unit="volume">538</biblScope>
			<biblScope unit="page" from="376" to="387" />
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Can we process 2d images using artificial bee colony?</title>
		<author>
			<persName><forename type="first">M</forename><surname>Woźniak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Połap</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gabryel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">K</forename><surname>Nowicki</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Napoli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Tramontana</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-319-19324-3_59</idno>
	</analytic>
	<monogr>
		<title level="s">Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science</title>
		<imprint>
			<biblScope unit="volume">9119</biblScope>
			<biblScope unit="page" from="660" to="671" />
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Dredging up the past: Lifelogging, memory and surveillance</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">L</forename><surname>Allen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">University of Chicago Law Review</title>
		<imprint>
			<biblScope unit="volume">12</biblScope>
			<biblScope unit="page" from="2825" to="2830" />
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<title level="m" type="main">The most astonishing facebook statistics in 2022</title>
		<author>
			<persName><forename type="first">T</forename><surname>Dobrilova</surname></persName>
		</author>
		<ptr target="https://techjury.net/blog/facebook-statistics/" />
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Privacy issues for online personal photograph collections</title>
		<author>
			<persName><forename type="first">S</forename><surname>Cunningham</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Masoodian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Adams</surname></persName>
		</author>
		<idno type="DOI">10.4067/S0718-18762010000200003</idno>
	</analytic>
	<monogr>
		<title level="j">Journal of Theoretical and Applied Electronic Commerce Research</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<author>
			<persName><forename type="first">R</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Baracaldo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Joshi</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2108.04417</idno>
		<title level="m">Privacy-preserving machine learning: Methods, challenges and directions</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Image disguising for privacypreserving deep learning</title>
		<author>
			<persName><forename type="first">S</forename><surname>Sharma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Chen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2018 ACM SIGSAC Conference on Computer and</title>
				<meeting>the 2018 ACM SIGSAC Conference on Computer and</meeting>
		<imprint>
			<biblScope unit="page" from="53" to="59" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<title/>
		<idno type="DOI">10.1145/3243734.3278511</idno>
		<ptr target="https://doi.org/10.1145/3243734.3278511" />
		<imprint>
			<date type="published" when="2018">2018</date>
			<publisher>Association for Computing Machinery</publisher>
			<pubPlace>New York, NY, USA</pubPlace>
		</imprint>
		<respStmt>
			<orgName>Communications Security</orgName>
		</respStmt>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">A context-driven privacy enforcement system for autonomous media capture devices</title>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">M</forename><surname>Farinella</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Napoli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Nicotra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Riccobene</surname></persName>
		</author>
		<idno type="DOI">10.1007/s11042-019-7376-z</idno>
		<ptr target="https://doi.org/10.1007/s11042-019-7376-z" />
	</analytic>
	<monogr>
		<title level="j">Multimedia Tools and Applications</title>
		<imprint>
			<biblScope unit="volume">78</biblScope>
			<biblScope unit="page" from="14091" to="14108" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<monogr>
		<author>
			<persName><forename type="first">P</forename><surname>Speciale</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">L</forename><surname>Schönberger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">B</forename><surname>Kang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">N</forename><surname>Sinha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Pollefeys</surname></persName>
		</author>
		<idno>CoRR abs/1903.05572</idno>
		<title level="m">Privacy preserving image-based localization</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Trusting the computer in computer vision: A privacy-affirming framework</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">T</forename><surname>-Y. Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Biglari-Abhari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">I</forename></persName>
		</author>
		<author>
			<persName><forename type="first">.-K</forename><surname>Wang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)</title>
				<imprint>
			<date type="published" when="2017">2017. 2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Enhancing sentiment analysis on seed-iv dataset with vision transformers: A comparative study</title>
		<author>
			<persName><forename type="first">I</forename><forename type="middle">E</forename><surname>Tibermacine</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Tibermacine</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Guettala</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Napoli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Russo</surname></persName>
		</author>
		<idno type="DOI">10.1145/3638985.3639024</idno>
	</analytic>
	<monogr>
		<title level="m">ACM International Conference Proceeding Series</title>
				<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="238" to="246" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Analysis pre and post covid-19 pandemic rorschach test data of using em algorithms and gmm models</title>
		<author>
			<persName><forename type="first">V</forename><surname>Ponzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Russo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Wajda</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Brociek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Napoli</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CEUR Workshop Proceedings</title>
				<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="volume">3360</biblScope>
			<biblScope unit="page" from="55" to="63" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">A novel convmixer transformer based architecture for violent behavior detection</title>
		<author>
			<persName><forename type="first">A</forename><surname>Alfarano</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>De Magistris</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Mongelli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Russo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Starczewski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Napoli</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-031-42508-0_1</idno>
	</analytic>
	<monogr>
		<title level="j">LNAI</title>
		<imprint>
			<biblScope unit="volume">14126</biblScope>
			<biblScope unit="page" from="3" to="16" />
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Eyetracking system with low-end hardware: Development and evaluation</title>
		<author>
			<persName><forename type="first">E</forename><surname>Iacobelli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Ponzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Russo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Napoli</surname></persName>
		</author>
		<idno type="DOI">10.3390/info14120644</idno>
	</analytic>
	<monogr>
		<title level="j">Information</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">An advanced solution based on machine learning for remote emdr therapy</title>
		<author>
			<persName><forename type="first">F</forename><surname>Fiani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Russo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Napoli</surname></persName>
		</author>
		<idno type="DOI">10.3390/technologies11060172</idno>
	</analytic>
	<monogr>
		<title level="j">Technologies</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Psychoeducative social robots for an healthier lifestyle using artificial intelligence: a case-study</title>
		<author>
			<persName><forename type="first">V</forename><surname>Ponzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Russo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Bianco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Napoli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Wajda</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CEUR Workshop Proceedings</title>
				<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="volume">3118</biblScope>
			<biblScope unit="page" from="26" to="33" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Contagion prevention of covid-19 by means of touch detection for retail stores</title>
		<author>
			<persName><forename type="first">R</forename><surname>Brociek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">D</forename><surname>Magistris</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Cardia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Coppa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Russo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CEUR Workshop Proceedings</title>
				<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="volume">3092</biblScope>
			<biblScope unit="page" from="89" to="94" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Human attention assessment using a machine learning approach with gan-based data augmentation technique trained using a custom dataset</title>
		<author>
			<persName><forename type="first">S</forename><surname>Pepe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Tedeschi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Brandizzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Russo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Iocchi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Napoli</surname></persName>
		</author>
		<idno type="DOI">10.21926/obm.neurobiol.2204139</idno>
	</analytic>
	<monogr>
		<title level="j">OBM Neurobiology</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">A machine learning based real-time application for engagement detection</title>
		<author>
			<persName><forename type="first">E</forename><surname>Iacobelli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Russo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Napoli</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CEUR Workshop Proceedings</title>
				<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="volume">3695</biblScope>
			<biblScope unit="page" from="75" to="84" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<monogr>
		<author>
			<persName><forename type="first">A</forename><surname>Krizhevsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Sutskever</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">E</forename><surname>Hinton</surname></persName>
		</author>
		<title level="m">Imagenet classification with deep convolutional neural networks</title>
				<imprint>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<monogr>
		<author>
			<persName><forename type="first">Y</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kirillov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Massa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W.-Y</forename><surname>Lo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Girshick</surname></persName>
		</author>
		<ptr target="https://github.com/facebookresearch/detectron2" />
		<title level="m">Detectron2</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<monogr>
		<author>
			<persName><forename type="first">T.-Y</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Maire</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Belongie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Bourdev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Girshick</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hays</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Perona</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Ramanan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">L</forename><surname>Zitnick</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Dollár</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1405.0312</idno>
		<title level="m">Microsoft coco: Common objects in context</title>
				<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<monogr>
		<title level="m" type="main">House rooms image dataset</title>
		<author>
			<persName><surname>Robinreni</surname></persName>
		</author>
		<ptr target="https://www.kaggle.com/robinreni/house-rooms-image-dataset" />
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b30">
	<analytic>
		<title level="a" type="main">Albumentations: Fast and flexible image augmentations</title>
		<author>
			<persName><forename type="first">A</forename><surname>Buslaev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">I</forename><surname>Iglovikov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Khvedchenya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Parinov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Druzhinin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">A</forename><surname>Kalinin</surname></persName>
		</author>
		<idno type="DOI">10.3390/info11020125</idno>
		<ptr target="http://dx.doi.org/10.3390/info11020125" />
	</analytic>
	<monogr>
		<title level="j">Information</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="page">125</biblScope>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b31">
	<monogr>
		<title/>
		<author>
			<persName><forename type="first">R</forename><surname>Girshick</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Radosavovic</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Gkioxari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Dollár</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Detectron</forename></persName>
		</author>
		<ptr target="https://github.com/facebookresearch/detectron" />
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b32">
	<monogr>
		<author>
			<persName><forename type="first">F</forename><surname>Massa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Girshick</surname></persName>
		</author>
		<ptr target="https://github.com/facebookresearch/maskrcnn-benchmark" />
		<title level="m">maskrcnn-benchmark: Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
	<note>Insert date here</note>
</biblStruct>

<biblStruct xml:id="b33">
	<monogr>
		<title level="m" type="main">Feature pyramid networks for object detection</title>
		<author>
			<persName><forename type="first">T.-Y</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Dollár</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Girshick</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Hariharan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Belongie</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1612.03144</idno>
		<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b34">
	<monogr>
		<author>
			<persName><forename type="first">A</forename><surname>Khvostikov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Aderghal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Benois-Pineau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Krylov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Catheline</surname></persName>
		</author>
		<title level="m">cnn-based classification using smri and md-dti images for alzheimer disease studies</title>
				<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page">3</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b35">
	<analytic>
		<title level="a" type="main">Scikit-learn: Machine learning in Python</title>
		<author>
			<persName><forename type="first">F</forename><surname>Pedregosa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Varoquaux</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gramfort</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Michel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Thirion</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Grisel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Blondel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Prettenhofer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Weiss</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Dubourg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Vanderplas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Passos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Cournapeau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Brucher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Perrot</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Duchesnay</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Machine Learning Research</title>
		<imprint>
			<biblScope unit="volume">12</biblScope>
			<biblScope unit="page" from="2825" to="2830" />
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
