<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Streetseek -Understanding Public Space Engagement Using Deep Learning &amp; Thermal Imaging</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Ciarán</forename><surname>O'mara</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">University of Limerick</orgName>
								<address>
									<postCode>V94 T9PX</postCode>
									<settlement>Limerick</settlement>
									<country key="IE">Ireland</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Eoghan</forename><surname>Mulcahy</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">University of Limerick</orgName>
								<address>
									<postCode>V94 T9PX</postCode>
									<settlement>Limerick</settlement>
									<country key="IE">Ireland</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Pepijn</forename><surname>Van De Ven</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">University of Limerick</orgName>
								<address>
									<postCode>V94 T9PX</postCode>
									<settlement>Limerick</settlement>
									<country key="IE">Ireland</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">John</forename><surname>Nelson</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">University of Limerick</orgName>
								<address>
									<postCode>V94 T9PX</postCode>
									<settlement>Limerick</settlement>
									<country key="IE">Ireland</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Streetseek -Understanding Public Space Engagement Using Deep Learning &amp; Thermal Imaging</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">C4F2775D6FAC257698760C939AA58299</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T02:38+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Smart Cities</term>
					<term>Computer Vision</term>
					<term>Data Engineering</term>
					<term>Machine Learning</term>
					<term>Object Detection</term>
					<term>Object Tracking</term>
					<term>Thermal Cameras</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In this paper, a platform for analysing public space engagement is described. This research focused on efforts to better understand the various ways people interact with the city environment, for example; the number of persons on a street, the average time spent, and topically -due to Covid-19, the physical distance maintained between people. A novel data collection method was used to capture imagery from several streets in a low-cost, scalable, and privacy ensuring fashion. Insights were captured in real-time over several months on a five-minute interval, for nine hours a day and seven days a week, across multiple cameras. These insights were generated through a novel CNN trained on thermal camera imagery -which maintained the individual's right to privacy by ensuring that no person was identifiable in the captured data-set. Finally, a SORT based tracking algorithm was used to measure interactions over time.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>The Streetseek project has been undertaken in response to an open call as part of the +CityxChange smart city program <ref type="bibr">[1]</ref>, funded by the European Union's Horizon 2020 research and innovation program. The goal of this program is closely aligned to the UN Sustainable Development Goals (SDG), specifically UN SDG 11 -"Making cities and human settlements inclusive, safe, resilient and sustainable". The development of Streetseek took place over the course of three months from May to August 2020. The idea of data-driven decision making has been around for centuries. However, in recent times advancements in information and communication technology have changed the way in which we use data for policy-making and urban growth <ref type="bibr" target="#b1">[2]</ref>. It's essential to understand how we use our cities in order to consistently innovate in urban areas while also ensuring they have the correct facilities and systems to cater for the growing urban populations. In 2019, the United Nations estimated that more than half the world's population (4.2 billion people) now live in urban areas and by 2041, this figure will increase to 6 billion people <ref type="bibr" target="#b2">[3]</ref>. The need to capture large scale actionable data has been highlighted further in recent times due to the Covid-19 pandemic. Policy-makers adapt their guidelines and restrictions based on data surrounding positive tests, hospital admissions, and deaths. Although the effectiveness of these guidelines can be inferred by examining these pieces of data there is no data to suggest in real-time how people are adhering to the measures that have been put in place.</p><p>A thermal and deep learning technology based platform has been built, capable of gathering real time, actionable insights directly from streets and lane-ways. Although local government are the immediate stakeholders for this type of system, the public and academic researchers will also have interest in this data. Therefore, the insights capture platform is built upon a collaborative, open data platform. This type of design ensures data is easily accessible and therefore can be easily communicated. The capabilities of computer vision applications have grown exponentially in recent years. This is driven by advancement in deep learning algorithms allowing for more accurate object detection within imagery across a wider range of environments. Such development has allowed for camera systems to become less passive and evolve into real world sensing tools. One of the major benefits of using a camera system as a sensor is the high resolution of the data collected. Many different insights can be generated through various algorithms depending on the intended use case. Processing of data can take place either at the edge (on the camera compute unit) or in the cloud. Cloud processing was chosen due to hardware limitations in the capture layer . Therefore, the capture layer (see Fig. <ref type="figure" target="#fig_0">1</ref> ) was vastly simplified, requiring only a video encoder and streaming software, resulting in minimal computational specifications. This reduced complexity in the capture layer translates to increased complexity in the inference layer (see Fig. <ref type="figure" target="#fig_0">1</ref> ). If the detection algorithms were running at the edge the upstream packets would be much smaller, and would simply consist of the detection data processed by the edge compute unit. Instead, video data was streamed from multiple cameras up to the cloud at a resolution of 160x120 pixels at 7 frames per second. Therefore, the decoding engine at the front of the inference layer was required to be scale-able on demand to the number of incoming streams. A cluster based approach was used to handle this requirement.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">System Overview</head><p>The YOLOv3 <ref type="bibr" target="#b3">[4]</ref> single shot detector algorithm (discussed in section 3.2) ran within the detector block. An API endpoint was exposed where decoded images from the decoder engine were forwarded in order to generate the detection data. The detections store (see Fig. <ref type="figure" target="#fig_0">1</ref> ) used a NoSQL database which included tables and items. Primary keys were used to uniquely identify each item in a table and a secondary index was used to provide more querying flexibility. Having generated relevant bounding boxes, this data was used as the input to the insights generator function. The SORT <ref type="bibr" target="#b4">[5]</ref> algorithm was used in this block to derive the various insights required of the application. The Insights generator ran on a 5 minute interval querying a batch of data between two timestamps from the detections store and sent a request to an API in order to write the insights to the Insights Store. A REST API was developed to interact with an Insights Store. This API exposed GET and POST requests to access the data contained within the Insights Store.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Thermal Person Detection</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Background</head><p>Person detection is a well researched problem. However, there are significant privacy concerns surrounding cameras and public spaces. Thermal cameras detect temperature by recognizing and capturing different levels of infrared light, invisible to the naked eye.  As a result they do not capture details which could be used to identify an individual. This poses a problem, as 'off the shelf' human detection models are trained on feature rich RGB images. The algorithm developed to detect humans must rely on foreground/background segmentation, which can vary in different conditions as shown in Fig. <ref type="figure" target="#fig_2">2</ref>. Furthermore, the cost of thermal sensors is significantly higher than an RGB sensor. In order for this system to be financially feasible a low resolution (160x120) 9fps (frames per second) thermal camera was used. The camera feed was streamed at 4.5fps to reduce computational cost in the cloud.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Method</head><p>Initially what would be referred to as 'classical approaches' were used in an attempt to detect pedestrians. A series of thresholding techniques <ref type="bibr" target="#b5">[6]</ref> were applied. The first approach involved applying a set thresholding value (as shown in Fig. <ref type="figure" target="#fig_4">3(b)</ref>) which was calculated through a trial and error process. The subsequent two methods tested were adaptive based thresholding techniques. The first, the Otsu algorithm exhaustively searches for the threshold that minimizes the intra-class variance, defined as a weighted sum of variances of the two classes:</p><formula xml:id="formula_0">σ 2 w (t) = ω 0 (t)σ 2 0 (t) + ω 1 (t)σ 2 1 (t).</formula><p>The mean adaptive thresholding method involved examining the mean pixel intensity values of the local neighbourhoods of each pixel. Initial results showed that thresholding alone would not suffice as shown in Fig 3 <ref type="figure">.</ref> This became even more apparent as the scene began to get more complex, with multiple pedestrians and varying environment temperatures. Furthermore, thresholding is incapable of detecting individual pedestrians when they are grouped together.  After it became apparent that thresholding would only work in simple scenarios, efforts pivoted to a background subtraction algorithm. This includes the training of a background model which can be subtracted from each video frame resulting in the foreground objects. The following algorithms <ref type="bibr" target="#b6">[7]</ref> were tested to access the their suitability for this use case; Gaussian Mixture-based Background/Foreground Segmentation Algorithm (MOG &amp; MOG2), K-Nearest Neighbours background subtraction algorithm, statistical background image estimation and per-pixel Bayesian segmentation algorithm (GMG) and the CouNT high speed background subtraction algorithm (CNT). The results presented in Fig. <ref type="figure">4</ref> were marginally better across frames than the results of the thresholding techniques. However, the top and bottom of people were often split in two.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Fig. 4: Background Subtractor Algorithms Results</head><p>In an attempt to produce a more robust and reliable thermal pedestrian detector, a deep learning approach was adopted. A relatively lightweight CNN architecture was needed to achieve near real-time inference while also keeping cloud computing costs low. The YOLOv3 architecture <ref type="bibr" target="#b3">[4]</ref>, presented in 2018 was selected. The out of the box YOLOv3 model has been trained on the COCO dataset <ref type="bibr" target="#b7">[8]</ref>. In order to train a model capable of classifying pedestrians in a thermal image, a transfer learning approach <ref type="bibr" target="#b8">[9]</ref> was used. The final fully connected layers in the model were stripped and output was reduced to 3 classes (person, car, bicycle). The FLIR thermal imagery driving dataset <ref type="bibr" target="#b9">[10]</ref> which was used for the transfer learning. The original 640x512 images were initially letter-boxed to convert them to a 4:3 aspect ratio (640x480). A 4x4 kernel was then run across the image, averaging pixels to reduce the image resolution to 160x120 as shown in Fig 5 <ref type="figure">.</ref> This was necessary for transfer learning as the training images needed to be of the same resolution as the street camera. Finally, by inspection it appeared that any detection who's bounding box was &lt;20px 2 was discarded since it appeared as noise.  <ref type="bibr" target="#b9">[10]</ref> thermal dataset which comprises bicycles, cars and people. This dataset came pre-annotated and following down-sampling to 160x120 was trained for 50 epochs. Performance increased slightly on this iteration as the model (v1) became more familiar with thermal data. However, the model was still not performing acceptably as it had not seen data from the 160x120 street cameras. The model was then fine-tuned (initialised with v1 weights) with an annotated dataset consisting of 922 thermal images from the Street 1 and 2 cameras, for a further 50 epochs -after which it achieved performance metrics shown in Table <ref type="table" target="#tab_1">2</ref>. This model (v2) was chosen for deployment. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Insights Generation</head><p>After detecting a pedestrian in an image, in order to understand their behaviour and interaction with both the public space and others they must be tracked across video frames. In order to track pedestrian centroids in a 2D space the SORT (Simple Online Real-Time Tracking) <ref type="bibr" target="#b4">[5]</ref> algorithm was implemented. A newer version of this algorithm built upon a deep learning architecture called DeepSORT was also implemented. This architecture adds a pre-trained neural net to generate features for objects. However, the computational cost of running DeepSORT was not required for this use-case and so its less computational predecessor SORT was chosen. The SORT algorithm uses a Kalman filter for object tracking. The Kalman filter is also known as linear quadratic estimation (LQE) algorithm that uses a series of detection centroids over time to produce an estimate to where the next centroid will be. The SORT algorithm then uses IOU (intersection over union) criteria to accept the estimate. The critical aspect of this algorithm is the association of objects between frames. IOU is not a good approach for small objects as there is naturally less of an overlap of their bounding boxes. DeepSORT <ref type="bibr" target="#b10">[11]</ref> addresses this issue by adding a pretrained neural network to generate features for objects. Using this method the association can be made based on feature similarity instead of overlap. Although DeepSORT offers improvement on overall accuracy when compared with SORT, it comes at computational cost. In this case SORT was used to speed up cloud processing time which contributes to keeping system costs down.  The calculation of each metric is an extension of the SORT implementation. The measurement distance or speed from a camera feed can be difficult due to perspective, perceived closeness and pixel to distance calibration. The concept of perspective is the idea that humans project the real (3D) world onto a 2D image in order to understand distance and depth. A camera sees the world in the same way and thus the Euclidean distance in the 2D plane is not a good approximation of the 3D or real world Euclidean distance. To solve this a bird's eye view virtual camera transform presented in <ref type="bibr" target="#b11">[12]</ref> was implemented. The perspective transformation was developed for camera-on-vehicle discussing the serious perspective effect on the image caused by the camera angle and height. The same issue is evident in this pedestrian camera feed and therefore the method can be transferred for use in this application.</p><p>The scene is transformed as shown in Fig. <ref type="figure" target="#fig_8">7</ref> and some basic pixel to distance calibration is performed. This allows for the distance between pedestrians to be calculated, as well as the avg speed they move at, the estimated time that they speed in the frame based on the SORT Id assigned to each person and finally a generalised heat-map that can be used to understand how pedestrians use the public space. Furthermore a count line can be positioned in the frame to count pedestrians and the direction that they are walking. The whole tracking process is described in Fig. <ref type="figure" target="#fig_8">7</ref></p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>(b).</head><p>A total of seven pieces of data were calculated and stored. After a video frame is processed by the YOLOv3 detection model, the pedestrian bounding boxes are stored in a detections database with a unix timestamp linking the data to the frame. Every five minutes an insights generator program fetches the last five minutes of detection data and calculates the metrics. This process is described in Algorithm 1.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Algorithm 1: Insights Generation Algorithm</head><p>Input: detection data Output: insights object insights object ← {"personCountLef t" : 0, "personCountRight" : 0, "avgSpeed" : 0, "estT imeSpent" : 0, "socialDistCompliance" : 100, "heatmap" : </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1">Thermal Pedestrian Model and Counter</head><p>The three models presented in Table <ref type="table" target="#tab_4">3</ref> were tested using some of the captured imagery from the Street 1 camera. The test dataset included 804 frames (half in direct sunlight half in shade) with a total of 72 people counted across the frame sequence and 6480 pedestrian instances. The recall metric (T P/T P + F N ) was used to evaluate how well the models performed as for this particular use case there is only one class and for tracking false positives are not a concern.  It was clear that for this test data Model v2 out performed the other two models as shown in Table <ref type="table" target="#tab_4">3</ref>. The training data used for Model v2 was very similar to that of the test set, including imagery from both streets. All three models struggled with the direct sunlight frames as well as instances where pedestrians overlap and there was very little contrast to segment them. The main objective of the captured insights was to communicate the data in the hope of starting discussions which can in some cases lead to positive change in the city. The insights data was captured at a 5 minute level of granularity and can be queried through a REST API. The person count (over a 2 month period) for Street 1 is shown in Fig. <ref type="figure" target="#fig_9">8</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2">Insights Generated</head><p>The first spike of people on the 2 nd of September and subsequent spikes thereafter were as a result of renovation work being carried out and workers walking up and down the laneway throughout the day. This camera was installed to measure the impact that these installations had on pedestrian footfall. It could be argued that the marginal increase in pedestrian traffic seen after the initial spike was as a direct result of the renovation work on the laneway. The social distancing compliance on Street 1 is presented in Fig. <ref type="figure" target="#fig_10">9</ref> The question which could be posed here is whether the media surrounding the increase in Covid-19 case numbers in Ireland resulted in better social distancing compliance in this laneway. Furthermore, measures could be introduced in this laneway to keep pedestrians distanced and monitored in real-time using this system.</p><p>The heatmap overlay is presented in Fig 10 <ref type="figure">.</ref> It would seem that the left-hand side of the laneway is more popular, as is the top of the frame where there is a seating area and an entrance to a café. This would suggest that the lane is predominantly used to access the café and could be used to start a conversation surrounding the pedestrianization of this laneway. Future research should further improve the accuracy of the thermal person detection models, and also could examine how accurate the model is in detecting cars and bicycles. The detection of cars and bicycles will offer more insight into how urban spaces are used. Furthermore, there is scope to examine how cloud computing costs could be minimised by potentially using an intermediate background subtraction layer to identify movement, before passing a frame to the inference layer. Finally, over 4.5 million thermal images from several streets have been captured and stored as part of this research. Future work will also include the annotation of a large 160x120 thermal imagery dataset.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7">Conclusion</head><p>To conclude, a thermal and deep learning based platform that uses AI algorithms to collect information on how pedestrians use public spaces has been developed and deployed in Limerick city. The system is capable of counting pedestrians (and their direction of movement), their average walking pace, the estimated time they spend in the frame, their compliance with the social distancing guidelines and a generalised heat-map. These new understandings can be leveraged at city planning level to introduce measures and invest in infrastructure that make urban spaces inclusive, safe, resilient and sustainable. The insights and thermal imagery dataset discussed will be released alongside this paper.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 :</head><label>1</label><figDesc>Fig. 1: System Level Diagram</figDesc><graphic coords="2,169.35,314.59,276.66,152.57" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Fig. 2 :</head><label>2</label><figDesc>Fig. 2: Street 2, Limerick -direct sunlight affecting natural segmentation.</figDesc><graphic coords="3,190.10,548.66,89.92,67.44" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head></head><label></label><figDesc>(a) Original segmentation (b) Global Thresholding (t=80) (c) Otsu Thresholding (d) Adaptive Mean Thresholding</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Fig. 3 :</head><label>3</label><figDesc>Fig. 3: Thresholding Techniques Results</figDesc><graphic coords="4,134.77,483.39,86.45,64.93" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Fig. 5 :</head><label>5</label><figDesc>Fig. 5: FLIR Thermal Dataset</figDesc><graphic coords="5,264.45,478.75,86.45,68.20" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_6"><head>Fig. 6 :</head><label>6</label><figDesc>Fig. 6: Street 2, Limerick -YOLOv3 thermal person classifier results.</figDesc><graphic coords="7,255.80,125.80,103.74,77.47" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_8"><head>Fig. 7 :</head><label>7</label><figDesc>Fig. 7: Bird's Eye Transformation used in image processing</figDesc><graphic coords="7,141.68,544.99,165.98,71.11" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_9"><head>Fig. 8 :</head><label>8</label><figDesc>Fig. 8: Daily Person Count on Street 1</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_10"><head>Fig. 9 :</head><label>9</label><figDesc>Fig. 9: 5-day Rolling Average -Social Distancing Compliance on Street 1 and New Covid-19 Cases in Ireland</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_11"><head>Fig. 10 :</head><label>10</label><figDesc>Fig. 10: Street 1 Heatmap Overlay</figDesc><graphic coords="11,221.22,115.84,172.92,130.20" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 :</head><label>1</label><figDesc>Descriptions of datasets used to train each model Test and training scripts were used to evaluate and generate new iterations of the model. The original version of YOLO trained on RGB images was used as a baseline. As expected the performance of this model on the captured 160x120 thermal dataset was poor. A transfer learning technique was used to generate a new model based on the FLIR</figDesc><table><row><cell>Dataset</cell><cell cols="5">Images # of Person # of Car # of Bicycle Model Version</cell></row><row><cell>COCO</cell><cell>328,000</cell><cell>900,000</cell><cell>100,000</cell><cell>20,000</cell><cell>baseline</cell></row><row><cell>FLIR</cell><cell>10,228</cell><cell>28,151</cell><cell>46,692</cell><cell>4,457</cell><cell>v1</cell></row><row><cell cols="2">Street Cameras 922</cell><cell>1,030</cell><cell>0</cell><cell>0</cell><cell>v2</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2 :</head><label>2</label><figDesc>Model metrics using the Street 1 validation set</figDesc><table><row><cell cols="3">Model F1 mAP@50 Precision Recall</cell></row><row><cell cols="2">baseline 0.3360 0.1990</cell><cell>0.9530 0.2040</cell></row><row><cell>v1</cell><cell>0.3630 0.2200</cell><cell>0.9500 0.2240</cell></row><row><cell>v2</cell><cell>0.8897 0.9471</cell><cell>0.8904 0.8890</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head></head><label></label><figDesc>[ H    </figDesc><table><row><cell>25×19</cell><cell>]}</cell></row><row><cell>for frame detections in detetection data do</cell><cell></cell></row><row><cell>tracked objects ← sort tracker.update(detection data)</cell><cell></cell></row><row><cell>for x1, y1, x2, y2, obj id in tracked objects do</cell><cell></cell></row><row><cell>feet ← (x1 + (bboxw/2)), (y2)</cell><cell></cell></row><row><cell>feet transformed ← perspective transf ormation(f eet)</cell><cell></cell></row><row><cell>object paths[obj id].append(feet transform)</cell><cell></cell></row><row><cell>insights object ← update insights metrics(object paths)</cell><cell></cell></row><row><cell>end</cell><cell></cell></row><row><cell>end</cell><cell></cell></row><row><cell>return insights object</cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 3 :</head><label>3</label><figDesc>Deployment dataset model results</figDesc><table /></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="8">Acknowledgements</head><p>With thanks to Darryl Connell and Liam Mulcahy. This publication has emanated from research supported in part by a Grant from Science Foundation Ireland under Grant number 18/CRT/6049.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<ptr target="https://cityxchange.eu/" />
		<title level="m">Home -+CityxChange</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<ptr target="https://population.un.org/wup/Publications/" />
		<title level="m">World Urbanization Prospects -Population Division -United Nations</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">The impact of technological innovation on building a sustainable city</title>
		<author>
			<persName><forename type="first">C</forename><surname>Goi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ternational Journal of Quality Innovation</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><surname>Redmon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Farhadi</surname></persName>
		</author>
		<ptr target="http://arxiv.org/abs/1804.02767" />
		<title level="m">YOLOv3: An Incremental Improvement</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Simple online and realtime tracking</title>
		<author>
			<persName><forename type="first">A</forename><surname>Bewley</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Ge</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Ott</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Ramos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Upcroft</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE International Conference on Image Processing (ICIP)</title>
				<imprint>
			<date type="published" when="2016">2016. 2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Thermal imaging systems for real-time applications in smart cities</title>
		<author>
			<persName><forename type="first">R</forename><surname>Gade</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Moeslund</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Nielsen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Petersen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Andersen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Basselbjerg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Dam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Jensen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Jørgensen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Lahrmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Madsen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Bala</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Povey</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Journal of Computer Applications in Technology</title>
		<imprint>
			<biblScope unit="volume">53</biblScope>
			<biblScope unit="page">291</biblScope>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Comparison of Background Subtraction Methods on Near Infra-Red Spectrum Video Sequences</title>
		<author>
			<persName><forename type="first">T</forename><surname>Trnovszký</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Sýkora</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Hudec</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Procedia Engineering</title>
		<imprint>
			<biblScope unit="volume">192</biblScope>
			<biblScope unit="page" from="887" to="892" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Microsoft COCO: Common Objects in Context</title>
		<author>
			<persName><forename type="first">T</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Maire</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Belongie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hays</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Perona</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Ramanan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Dollár</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Zitnick</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computer Vision -ECCV</title>
		<imprint>
			<biblScope unit="page" from="740" to="755" />
			<date type="published" when="2014">2014. 2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">A New Method of Image Detection for Small Datasets under the Framework of YOLO Network</title>
		<author>
			<persName><forename type="first">G</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Song</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Fu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE 3rd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC)</title>
				<imprint>
			<date type="published" when="2018">2018. 2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<ptr target="https://www.flir.com/oem/adas/adas-dataset-form/" />
		<title level="m">FREE -FLIR Thermal Dataset for Algorithm Training -FLIR Systems</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Simple online and realtime tracking with a deep association metric</title>
		<author>
			<persName><forename type="first">N</forename><surname>Wojke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Bewley</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Paulus</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE International Conference on Image Processing (ICIP</title>
				<imprint>
			<date type="published" when="2017">2017. 2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Low-cost implementation of bird&apos;s-eye view system for camera-on-vehicle</title>
		<author>
			<persName><forename type="first">Lin-Bo</forename><surname>Luo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">In-Sung</forename><surname>Koh</surname></persName>
		</author>
		<author>
			<persName><surname>Min</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jun</forename><surname>Kyeong Yuk Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jongwha</forename><surname>Chong</surname></persName>
		</author>
		<idno type="DOI">10.1109/ICCE.2010.5418845</idno>
	</analytic>
	<monogr>
		<title level="j">Digest of Technical Papers International Conference on Consumer Electronics</title>
		<imprint>
			<biblScope unit="page" from="311" to="312" />
			<date type="published" when="2010">2010. 2010. 2010</date>
		</imprint>
	</monogr>
	<note>ICCE</note>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Deep Learning vs. Traditional Computer Vision</title>
		<author>
			<persName><forename type="first">N</forename><surname>O'mahony</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Campbell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Carvalho</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Harapanahalli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Hernandez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Krpalkova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Riordan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Walsh</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in Intelligent Systems and Computing</title>
				<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="128" to="144" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
