<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">UI element detection from wireframe drawings of websites</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Prasang</forename><surname>Gupta</surname></persName>
							<email>prasang.gupta@pwc.com</email>
							<affiliation key="aff0">
								<orgName type="institution">PwC US Advisory</orgName>
								<address>
									<addrLine>BG House, Lake Boulevard Road, Hiranandani Gardens</addrLine>
									<settlement>Powai</settlement>
									<region>Mumbai</region>
									<country key="IN">India</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Vishakha</forename><surname>Bansal</surname></persName>
							<email>vishakha.bansal@pwc.com</email>
							<affiliation key="aff0">
								<orgName type="institution">PwC US Advisory</orgName>
								<address>
									<addrLine>BG House, Lake Boulevard Road, Hiranandani Gardens</addrLine>
									<settlement>Powai</settlement>
									<region>Mumbai</region>
									<country key="IN">India</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">UI element detection from wireframe drawings of websites</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">9ADA2AEADDEE74012F4E63FA56074F9D</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T20:49+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Website UI elements</term>
					<term>UI element extraction</term>
					<term>Image Processing</term>
					<term>OpenCV</term>
					<term>Object Detection</term>
					<term>YOLOv5</term>
					<term>Confidence Cutoff Variation</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>User Interfaces (UIs) wireframe is a crucial part of designing front-end of websites and mobile applications. Detection of UI elements such as paragraphs, buttons, images etc. from the wireframes using advanced Artificial Intelligence (AI) algorithms pave the way to automate the process of conversion of wireframes to Hypertext Mark-up Language (HTML) code. In this paper, we have explored different variants of 5th generation of You Only Look Once (YOLOv5) algorithm and post-processing techniques involving tuning of confidence cut-off variable for detection of UI elements. Our final approach comprises of data pre-processing using contrast normalization and conversion to black and white (BW), detection and localization of UI elements using YOLOv5x variant followed by confidence cutoff for selecting final bounding boxes. This approach resulted in Mean Average Precision (mAP) of 0.836 on the test data.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>In recent times, building an online presence through websites and mobile applications has become a necessity for businesses to create global outreach and provide better customer service. Designing such applications is a time-consuming and iterative process. Wireframing serves as a starting point for this process. There are various tools that could be used for creating such wireframes and converting them automatically to code. However, the tools could be expensive and require learning for specific usage.</p><p>The wireframe task from ImageCLEFdrawnUI 2021 Task <ref type="bibr" target="#b0">[1]</ref> which is part of ImageCLEF 2021 <ref type="bibr" target="#b1">[2]</ref> is the second edition in this area and aims at reducing the dependency on the tools and automating the code conversion process by using machine learning for detection and localization of UI elements in the wireframes. The dataset provided as part of this task has been enhanced from its previous edition in terms of volume and class distribution.</p><p>In this study, we focus on creating model-driven approach which is able to identify and localize the bounding boxes of all UI elements present in a wireframe. In the next section, we briefly describe the dataset used for training and validation of the models. In Section 3, we cover the methodology used âĂŞ data pre-processing, modelling and post-processing. In Section 4, we present the results from the final approach. The paper finishes with the conclusion and future work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Dataset</head><p>The dataset for the ImageCLEFdrawnUI 2021 competition <ref type="bibr" target="#b0">[1]</ref> included snapshots of hand drawn wireframe images of website layouts. These images included a total of 21 classes of atomic UI elements including images, paragraphs, headers, links etc. The provided dataset included 4291 of such images. These images were divided into a development and a test set. The development set contained 3218 labelled images while the test set contained 1073 unlabelled images. The development set was further divided into a train and a validation set. Out of the 3218 images in the development set, 3058 images were included in the train set and the rest 160 images formed the validation set (about 5% of the development set). This division was done keeping in mind that the distributions of the classes remain as close as possible. The exact distribution is shown in Table <ref type="table" target="#tab_0">1</ref>.</p><p>The images were all RGB images having a myriad of different sizes. The size distribution of the images can be seen in the histograms for height and width distribution in Figure <ref type="figure" target="#fig_0">1</ref>. All the images were later resized to a constant size of 512 x 512 for training purposes. The 21 classes present had different amount of representation in the dataset. Some classes which are commonly found in websites were abundantly present and dominated most of the images in the dataset as well. These classes are paragraph, button, link, image etc. However, some classes which are not as abundantly present in websites such as table, video, stepperinput, list etc. were present in comparatively lesser number in the dataset as well. The distribution of the classes among the dataset can be seen in Figure <ref type="figure" target="#fig_1">2</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Methodology</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Data Pre-Processing</head><p>There are different pre-processing techniques that we have employed to improve the viability and performance of our model. As we are dealing with image data, most of our pre-processing steps use OpenCV. We have opted to use OpenCV <ref type="bibr" target="#b2">[3]</ref> on C++ because of the added speed it provides when dealing with large images. Some of the techniques we have used are described in the subsequent sections.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.1.">Contrast improvement</head><p>The images present in the dataset are made by taking snapshots of wireframe drawings of websites drawn by users. These drawings were either made on paper using pen or on a whiteboard using a marker. As the final images are snapshots, they very much depend on the quality of the camera used. Since most of the cameras introduce noise or a brightness overlay on the image, it was expected and was verified that different images had different tints, brightness and contrast. To counter this issue, there are histogram equalisation techniques which change the range of the pixels present in the image from a very confined space to a much larger distribution. This generally results in a much clearer image with better separation. This can be visualised in Figure <ref type="figure" target="#fig_2">3</ref>.</p><p>The histogram equalisation technique is powerful, but it also has a shortcoming. In addition to enhancing the contrast of the image, it also enhances the noise. Since our dataset has a lot of noise due to the nature of how they are collected, we chose to employ the Contrast Limited Adaptive Histogram Equalisation(CLAHE) technique from the OpenCV library which overcomes this shortcoming.  above a specified contrast limit (we have used the default value 40 in this study) and then uniformly distributing the clipped pixels to the other bins. This can be visualised in Figure <ref type="figure" target="#fig_3">4</ref>. Using this technique provided us with much more cleaner images for our dataset. The effect of this technique can be visualised on a sample image from the dataset in Figure <ref type="figure" target="#fig_4">5</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.2.">Conversion to Black and White</head><p>After getting the contrast normalised images, the next step was to convert them into black and white. This would remove any noise present in the image and focus the model to identify only the things that matter i.e. the wireframe drawings.</p><p>There were a lot of iterations performed before by Gupta et. al. <ref type="bibr" target="#b3">[4]</ref> to get the most effective way to convert wireframe grayscale images to black and white. We have directly used the techniques discussed in that paper.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Modelling</head><p>This problem is an object detection problem at its heart. Hence, a lot of object detection models like Mask-RCNN <ref type="bibr" target="#b4">[5]</ref>, YOLO <ref type="bibr" target="#b5">[6]</ref> and EfficientDet <ref type="bibr" target="#b6">[7]</ref> come to mind. This problem can also be modelled as a segmentation problem. Hence, this adds other famous and proven models like U-Net <ref type="bibr" target="#b7">[8]</ref>, LinkNet <ref type="bibr" target="#b8">[9]</ref>, FPN <ref type="bibr" target="#b9">[10]</ref> and PSPNet <ref type="bibr" target="#b10">[11]</ref> to the list of possible modelling techniques.</p><p>For the purpose of this problem, we started with exploring U-Net and Mask-RCNN. The U-Net model is proven to work well on object detection problems, but considering that the size of the dataset is not huge as compared to deep learning (DL) standards, training a full U-Net would firstly, take a lot of time and secondly, would run into overfitting issues due to the sheer number of parameters involved.</p><p>Mask-RCNN was another great option to go ahead with. It is much easier to train and has been known to perform at par, if not better than U-Net on different use cases <ref type="bibr" target="#b11">[12]</ref>  <ref type="bibr" target="#b12">[13]</ref>. However, as it has already been explored and found to run into problems in detecting smaller UI elements, it was discarded <ref type="bibr" target="#b3">[4]</ref>.  We chose to go with the latest version of YOLOv5 <ref type="bibr" target="#b14">[15]</ref> to ensure speedy inference for real-life use cases as well as flexibility in choosing the right number of parameters based on its different flavours. The general architecture of YOLOv5 is shown in Figure <ref type="figure" target="#fig_5">6</ref>. Also, YOLOv5 is available in 4 different size variants, the details of which are present in Table <ref type="table" target="#tab_1">2</ref> most of which has been explored in this study.</p><p>Before diving into the details of all the different submissions made, let us go over the common elements of all the runs. The images were all resized to size 512 x 512 using OpenCV's linear interpolation method after performing the aforementioned pre-processing steps. Also, the dataset split was chosen to be about 95% for train and 5% for validation. A batch size of 32 was chosen throughout to train except for the extra large YOLO variants where a batch size of 16 was used. The models were trained on Google Colab using a GPU environment with a single NVIDIA Tesla K80 GPU. To establish a baseline for our runs, we used a basic YOLOv5s architecture with no pre-trained weights and trained the model for 100 epochs from scratch. The training metrics for the model can be seen in Figure <ref type="figure" target="#fig_6">7</ref>. We got a precision score of 0.876 and a recall score of 0.835 on our validation dataset. These numbers were decent for a small model with no starting weights. To explore further on how pre-trained weights would affect this, we incorporated that in the next run.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.1.">Run 1 : YOLOv5s Baseline</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.2.">Run 2 : YOLOv5s with pre-trained weights</head><p>This run included pre-trained weights on the same YOLOv5s architecture, These pre-trained weights were originally generated by training the model on COCO dataset <ref type="bibr" target="#b15">[16]</ref>. COCO dataset is very general in nature with 91 classes and contains 123,287 images. Having been trained on such a large gamut of images, these weights have a lot of information already encoded in them. Hence, very little training suffices for a decent performing model.</p><p>We loaded the model with the pre-trained weights and trained it for 100 epochs keeping all the layers unfrozen. The training metrics for the model can be seen in Figure <ref type="figure" target="#fig_7">8</ref>. We got a significant bump in our validation dataset metrics with a precision score of 0.951 and a recall score of 0.82 on the same. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.3.">Run 3 : YOLOv5l</head><p>Having tried out the small variant of YOLOv5, we moved on to the large variant. However, this time we also employed a learning rate scheduler. We used pytorch's <ref type="bibr" target="#b16">[17]</ref> implementation of ReduceLROnPlateau. We also implemented early stopping in this run. Both of these settings were carried forward to the rest of the runs as well. We trained the large model with the pre-trained weights for 200 epochs. The training metrics for the model can be seen in Figure <ref type="figure" target="#fig_8">9</ref>. We got a precision score of 0.964 and a recall score of 0.944 on our validation dataset. This is an improvement over the small model which was expected because of the larger parameters present in the large variant. To extract the highest amount of performance from the YOLO models, we next tried the YOLOv5x variant which is the largest variant present. Because of the huge size of the model parameters, we reduced our batch size to 16. We trained this model with pre-trained weights for 200 epochs and got very similar performance on our validation dataset. The training metrics for the model can be seen in Figure <ref type="figure" target="#fig_9">10</ref>. We got a precision score of 0.961 and a recall score of 0.943 on the same.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.4.">Run 4 : YOLOv5x</head><p>However, the difference was notable in the test performance of the extra large and the large model. The large model got an mAP value of 0.81 while the extra large model got an mAP score of 0.82 on the test set. Even though there are diminishing returns with respect to the speed of the model, as speed was not an issue, we decided to go ahead with the extra large model for further investigation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.5.">Run 5 : YOLOv5x with frozen layers</head><p>Up until now, we were loading the pre-trained weights into the model, and re-training all the layers. In this run, we decided to freeze the early layers and train only the head of the model. This would ideally ensure much faster training time as the number of parameters to be updated currently are huge. We trained only the head (with 0-9 layers frozen of the model) for 100 epochs. The training was much faster, however their was a dip in performance. The training metrics for the model can be seen in Figure <ref type="figure" target="#fig_10">11</ref>. We got a precision score of 0.927 and a recall score of 0.863 on the validation dataset. Since this was a sizeable dip, we planned to go ahead with the model we got in Run 4.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Post-Processing</head><p>After getting the final trained model in Run 4, their were several post-processing methods that were employed. The first method that was employed was the multi-pass inference <ref type="bibr" target="#b3">[4]</ref> and the second method was model confidence variation. We will discuss both of these methods in detail in the following sections.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.1.">Multi-Pass Inference</head><p>This method is predominantly used for increasing the recall score. This technique works by sending the image through the model multiple times, each time removing the objects that were detected earlier and then smartly appending all the outputs together based on different confidence scores and pass numbers.</p><p>We employed this technique to our study, but this did not work well as we found that our model was performing very well in the recall department detecting almost all the UI elements. Hence, there was little to no scope for improvement using this technique. Hence, we scrapped this and went on to our next post processing method.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.2.">Confidence cutoff variation</head><p>Another variable that was found to be important for the performance of our model was the confidence cutoff variable. This cutoff is a hyperparameter for the model. and is responsible for selecting all the bounding boxes that appear in the final results of the model while the others are discarded. There were several different cutoff values we tried with different number of total labels detected in the model. We observed a marginal increase in the performance of the model. The cutoff variable was changed from 25% to 1%. The summary of the results is shown in Table <ref type="table" target="#tab_2">3</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Results and Discussion</head><p>A total of 10 submissions were made. The predictions on the test set images were collated in a csv file. For each image on the test set, the bounding boxes corresponding to each instance of a detected class and the confidence scores were submitted. The mAP and recall scores obtained on the test dataset are shown in Table <ref type="table" target="#tab_3">4</ref>. It has be seen that YOLOv5x performed best of all the YOLOv5 variants and confidence cutoff variable used for post processing is an important factor as it contributed to an increase in the performance of the model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusion</head><p>In this paper, we have built YOLOv5 based model to detect and localize UI elements in wireframes. We also performed contrast normalization for improvement in the quality of the input images to the model and introduced tuning of confidence cut-off variable for improving the output performance of the model. An mAP score north of 0.8 was attained using This approach on the test data which consisted of wireframes containing a range of UI elements helping us gain 2nd position on the leaderboard of wireframe task of ImageCLEFdrawnUI 2021 <ref type="bibr" target="#b0">[1]</ref>. This approach could be integrated further into the pipeline of automating the conversion to front-end code and ensure speedy inference for real-life use cases. There is also a scope of experimenting with the ensemble of two modelling approaches: one for wireframes with more compactly placed UI elements and another for wireframes with less compactly placed UI elements. This would ensure that confidence cutoff variable is correctly tuned and would result in getting the reasonable number of selected bounding boxes for these two different cases.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Distribution of heights and widths of the images in the development set with 20 bins.</figDesc><graphic coords="3,108.88,84.19,187.51,125.01" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Distribution of the classes within the development dataset. The plot on the left shows the total number of occurrences of a class in the dataset. The plot on the right shows the spread of the classes defined as the total number of unique images which have atleast 1 occurence of that class. Both of these plots show that some classes are abundantly present while some are a little less represented.</figDesc><graphic coords="3,108.88,301.60,187.51,125.01" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Histogram equalisation technique changing the narrow histogram of the pixels in the input image to a much more wide distribution.</figDesc><graphic coords="4,203.88,256.27,187.52,84.38" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Histogram equalisation after contrast limiting (CLAHE)</figDesc><graphic coords="4,203.88,532.55,187.51,68.17" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 5 :</head><label>5</label><figDesc>Figure 5: A depiction of the effect of CLAHE on images. The image on the left is the original image and the one on the right is after applying CLAHE. It can be observed that the processed image is much more clear and legible than the original.</figDesc><graphic coords="5,108.88,84.19,187.50,192.99" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Figure 6 :</head><label>6</label><figDesc>Figure 6: YOLOv5 architecture [14]</figDesc><graphic coords="6,130.96,153.69,333.37,248.65" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_6"><head>Figure 7 :</head><label>7</label><figDesc>Figure 7: Train results for the YOLOv5s model with no pre-trained weights.</figDesc><graphic coords="7,89.29,183.75,416.69,208.35" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_7"><head>Figure 8 :</head><label>8</label><figDesc>Figure 8: Train results for the YOLOv5s model with pre-trained weights.</figDesc><graphic coords="8,89.29,84.19,416.69,208.35" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_8"><head>Figure 9 :</head><label>9</label><figDesc>Figure 9: Train results for the YOLOv5l model with pre-trained weights.</figDesc><graphic coords="8,89.29,424.18,416.69,208.35" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_9"><head>Figure 10 :</head><label>10</label><figDesc>Figure 10: Train results for the YOLOv5x model with pre-trained weights.</figDesc><graphic coords="9,89.29,183.75,416.69,208.35" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_10"><head>Figure 11 :</head><label>11</label><figDesc>Figure 11: Train results for the YOLOv5x model with pre-trained weights and layers 0-9 frozen (Trained only on the head layer)</figDesc><graphic coords="10,89.29,84.19,416.69,208.35" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Class distribution in the train and validation sets.</figDesc><table><row><cell>Label</cell><cell cols="4">Train Freq Val Freq Train Dist Val Dist</cell></row><row><cell>paragraph</cell><cell>2727</cell><cell>141</cell><cell>0.89</cell><cell>0.88</cell></row><row><cell>label</cell><cell>1004</cell><cell>42</cell><cell>0.33</cell><cell>0.26</cell></row><row><cell>header</cell><cell>2059</cell><cell>121</cell><cell>0.67</cell><cell>0.76</cell></row><row><cell>button</cell><cell>2991</cell><cell>153</cell><cell>0.98</cell><cell>0.96</cell></row><row><cell>image</cell><cell>2462</cell><cell>121</cell><cell>0.81</cell><cell>0.76</cell></row><row><cell>linebreak</cell><cell>2370</cell><cell>128</cell><cell>0.78</cell><cell>0.8</cell></row><row><cell>container</cell><cell>2923</cell><cell>153</cell><cell>0.96</cell><cell>0.96</cell></row><row><cell>link</cell><cell>1031</cell><cell>56</cell><cell>0.34</cell><cell>0.35</cell></row><row><cell>textinput</cell><cell>1453</cell><cell>69</cell><cell>0.48</cell><cell>0.43</cell></row><row><cell>dropdown</cell><cell>688</cell><cell>34</cell><cell>0.22</cell><cell>0.21</cell></row><row><cell>checkbox</cell><cell>663</cell><cell>25</cell><cell>0.22</cell><cell>0.16</cell></row><row><cell>radiobutton</cell><cell>478</cell><cell>14</cell><cell>0.16</cell><cell>0.09</cell></row><row><cell>rating</cell><cell>434</cell><cell>11</cell><cell>0.14</cell><cell>0.07</cell></row><row><cell>toggle</cell><cell>452</cell><cell>13</cell><cell>0.15</cell><cell>0.08</cell></row><row><cell>textarea</cell><cell>418</cell><cell>9</cell><cell>0.14</cell><cell>0.06</cell></row><row><cell>datepicker</cell><cell>468</cell><cell>12</cell><cell>0.15</cell><cell>0.08</cell></row><row><cell>stepperinput</cell><cell>91</cell><cell>3</cell><cell>0.03</cell><cell>0.02</cell></row><row><cell>slider</cell><cell>491</cell><cell>16</cell><cell>0.16</cell><cell>0.1</cell></row><row><cell>video</cell><cell>448</cell><cell>22</cell><cell>0.15</cell><cell>0.14</cell></row><row><cell>table</cell><cell>56</cell><cell>1</cell><cell>0.02</cell><cell>0.01</cell></row><row><cell>list</cell><cell>180</cell><cell>6</cell><cell>0.06</cell><cell>0.04</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc>Comparison of different variants of YOLOv5 using COCO dataset<ref type="bibr" target="#b14">[15]</ref> </figDesc><table><row><cell>YOLOv5s</cell><cell>43.3</cell><cell>43.3</cell><cell>61.9</cell><cell>4.3</cell><cell>12.7</cell></row><row><cell>YOLOv5m</cell><cell>50.5</cell><cell>50.5</cell><cell>68.7</cell><cell>8.4</cell><cell>35.9</cell></row><row><cell>YOLOv5l</cell><cell>53.4</cell><cell>53.4</cell><cell>71.1</cell><cell>12.3</cell><cell>77.2</cell></row><row><cell>YOLOv5x</cell><cell>54.4</cell><cell>54.4</cell><cell>72.0</cell><cell>22.4</cell><cell>41.8</cell></row></table><note>Model mAP val 0.5:0.95 mAP Test 0.5:0.95 mAP val 0.5 Speed on V100 (ms) Parameters (millions)</note></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3</head><label>3</label><figDesc>Different cutoff values tried and corresponding number of labels and test mAP score.</figDesc><table><row><cell cols="4">Confidence Cutoff Total Dataset labels % increase in labels Test mAP scores</cell></row><row><cell>25</cell><cell>101251</cell><cell>0.0</cell><cell>0.820</cell></row><row><cell>20</cell><cell>109848</cell><cell>8.5</cell><cell>0.824</cell></row><row><cell>15</cell><cell>119881</cell><cell>9.1</cell><cell>NA</cell></row><row><cell>10</cell><cell>130765</cell><cell>9.0</cell><cell>0.829</cell></row><row><cell>5</cell><cell>142963</cell><cell>9.3</cell><cell>0.832</cell></row><row><cell>1</cell><cell>165013</cell><cell>15.4</cell><cell>0.836</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 4</head><label>4</label><figDesc>Table summarising the runs submitted for the challenge.</figDesc><table><row><cell></cell><cell>Run ID</cell><cell>Model Description</cell><cell>mAP Recall</cell></row><row><cell>Run 1</cell><cell>132552</cell><cell>YOLOv5s baseline</cell><cell>0.649 0.675</cell></row><row><cell>Run 2</cell><cell>132567</cell><cell>YOLOv5s with pre-trained weights</cell><cell>0.649 0.675</cell></row><row><cell>Run 3</cell><cell>132575</cell><cell>YOLOv5l with pre-trained weights</cell><cell>0.810 0.826</cell></row><row><cell>Run 4</cell><cell>132583</cell><cell>YOLOv5x with pre-trained weights , LR, Early Stopping</cell><cell>0.820 0.840</cell></row><row><cell>Run 5</cell><cell cols="3">132592 YOLOv5x with pre-trained weights and only heads trained 0.701 0.731</cell></row><row><cell>Run 6</cell><cell>134090</cell><cell>Run 4 with 0.2 confidence cutoff</cell><cell>0.824 0.844</cell></row><row><cell>Run 7</cell><cell>134099</cell><cell>Run 4 with 0.15 confidence cutoff</cell><cell>0.824 0.844</cell></row><row><cell>Run 8</cell><cell>134113</cell><cell>Run 4 with 0.1 confidence cutoff</cell><cell>0.829 0.852</cell></row><row><cell>Run 9</cell><cell>134133</cell><cell>Run 4 with 0.05 confidence cutoff</cell><cell>0.832 0.858</cell></row><row><cell cols="2">Run 10 134133</cell><cell>Run 4 with 0.01 confidence cutoff</cell><cell>0.836 0.865</cell></row></table></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>Thanks to the developers of Google Colab for providing a free GPU environment for model training.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Overview of ImageCLEFdrawnUI 2021: The detection and recognition of hand drawn and digital website uis task</title>
		<author>
			<persName><forename type="first">R</forename><surname>Berari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Tauteanu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Fichou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Brie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Dogariu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">D</forename><surname>Ştefan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">G</forename><surname>Constantin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Ionescu</surname></persName>
		</author>
		<ptr target="WS.org&lt;http://ceur-ws.org&gt;" />
	</analytic>
	<monogr>
		<title level="m">CLEF2021 Working Notes, CEUR Workshop Proceedings</title>
				<meeting><address><addrLine>Bucharest, Romania</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Overview of the ImageCLEF 2021: Multimedia retrieval in medical, nature, internet and social media applications</title>
		<author>
			<persName><forename type="first">B</forename><surname>Ionescu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Müller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Péteri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ben Abacha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Sarrouti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Demner-Fushman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">A</forename><surname>Hasan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kozlovski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Liauchuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Dicente</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Kovalev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Pelka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">G S</forename><surname>De Herrera</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Jacutprakart</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">M</forename><surname>Friedrich</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Berari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Tauteanu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Fichou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Brie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Dogariu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">D</forename><surname>Ştefan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">G</forename><surname>Constantin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Chamberlain</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Campello</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Clark</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">A</forename><surname>Oliver</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Moustahfid</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Popescu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Deshayes-Chossart</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Experimental IR Meets Multilinguality, Multimodality, and Interaction, Proceedings of the 12th International Conference of the CLEF Association (CLEF 2021)</title>
		<title level="s">LNCS Lecture Notes in Computer Science</title>
		<meeting><address><addrLine>Bucharest, Romania</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">The OpenCV Library</title>
		<author>
			<persName><forename type="first">G</forename><surname>Bradski</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Dr. Dobb&apos;s Journal of Software Tools</title>
		<imprint>
			<date type="published" when="2000">2000</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">and novel multi-pass inference technique</title>
		<author>
			<persName><forename type="first">P</forename><surname>Gupta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Mohapatra</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Html atomic ui elements extraction from hand-drawn website images using mask-rcnn</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Mask r-cnn</title>
		<author>
			<persName><forename type="first">K</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Gkioxari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Dollãąr</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Girshick</surname></persName>
		</author>
		<idno type="DOI">10.1109/ICCV.2017.322</idno>
	</analytic>
	<monogr>
		<title level="m">2017 IEEE International Conference on Computer Vision (ICCV)</title>
				<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="2980" to="2988" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">You only look once: Unified, real-time object detection</title>
		<author>
			<persName><forename type="first">J</forename><surname>Redmon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Divvala</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Girshick</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Farhadi</surname></persName>
		</author>
		<idno type="DOI">10.1109/CVPR.2016.91</idno>
	</analytic>
	<monogr>
		<title level="m">IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</title>
				<imprint>
			<date type="published" when="2016">2016. 2016</date>
			<biblScope unit="page" from="779" to="788" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<title level="m" type="main">Efficientdet: Scalable and efficient object detection</title>
		<author>
			<persName><forename type="first">M</forename><surname>Tan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Pang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><forename type="middle">V</forename><surname>Le</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1911.09070</idno>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">U-net: Convolutional networks for biomedical image segmentation</title>
		<author>
			<persName><forename type="first">O</forename><surname>Ronneberger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Fischer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Brox</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Medical Image Computing and Computer-Assisted Intervention -MICCAI 2015</title>
				<editor>
			<persName><forename type="first">N</forename><surname>Navab</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Hornegger</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">W</forename><forename type="middle">M</forename><surname>Wells</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><forename type="middle">F</forename><surname>Frangi</surname></persName>
		</editor>
		<meeting><address><addrLine>Cham</addrLine></address></meeting>
		<imprint>
			<publisher>Springer International Publishing</publisher>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="234" to="241" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Linknet: Exploiting encoder representations for efficient semantic segmentation</title>
		<author>
			<persName><forename type="first">A</forename><surname>Chaurasia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Culurciello</surname></persName>
		</author>
		<idno type="DOI">10.1109/vcip.2017.8305148</idno>
		<ptr target="http://dx.doi.org/10.1109/VCIP.2017.8305148.doi:10.1109/vcip.2017.8305148" />
	</analytic>
	<monogr>
		<title level="m">IEEE Visual Communications and Image Processing</title>
				<imprint>
			<publisher>VCIP</publisher>
			<date type="published" when="2017">2017. 2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Feature pyramid networks for object detection</title>
		<author>
			<persName><forename type="first">T.-Y</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Dollãąr</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Girshick</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Hariharan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Belongie</surname></persName>
		</author>
		<idno type="DOI">10.1109/CVPR.2017.106</idno>
	</analytic>
	<monogr>
		<title level="m">IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</title>
				<imprint>
			<date type="published" when="2017">2017. 2017</date>
			<biblScope unit="page" from="936" to="944" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Pyramid scene parsing network</title>
		<author>
			<persName><forename type="first">H</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Shi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Qi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Jia</surname></persName>
		</author>
		<idno type="DOI">10.1109/CVPR.2017.660</idno>
	</analytic>
	<monogr>
		<title level="m">IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</title>
				<imprint>
			<date type="published" when="2017">2017. 2017</date>
			<biblScope unit="page" from="6230" to="6239" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Comparing Mask R-CNN and U-Net architectures for robust automatic segmentation of immune cells in immunofluorescence images of Lupus Nephritis biopsies</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">S</forename><surname>Durkee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Abraham</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Ai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">D</forename><surname>Fuhrman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">R</forename><surname>Clark</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">L</forename><surname>Giger</surname></persName>
		</author>
		<idno type="DOI">10.1117/12.2577785</idno>
		<ptr target="https://doi.org/10.1117/12.2577785.doi:10.1117/12.2577785" />
	</analytic>
	<monogr>
		<title level="m">Imaging, Manipulation, and Analysis of Biomolecules, Cells, and Tissues XIX</title>
				<editor>
			<persName><forename type="first">I</forename><surname>Georgakoudi</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Tarnok</surname></persName>
		</editor>
		<imprint>
			<publisher>SPIE</publisher>
			<date type="published" when="2021">2021</date>
			<biblScope unit="volume">11647</biblScope>
			<biblScope unit="page" from="109" to="115" />
		</imprint>
	</monogr>
	<note>International Society for Optics and Photonics</note>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Comparing u-net convolutional network with mask r-cnn in agricultural area segmentation on satellite images</title>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">T P</forename><surname>Quoc</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">T</forename><surname>Linh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">N T</forename><surname>Minh</surname></persName>
		</author>
		<idno type="DOI">10.1109/NICS51282.2020.9335856</idno>
	</analytic>
	<monogr>
		<title level="m">2020 7th NAFOSTED Conference on Information and Computer Science (NICS)</title>
				<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="124" to="129" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">A forest fire detection system based on ensemble learning</title>
		<author>
			<persName><forename type="first">R</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Cao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
		<idno type="DOI">10.3390/f12020217</idno>
	</analytic>
	<monogr>
		<title level="j">Forests</title>
		<imprint>
			<biblScope unit="volume">12</biblScope>
			<biblScope unit="page">217</biblScope>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<author>
			<persName><forename type="first">G</forename><surname>Jocher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Stoken</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Borovec</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Nanocode012</surname></persName>
		</author>
		<author>
			<persName><surname>Chaurasia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Taoxie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">V</forename><surname>Changyu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Laughing</surname></persName>
		</author>
		<author>
			<persName><surname>Hogan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Alexwang1900</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Hajek</surname></persName>
		</author>
		<author>
			<persName><surname>Diaconu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Marc</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Kwon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Defretin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Lohia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Milanko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Fineran</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Khromov</surname></persName>
		</author>
		<author>
			<persName><surname>Yiwei</surname></persName>
		</author>
		<author>
			<persName><surname>Doug</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Durgesh</surname></persName>
		</author>
		<author>
			<persName><surname>Ingham</surname></persName>
		</author>
		<idno type="DOI">10.5281/zenodo.4679653</idno>
		<ptr target="https://doi.org/10.5281/zenodo.4679653.doi:10.5281/zenodo.4679653" />
		<title level="m">ultralytics/yolov5: v5.0 -YOLOv5-P6 1280 models, AWS, Supervise.ly and YouTube integrations</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<monogr>
		<author>
			<persName><forename type="first">T.-Y</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Maire</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Belongie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Bourdev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Girshick</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hays</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Perona</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Ramanan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">L</forename><surname>Zitnick</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Dollãąr</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1405.0312</idno>
		<title level="m">Microsoft coco: Common objects in context</title>
				<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Pytorch: An imperative style, high-performance deep learning library</title>
		<author>
			<persName><forename type="first">A</forename><surname>Paszke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gross</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Massa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Lerer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Bradbury</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Chanan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Killeen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Gimelshein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Antiga</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Desmaison</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kopf</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Devito</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Raison</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Tejani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Chilamkurthy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Steiner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Fang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Bai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Chintala</surname></persName>
		</author>
		<ptr target="http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf" />
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Information Processing Systems 32</title>
				<editor>
			<persName><forename type="first">H</forename><surname>Wallach</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Larochelle</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Beygelzimer</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">F</forename><surname>Alché-Buc</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">E</forename><surname>Fox</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">R</forename><surname>Garnett</surname></persName>
		</editor>
		<imprint>
			<publisher>Curran Associates, Inc</publisher>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="8024" to="8035" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
