<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Unsupervised Unknown Unknown Detection in Active Learning</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Prajit T. Rajendran</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Huascar Espinoza</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Agnes Delaborde</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Chokri Mraidha</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CEA, List</institution>
          ,
          <addr-line>F-91120, Palaiseau</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>KDT JU</institution>
          ,
          <addr-line>Avenue de la Toison d'Or 56-60, 1060 Brussels</addr-line>
          ,
          <country country="BE">Belgium</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Laboratoire National de Metrologie et d'Essais</institution>
          ,
          <addr-line>Trappes</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Unknown unknowns in machine learning signify data points outside the distribution of known data and constitute blindspots of traditional machine learning models. As these data points typically involve rare and unexpected scenarios, the models may make wrong predictions, potentially leading to catastrophic situations. Detecting “unknown unknowns" is essential to ensure machine learning systems' reliability and robustness and avoid unexpected failures in real-world safety-critical applications. This paper proposes an Unsupervised Unknown Unknown Detection in Active Learning (U3DAL) to detect “unknown unknowns" in a stream-based data setting using active learning data selection mechanisms that rely on uncertainty and diversity. The efectiveness of the proposed approach is validated on the Imagenet-A dataset and across diferent metrics, demonstrating that it outperforms existing methods for detecting “unknown unknowns".</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Active learning</kwd>
        <kwd>safety</kwd>
        <kwd>unknown unknowns</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <sec id="sec-1-1">
        <title>1.1. Motivation</title>
        <p>Thanks to its ability to make accurate predictions based
on patterns and trends in data, machine learning has
become a popular tool across various industries and
use cases. However, regarding the use of such models
in safety critical applications, there are some potential
downsides such as distribution shift, adversarial
examples, lack of explainability, out of distribution examples,
anomalies, unknown unknowns and more. Unknown
unknowns refer to data points that are outside the
distribution of known data and, therefore, represent blind
spots of traditional machine learning models[1]. These
data points typically involve rare and unexpected
scenarios, and if a model is not able to detect them, it may make
wrong predictions, potentially leading to catastrophic
situations. Model monitoring mechanisms such as purely
uncertainty based techniques fail in this regard, because
the model is highly confident about its misprediction.</p>
        <p>Detecting unknown unknowns in machine learning
can be challenging because these are unanticipated issues
that have not been previously encountered or accounted
for in the design phase[2, 3, 4]. Some of the simpler, yet
not fully suficient ways to deal with them are discussed
below:
• Anomaly detection: Anomalies maybe present
in the data, which may confuse the model to make
confident mispredictions[ 5]. During testing or
deployment, anomaly detectors could be deployed
to identify potentially anomalous inputs or states.
In training time, it is possible to analyze the data
thoroughly to determine biases and irregularities
so that these anomalies are not passed on to the
model. This is harder when we have no access
to what the true data is and what the anomaly is,
which is typical in stream-based data settings.
• Out-of-distribution detection: Machine
learning models perform poorly when shown data
points which are very diferent from previously
seen data points[6]. Detecting potential
out-ofdistribution samples that may not belong to any
known classes or categories could also help in
identifying potential unknown unknowns. Note
that out-of-distribution samples is a subset of
unknown unknowns, which includes all data points
which are high-confidence mispredictions by the
model.
• Adversarial Attack Detection: Adversarial
inputs may confuse the model to make highly
confident mispredictions lead to unknown unknowns
[7]. There are various techniques to tackle
adversarial exampes, which could also help in
mitigating some unknown unknowns.
• Human-in-the-Loop: Humans are equipped
with conceptual knowledge and hence can
idenInitial training data
Empty buffer and
retrain model
Prediction model M
Stream buffer S</p>
        <p>Is S full?</p>
        <p>Data
stream D</p>
        <p>Anomaly set
Unknown
unknowns
detected</p>
        <p>Select points under</p>
        <p>U, over D</p>
        <p>Label
Selection
criterion</p>
        <p>Is T
exceeded?</p>
        <p>Selected</p>
        <p>data
Selection
thresholds
T ={U, D}</p>
        <p>Annotator</p>
        <p>Label</p>
        <p>Label</p>
        <p>No</p>
        <p>Is B
exceeded? Yes</p>
        <p>TERMINATE</p>
        <p>Stream
buffer limit L
Budget B</p>
        <sec id="sec-1-1-1">
          <title>The above methods can assisting in identifying safety</title>
          <p>issues to some extent, but it is not possible to detect or
account for all unknown unknowns.</p>
          <p>To ensure the reliability and robustness of machine
learning systems, it is crucial to detect unknown
unknowns. In this paper, we propose a new approach called
Unsupervised Unknown Unknown Detection in Active
Learning (U3DAL) to detect unknown unknowns in a
stream-based data setting using active learning data
selection mechanisms that rely on uncertainty and diversity
thresholds.</p>
          <p>In active learning, a model is trained with a subset
of initial labeled data. Based on a predefined function
called the acquisition function, the remaining data points
tify potentially dangerous situations with their are analyzed to determine which of them are complex or
expert knowledge. If a human is present in the interesting enough to be labelled by the human [9]. Some
loop, they can assist the model by covering its of the common functions include uncertainty, which is
blindspots, hence mitigating some of the un- a measure of how confident the model is in its
predicknown unknowns [8]. tions [10] and diversity, measuring how the distance of
• Robustness testing: If it is possible to test the the instances in the stream from those already in the
model under diferent scenarios, data distribu- training set[11]. The acquisition function is designed
tions and perturbations, some dangers of un- to select the most informative or diverse data points to
known unknowns could be mitigated. However, be labeled, within the constraint of the budget, without
in a stream-based active learning setting wherein compromising on performance[8].
the data arrives one by one, it is not possible to Stream-based active learning is a type of active
learnmitigate the danger of unknown unknowns in ing wherein the data arrives in a continuous stream [12].
advance; it is necessary to detect these unknown Learning in real-time is essential in applications where
unknowns in real time. the data distribution is time variant. A challenging
aspect of the stream-based learning approach is that it is
not possible to access future data points, and therefore
the decision of whether or not to choose a data point for
querying to the human oracle has to be made as the data
arrives.</p>
          <p>In this paper, we aim to solve the problem of detecting
unknown unknowns in stream-based active learning
setting in an unsupervised manner without access to what
constitutes a "good" data point or "bad" data point
beforehand. As the model has no access to future data points
and needs to make a decision to query the current data
point one by one, it is interesting to determine which
points could be potentially unsafe as they arrive. Since
stream-based active learning methods have thresholds
for data selection by design, we hypothesize that these has been shown to be efective in various domains,
inthresholds can help us determine unknown unknown cluding cybersecurity and finance, where the detection of
data points. Moreover, through our empirical experi- unknown unknowns is crucial for identifying emerging
ments, we aim to explore the link between the unknown threats or fraud.
unknown detection capability and the threshold levels. Apart from Isolation Forest and LOF, several other</p>
          <p>Contributions: This paper proposes an unknown un- techniques have also been introduced for anomaly
detecknown detection mechanism in a stream-based active tion. Density-Based Spatial Clustering of Applications
learning application, making use of the thresholds for with Noise (DBSCAN) algorithm is one such algorithm
uncertainty and diversity. The contributions of this paper proposed by Ester et al. [17]. DBSCAN groups together
are listed as follows: densely connected data points and identifies outliers as
points that do not belong to any cluster. Studies such as
• Defined a novel unknown unknown detection Tang et al. [18] have applied DBSCAN for anomaly
detecalgorithm which uses the thresholds for uncer- tion in computer networks, reporting its ability to detect
tainty and diversity to determine low entropy and unknown unknowns. In some works such as Hodge and
high diversity points. Austin [19] proposed an ensemble approach that
com• Conducted an empirical study with the datasets bines Isolation Forest, LOF, and other techniques to
enMini Imagenet and Imagenet-A, comparing with hance the overall detection performance, ensemble-based
state-of-the-art approaches in anomaly detection. methods such as the combination of multiple anomaly
• Studied the impact of the uncertainty and diver- detection algorithms, have been explored.
sity thresholds over several acquisition functions By making use of intrinsic data characteristics,
unsuin terms of the unknown unknown detection ca- pervised anomaly detection methods has the potential to
pability. be applied in various domains such as fraud detection,
cyber security and safety critical applications, where the
2. Related works identification of unknown unknowns is of paramount
importance to improve safety. In this work, U3DAL, a novel
unsupervised anomaly detection method is proposed.</p>
          <p>Detection of unknown unknowns and anomalies in
machine learning is of paramount importance in the case of
deployment in safety critical applications. Several studies 3. U3DAL Method
have researched about efective techniques to tackle these
problems. Isolation Forest, proposed by Liu et al. [13], is Figure 2 demonstrates the quadrant of knowledge in
maa powerful anomaly detection algorithm capable of efi- chine learning [8]. In the top left are the known knowns.
ciently handling high-dimensional data, and is a popular These are the data points which the model is confident
choice in the industry. It utilizes the principle of isolating about, and makes correct predictions. Therefore these
anomalies, making it potentially suitable for detecting data points have a low predictive entropy and are familiar,
unknown unknowns eficiently. The Isolation Forest al- hence are not too distant from what is already seen by the
gorithm constructs a random forest of isolation trees, model. Here, we trust the model to make the correct
deciwhere anomalies are expected to have shorter average sions. Known unknowns are data points which the model
path lengths. Studies such as Liu et al. [13] have demon- is underconfident about and makes wrong predictions.
strated the efectiveness of Isolation Forest in identifying The dangerous situations arising from these data points
anomalies in diverse applications, including network in- can be captured easily using uncertainty based monitors.
trusion detection and fraud detection. Isolation forest is Here, we know that the model should not be trusted.
marked by its ability to handle high-dimensional data Unknown knowns are human blindspots, such as latent
and its resistance to outliers and this makes it a popular features, but they are rich features in the model’s
perchoice in anomaly detection tasks. spective and facilitates better prediction capabilities. The</p>
          <p>Local Outlier Factor (LOF), introduced by Breunig et al. last category in the quadrant consists of the unknown
[14] is another widely studied anomaly detection tech- unknowns. These are the data points which the model
nique. LOF measures the degree of local deviation of makes mispredictions with a high confidence. Therefore
a data point with respect to its neighboring points, en- they are categorized by a low predictive entropy (high
abling it to identify anomalies based on the concept of confidence) and high diversity (diferent from data seen
difering densities. Various studies have focussed the previously) score.
application of LOF in anomaly detection tasks such as Pa- Figure 1 shows the block diagram of the proposed
appadimitriou et al. [15], where LOF was applied for outlier proach. As in a typical stream-based learning setting,
detection in sensor networks. Schubert et al. [16] used there is a prediction model M trained on the initially
LOF for detecting anomalies in spatial databases. LOF available labelled data. The data stream is passed into the
prediction model, and the acquisition function decides
whether the data point should be selected to be labelled
by the human annotator or not. This selection is based on
a preset criterion such as uncertainty or threshold, and
requires thresholds for each of the criteria. Data points
exceeding the threshold are passed on to the annotator to
provide labels. The human oracle can only provide labels
until the budget B is exhausted. The label and data point
are passed to a stream bufer. When the bufer is full, the
data is appended to the previously used training data and
the prediction model M is re-trained. This process
continues till either the budget B runs out, the data stream D
stops or the prediction model reaches a suficient level of
performance. After each training of the model, it is
possible for the model to be used as an unknown unknown
detector as well, apart from its original functionality of
classification, regression etc. This is the core idea of
U3DAL- making use of the thresholds U, D and the
prediction model M to determine whether a given data point
is an unknown unknown data point or not. If the
normalized (min-max, for instance) predictive entropy of a
given data point is lower than the threshold U, and if its
distance score is greater than the threshold D, U3DAL
classifies that point as an unknown unknown. To
evaluate the eficacy of this approach, U3DAL is compared
with other state of the art approaches such as Isolation
forest and LOF on the same anomaly set (all data points
of which are curated to be very complex anti-examples)
to compare how many of the unknown unknowns are
detected accurately. Note that the approach is unsupervised
because the model is not provided any prior information
regarding which samples constitute unknown unknowns.</p>
          <p>The labelling that takes place in this pipeline refers to the
human oracle providing class labels to the
corresponding data points, which only influences the classification
performance of the model on the trained task of
classiifcation and not on unknown unknown detection. The
unknown unknown detection model is based on the
uncertainty and diversity thresholds of selection and are
not dependent on the class labels provided by the human
oracle.
which is a well-established measure in the active learning
domain. Predictive entropy is a measure of the spread of
the probability distribution over all the possible classes.</p>
          <p>High entropy indicates increased randomness, which
whereas low entropy indicates that the model is confident
in its prediction, regardless of its accuracy. High entropy
data points are usually close to the decision boundary and
therefore can be categorized as the known unknowns of
the model. Identifying these data points which are close
to the boundary and labelling them selectively results in  is defined as:
an improved performance without the need to label all
instances.</p>
          <p>In U3DAL, the measurement of uncertainty is entropy, of a data point from the centroid of the instances from the
means that the model is unsure about the true class, the model is re-trained, the mean and standard deviation</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>4. Evaluation</title>
      <sec id="sec-2-1">
        <title>This section introduces the test methodology used in this work and presents the experimental results.</title>
        <sec id="sec-2-1-1">
          <title>4.1. Test methodology</title>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>Algorithm 1 U3DAL algorithm</title>
        <p>Require: Data stream:  = 0, 1, ...,</p>
        <p>Budget: , Uncertainty selection threshold:  ∈ [0, 1],
Diversity selection threshold:  ∈ [0, 1], Stream
bufer: , Current uncertainty range: , ,
Current diversity range: , , Training
vector mean:  , Training vector standard deviation:  ,
Classification model:  , trained on initially labeled
data
Initialize  = 0
while  &lt;  do
 ← ((−−))
 ← ((−−))
if selected by acquisition function then
append  to stream bufer 
 ←  + 1
end if
if stream bufer  full then</p>
        <p>Re-train model, empty bufer
Initialize  = 0
for data point in anomaly set do
if  &lt;  and  &gt;  then</p>
        <p>+ = 1
end if
end for
end if
end while
Output: : Number of detected
unknown unknown data samples in anomaly set
The proposed algorithm was tested on the classification
problem on the Imagenet-A dataset, which is a
challenging dataset that causes machine learning model perfor- to assess the number of anomalies detected after each
mance to degrade substantially. The authors of [21] re- round of training. The training set for the classification
port that on this dataset, well known CNN models ex- task consists of images from the same fifteen classes,
hibit an accuracy drop of approximately 90%. The data taken from the Mini Imagenet dataset [22]. These
impoints are chosen to be those with limited spurious cues, ages are also resized to the dimension 224 x 224 x 3 and
collected with a simple adversarial filtration technique. are 9000 in number, 600 from each class. 1000 images
The Imagenet-A dataset contains images belonging to are selected to be the initial labelled points in the active
iffteen classes; 759 of them constitute the anomaly set in learning pipeline, and a further 1000 points are set aside
this work. Each image is resized to the dimension 224 x to be the validation set. Data points from Mini Imagenet
224 x 3, which is a standard input shape for most well dataset, fed in a stream to simulate a stream-based active
known CNN models used in transfer learning. Note that learning setting, are used to train the model to perform
the anomaly set shall not be seen by the model at any image classification. Unknown unknown detection is
stage of training, and will only be used as the dataset not the trained task of the model, and is instead
accomplished using the selection thresholds for entropy and points as those having a low entropy and high
diversity achieved during the training process. Note that diversity, in a stream-based setting.
the Imagenet-A data samples are the anomalous samples
used solely for testing the performance of the unknown LOF is a flexible algorithm, and it can handle diferent
unknown detection model and are not seen by the model types of data and adapt to various data distributions. It
during test time. Figures 2 and 3 demonstrate how the is particularly useful in situations where the normal data
Mini Imagenet dataset consists of normal data points points exhibit complex patterns. Isolation forest is
efiuse for training whereas Imagenet-A consists of more cient and is capable of dealing with high-dimensional
complex and confusing data points. data, and is thus useful for detecting anomalies in
var</p>
        <p>Since the goal of the work is to evaluate unknown un- ious applications. The above algorithms are extremely
known detection in a stream-based setting, the remaining popular in the world of anomaly detection, and they form
data points are fed into the active learning pipeline one a good baseline to evaluate the eficacy of the proposed
by one. The order of the data points to be fed to the method because of their extensive use in the industry.
pipeline is shufled, but the random seed is fixed in order
tthoefafoclilloitwatiengcoexmppearirmiseonntsb,etthwe emeanxdimifeurmenbtusdetgteintgs.isInset ThUUre==s00h..o65ld D1=91106.5 D=86590.6 D=56680.7
to be 4000 data points, meaning that at most 4000 data U=0.7 122 88 70
points out of the dataset are fed to the human oracle for
labeling. Table 2</p>
        <p>Transfer learning based on the Mobilenet backbone Variation of number of unknown unknown data points
de[23] is used as the prediction model. As the onus of the tected as a function of the uncertainty threshold (U) and
paper is on unknown unknown detection, multiple archi- diversity threshold (D), acquisition function = Random
tectures were not tested for the prediction model.
However, as the proposed approach is model agnostic, there
are no limitations to apply the same for other model archi- Threshold D=0.5 D=0.6 D=0.7
tectures. The architecture used is: Mobilenet backbone + U=0.5 84 62 46
GlobalAveragePooling2D + Dense(1024) + Dense(512) + U=0.6 96 69 52
Dense(100) + Dense(15). The penultimate fully connected U=0.7 108 77 58
(Dense) layer acts as the base to extract the intermediate Table 3
features, in order to compute the diversity score, as well Variation of number of unknown unknown data points
deas input to the baselines of Local Outlier Factor (LOF)[14] tected as a function of the uncertainty threshold (U) and
and Isolation forest[13]. The other parameters are as fol- diversity threshold (D), acquisition function = Uncertainty
lows:  = 4000, Bufer size = 1000. Since the 15-class
classification problem in Mini Imagenet included a total
of 9000 images, the total budget  was set to be 4000 Threshold D=0.5 D=0.6 D=0.7
(&lt;50% of all images) to simulate a realistic active learning U=0.5 90 69 55
setting with limited time and resources. The bufer size U=0.6 100 76 57
was selected to be 1000 to ensure that the model is not U=0.7 104 78 59
re-trained too often, to follow time constraints of train- Table 4
ing. The Mobilenet backbone was selected because it is Variation of number of unknown unknown data points
dea very popular CNN model used for image classification tected as a function of the uncertainty threshold (U) and
tasks. Ablation studies are possible with diferent archi- diversity threshold (D), acquisition function = Diversity
tectures, budget values, bufer sizes and thresholds and
this is deferred to future work. The proposed method is evaluated with the following</p>
        <p>The algorithms used for unknown unknown detection acquisition functions:
in this work are as follows:
• Local Outlier Factor (Baseline): Identifies
anomalies with the concept that outliers have
diferent densities compared to their neighboring
data points.
• Isolation forest (Baseline): Measures the
anomaly score based on the average path length
required to isolate instances.
• U3DAL (Our approach): Detects anomalous
• Random selection: Data points from the stream
are selected at random to be queried to the
annotator.
• Entropy/uncertainty-based selection: Data
points are selected to be labeled if they have a
predictive entropy higher than a preset threshold.
• Distance/diversity-based selection: Data
points are selected to be labeled if they have a
Z-score higher than a preset threshold.
Uncertainty and diversity based methods are popular data points. Tables 2,3, and 4 illustrate how the
unceracquisition functions in active learning applications. In tainty and diversity thresholds influence the number of
uncertainty-based techniques, the focus is on selecting in- unknown unknowns correctly detected for each
acquistances that the model is unsure about- dealing with the sition function. Combinations of 0.5, 0.6 and 0.7 were
model blindspots, whereas diversity-based techniques tested for both the uncertainty and diversity thresholds.
aim to maximize the diversity of the data points in the It can be observed that the best configuration for this
training set- dealing with the data blindspots. Both ap- anomaly set is U=0.7 and D=0.5. This implies that only
proaches possess diferent advantages, and are popular data points with a normalized prediction entropy of lower
choices because they improve the robustness and gener- than 0.7 and those with a normalized Z-score greater than
alization of the model. Random selection on the other 0.5 are classified as unknown unknown data points. This
hand is a common baseline acquisition function in active configuration is shown to detect the highest number of
learning. unknown unknown data points. The variation amongst
the acquisition functions seem to be insignificant for the
4.2. Experimental results most part.</p>
        <p>In the second experiment, we stack up the baseline
To demonstrate that the anomaly set is dificult for the outlier detection methods of LOF and Isolation forest
prediction model, we evaluate the classification accuracy against U3DAL in this use case. We observed that in
of the prediction model on the anomaly set over multiple the stream-based setting with a challenging anomaly set,
rounds of active learning. Table 1 shows the classifica- U3DAL outperformed both LOF and Isolation forest in
tion accuracy on the anomaly set for each acquisition detecting the unknown unknown data points contained
function. Note that the initial 1000 data points are the in the anomaly set. Table 5 reports the number of
unsame in each of the acquisition functions. Subsequently, known unknown data points detected by the methods
due to the difering data selection mechanism, the pre- LOF, Isolation forest and the proposed method U3DAL. It
diction performance difers for each acquisition function. can be seen that U3DAL, making use of the uncertainty
It can be seen that the classification accuracy over the and diversity thresholds, is able to detect more number
anomaly set is significantly lower than that for the val- of unknown unknowns than the baseline methods. As
idation set. This illustrates that the samples from the the active learning cycle proceeds and more data points
anomaly set are vastly more challenging than the ones are labelled by the oracle, we can see an improvement in
used for training and validation. It is an expected re- the unknown unknown detection in all of the algorithms.
sult because Imagenet-A was curated to be a challenging This is expected because as the model is trained further,
dataset. Since the model confidently mispredicts the data the predictive performance (influencing the uncertainty
points, as expected Imagenet-A consists of unknown un- score) and the richness of the features (influencing the
known data points. Thus, in the following experiments, diversity score) improves drastically. As the model also
the goal is to evaluate which algorithm can determine the comes across more data points, it learns the
distribuunknown unknown data points contained in the anomaly tion of the data better and when the normalized entropy
set with a higher accuracy score. scores and diversity scores are computed, the thresholds</p>
        <p>In the first experiment, we compare the variation of become a better filter for detecting unknown unknowns.
the anomaly detection capability of U3DAL for various In an adaptive threshold setting wherein the threshold
uncertainty and diversity thresholds. Note that in U3DAL changes to adapt for data distribution shift, the
perforunknown unknown data points are defined to be the low mance could be expected to be even better, although it is
uncertainty-high diversity data points. This means that out of the scope of this work.
the data points with an entropy lower than the current
threshold and with a diversity score higher than the
current threshold are predicted to be the unknown unknown</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>5. Conclusion</title>
      <p>Vision and Pattern Recognition, 2021, pp. 15262–
15271.
[22] O. Vinyals, C. Blundell, T. Lillicrap, D. Wierstra,
et al., Matching networks for one shot learning,
Advances in neural information processing systems
29 (2016).
[23] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko,
W. Wang, T. Weyand, M. Andreetto, H. Adam,
Mobilenets: Eficient convolutional neural networks
for mobile vision applications, arXiv preprint
arXiv:1704.04861 (2017).</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>