<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Hybrid-AI Approach for Competence Assessment of Automated Driving Functions</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jan-Pieter Paardekooper</string-name>
          <email>jan-pieter.paardekooper@tno.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mauro Comi</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Corrado Grappiolo</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ron Snijders</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Willeke van Vught</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rutger Beekelaar</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Radboud University, Donders Institute for Brain</institution>
          ,
          <addr-line>Cognition and Behaviour, Nijmegen</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>TNO - Data Science</institution>
          ,
          <addr-line>Den Haag</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>TNO - Integrated Vehicle Safety</institution>
          ,
          <addr-line>Helmond</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>TNO - Monitoring &amp; Control Services</institution>
          ,
          <addr-line>Groningen</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>An increasing number of tasks is being taken over from the human driver as automated driving technology is developed. Accidents have been reported in situations where the automated driving technology was not able to function according to specifications. As data-driven Artificial Intelligence (AI) systems are becoming more ubiquitous in automated vehicles, it is increasingly important to make AI systems situational aware. One aspect of this is determining whether these systems are competent in the current and immediate traffic situation, or that they should hand over control to the driver or safety system. We aim to increase the safety of automated driving functions by combining data-driven AI systems with knowledge-based AI into a hybrid-AI system that can reason about competence in the traffic state now and in the next few seconds. We showcase our method using an intention prediction algorithm that is based on a deep neural network and trained with real-world data of traffic participants performing a cut-in maneuver in front of the vehicle. This is combined with a unified, quantitative representation of the situation on the road represented by an ontology-based knowledge graph and firstorder logic inference rules, that takes as input both the observations of the sensors of the automated vehicle as well as the output of the intention prediction algorithm. The knowledge graph utilises the two features of importance, based on domain knowledge, and doubt, based on the observations and information about the dataset, to reason about the competence of the intention prediction algorithm. We have applied the competence assessment of the intention prediction algorithm to two cut-in scenarios: a traffic situation that is well within the operational design domain described by the training data set, and a traffic situation that includes an unknown entity in the form of a motorcycle that was not part of the training set. In the latter case the knowledge graph correctly reasoned that the intention prediction algorithm was incapable of producing a reliable prediction. This shows that hybrid AI for situational awareness holds great promise to reduce the risk of automated driving functions in an open world containing unknowns.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>The current application is a DNN that predicts the intention
of other road users to merge into the lane of the ego vehicle
(cut-in maneuver). This is combined with a knowledge
graph of the traffic state that relates the current situation
to what the predictor has learned from the training data.
The knowledge graph reasoner returns an estimate on
the reliability of the predictor, which it forecasts into the
immediate future (2 seconds ahead) to be able to warn the
driver or safety system in advance that takeover of control
is imminent in the near future.</p>
    </sec>
    <sec id="sec-2">
      <title>Related work</title>
      <p>In the automotive domain, situation awareness (Endsley
1995) is a term often used to describe the readiness of human
drivers to make good decisions (Endsley 2020). It is based
on perception of the environment, comprehension of the
current situation, and projection into future. Here we extend
situation awareness to an AI system in the vehicle, where we
add the assessment of competence in the current situation to
the comprehension of the current situation.</p>
      <p>
        Machine learning methods tend to underperform when the
distributions of the test dataset and training dataset differ
significantly. Throughout the paper, we refer to data samples
drawn from the training set distribution as in-distribution
(ID), while samples drawn from a different distribution as
out-of-distribution (OOD). DNNs often attribute high
confidence to the classification or prediction of OOD
samples
        <xref ref-type="bibr" rid="ref3">(Hein, Andriushchenko, and Bitterwolf 2019)</xref>
        ; this
behaviour, which is especially valid for softmax classifiers
(Hendrycks and Gimpel 2016), can have dramatic
consequences in applications where model reliability and safety
are priorities. Various papers attempt to increase models’
robustness by calibrating the predicting probability estimates
(Guo et al. 2017) or by injecting small perturbations to the
input data (Liang, Li, and Srikant 2017). Density estimation
methods are also leveraged to detect OOD observations: the
likelihood over the in-distribution sample space can be
approximated (Dinh, Sohl-Dickstein, and Bengio 2016) (Ren
et al. 2019) and used to compute the likelihood of new
observations, thus detecting those samples that lie in low-density
regions.
      </p>
      <p>Another way of dealing with OOD observations is to have
the DNN output more accurate certainty values. Several
approaches have been described in the literature, ranging from
Monte Carlo Dropout (Gal and Ghahramani 2016) to adding
a Gaussian distribution over the weights in the last layer of
a ReLU network (Kristiadi, Hein, and Hennig 2020). It has
been shown that Bayesian deep learning is important for the
safety of automated vehicles (McAllister et al. 2017).</p>
    </sec>
    <sec id="sec-3">
      <title>Method</title>
      <p>To assess the competence of data-driven-AI
automateddriving capabilities we propose a pipelined framework,
depicted in Figure 1. The framework receives as input the
observations of the current road situation and, via a pipelined
information flow, outputs the decision on whether the
driving mode should remain autonomous or should be handed
over to the human driver or backup safety system. The
framework’s internal structure is divided into three modules:
Intention Predictor, Reasoner and Competence Assessment.</p>
      <p>Raw observations related to each target vehicle, such as
their speed and acceleration, are fed to the Intention
Predictor. This module processes the information via two
submodules. The first one is a deep neural network trained to
output (predict) whether a given target vehicle will perform a
cut-in maneuver (Cut-in Classifier). The second sub-module
is a Feature Uncertainty Estimator. It holds univariate
densities of the classifier’s training set input features and
provides information on the in-distribution likelihood of the
network’s input data.</p>
      <p>The Intention Predictor’s output, the observations related
to road geometry (e.g. presence of entry lanes) and lane
visibility are fed to the framework’s second module, the
Reasoner. The reasoner — characterised by an ontology and
first-order logic rules — fuses the input observations with
domain knowledge (encoded in the ontology and in the
rules) into a knowledge graph. The graph realises the
framework’s situational awareness, as it holds a unified
representation of the current situation and is aware of what entities
are important and doubtful.</p>
      <p>The graph is then fed to the last module of our
framework, Competence Assessment. The module first
organises its present and past situation-aware knowledge. Then
it projects such knowledge into the future. Finally, it decides
whether such forecast is outside the autonomous system’s
competence level.</p>
      <p>In the next part, we will describe each module in more
detail.</p>
      <sec id="sec-3-1">
        <title>Simulation Environment</title>
        <p>For simulation we use CARLA, an Open Source simulator
which aims to support the development, training, and
validation of autonomous driving systems (Dosovitskiy et al.
2017). The scenarios are defined using OpenSCENARIO1,
an open format used to describe synchronized maneuvers of
vehicles.</p>
        <p>For a given location of the Ego Vehicle (EV), we use the
API of CARLA to extract a world model of the road
situation. This world model includes the number of lanes, the
presence of an entrance lane and all Target Vehicles (TVs).
For each TV, the velocity, acceleration, angle and position
relative towards EV is determined. The lane visibility v is
calculated as v = ds , where 0 v 1. Here, d is the
distance of the closest TV on that lane and s the scope of EV in
meters (s := 50m by default).</p>
      </sec>
      <sec id="sec-3-2">
        <title>Intention predictor</title>
        <p>Predicting the intention of a vehicle to perform a cut-in can
be framed as a binary classification task; the two labels to
classify are ”cut-in” and ”not cut-in”. A data point labeled as
”cut-in” refers to the collected information at timestep t for a
TV that performs a cut-in between t and t + 2s. Since more
than one vehicle can be present at the same time, multiple
data points can be collected at t.</p>
        <sec id="sec-3-2-1">
          <title>1https://www.asam.net/standards/detail/openscenario/</title>
          <p>The dataset consists of 24305 data points, divided into
6348 instances labeled as ”cut-in” and 17957 ones
labeled as ”not cut-in”, drawn from the StreetWise database
(Paardekooper et al. 2019). While completeness measures
of this database have been developed (de Gelder et al. 2019),
the dataset used for training the intention predictor does not
cover the entire spectrum of cut-ins that are to be expected
in real-life traffic. However, for the purpose of this work a
complete dataset is not essential, as we are interested in the
situations where the intention predictor has not been trained
for.</p>
          <p>To date, a variety of physics-based and data-driven
approaches have been developed to detect spatio-temporal
patterns in road users’ behaviour. Specifically, DNNs have been
commonly adopted for classification purposes as they can
often outperform other methods for high-dimensional data
(Sakr et al. 2017).</p>
          <p>The DNN we developed for this study is a two-layer fully
connected network trained with gradient-based
backpropagation. The input x 2 Rm is mapped into an output y 2 Rn,
where m = 30 and n = 1 are respectively the input and
output dimensionality. The two hidden layers contain 512
neurons each and are activated by a ReLU function (Nair
and Hinton 2010). The 30 features used as input represent
continuous values related to the dynamics of a TV present
at a given time t. Some of these variables, such as the TV’s
speed, acceleration, and the relative lateral and longitudinal
distance to EV, have been directly collected in real-life
driving scenarios; other variables are the result of feature
engineering techniques to develop expressive variables. The
output, which is a single non-linear sigmoid layer defined over
a domain o 2 [0; 1], represents the predicted confidence in
the TV performing a cut-in within the next 3 seconds. The
result of this two-class logistic regression is converted into
a binary output by defining a maximum threshold on the
output.</p>
          <p>
            During training, the cross-entropy logarithmic loss is
weighted for the two different classes to take into account
their imbalance in the dataset. The threshold , the
learning rate and the number of neurons per layer are fine-tuned
using a Bayesian approach for global optimization
            <xref ref-type="bibr" rid="ref2">(Brochu,
Cora, and De Freitas 2010)</xref>
            . To reduce overfitting, dropout
and early stopping are used.
          </p>
          <p>Feature uncertainty estimator To assess the robustness
of the trained DNN predictor on unseen test scenarios, we
first analyzed the univariate distribution of each feature xi
in the training set X = fx1; :::; x30g. Among these features,
the most expressive ones for situational awareness were
extracted for further analysis. The following features were
chosen: the absolute EV and TVs’ velocity and acceleration,
their relative velocity and acceleration, the relative
longitudinal and lateral distance between the vehicles, their relative
heading, and the distance between EGO and the closest lane
marker. A desired characteristic of these features concerns
their distribution. We observed that the distribution of these
data samples can be approximated to multimodal skewed
distributions when the dynamic properties of the vehicles
change incrementally over time. Such distributions can be
approximated by traditional non-parametric density
estimations methods, such as the Kernel Density Estimation (KDE)
(Parzen 1962).</p>
          <p>KDE is a technique used to reconstruct the probability
density function of given data samples, and it can be adopted
for a single feature (univariate KDE) or to multiple features
(multivariate KDE). In the case of the univariate version,
this technique consists of fitting a kernel function, such as
a Gaussian, over each of the k samples in the chosen
feature vector. The resulting k densities are then summed and
normalized to return the final density estimate of the
feature. The main hyperparameter of KDE, the bandwidth h,
controls the variance of the kernel function; its value
determines how smooth the final density estimate is. The
optimization of this parameter, which is necessary to guarantee
that the kernel function fits the data samples correctly, was
optimized using the Maximum Likelihood Cross-Validation
(MLCV) approach (Habbema et al. 1974):
(1)
where k is number of data samples to fit, K( ) is a Gaussian
kernel, xj is a data point over the defined domain chosen,
and xi is the i-th sample in the feature vector. Once the
final density estimate is computed, it is possible to evaluate
the likelihood of new samples for each feature; this
computation can be performed synchronously with the observation
of new data in unseen scenarios, as required in our study. For
practical purposes, the log-likelihood of the samples is used
instead of the likelihood.</p>
          <p>A main assumption in our investigation is that samples
with low likelihood lead to higher uncertainty on the DNN’s
competence. To quantify this intuition, we define the ratio ri
as:
ri = L(xi j Mi)</p>
          <p>Lmax;i
where xi is an observation that belongs to the i-th feature.
The value ri represents the ratio between the estimated
loglikelihood of the new sample xi given the fitted model Mi
and the maximum log-likelihood Lmax;i observed for the
ith feature. The maximum log-likelihood was pre-computed
and stored during the kernel fitting phase. Finally, we define
the feature uncertainty as:
= 1
m
1 Xm ri
i=0
log[(k
(2)
(3)
and road lanes. An example of relations is “drive-on”,
linking vehicles and lanes. An example of attributes is
“distancefrom-ego”, which both vehicles and lanes have.</p>
          <p>Given a set of observations in input, the reasoner first
initialises the related knowledge graph by creating nodes —
corresponding to entities and attributes — and edges —
corresponding to relations. Subsequently, the rules, defined as
Horn clauses (Horn 1951), augment the graph by creating
the two attributes ”importance” and ”doubt”, linked to
entities and relations. The importance aims to encode domain
expert knowledge of the automotive domain. Its purpose is
to categorise and rank nodes and edges. The doubt, on the
other hand, can be interpreted as a measure of uncertainty
associated to the nodes and edges. Its purpose is to assign a
unique type of weight across the whole graph elements. The
two features are orthogonal to each other: the schema could
specify that a fully visible entry lane is important
independently on its doubt value. On the other hand, the cut-in
classifier prediction of a target vehicle that drives far away from
ego, yet in an erratic way (high feature uncertainty), could
have a high doubt value associated to it and, concurrently, a
low importance value because of its position. We consider
three possible importance values, namely low, medium and
high, and 11 doubt values, bounded in the [0; 1] interval,
equidistant from each other (0; 0:1; 0:2; : : : 1).
where m is the dimensionality of the feature space x. This
quantity is intrinsically related to the frequency of the
observation in the training set and reflects our previously
mentioned assumption on the DNN’s competence. The
subtraction guarantees that the feature uncertainty tends to 1 when
all the features are out-of-distribution, thus maximizing the
uncertainty in the predictor’s output, and to 0 when the
features are in-distribution, thus following the same trend as the
competence.</p>
          <p>Reasoner The second module of our framework, the
Reasoner, is in charge of aggregating all observations — namely
target vehicle data, road geometry, lane visibility and the
output of the Intention Predictor — to have a unified and
quantitative representation of the situation on the road. This
view is represented by means of a knowledge graph based
on an underlying ontology and a set of first-order logic
inference rules. We will hereafter refer to the ontology-rule pair
as the schema. The Reasoner is implemented in Grakn 2. The
ontology specifies (part of) the automotive domain via
entities, attributes and relations. Example of entities are vehicles</p>
        </sec>
        <sec id="sec-3-2-2">
          <title>2https://grakn.ai/. Last accessed 18 December 2020.</title>
          <p>
            An excerpt of the ontology’s entity organisation is
depicted in Fig.2, whilst a schematic representation of the
relations is shown in Fig. 3. The entities are organised
hierarchically and along three main branches: one
representing the possible vehicles, one the driving infrastructure, and
one the computational models external to the reasoner. The
non-ego vehicles are divided into two key-categories: known
and unknown. The categorisation is done based on the types
of vehicles present in the dataset the cut-in classifier was
trained on. For instance, if the dataset contained only
passenger cars, such entity would be placed under the known
branch, whilst other vehicles such as lorries and
motorcycles would be inferred as children of the unknown entity. The
known/unknown information associated to observed TVs is
used by the rules to assign doubt values to the classifier’s
output and importance values to the graph nodes. The
driving infrastructure describes all non-vehicle entities present
on the road, such as lanes, ramps and signs, in accordance
with
            <xref ref-type="bibr" rid="ref6">(Zhao et al. 2015; Czarnecki 2018a,b)</xref>
            . Lanes have a
fundamental attribute: visibility. The rules implement a
negative correlation: the lower the visibility, the higher the doubt
associated to that lane. In this way, the framework aims to
speculate about the possible existence of hidden entities in
adjacent lanes. Computation entities represent framework
models which process raw observations to generate new
information, in our case the cut-in classifier. In case the models
are machine learned, the Reasoner infers, via positive
correlation, doubt values associated to the model outputs
depending on the related in-distribution likelihood values: the
lower the likelihood, the lower the doubt. An example of an
observation-reasoned knowledge graph is shown in Fig. 4.
          </p>
        </sec>
      </sec>
      <sec id="sec-3-3">
        <title>Competence Assessment</title>
        <p>The last module in our framework — Competence
Assessment — leverages previous and current knowledge graphs to
determine whether the EV should maintain an autonomous
driving modality or leave the control to the human driver
or backup safety system. Competence Assessment follows a
remember-forecast-decide processing flow.</p>
      </sec>
      <sec id="sec-3-4">
        <title>Remember</title>
        <p>dings</p>
        <p>A time-indexed memory of
graph
embede1; : : : e
is kept. The embedding of a particular time corresponds to
a single value encoding the graph related to that particular
time’s road observations. Currently, the embedding
procedure corresponds to a weighted average of all doubts, where
the weights are associated to the relative importance values:
the higher the importance, the higher the weight.
Forecast The remembered embeddings represent, albeit in
a compact way, reasoned (importance/doubt-aware)
situations. Intuitively, the lower an embedding, the more
competent the autonomous vehicle was in that situation, as low
importance and doubt attributes would predominantly exist
in the corresponding graph. We therefore define the
Competence related to a graph embedding as:</p>
        <p>Currently, the framework implements a linear regressor,
based on the assumption that a short-term linear dependency
across observations holds.</p>
        <p>Decide The decision whether the driving should remain
autonomous or handed over to a human is made based on
the lowest future competence value
c^min =</p>
        <p>min c^i; 8 i 2 [1; : : : ]
and by comparing it to an assessing threshold c
decision =
takeover
AD mode
if 9 ci &lt;
otherwise
c
where AD stands for Autonomous Driving.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Results and Discussion</title>
      <p>We have trained the DNN for intention prediction on 24305
instances, divided into 6348 cut-ins and 17957 non cut-ins.
Since the dataset was unbalanced, we weighted the loss
function to compensate for the difference in the
observations per class. The algorithm was tested on 7200 instances,
resulting in a Fscore = 0:98 (accuracy = 0:99).</p>
      <p>We have assessed the competence of the intention
predictor in two cut-in scenarios. The first scenario describes a
cutin by a passenger car on an otherwise empty road (Fig. 5a).
(5)
(6)
1
2
3
4</p>
      <p>Low
Low
High
High
not present</p>
      <p>present
not present
present
The velocity, distance and driving profile of the TV was
designed not to pose any risk to the EV. In addition, every
vehicle present in the scenario was known to the knowledge
graph.</p>
      <p>(a) The cut-in scenario is within the operational design
domain. Corresponds to Case 1 and 2 in Table 1.</p>
      <p>(b) The lane entrance scenario which is outside the
operational design domain. Corresponds to Case 3 and 4
in Table 1.</p>
      <p>The second scenario (Fig. 5b) describes multiple vehicles
(two trucks and a motorcycle) on the first right-most lane
and a truck approaching from the entrance lane. The EV is
in the left lane and cannot see the approaching truck as it is
occluded by the vehicles on its right. The features in this
scenario are out-of-distribution, as only two features lie within
the training set domain (Fig. 6). Moreover, the scenario
includes an unknown entity in the form of a motorcycle that
was not part of the training set. The rationale for this is that
a type of vehicle not present in the training set might
display a driving profile that the intention prediction does not
expect. In other words, the output of the predictor might be
incorrect since it relies on the detection of spatio-temporal
patterns in the vehicle’s driving behaviour. The visibility on
the road was reduced by the traffic on the first lane; this lane
was considered of high importance due to the road entrance.
This scenario was designed to pose potential risk to the
autonomous system, due to the out-of-distribution features and
unknown entities. The two scenarios were evaluated at the
moment that one of the TVs performs a cut-in.</p>
      <p>The two settings were first tested without the contribution
of the symbolic reasoning inference, shown as Case 1 and
Case 3 in Table 1. Since the Reasoner was not in place, the
feature uncertainty was used as a proxy to relate the
Intention Predictor to its ability to correctly perform in the given
situation. For clarity, the quantity 1 is reported; hence,
a score equal to 1 represents full confidence in the
Intention Predictor output and can be directly compared with the
Competence score. The threshold = 0.7 was defined to
establish whether it was necessary for the human driver to
take over (1 &lt; ), or the vehicle could maintain AD
mode (1 ). In both cases, the system decides not to
maintain the AD mode, due to the high feature uncertainty,
even if the scenario was safe. The absence of the Reasoner
causes a lack of situational awareness: the speed of the TV
was lower than the average velocities collected in the
training set —– thus making the velocity an OOD feature —–
but the large distance between the EV and TV is not used by
the Intention Predictor to reduce the importance attributed to
this quantity.</p>
      <p>Results of the competence assessment with the Reasoner
are shown as Case 2 and Case 4 in Table 1. The Current</p>
      <p>Competence column refers to the competence c as inferred
by the first-order rules of the knowledge graph at the
current moment. In the event that more than one vehicle was
predicted to perform a cut-in, the reported value is the
lowest c estimated among all the vehicles. The Minimum
Future Competence c^min was computed converting the future
doubt-embedding extrapolated by the forecaster (Fig. 7) as
described in Eq. 4 and Eq. 5 ( = 2). Thus, the future
competence was calculated for a prediction horizon of 2 seconds.
The decision whether the vehicle should remain autonomous
was performed by a thresholding function ( c = 0.7) on the
future competence, as detailed in Eq. 6.</p>
      <p>We found that c^min evaluated for Case 4 was six times
lower than in Case 2. In Case 2 the threshold for takeover
was never reached and the system did not hand over the
autonomous control. Due to the large distance of the TV and
the high visibility of the lanes, the Reasoner determined that
the vehicle could stay in AD mode despite the low
likelihood of the input data expressed by the average feature
uncertainty. In contrast, the system decided that a takeover of
the AD mode was necessary in Case 4, because the
numerous sources of risk in this setting caused a low future
competence. This is expressed by a competence value that is
substantially lower than solely based on the feature uncertainty.
Using the likelihood of the input data expressed by the
feature uncertainty alone is not sufficient to correctly assess the
confidence in the Intention Predictor output. This is evident
by the results of Case 1 and Case 3 (Table 1), where the
absence of the Reasoner fails to correctly assess the situation.
In addition, the competence returned by the Reasoner shows
a larger contrast between these two extreme cases than the
method based on the feature uncertainty alone.</p>
      <p>We found that the linear regression used to assess the
future competence was strongly affected by small variations
in the history of doubt-embeddings . Thus, we do not
consider that a prediction horizon higher than 2 seconds would
be reliable enough to support the decision making process.</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions and future work</title>
      <p>We have presented a hybrid-AI framework for the safe
application of AI functions in automated driving. The framework
aggregates road observations and the results of data-driven
AI computations — such as a DNN for intention prediction
in our case study — into a knowledge graph. The graph is
built by means of an ontology, which specifies the entities
that can exist on the road, and a set of first-order logic
inference rules, the latter aiming to estimate the severity level
of the road situations. The knowledge graph is then
compressed into a single value (embedding), stored in a
working memory, and used to forecast imminent severity levels.
A final decision maker modules establishes whether the
vehicle should continue driving autonomously or whether the
steering wheel should be handed over to a human driver or
backup safety system. The knowledge graphs encode the
situational awareness capabilities of the vehicle, whilst the
forecasting and decision making processes realise the
vehicle’s competence assessment capability.</p>
      <p>We have shown that the reasoner correctly assigns high
competence to the Intention Predictor in a situation in which
some features of the DNN are uncertain, but the TV poses no
safety threat to the EV due to the large distance and high lane
visibility. The added value of the Reasoner is also shown in
a situation that contains a vehicle (in this case a
motorcycle) that has never been seen before by the predictor, in an
environment with important entities that require attention (in
this case an entrance lane). The predictor output is unreliable
in this case, potentially leading to erratic and dangerous
behaviour of the EV if taken at face value. Here, the Reasoner
correctly assigns a low competence to the predictor based on
the presence of the motorcycle (high doubt) and the presence
of the entrance lane (high importance).</p>
      <p>These results provide a solid starting point for future
investigations on situational awareness. In future work, we
will extend situational awareness to the entire automated
vehicle instead of a single component. In addition, the
reasoner will aggregate more types of observations, for
example those regarding road works or weather conditions, and
its first-order logic inference rules could be parameterised
via data-driven approaches instead of solely relying on
domain knowledge. Combining the DNN with the knowledge
graph into a graph neural network will result in a better
estimation of competence, especially further into future. Graph
neural networks might also aid in enhanced explainability
on why takeover is needed.</p>
      <p>While limited to a single function in a simulation
environment, our work shows that a hybrid-AI approach to
situational awareness is essential for the safe application of AI
systems in automated driving.
with application to active user modeling and hierarchical
reinforcement learning. arXiv preprint arXiv:1012.2599 .
Cordts, M.; Omran, M.; Ramos, S.; Rehfeld, T.; Enzweiler,
M.; Benenson, R.; Franke, U.; Roth, S.; and Schiele, B.
2016. The Cityscapes Dataset for Semantic Urban Scene
Understanding. In Proc. of the IEEE Conference on
Computer Vision and Pattern Recognition (CVPR).</p>
      <p>Czarnecki, K. 2018a. Operational world model ontology for
automated driving systems–part 1: Road structure. Waterloo
Intelligent Systems Engineering Lab (WISE) Report,
University of Waterloo .</p>
      <p>Czarnecki, K. 2018b. Operational world model ontology
for automated driving systems–part 2: Road users, animals,
other obstacles, and environmental conditions,”. Waterloo
Intelligent Systems Engineering Lab (WISE) Report,
University of Waterloo .
de Gelder, E.; Paardekooper, J.-P.; den Camp Olaf, O.; and
De Schutter, B. 2019. Safety assessment of automated
vehicles: how to determine whether we have collected enough
field data? Traffic Injury Prevention 20(S1): S162–S170.
Deo, N.; and Trivedi, M. M. 2018. Multi-Modal Trajectory
Prediction of Surrounding Vehicles with Maneuver based
LSTMs. In IEEE Intelligent Vehicles Symposium,
Proceedings, 1179–1184. University of California, San Diego, San
Diego, United States, IEEE.</p>
      <p>Dinh, L.; Sohl-Dickstein, J.; and Bengio, S. 2016. Density
estimation using real nvp. arXiv preprint arXiv:1605.08803
.</p>
      <p>Dosovitskiy, A.; Ros, G.; Codevilla, F.; Lopez, A.; and
Koltun, V. 2017. CARLA: An open urban driving
simulator. arXiv preprint arXiv:1711.03938 .</p>
      <p>Endsley, M. R. 1995. Toward a theory of situation awareness
in dynamic systems. Human Factors 37(1): 32–64.
Endsley, M. R. 2020. Situation Awareness in Driving. In
Fisher, D.; Horrey, W.; Lee, J.; and Regan, M., eds.,
Handbook of Human Factors for Automated, Connected, and
Intelligent Vehicles, chapter 7. CRC Press.</p>
      <p>Gal, Y.; and Ghahramani, Z. 2016. Dropout as a Bayesian
approximation: Representing model uncertainty in deep
learning. In 33rd International Conference on Machine
Learning, ICML 2016, 1651–1660. University of
Cambridge, Cambridge, United Kingdom.</p>
      <p>Guo, C.; Pleiss, G.; Sun, Y.; and Weinberger, K. Q. 2017.
On calibration of modern neural networks. arXiv preprint
arXiv:1706.04599 .</p>
      <p>Habbema, J.; JDF, H.; Van den Broek, K.; et al. 1974. A
stepwise discriminant analysis program using density
estimation. .</p>
      <p>Hein, M.; Andriushchenko, M.; and Bitterwolf, J. 2019.
Why ReLU networks yield high-confidence predictions far
away from the training data and how to mitigate the
problem. In Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition, 41–50.</p>
      <p>Hendrycks, D.; and Gimpel, K. 2016. A baseline for
detecting misclassified and out-of-distribution examples in neural
networks. arXiv preprint arXiv:1610.02136 .</p>
      <p>Hendrycks, D.; and Gimpel, K. 2017. A Baseline for
Detecting Misclassified and Out-of-Distribution Examples in
Neural Networks. Proceedings of International Conference
on Learning Representations .</p>
      <p>Horn, A. 1951. On sentences which are true of direct unions
of algebras. The Journal of Symbolic Logic 16(1): 14–21.
Koopman, P.; and Wagner, M. 2016. Challenges in
Autonomous Vehicle Testing and Validation. SAE International
Journal of Transportation Safety 4(1): 15–24.</p>
      <p>Kristiadi, A.; Hein, M.; and Hennig, P. 2020. Being
Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU
Networks. In Daume´ III, H.; and Singh, A., eds.,
Proceedings of the 37th International Conference on Machine
Learning, volume 119 of Proceedings of Machine Learning
Research, 5436–5446. Virtual: PMLR.</p>
      <p>LeCun, Y.; Bengio, Y.; and Hinton, G. 2015. Deep learning.
Nature 521(7553): 436–444.</p>
      <p>Liang, S.; Li, Y.; and Srikant, R. 2017. Enhancing the
reliability of out-of-distribution image detection in neural
networks. arXiv preprint arXiv:1706.02690 .</p>
      <p>McAllister, R.; Gal, Y.; Kendall, A.; van der Wilk, M.; Shah,
A.; Cipolla, R.; and Weller, A. 2017. Concrete problems for
autonomous vehicle safety: Advantages of Bayesian deep
learning. In IJCAI International Joint Conference on
Artificial Intelligence, 4745–4753. University of Cambridge,
Cambridge, United Kingdom.</p>
      <p>Meyer-Vitali, A.; Bakker, R.; van Bekkum, M.; Boer, M. d.;
Burghouts, G.; Diggelen, J. v.; Dijk, J.; Grappiolo, C.;
Greeff, J. d.; Huizing, A.; et al. 2019. Hybrid ai: white paper.
Technical report, TNO.</p>
      <p>Nair, V.; and Hinton, G. E. 2010. Rectified linear units
improve restricted boltzmann machines. In ICML.</p>
      <p>Okuda, R.; Kajiwara, Y.; and Terashima, K. 2014. A survey
of technical trend of ADAS and autonomous driving. In
Proceedings of Technical Program - 2014 International
Symposium on VLSI Technology, Systems and Application,
VLSITSA 2014. Renesas Electronics Corporation, Tokyo, Japan.
Paardekooper, J.-P.; van Montfort, S.; Manders, J.; Goos, J.;
de Gelder, E.; Op den Camp, O.; Bracquemond, A.; and
Thiolon, G. 2019. Automatic Detection of Critical Scenarios in
a Public Dataset of 6000 km of Public-Road Driving. In
Enhanced Safety of Vehicles, 1–8.</p>
      <p>Parzen, E. 1962. On estimation of a probability density
function and mode. The annals of mathematical statistics 33(3):
1065–1076.</p>
      <p>Ren, J.; Liu, P. J.; Fertig, E.; Snoek, J.; Poplin, R.; Depristo,
M.; Dillon, J.; and Lakshminarayanan, B. 2019. Likelihood
ratios for out-of-distribution detection. In Advances in
Neural Information Processing Systems, 14707–14718.
Sakr, S.; Elshawi, R.; Ahmed, A. M.; Qureshi, W. T.;
Brawner, C. A.; Keteyian, S. J.; Blaha, M. J.; and Al-Mallah,
M. H. 2017. Comparison of machine learning techniques to
predict all-cause mortality using fitness data: the Henry ford
exercIse testing (FIT) project. BMC medical informatics and
decision making 17(1): 174.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Bansal</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Krizhevsky</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>Ogale</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <year>2019</year>
          .
          <article-title>ChauffeurNet: Learning to Drive by Imitating the Best and Synthesizing the Worst</article-title>
          .
          <source>In Robotics: Science and Systems.</source>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Brochu</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Cora</surname>
            ,
            <given-names>V. M.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>De Freitas</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <year>2010</year>
          .
          <article-title>A tutorial on Bayesian optimization of expensive cost functions</article-title>
          , Thill,
          <string-name>
            <surname>S.</surname>
          </string-name>
          ; Hemeren,
          <string-name>
            <surname>P. E.</surname>
          </string-name>
          ; and Nilsson,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <year>2014</year>
          .
          <article-title>The apparent intelligence of a system as a factor in situation awareness</article-title>
          .
          <source>In 2014 IEEE International Inter-Disciplinary Conference on Cognitive Methods in Situation Awareness and Decision Support, CogSIMA</source>
          <year>2014</year>
          ,
          <fpage>52</fpage>
          -
          <lpage>58</lpage>
          . RISE Viktoria, Gothenburg, Sweden, IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>van Harmelen</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ;
          <article-title>and ten</article-title>
          <string-name>
            <surname>Teije</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <year>2019</year>
          .
          <article-title>A Boxology of Design Patterns for Hybrid Learning and Reasoning Systems</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Vellinga</surname>
            ,
            <given-names>N. E.</given-names>
          </string-name>
          <year>2019</year>
          .
          <article-title>Automated driving and its challenges to international traffic law: which way to go?</article-title>
          <source>Law, Innovation and Technology</source>
          <volume>11</volume>
          (
          <issue>2</issue>
          ):
          <fpage>257</fpage>
          -
          <lpage>278</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>WHO.</surname>
          </string-name>
          <year>2018</year>
          .
          <source>Global status report on road safety</source>
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Ichise</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ; Yoshikawa,
          <string-name>
            <given-names>T.</given-names>
            ;
            <surname>Naito</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ;
            <surname>Kakinami</surname>
          </string-name>
          ,
          <string-name>
            <surname>T.</surname>
          </string-name>
          ; and Sasaki,
          <string-name>
            <surname>Y.</surname>
          </string-name>
          <year>2015</year>
          .
          <article-title>Ontology-based decision making on uncontrolled intersections and narrow roads</article-title>
          .
          <source>In 2015 IEEE intelligent vehicles symposium (IV)</source>
          ,
          <fpage>83</fpage>
          -
          <lpage>88</lpage>
          . IEEE.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>