=Paper=
{{Paper
|id=Vol-2600/paper8
|storemode=property
|title=GiCoMAF: An Artificial Intelligence Algorithm to Utilize Maps for Operators of Unmanned Aerial Vehicles
|pdfUrl=https://ceur-ws.org/Vol-2600/paper8.pdf
|volume=Vol-2600
|authors=Yuval Zak,Yisrael Parmet,Tal Oron-Gilad
|dblpUrl=https://dblp.org/rec/conf/aaaiss/ZakPO20
}}
==GiCoMAF: An Artificial Intelligence Algorithm to Utilize Maps for Operators of Unmanned Aerial Vehicles==
GiCoMAF: An Artificial Intelligence Algorithm to Utilize Maps for Operators of
Unmanned Aerial Vehicles
Yuval Zak, Yisrael Parmet and Tal Oron-Gilad
Ben-Gurion University of the Negev
Beersheva, Israel
{zaky@post.bgu.ac.il, iparmet@bgu.ac.il, orontal@bgu.ac.il}
Abstract The use of UAVs in the military domain is increasing, due
to their ability to perform missions without risking human
Unmanned Aerial Vehicles (UAV) operators must maintain operators (Izzetoglu et al. 2015). UAV operators monitor the
high levels of situation awareness on their area of operation. payload, often a camera sending video feed, in various mis-
To achieve this, they use the Command and control (C2) map,
which are shared among forces, and is regularly overloaded
sions (e.g., reconnaissance, guidance of forces; Marusich et
with data that is irrelevant to their mission. UAV operators’ al. 2016), in addition to multiple tasks (e.g., navigation and
missions require distilled information at the right timing. Yet, orientation, flying the vehicle, radio communication; Ever-
the existing filtering mechanisms of C2 maps are layer-based aerts 2008). A command and control (C2) map, often in a
and insufficient. We propose a new approach to automatically different display, is used for orienting and making sense of
and dynamically filter information items on the map based the payload’s outputs. The C2 map shows mission critical
on environmental and mission context. To achieve this, we information and intelligence-related information items such
introduce a three-tier artificial intelligence (AI)-based algo- as markings of potential targets, location of allied forces and
rithm (GiCoMAF), where we delineate the use of machine so forth. A cognitive work analysis of UAV operators work-
learning (ML) models to support UAV missions. For the Gi- flows, emphasizes frequent and continuous use of the C2
CoMAF development, tagged data was collected in simulated
experimental runs with professional UAS operators. Differ-
map during missions (Back et al. 2019). The C2 map, how-
ent types of ML models were evaluated and fitted into the ever, being shared among military elements, is showcasing
algorithm. The models achieved a relatively high accuracy information that is irrelevant to the UAV operators. It has
at modeling human preference and area of interest. The ap- been indicated (Endsley 2000; Sandom 2000) that informa-
proach presented in this study can be further implemented to tion overload is a contributor to poor Situation Awareness
support other operators in time-critical spatial-temporal prob- high workload and low overall performance. According to
lems. Back et al. (2019), the information clutter in C2 maps may
often lead operators to neglect the map, and rely solely on
the payload’s feed, and may also lead to fatal results as the
1 Introduction tragic incident of February 2010. Answering Adams’ (2015)
February 2010, Afghanistan, an American helicopter fired call for incorporating human factors limitations in the design
on three suspected trucks, killing 23 innocent civilians, and of UAVs, it serves as an incentive to address the information
wounding 12. The attack was approved based on informa- overload problem. Some advanced solutions for improving
tion provided by operators of an unmanned aerial vehicle UAV operators’ SA had been suggested, e.g. using synthetic
(UAV) who did not report the presence of civilians in the vision (Calhoun et al. 2005). Such solutions, however, may
trucks (Filkins 2010). In a later news report, Shanker and require costly adjustments of the vehicle’s payload.
Richtel (2011) cite Army officials claiming that the leading For decluttering, most C2 maps are based on information
cause for the tragic incident was information overload that layers. Layers can manually or automatically (via a set of
lead to poor Situation Awareness (SA). SA can be defined rules) be hidden, shown or dimmed. The layer mechanism
as one’s perception of the environment around him or her at reduces the information overload problem by hiding or dim-
any given point in time (Endsley 1988). Thus, although there ming layers, yet, at the same time, it may cause information
were evidences for children in the trucks, the UAV operators deprivation due to an inherent tradeoff; any action performed
“did not adequately focus on them amid the swirl of data”. on a layer affects it entirely. Thus, it is impossible to hide
irrelevant information items in a layer, while showing rele-
Copyright c 2020 held by the author(s). In A. Martin, K. Hinkel-
vant ones of the same layer. Therefore, aspiring to solve the
mann, H.-G. Fill, A. Gerber, D. Lenat, R. Stolle, F. van Harmelen
(Eds.), Proceedings of the AAAI 2020 Spring Symposium on Com- C2 map information overload tradeoff, Zak, Oron-Gilad, and
bining Machine Learning and Knowledge Engineering in Practice Parmet (2018) called for ’breaking’ the layer mechanism
(AAAI-MAKE 2020). Stanford University, Palo Alto, California, and addressing the information at the information items’
USA, March 23-25, 2020. Use permitted under Creative Commons level. Thus, instead of filtering layers of information items,
License Attribution 4.0 International (CC BY 4.0). the filter will handle each information item individually (as
Figure 1: Left. An illustration of a layer-based filter, where entire layers can be turned on/off. Right. An information-item level
based filter, where each single information item can be considered as a separate layer and handled individually.
illustrated in Figure 1). This can be achieved by an algorithm els have been used in UAV, GIS and C2 domains (e.g. Azak
that dynamically and automatically filters information items and Bayrak 2008; Bao 2016; Choi and Cha 2019; Dzieci-
on the C2 map by their importance and relevance to UAV uch et al. 2017; Noh and Jeong 2010; Rapaport 2015), but
operator’s current mission. Automating this process, how- to our knowledge, not as the interface between the C2 map
ever, is challenging as it can inadvertently reduce operators’ and the operators. For our research, we looked for models
SA and performance, especially regarding items chosen not that can be used for classification (to classify information
to be shown (Endsley and Kiris 1995). item’s importance) or logistic regression (for defining the
Two guidelines led the development of the automatic and field of relevance, as detailed later). Table 1 describes four
dynamic algorithm. First, working at an information item ML techniques, suitable for classification and regression and
level requires advanced techniques. Those techniques rise therefore applicable towards GiCoMAF development.
from the artificial intelligence (AI) domain. Second, as the There are other ML models that can be used for the pur-
problem is both spatial and temporal, the solution should pose of this study. For example, Long Short-Term Mem-
adopt perceptual concepts inspired by Gibson and Crooks’ ory (LSTM), is a common deep learning technique for time
(1938) field of safe travel. The field of safe travel, was de- series data, that can be used for multi-stream data as well
fined as the spatial field where it was safe to steer a car. It (Behera, Keidel, and Debnath 2019; Bouaziz et al. 2017).
was dependent on driver state, vehicle and environment, and Given that each information item in the C2 map can be per-
it was constantly changing as the vehicle moved and the con- ceived as a standalone time series, this model was consid-
text of driving changed. Gibson and Crooks referred to the ered for this study too. However, the exact number of par-
field’s intuitiveness as affordance. In Section 2 we use these allel streams (i.e. the number of the information items) is
guidelines for the development of the GiCoMAF (Gibsonian constantly changing as more information items are added to
Command and Control Map AI Filter algorithm) but first, the map. Therefore, the LSTM model was neglected.
in Sections 1.1 and 1.2, we review Machine Learning (ML)
models that were considered for the AI implementation and 1.2 Research Goals
delineate the research goals.
Acknowledging UAV operators’ C2 map needs, this study
aims to introduce the GiCoMAF solution for distilling the
1.1 Machine-Learning (ML) Models for C2 information presented on the map to show mission relevant
Systems important information items, while minimizing or hiding
The military domain is seeking for techniques to incorpo- less important or distracting items. The first goal is to de-
rate AI into C2 systems and Machine Learning (ML) mod- velop an algorithm that automatically and dynamically fil-
Table 1: A short description of machine learning models suitable for classification and regression modeling.
Model Description
Lasso Re- In Generalized Linear Models (GLM) the relationship between the predicting variables and the independent
gression variable is a linear function. It is mostly used to predict a single continuous numeric value, but variations allow
(LR) it to be used for classification (multinomial linear regression; Greene 2012) or for predicting a probability
(logistic regression; Walker and Duncan 1967). Lasso is a method of performing feature selection alongside
GLM by introducing a coefficients size penalty as a constraint. The effect of the penalty on the regression is
controlled by a user defined λ parameter (Tibshirani 1996; Tibshirani et al. 2012).
Neural NN is a composite of input, output, and middle layers (a.k.a. ’hidden layers’) presuming to mimic the work of
Networks neurons in the human brain. Each layer consists of an undefined number of neurons, which sum the input
(NN) received from the previous layer using an activation function f , and ’fire’ the result to the next layer. NNs can
handle non-linear problems (Kanevski et al. 2004).
Random Decision tree is a prediction for non-linear problems where each node of the graph represents a predicting
Forest variable. The returned value is the final node the decision led to (Quinlan 1986). The number of possible
(RF) results is limited by the number of possible values of the label, and by the depth of the tree (maximum number
of nodes a branch can have). Increasing the number of possible results can be done by increasing the tree’s
depth, but it may result in undesired patterns of overfitting. Random Forest overcomes this limitation by
producing multiple trees, each one is trained on a randomly selected subset of the same dataset. The returned
value is an average of the predictions of all trees for a regression type model, and the highest frequent value
for the a classification type model (Breiman 2001).
XGBoost Gradient boosting is an iterative ensemble of trees model, where in each iteration a decision tree is learned
using the residuals of the previous iteration (Friedman 2002). XGBoost is an open source package
implementing an efficient scalable tree boosting system, incorporating a regularized model to prevent
overfitting (Chen and Guestrin 2016).
ters the information items on the C2 map. The automatic part operator’s area of interest be modeled on the map? Tier III
of the algorithm, addresses the filtering at the information aims at the dynamicity property of the algorithm, and an-
items’ level. The dynamic part of the algorithm addresses swers the question how often should the automatic filter be
the evolving environmental context of the area of operation. updated?
The second goal is to delineate the use of ML models in The construction of the GiCoMAF is illustrated in Fig-
the construction of the algorithm by demonstrating how its ure 2. The distinction between Tiers I and II is impor-
construction can be achieved using ML models. This goal is tant. Tier I predicts each information item’s importance. The
attained using importance and relevance labelled data col- scale is inclusive, i.e., it provides an indication of how im-
lected empirically from UAV operators. Their inputs enable portant it is to show the information item on the map, and an
the ML models to learn the operators’ contextual informa- indication of how important it is not to show the item. The
tion needs, and to showcase the algorithm’s feasibility. This incentive behind this logic is that some non-important infor-
paper describes the process of developing the GiCoMAF al- mation items may be harmless and operators will be oblivi-
gorithm using ML models. A third goal, not described in this ous to their presence, while others may be disturbing. More-
paper, is to evaluate the update rate of the GiCoMAF empir- over, predicted information item’s importance should not be
ically, exploring its effect on operators’ mental workload, handled in the same way throughout the map space, and this
situation awareness and perception of the experience. is where Tier II comes into play. Consider an information
In the following Sections the definition of the GiCoMAF item with a neutral predicted importance, i.e., not important
algorithm is first outlined. Then, the process of acquiring the but not disturbing. While the item per se is not considered
ML models constructing the algorithm’s tiers is detailed, in- disturbing, possibly, if it is within the operators’ area of in-
cluding the data collection, manipulation, and models eval- terest, they may be more sensitive to disruptions, and the
uation. Lastly, the discussion Section discusses how to com- item can inadvertently cause clutter or quickly become dis-
bine all tiers into an operating filter algorithm. turbing. To avoid such cases, it may be better to filter out the
item. Outside the area of interest, however, leaving a neutral
2 Developing GiCoMAF item may be a good strategy, as its importance may rise as
The GiCoMAF – Gibsonian Command and Control Map AI the mission evolves. Hence non-important items for the im-
Filter algorithm, consists of two tiers, each answers a dif- mediate context, may be valuable to foresee future evolve-
ferent research question. The integration of the tiers creates ment of the situation and prepare for it. Therefore, Tier I of
the filter rule, and incorporates the outcomes of these ques- predicting the information items’ importance is not enough,
tions into the workflow of the operators. Tier I aims at the and the algorithm should model the operators’ area of inter-
information item level, and answers the question what is the est as derived in Tier II. The final decision rule, hence, is
perceived importance of each information item? Tier II aims a combination of these two tiers. Tier I and II of the algo-
at the map as a whole, and answers the question how can the rithm are based on ML models, thus, by learning examples
from UAV operators, the algorithm predicts and executes an of interest’, where each area has a different level of interest.
automatic filter for new operators and in new unknown sce- For example, Figure 3 illustrates multilevel areas of interest,
narios. ML models require tagged examples to be learned where in the center of the polygons occurs the mission, and
from. Therefore, an experiment which emulated the work therefore the smallest area around this focus has the highest
of UAV operator in operational scenarios was designed and interest. The surroundings do withhold interest to the oper-
conducted to collect tagged data from UAV operators (Zak, ator since they may affect the mission’s center. Their rele-
Parmet, and Oron-Gilad 2019a, 2019b). Tier III defines the vance decreases as they get farther than the mission’s center.
rate at which the filter rule of Tiers I and II should be ap- The ’area of interest’ is modeled using the concept of field
plied. At this stage of development Tier III cannot be based of relevance, an adaptation of Gibson and Crooks (1938).
on ML models, and represent a pure cognitive issue. Since The field of relevance depicts operators’ areas of interest
the scope of this study was to delineate the construction of based on the environment, mission, operators’ behavior, etc.
the GiCoMAF algorithm using ML models, the process of Moreover, it corresponds with the affordance property as the
studying the cognitive effect of various filter update rates field of relevance highlights areas that are intuitively more
and setting the optimal rate is due to future research. focused upon by the operators. Due to the probabilistic char-
acteristic of ML models, it was decided to adopt the quality
2.1 Tier I – Information Item Importance map approach of Morse, Engh, and Goodrich 2010, and to
Tier I aims to model information items’ importance as per- model the field of relevance as a pseudo-Gaussian heatmap,
ceived by the operators. An information item’s importance where each spatial element on the map gets a value between
may vary based on environmental context, mission, and 0 (no relevance) and 1 (high relevance) corresponding to the
characteristics. Generally, operators want important infor- probability of that spatial point to be in the operator’s area
mation items to be shown on the map. If an information of interest.
item is not perceived as important, its presence can be dis- Similar to Tier I, prediction in this tier is done using a ML
turbing and then operators would prefer that it will not to be model, deducing from insightful environmental and mission
shown, or not disturbing and operators may be impartial to related measures that describe the context. In this tier those
its presence on the map. Therefore, Tier I attempts to pre- measures are in respect to a spatial element (e.g., a square
dict and classify information items’ importance level, into of 10 meters2 ), without relating to any particular informa-
a four tics scale, based on the context: Positive importance tion item within the area of interest. For example, a mission
represents important (1) and very important (2) informa- related measure can be the average time a spatial element
tion items. Zero importance represent non-important infor- is in the UAV payload’s field of view, assuming higher av-
mation items, that operators have no preference to whether erage time may indicate higher relevance of that element.
they should be shown or not. Negative importance represents An environmental measure can be the density of informa-
information items that disturb and distract operators from tion items around a certain element. Assuming higher den-
their mission context. sity around a spatial element indicates upon the probability
The prediction is done using a ML model. The model, of some operational event happening at that location, and in
once trained, has the ability to deduce item importance from turn higher relevance. Similar to Tier I, the examples of rela-
insightful environmental and mission related measures that tions are for simplifying purposes, and the exact ML model
describe the context. For example, the average distance of (Table 1) was determined through an empirical process de-
an information item from the UAV payload may describe tailed in Section 3.
its mission related context. The density of information items
around an information item, for example, may reflect the en- 3 Constructing the GiCoMAF
vironmental context. Higher density raises the probability of In the construction of the filter, an experiment emulating
some operational event happening at that location. A ML the work of UAV operators in the military domain was ex-
model can find the relationship between an information item ecuted. Participants, professional military UAV operators,
and the route, and then predict the importance of the infor- were asked to perform a mission of supporting a ground bat-
mation item; and it can determine that as the environment is talion in urban battlefield scenarios. The data collected using
denser, the probability of showing non-important items in- the feedbacks they provided during and after the experiment
creases. These examples of relations between derived mea- was used to construct the ML models of Tiers I and II. The
sures like distance and density and the predicted value are process is illustrated in Figure 2. The research was approved
given for simplifying purposes, and the real relationship that by the Institutional Review Board at Ben Gurion University.
emerge from the ML model may be more complicated. Fur- Informed consent was obtained from each participant.
thermore, selecting the most suitable ML model (Table 1)
was determined empirically as detailed in Section 3. 3.1 Data Tagging Experiment
Data for Tiers I and II were collected in a set of experimen-
2.2 Tier II – Operator’s Field of Relevance tal runs, detailed in Zak, Parmet, and Oron-Gilad 2019a and
Tier II of the algorithm addresses the areas of interest for the 2019b. The experiment aimed to emulate the work of UAV
operators in the environment. It is essentially a spatial prob- operator in the military domain. Using a designated sys-
lem that can be represented using geospatial measures on the tem developed for this task (UCES – UAV Command and
map (e.g. polygons, heatmaps, etc.). The ’area of interest’ is Control Experiment System), a battlefield scenario was de-
not necessarily a singular area, and can be phrased as ’areas veloped by subject matter experts with a UAV mission to
Figure 2: The construction process of the GiCoMAF. Tier I and II provide the decision rule for what should be presented on
the C2 map. Tier III is related to the update rate of the filter and how it affects operators’ mission performance, workload and
experience.
assist a ground battalion in conquering an urban neighbor-
hood. The ‘five paragraph order’, a common military stan-
dard of writing a fight plan of the mission, was outlined on
the C2 map (Figure 4) and programmed in the VR-Forces
simulation engine. The mission was 12 minutes long, rep-
resenting a sequence of events that in real life settings may
take several hours. Thirteen professional military UAV op-
erators performed a reconnaissance mission as if they were
acting in a real-world battlefield. The UCES allowed them
to control the vehicle’s payload, observe the battlefield from
an aerial point of view, and get real-time information from a
C2 map. Occasionally at specific points, the scenario paused,
and a scoring session had started. They were asked to label
two types of information using the UCES map: tagging in-
dividual information items’ importance; and drawing their
current contextual area of interest as polygons on the map.
There were 88 scoring sessions in total, an average of 6.5
sessions for each experimental run. The data collected from
the runs was put together into two datasets. The dataset con-
taining the information item’s importance tags was the in-
put for Tier I, and the dataset containing the area of interest
polygons was the input for Tier II. Then, ML models for the
Figure 3: An illustration of multilevel areas of interest, as two tiers were developed. The processes and outcomes are
derived by operators in an operational context. Each polygon detailed in Sections 3.2 and 3.3.
represents a level of interest, increasing from the outside-in.
3.2 Tier I ML Model – Information Items’
Importance
The data collected in the experiment was divided into two
tables. The first table was an event diary, where each row
Table 2: Entropy analysis for each model.
Predicted Value
Model
Negative 0 1 2
LR NA NA NA 0.921
NN NA NA 0.891 0.917
RF 0.716 0.875 0.739 0.661
XGBoost 0.999 0.999 0.999 1
ing set and the 3 remaining as the validation set. The final
model for each technique was built using the derived opti-
mal parameters set. Figure 5 illustrates the results for two
types of errors: (a) weighted error, the average k-fold result
for the optimal parameters set, the training set, and the val-
idation set; and (b) sign error (percent of misclassifications
of important/very important as disturbing, and vice versa),
for all three sets. From Figure 5 it is evident that RF shows
patterns of overfitting, and multinomial LR and NN perform
better than XGBoost in terms of weighted error. However,
Figure 4: A mockup of the C2 map, including the fight plan in terms of sign error XGBoost outperforms the others. Fig-
for the simulation’s scenario (in dark blue). The symbols are ure 6 delves into the differences among the models by illus-
standard NATO symbols, where light blue represents allied trating a multiclass receiver operating characteristic (ROC)
forces, red represents enemy forces, and green represents graphs (Hand and Till 2001) on the validation set. In this
neutral entities. figure, each line represents the ROC of the comparison of
two possible values, and the area under the curve (AUC) is
the average of all AUCs. It can explain the differences be-
tween weighted error and sign error patterns of the NN and
represents one event in one experimental run (e.g., UAV LR. As seen in the plot, NN and LR have almost no sepa-
movement, forces movement, cross-fire event, etc.). This ta- ration between the predicted values, probably because mis-
ble contained approximately 90,000 rows. The second table classifying very important information items as disturbing
is a tagging table, where each row represents an information had the highest penalty. Thus, both models tend to classify
item in a scoring session in an experimental run and its per- all information items as ‘very important’. Moreover, accord-
ceived importance as was reported by the participant. This ing to Figure 6, RF had slightly better AUC, although Fig-
table contained 1,203 rows, where each information item has ure 5 suggests it was overfitting. A third analysis to evaluate
inherent attributes like its location, type, the time it was last the models’ performance was to look at how each model
modified, etc. However, item attributes are insufficient for was certain in the prediction of the validation set. In the pre-
introducing mission context to the ML models. Therefore, diction process, each model provides a probability for each
the two tables had to be joined into one training dataset. For level, where the predicted value is set by choosing the level
that, the event diary had to be manipulated into insightful with the highest probability. It is expected that a more ‘deci-
variables. Thirty-five derived measures were calculated from sive’ model would provide relatively high probability for the
the event diary to describe the environment and the mission. predicted value and relatively low probabilities for the other
The objective of the ML model was to classify information values. The
items’ importance. Misclassifying a non-important informa- P entropy for that prediction, defined by the for-
mula − i pi loga pi (where pi is the probability of predict-
tion item as important does not have the same implication as ing a given value i, and a is the number of possible values),
misclassifying an important information item as disturbing. would then be close to zero. The result of the entropy anal-
Therefore, the error function could not be merely classifica- ysis as given in Table 2, show the average entropy for each
tion accuracy, and weights were given as penalty for mis- model and predicted value. It is evident from Table 2 that
classifications based on the sensitivity and specificity of the LR assigns only the value ‘2’, and NN only the values ‘1’
misclassification. and ‘2’. All four models have relatively high entropy scores,
All four techniques described in Table 1 were tested in sometimes close to one, indicating that the models’ classifi-
a classification configuration. Model technique parameters cation decisions, based on the levels’ probabilities, are hung
were optimized using the k-fold cross validation technique, on the fluctuation of an . I.e., and all four models are not
where k = 10. During this process, 22,945 models were very decisive. The only exception is the RF model, which
built in total for all four techniques. Then, data were divided had a relatively lower entropy, although still closer to one
by participant; 10 randomly chosen participants as the train- than zero.
Figure 5: ML model performance result for Tier I. Left. The average weighted error of the k-fold result of the optimal parameters
set (red), the weighted error of the training (green) and the validation (blue) sets. Right. The corresponding sign error of the
k-fold and the sets. One can see that LR and NN models provide less weighted error in comparison to RF and XGBoost, but
more sign error.
3.3 Tier II ML Model – Field of Relevance value was logistic (a number in [0,1]). Model performance
was measured using root mean square error (RMSE). Due to
The objective of the model was to define the field of rele- the large scale of data and computing resource limitations,
vance by predicting the probability of each point on the map 408 models were built in total in four k-fold cross valida-
to be in the area of interest. To facilitate the model construc- tion processes, where k = 3. Figure 7 illustrates the aver-
tion, the C2 map was divided into a grid of cells, where each age k-fold RMSE results for the optimal parameters set, and
cell represents an area of 25x25 meters on the ground, 9,016 the RMSE of the training and validation sets. While logis-
cells in total. Each row in the dataset represents one cell in tic LR and NN have better results on the validation set in
a scoring session of an experimental run, with the predicted terms of RMSE, both XGBoost and RF perform better on
label of (1) if the cell was in the drawn area of interest in both the training set and k-fold results. Figure 8 delves into
that scoring session, and (0) otherwise. The derived mea- the results and illustrates a ’rotated confusion matrix’. The
sures were calculated with respect to each cell (e.g. the av- x-axis represents the predicted value, binned in intervals of
erage distance of a cell from the UAV route, etc.). 0.01. The y-axis represents the average original value of the
Similar to Tier I, the data was divided into two tables; an rows corresponding to the bins of the x-axis. The size of the
event diary and a table with the reported feedback from the points represents the relative number of records in each bin.
participants. In this tier, each row in the reported feedback A perfect model would provide points aligning with the di-
table represents coordinates of the area of interest’s poly- agonal of the plot. According to Figure 8, both LR and NN
gon of a scoring session in an experimental run. Each scor- collapse and do not provide good differentiation. RF pro-
ing session consisted of one polygon. Subtracting two cases vides something close to binary results, which may conflict
where the participant forgot to draw an area of interest, this with the field of relevance concept. Therefore, although not
table had 86 rows. Also, the events diary was manipulated perfectly aligning with the diagonal, XGBoost performs bet-
into 20 derived measures. ter than the other three models. An illustration of XGBoost’s
The objective of the model was to define the field of rele- performance relative to the runner-up RF model, in two scor-
vance by predicting the probability of each point on the map ing session of two different participants is given in Figure 9.
to be in the area of interest. To facilitate the model construc- Figure 9 demonstrates how each operator marked the field
tion, the map was divided into a grid of cells, where each of relevance, at the same stage showing that the structure
cell represents an area of 25x25 meters on the ground, 9,016 of the field of relevance depends on the environment, the
cells in total. Thus, each row in the dataset represents a cell mission, and operators’ individual characteristics and pref-
in a scoring session of an experimental run, with the pre- erences. The XGBoost model seems to handle these charac-
dicted label of (1) if the cell was in the drawn area of inter- teristics well.
est in that scoring session, and (0) otherwise. The derived
measures were calculated with respect to each cell (e.g. the
average distance of a cell from the UAV route, etc.). 4 Discussion
The same four ML techniques that were used in Tier I Operators of UAVs work in uncertain and dynamic envi-
were used in the regression configuration, but the predicted ronments. Their main focus is on managing their vehicle’s
Figure 7: ML model performance result for Tier II. The av-
erage RMSE of the k-fold result of the optimal parameters
set, and the RMSE of the training and the validation sets. LR
Figure 6: Multiclass ROC curves for the four ML models and NN have better results on the validation set, both XG-
on the validation set, with the average AUC for each model. Boost and RF perform better on the training set and k-fold
X-axis – False Positive Rate (1 – Specificity). Y-axis – True results.
Positive Rate (Sensitivity). The width of each line represents
the number of observations used for the development of the
line in the curve. However, since the RF had extremely low training error, for
both error types, it may be overfitted. Further study on the ef-
fects of each model on UAV operators’ mission performance
payload (e.g., camera video-feed) while they are required to is needed to examine if the effect of overfitting is negligi-
maintain orientation and awareness to their current location ble. Therefore, to be cautious, the XGBoost model was cho-
and its surroundings, and plan ahead. In order to develop sen for further development. The XGBoost model, however,
the essential SA, they continuously and constantly interact had an accuracy sign error of 25.9%, and its decisions were
with a C2 map (see Back et al. (2019) for a Cognitive Work hanging on the thread of an (Table 2). This magnitude of
Analysis). The map, is often overloaded with data that is the error is reasonable when modeling human preferences
irrelevant to their mission, and like the operational environ- and performance (as seen in previous studies, e.g. Agichtein
ment things change rapidly. The existing layer-based map et al. 2016; Guimerà et al. 2012; Liu, Bian, and Agichtein
filter mechanism causes a tradeoff dilemma for the opera- 2008), especially when having a limited number of observa-
tor –showing an entire layer with its relevant and irrelevant tions. The XGBoost results of Tier II were visually evaluated
information, or hiding an entire layer and losing important and provided the best and most accurate pseudo-Gaussian
information. Furthermore, as noted in Back et al. the existing heatmap for the field of relevance. as illustrated in Figures 8
filtering mechanisms require operators to manually interact and 9. It should be noted that accuracy can be further im-
with the interface, highlight or hide layers of information at proved by testing other ML models and parameters. Indeed,
the busiest times of their mission. Thus, currently, obtaining due to resource limitations, not all suitable models could be
information from C2 maps pose high demands on operators, tested. Future study can attempt to improve the results of this
and therefore, they often cope with impaired SA. This study study by applying additional ML models and techniques.
proposes a solution to dynamically and automatically adjust The two tiers delineate the use of ML models in the GiCo-
the C2 map display to operators’ needs by introducing an MAF algorithm . Yet, is not clear cut how to combine Tiers
AI-based dynamic and automatic algorithm that filters the I and II into the GiCoMAF algorithm. Several options for
information on the C2 map at the individual items level (as how the tier combination protocols are being discussed in
opposed to layers), as described in section 2, and laying the the following section.
foundation for the algorithm by delineating the use of ML
models in its construction as detailed in Figure 2.
4.1 Putting It All Together – Combining Tiers I
The data tagging experiment was designed to collect la-
and II
belled data towards the construction of the ML models of
Tiers I and II; predicting information items’ importance and The exact combination of Tiers I and II into the GiCoMAF
predicting the field of relevance, respectively. Using the col- algorithm may depend on the mission, the environment, and
lected data, four different models were optimized and devel- even the organization that the operators are part of. This sec-
oped for each tier. In Tier I both XGBoost and RF had good tion suggests three possible combination approaches; how-
results on the validation set, with RF on the upper hand. ever, the final setup is not limited to these approaches and
at points with high relevance (implying more information
items would be shown), and rather high at points with low
relevance (only very important information items would be
shown).
Negative Correlation – In contrast to the positive corre-
lation approach, this approach assumes that there should be
a negative correlation between the field of relevance and in-
formation density. I.e., the more relevant a spatial element
is, the more attention operators direct to it, and therefore
fewer information items should be shown to avoid disrup-
tion. Thus, as illustrated in Figure 10b, there should be a
negative relation between the field of relevance and infor-
mation items’ importance, in similar but opposite approach
to the relation described in the positive correlation.
Figure 8: Predicted values binned into intervals of 0.01 (X- Binary Relevance Decision – This approach assumes that
axis), and the average original values (Y-axis) of correspond- only important information items should be shown on the
ing records. A perfectly fit model should be aligned with the map, and only in areas which are currently most relevant to
diagonal line. The XGBoost model seems to be the better the operators. Therefore, there are only two constant thresh-
predictor of the four techniques. olds in the algorithm – one to decide the minimum impor-
tance an information item should have in order to be shown,
and one to decide the minimum relative relevance a spatial
point should have in order to set the importance threshold
into motion. Figure 10c illustrates the approach.
For a further study, we propose to examine these three
approaches and choose the fittest approach for the context
that was addressed in this paper (i.e., the same mission type
and profile of participants). For that a new multistage sce-
nario similar in nature to the experimental scenarios detailed
in this paper is required. Using the ML models developed
in Tiers I and II, each scenario can be run in one of the
three combination approaches, as well as running with no
filter at all. In the first experiment, participants will be intro-
duced to all of the approaches and different thresholds setups
Figure 9: The field of relevance simulated in one scoring ses- would be tested. Then, using the optimal threshold for each
sion in the validation set. The predicted field of relevance is approach, new operators would participate in a two-stages
illustrated in blue and the operator’s reported area of interest crossover experiment with eight possible treatments – the
is shown in red. Subfigures (a)-(b) and (c)-(d) represent two three combination approaches and no filter at all.
different participants, in the same part of the scenario. (a)-
(c) are the predictions of the XGBoost model, (b)-(d) are of 5 Conclusions
RF model. This study aimed to introduce a solution to the information
overload of C2 maps used by UAV operators. By solving op-
erators’ need for distilled information at the right time and in
each organization/user/system case may dictate the devel- the right place, it is expected that operators will benefit more
opment of a tailored solution. These approaches provide a from the C2 map at lower efforts, and overall mission per-
show/no-show decision rule based on relevance and impor- formance will improve. The study had met its goals. First,
tance thresholds. The optimal approach used in the algo- a solution was achieved by introducing the three-tier Gibso-
rithm, as well as its thresholds, should be continuously eval- nian Command and Control Map AI Filter algorithm (GiCo-
uated through a process of experiments. MAF). Then, ML models were developed and evaluated us-
Positive Correlation – This approach assumes that there ing tagged data collected in an experiment. The results of the
should be a positive correlation between the field of rele- experiment are encouraging. The ML models that emerged
vance and the information density. Thus, the more relevant from the first experiment were satisfying, and indicated high
a spatial element is, the more information should be pre- accuracy and usability of the algorithm, a step towards a so-
sented around it. To avoid distraction, the farther away from lution to the information overload problem, and high work-
the spatial element, the less information should be shown. load of UAV operators. The combination of the tiers into a
Figure 10a illustrates this approach where there is a negative single algorithm is yet to be fully defined, and should be de-
relation between the field of relevance and the information termined by an additional set of experiments we recommend
items’ importance. In order to use this approach, an adap- performing in future studies.
tive show/no-show threshold should be set based on infor- The contribution of this study lays in various aspects.
mation items’ importance, where the threshold is rather low First, this study targets a long-neglected field of improving
Figure 10: An illustration of three possible algorithm combination approaches for Tier I and II. The heatmap represents the field
of relevance, dimmed information items represent items that would not be shown to the operator.
the use of C2 maps by operators. Second, by using ML mod- — Big Data Analytics For Military Intelligence. Pointer,
els in the construction of the GiCoMAF algorithm, this study Journal of the Singapore Armed Forces 42(1):51–65.
utilizes AI concepts and techniques to lay the foundation of Behera, A.; Keidel, A.; and Debnath, B. 2019. Context-
an algorithm that is targeted for improving human perfor- driven Multi-stream LSTM (M-LSTM) for Recognizing Fine-
mance and human cognitive aspects of workload and SA. Grained Activity of Drivers, volume 11269 LNCS. Springer
And third, successful construction of the algorithm can im- International Publishing.
prove mission performance and enhance the cognitive abil-
Bouaziz, M.; Morchid, M.; Dufour, R.; Linarès, G.; and
ities of operators to perform spatial-temporal tasks beyond
De Mori, R. 2017. Parallel Long Short-Term Memory for
the UAV domain. Algorithms of this form can be further im-
multi-stream classification. 2016 IEEE Workshop on Spoken
plemented in any domains where an overloaded spatial and
Language Technology, SLT 2016 - Proceedings 218–223.
temporal information has to be filtered, e.g., emergency dis-
patching systems, information guided surgeries, search and Breiman, L. 2001. Random Forests. Machine Learning
rescue missions, air traffic control, etc. 45(1):5–32.
Calhoun, G. L.; Draper, M. H.; Abernathy, M. F.; Patzek,
6 Acknowledgments M.; and Delgado, F. 2005. Synthetic vision system for im-
proving unmanned aerial vehicle operator. In Proc. SPIE
This research is partially funded by the ”Negev” scholar- 5802, Enhanced and Synthetic Vision 2005, volume 5802,
ship, and by the George Shrut Chair in Human Performance 219–230.
Management at Ben-Gurion University of the Negev. Corre-
sponding author – Yuval Zak Chen, T., and Guestrin, C. 2016. XGBoost: A Scalable Tree
Boosting System. In KDD ’16 Proceedings of the 22nd ACM
SIGKDD International Conference on Knowledge Discov-
References ery and Data Mining, 785–794.
Adams, J. A. 2015. Cognitive Task Analysis for Unmanned Choi, S. Y., and Cha, D. 2019. Unmanned aerial vehicles
Aerial System Design. In Handbook of Unmanned Aerial using machine learning for autonomous flight; state-of-the-
Vehicles. Dordrecht: Springer Netherlands. 2425–2441. art. Advanced Robotics 33(6):265–277.
Agichtein, E.; Brill, E.; Dumais, S.; and Ragno, R. 2016. Dzieciuch, I.; Reeder, J.; Gutzwiller, R.; Gustafson, E.;
Learning User Interaction Models for Predicting Web Coronado, B.; Martinez, L.; Croft, B.; and Lange, D. S.
Search Result Preferences. In Proceedings of the 29th an- 2017. Amplifying Human Ability through Autonomics and
nual international ACM SIGIR conference on Research and Machine Learning in IMPACT. In Proc. SPIE 10194, Micro-
development in information retrieval - SIGIR ’06. Seattle, and Nanotechnology Sensors, Systems, and Applications IX.
Washington, USA: ACM. Endsley, M. R., and Kiris, E. O. 1995. The Out-of-the-Loop
Azak, M., and Bayrak, A. E. 2008. A new approach for Performance Problem and Level of Control in Automation.
Threat Evaluation and Weapon Assignment problem, hybrid Human Factors 37(2):381–394.
learning with multi-agent coordination. In 2008 23rd In- Endsley, M. R. 1988. Situation awareness global assessment
ternational Symposium on Computer and Information Sci- technique (SAGAT). Aerospace and Electronics Confer-
ences, ISCIS 2008. ence, 1988. NAECON 1988., Proceedings of the IEEE 1988
Back, Y.; Zak, Y.; Parmet, Y.; and Oron-Gilad, T. 2019. Us- National 789–795.
ing Cognitive Work Analysis to Understand UAS Operators’ Endsley, M. R. 2000. Theoretical Underpinnings of Situa-
Map Display Needs. Submitted to Requirements Engineer- tion Awareness: A Critical Review. In Situation Awareness
ing, April 2019. Analysis and Measurement. Lawrence Erlbaum Associates.
Bao, T. 2016. Swimming In Sensors , Drowning In Data 3–32.
Everaerts, J. . 2008. The Use of Unmanned Aerial Vehi- Sandom, C. 2000. Operator Situational Awareness and Sys-
cles (UAVs) for Remote Sensing and Mapping. The Inter- tem Safety. In IEE One Day Seminar on Systems Depen-
national Archives of the Photogrammetry, Remote Sensing dency on Humans, volume 2000, 5–5. IEE.
and Spatial Information Sciences XXXVII(Part B1):1187– Shanker, T., and Richtel, M. 2011. In New Military, Data
1192. Overload Can Be Deadly.
Filkins, D. 2010. Operators of Drones Are Faulted in Tibshirani, R.; Bien, J.; Friedman, J.; Hastie, T.; Simon,
Afghan Deaths. N.; Taylor, J.; and Tibshirani, R. J. 2012. Strong Rules
Friedman, J. H. 2002. Stochastic gradient boosting. Com- for Discarding Predictors in Lasso-type Problems. Journal
putational Statistics and Data Analysis 38(4):367–378. of the Royal Statistical Society. Series B (Methodological)
74(2):245–266.
Gibson, J. J., and Crooks, L. E. 1938. A Theoretical Field-
Analysis of Automobile-Driving. The American journal of Tibshirani, R. 1996. Regression Shrinkage and Selection via
psychology 51(3):453–471. the Lasso. Journal of the Royal Statistical Society. Series B
(Methodological) 58(1):267–288.
Greene, W. H. 2012. Econometric Analysis (Seventh ed.).
Walker, S. H., and Duncan, D. B. 1967. Estimation of the
Boston: Pearson Education.
probability of an event as a function of several variables in-
Guimerà, R.; Llorente, A.; Moro, E.; and Sales-Pardo, M. dependent. Biometrika 54(1 and 2):167–179.
2012. Predicting Human Preferences Using the Block Struc- Zak, Y.; Oron-Gilad, T.; and Parmet, Y. 2018. Operator
ture of Complex Social Networks. PLoS ONE 7(9):3–9. Workload Reduced in Unmanned Aerial Vehicles : Making
Hand, D. J., and Till, R. J. 2001. A Simple Generalisation Command and Control ( C2 ) Maps More Useful. In Pro-
of the Area Under the ROC Curve for Multiple Class Clas- ceedings of the Human Factors and Ergonomics Society An-
sification Problems. Machine Learning 45(2):171–186. nual Meeting.
Izzetoglu, K.; Ayaz, H.; Hing, J. T.; Shewokis, P. A.; Bunce, Zak, Y.; Parmet, Y.; and Oron-Gilad, T. 2019a. Making
S. C.; Oh, P. Y.; and Onaral, B. 2015. Uav Operators Command and Control Maps More Useful for Operators of
Workload Assessment by Optical Brain Imaging Technol- Unmanned Aerial Systems : A Preliminary Model and Em-
ogy (fNIR). In Handbook of Unmanned Aerial Vehicles. pirical Approach. In HSI2019.
Dordrecht: Springer Netherlands. 2475–2500. Zak, Y.; Parmet, Y.; and Oron-Gilad, T. 2019b. Towards the
Kanevski, M.; Parkin, R.; Pozdnukhov, A.; Timonin, V.; Development of a Display Filter Algorithm for Command
Maignan, M.; Demyanov, V.; and Canu, S. 2004. Environ- and Control (C2) Maps for Operators of Unmanned Aerial
mental data mining and modeling based on machine learning Systems. In Extended Abstracts of the 2019 CHI Confer-
algorithms and geostatistics. Environmental Modelling and ence on Human Factors in Computing Systems, CHI EA ’19.
Software 19(9):845–855. Glasgow, UK: ACM.
Liu, Y.; Bian, J.; and Agichtein, E. 2008. Predicting In-
formation Seeker Satisfaction in Community Question An-
swering. Proceedings of the 31st annual international ACM
SIGIR conference on Research and development in informa-
tion retrieval - SIGIR ’08 483.
Marusich, L. R.; Bakdash, J. Z.; Onal, E.; Yu, M. S.;
Schaffer, J.; ODonovan, J.; Ho llerer, T.; Buchler, N.; and
Gonzalez, C. 2016. Effects of Information Availabil-
ity on Command-and-Control Decision Making: Perfor-
mance, Trust, and Situation Awareness. Human Factors:
The Journal of the Human Factors and Ergonomics Society
58(2):301–321.
Morse, B. S.; Engh, C.; and Goodrich, M. A. 2010. UAV
video coverage quality maps and prioritized indexing for
wilderness search and rescue. In 2010 5th ACM/IEEE In-
ternational Conference on Human-Robot Interaction (HRI),
227–234. Osaka: IEEE.
Noh, S., and Jeong, U. 2010. Intelligent Command and
Control Agent in Electronic Warfare Settings. International
Journal of intelligent Systems 25:514–528.
Quinlan, J. R. 1986. Induction of Decision Trees. Machine
Learning 1(1):81–106.
Rapaport, A. 2015. ”Quite a few Terrorists lost their lives
owing to Big Data”.