<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>On feature selection and evaluation of transportation mode prediction strategies</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mohammad Etemad</string-name>
          <email>etemad@dal.ca</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stan Matwin∗</string-name>
          <email>stan@cs.dal.ca</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>∗Institute for Computer Science, Polish Academy of Sciences, Warsaw and Postcode,</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Amílcar Soares</string-name>
          <email>amilcar.soares@dal.ca</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Luis Torgo</string-name>
          <email>ltorgo@dal.ca</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Faculty of Computer Science, Dalhousie University</institution>
          ,
          <addr-line>Halifax, NS</addr-line>
          ,
          <country country="CA">Canada</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute for Big Data Analytics, Dalhousie University</institution>
          ,
          <addr-line>Halifax, NS</addr-line>
          ,
          <country country="CA">Canada</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Poland</institution>
        </aff>
      </contrib-group>
      <fpage>2</fpage>
      <lpage>8</lpage>
      <abstract>
        <p>Transportation modes prediction is a fundamental task for decision making in smart cities and trafic management systems. Trafic policies based on trajectory mining can save money and time for authorities and the public. It may reduce the fuel consumption, commute time, and more pleasant moments for residents and tourists. Since the number of features that may be used to predict a user transportation mode can be substantial, finding a subset of features that maximizes a performance measure is worth investigating. In this work, we explore a wrapper and an information retrieval methods to find the best subset of trajectory features for a transportation mode dataset. Our results were compared with two related papers that applied deep learning methods. The results showed that our work achieved better performance. Furthermore, two types of cross-validation approaches were investigated, and the performance results show that the random cross-validation method may provide overestimated results.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        Trajectory mining is a very hot topic since positioning devices
are now used to track people, vehicles, vessels, natural
phenomena, and animals. It has applications including but not limited to
transportation mode detection [
        <xref ref-type="bibr" rid="ref3 ref31 ref33 ref6 ref7">3, 6, 7, 31, 33</xref>
        ], fishing detection
[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], tourism [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], vessels monitoring [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], and animal behaviour
analysis [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. There are also a number of topics in this field that
need to be investigated further such as high performance
trajectory classification methods [
        <xref ref-type="bibr" rid="ref20 ref3 ref31 ref33 ref6">3, 6, 20, 31, 33</xref>
        ], accurate trajectory
segmentation methods [
        <xref ref-type="bibr" rid="ref28 ref30 ref34">28, 30, 34</xref>
        ], trajectory similarity and
clustering [
        <xref ref-type="bibr" rid="ref10 ref17">10, 17</xref>
        ], dealing with trajectory uncertainty [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], active
learning [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ], and semantic trajectories [
        <xref ref-type="bibr" rid="ref2 ref22 ref24">2, 22, 24</xref>
        ]. These topics
are highly correlated and solving one of them requires to some
extent exploring more than one.
      </p>
      <p>As one of the trajectory mining applications, transportation
mode prediction is a fundamental task for decision making in
smart cities and trafic management systems. Trafic policies that
are designed based on trajectory mining can save money and time</p>
    </sec>
    <sec id="sec-2">
      <title>RELATED WORKS</title>
      <p>
        Feature engineering is an essential part of building a learning
algorithm. Some of the algorithms artificially extract features
using representation learning methods; On the other hand, some
studies select a subset from the handcrafted features. Both
methods have advantages such as faster learning, less storage space,
performance improvement of learning, and generalized models
building [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. These two methods are diferent from two
perspectives. First, artificially extracting features generates a new set
of features by learning, while feature selection chooses a subset
of existing handcrafted ones. Second, selecting handcrafted
features constructs more readable and interpretable models than
artificially extracting features [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. This work focuses on the
handcrafted feature selection task.
      </p>
      <p>
        Feature selection methods can be categorized into three
general groups: filter methods, wrapper methods, and embedded
methods [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Filter methods are independent of the learning
algorithm. They select features based on the nature of data regardless
of the learning algorithm [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. On the other hand, wrapper
methods are based on a kind of search, such as sequential, best first, or
branch and bound, to find the best subset that gives the highest
score on a selected learning algorithm [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. The embedded
methods apply both filter and wrapper [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. Feature selection methods
can be grouped based on the type of data as well. The feature
selection methods that use the assumption of i.i.d. (independent
and identically distributed) are conventional feature selection
methods [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] such as laplacian methods [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] and spectral feature
selection methods [
        <xref ref-type="bibr" rid="ref32">32</xref>
        ]. They are not designed to handle
heterogeneous or auto-correlated data. Some feature selection methods
have been introduced to handle heterogeneous data and stream
data that most of them working on graph structure such as [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>
        Conventional feature selection methods are categorized in four
groups: similarity-based methods like laplacian methods[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ],
Information theoretical methods [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ], sparse learning methods such
as [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ], and statistical based methods like chi2 [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ].
Similaritybased feature selection approaches are independent of the
learning algorithm, and most of them cannot handle feature
redundancy or correlation between features[
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. Likewise, statistical
methods like chi-square cannot handle feature redundancy, and
they need some discretization strategies[
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. The statistical
methods are also not efective in high dimensional space[
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. Since
our data is not sparse and sparse learning methods need to
overcome the complexity of optimization methods, and they were
not a candidate for our experiments. On the other hand,
information retrieval methods can handle both feature relevance and
redundancy[
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. Furthermore, selected features can be
generalized for learning tasks. Information gain, which is the core
of Information theoretical methods, assumes that samples are
independently and identically distributed. Finally, the wrapper
method only sees the score of the learning algorithm and tries to
maximize the score of the learning algorithm.
      </p>
      <p>
        The most common evaluation metric reported in the related
works is the accuracy of the models. Therefore, we use the
accuracy metric to compare our work with others from literature.
Since the data was imbalanced, we reported the f-score as well
to give equal importance to precision and recall. Despite the fact
that most of the related work applied the accuracy metric, it
is calculated using diferent methods including random
crossvalidation, cross-validation with dividing users, cross-validation
with mix users and simple division of the training and test set
without cross-validation. The latter is a weak method that is used
only in [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ]. The random cross-validation or the conventional
cross-validation was applied in [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ], [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] , and [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. [
        <xref ref-type="bibr" rid="ref33">33</xref>
        ] mixed the
training and test set according to users so that 70% of trajectories
of a user goes to the training set and the rest goes to test set. Only
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] performed the cross-validation by dividing users between the
training and test set. Because trajectory data is a kind of data with
spatial and temporal dimensions, users can also be placed in the
same semantic hierarchical structure such as students, worker,
visitors, and teachers, a conventional cross-validation method
could provide overestimated results as studied in [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ].
      </p>
    </sec>
    <sec id="sec-3">
      <title>NOTATIONS AND DEFINITIONS</title>
      <p>Definition 3.1. A trajectory point, li ∈ L, so that li = (xi , yi , ti ),
where xi is longitude and it varies from 0◦ to ±180◦, yi is latitude
and it varies from 0◦ to ±90◦, and ti (ti &lt; ti+1) is the capturing
time of the moving object, and L is the set of all trajectory points.</p>
      <p>
        A trajectory point can be assigned by some features that
describe diferent attributes of the moving object with a specific
time-stamp and location. The time-stamp and location are two
dimensions that make trajectory point spatio-temporal data with
two important properties: (i) auto-correlation and (ii)
heterogeneity [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. These features make the conventional cross validation
less suitable [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ].
      </p>
      <p>Definition 3.2. A raw trajectory, or simply a trajectory τ , is
a sequence of trajectory points captured through time, where
τ = (li , li+1, .., ln ), li ∈ L, i ≤ n.</p>
      <p>Definition 3.3. A sub-trajectory is one of the consecutive
subsequences of a raw trajectory generated by splitting the raw
trajectory into two or more sub-trajectories.</p>
      <p>For example, if we have one split point, k, and τ1 is a raw
trajectory then s1 = (li , li+1, ..., lk ) and s2 = (lk+1, lk+2, ..., ln )
are two sub trajectories generated by τ1.</p>
      <p>Definition 3.4. The process of generating sub-trajectories from
a raw trajectory is called segmentation.</p>
      <p>
        We used a daily segmentation of raw trajectories and then
segmented the data utilizing the transportation modes annotations
to partition the data. This approach is also used in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>Definition 3.5. A point feature is a measured value F p , assigned
to each trajectory points of a sub trajectory s.</p>
      <p>F p = (fi , fi+1, .., fn ) (1)
Notation 1 shows the feature F p for sub trajectory s. For example,
speed can be a point feature since we can calculate the speed
of a moving object for each trajectory point. Since we need two
trajectory points to calculate speed, we assume the speed of the
ifrst trajectory point is equal to the speed of the second trajectory
point.</p>
      <p>Definition 3.6. A trajectory feature is a measured value Ft ,
assigned to a sub trajectory, s.</p>
      <p>Ft = Σ fk (2)</p>
      <p>n</p>
      <p>Equation 2 shows the feature Ft for sub trajectory s. For
example, the speed mean can be a trajectory feature since we can
calculate the speed mean of a moving object for a sub trajectory.</p>
      <p>The Ftp is the notation for all trajectory features that generated
using point feature p. For example, Ftspeed represents all the
trajectory features derived from speed point feature. Moreover,
F mspeeaend denotes the mean of the trajectory features derived from
the speed point feature.
4</p>
    </sec>
    <sec id="sec-4">
      <title>THE FRAMEWORK</title>
      <p>In this section, the sequence of steps of q framework with eight
steps are explained (Figure 1). The first step groups the trajectory
points by Trajectory id to create daily sub-trajectories
(segmentation). Sub-trajectories with less than ten trajectory points were
discarded to avoid generating low-quality trajectories.</p>
      <p>
        Point features including speed, acceleration, bearing, jerk,
bearing rate, and the rate of the bearing rate were generated
in step two. The features speed, acceleration, and bearing were
ifrst introduced in [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ], and jerk was proposed in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The very
ifrst point feature that we generated was duration. This is the
time diference between two trajectory points. This feature gives
us essential information including some of the segmentation
position points, loss signal points, and is useful in calculating
point features such as speed, and acceleration. The distance was
calculated using the haversine formula. Having duration and
distance as two point features, we calculate speed, acceleration
and jerk using Equation 3, 4 , and 5 respectively. A function to
calculate the bearing (B) between two consecutive points was
also implemented and is detailed in Equation 6, where ϕi , λi is
the start point, ϕi+1, λi+1 the end point.
      </p>
      <p>Si =</p>
      <p>Distancei</p>
      <p>Durationi
Ai+1 = (Si+1 − Si )</p>
      <p>∆ t
Ji+1 = (Ai+1 − Ai )
∆ t
(3)
(4)
(5)
Bi+1 = atan2(sin λi+1 − λi cos ϕi+1, (6)
cos ϕi sin ϕi+1 − sin ϕi cos ϕi+1 cos λi+1 − λi )</p>
      <p>
        Two new features were introduced in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], named bearing rate,
and the rate of the bearing rate. Applying equation 7, we
computed the bearing rate.
      </p>
      <p>Br at e(i+1) = (Bi+1 − Bi ) (7)
∆ t</p>
      <p>Bi and Bi+1 are the bearing point feature values in points
i and i + 1. ∆ t is the time diference. The rate of the bearing
rate point feature is computed using equation 8. Since extensive
calculations are done with trajectory points, it was necessary an
eficient way to calculate all these equations for each trajectory.
Therefore, the code was written in a vectorized manner in Python
programming language which is faster than other online available
python versions of the bearing calculation. It can be possible to
gain more performance using other languages like C/C++.</p>
      <p>Brr at e(i+1) = (Br at e(i+1∆) t− Br at e(i)) (8)</p>
      <p>After calculating the point features for each trajectory, the
trajectory features were extracted in step three. Trajectory features
were divided into two diferent types including global trajectory
features and local trajectory features. Global features, like the
Minimum, Maximum, Mean, Median, and Standard Deviation,
summarize information about the whole trajectory and local
trajectory features, percentiles ( 10, 25, 50, 75, and 90), describe
a behavior related to part of a trajectory. The local trajectory
features extracted in this work were the percentiles of every
point feature. Five diferent global trajectory features were used
in the models tested in this work. In summary, we computed
70 trajectory features ( 10 statistical measures including five
global and five local features calculated for 7 point features)
for each sample trajectory. In Step 4, two feature selection
approaches were performed, wrapper search and information
retrieval feature importance. According to the best accuracy results
for development set , a subset of top 19 features was selected in
step 5. The code implementation of all these steps is available at
https://github.com/metemaad/TrajLib.</p>
      <p>
        In step 6, the framework deals with noise in the data
optionally. This means that we ran the experiments with and without
this step. Finally, we normalized the features (step 7) using the
Min-Max normalization method to avoid saturation, since this
method preserves the relationship between the values to
transform features to the same range and improves the quality of the
classification process [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Another possible method is Z
normalization; however, finding the best normalization method was out
of the scope of this work.
5
      </p>
    </sec>
    <sec id="sec-5">
      <title>EXPERIMENTS</title>
      <p>
        In this section, we detail the four experiments performed in
this work. In this work, we used the GeoLife dataset [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ]. This
dataset has 5,504,363 GPS records collected by 69 users, and is
labeled with eleven transportation modes: taxi (4.41%); car (9.40%);
train (10.19%); subway (5.68%); walk (29.35%); airplane (0.16%);
boat (0.06%); bike (17.34%); run (0.03%); motorcycle (0.006%); and
bus (23.33%). Two primary sources of uncertainty of the Geolife
dataset are device and human error. This inaccuracy can be
categorized in two major groups, systematic errors and random errors
[
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. The systematic error occurs when the recording device
cannot find enough satellites to provide precise data. The random
error can happen because of atmospheric and ionospheric efects.
Furthermore, the data annotation process has been done after
each tracking as [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ] explained in the Geolife dataset
documentation. As humans, we are all subject to fail in providing precise
information; it is possible that some users forget to annotate the
trajectory when they switch from one transportation mode to
another. For example, the changes in the speed pattern might be
a representation of human error.
      </p>
      <p>Moreover, we divide data into two folds: development set and
validation set. These two folds divided in a way that each user
can be either in development set or validation set. Therefore,
there is no overlap in terms of users. This division is applied for
user-oriented cross validation. We divide the validation fold to
ifve folds to do the cross validation and using this fold to compare
our results with related work.</p>
      <p>
        The best classifier using their default input parameters
(Section 5.1) was found in our first experiment (check scikit-learn
documentation1 for the classifiers default parameters values).
Tuning the classifiers parameters may lead to find a better
classiifer, but doing a grid search is expensive and does not change the
framework. In our second experiment (Section 5.2), the wrapper
and information theoretical methods are used to search the best
subset of our 70 features for the transportation modes prediction
task. The third experiment (Section 5.3) is a comparison between
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and our implementation. In the last experiment
(Section 5.4), the type of cross validation was investigated.
      </p>
      <p>In order to avoid using non-parametric statistical tests, we
repeat the experiments with diferent seeds and collect more
than 30 samples for performing the statistical tests. According to
central limit theorem, we can assume these samples follow the
normal distribution. Therefore, t-test results are reported.
5.1</p>
    </sec>
    <sec id="sec-6">
      <title>Classifier selection</title>
      <p>
        In this experiment, we investigated among six classifiers, which
classifier is the best. The experiment settings use to conventional
cross-validation and to perform the transportation mode
prediction task showed on [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. XGBoost, SVM, decision tree, random
forest, neural network, and adaboost are six classifiers that were
applied in the reviewed literature [
        <xref ref-type="bibr" rid="ref31 ref33 ref35 ref7">7, 31, 33, 35</xref>
        ].2 The dataset is
ifltered based on labels that have been applied in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] (e.g.,
walking, train, bus, bike, driving) and no noise removal method was
applied. The classifiers mentioned above were trained, and the
accuracy metric was calculated using random cross-validation
similar to [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ], and [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. This experiment was repeated for
eight randomly selected seeds (8, 65, 44, 7, 99, 654, 127, 653) to
generate more than 30 result samples that make safe to assume
a normal distribution for results based on central limit theorem.
The results of cross validation accuracy, presented in figure 2,
1https://scikit-learn.org/stable/supervised_learning.html#supervised-learning
2available on https://github.com/metemaad/trajpred
show that the random forest performs better than other models
(µ accur acy = 0.8189, σ = 0.10%) on the development set.
      </p>
      <p>The results of cross validation f-score, presented in figure 3,
show that the random forest performs better than other models
(µ f 1 = 0.8179, σ = 0.12%) on the development set.</p>
      <p>The second best model was XGBoost (µ accur acy = 0.8245, σ =
0.11%). The XGBoost was ranked the second because a paired
T-Test indicated that the random forest classifier results were
not statistically significantly higher than the XGBoost classifier
results, but since it has a higher variance than random forest,
we decided to rank random forest as first. In the other hand,
paired t-tests indicated that the random forest classifier results
were statistically significantly higher than the SVM, decision tree,
Neural Network, and Adaboost classifiers results.</p>
    </sec>
    <sec id="sec-7">
      <title>Feature selection using wrapper and information theoretical methods</title>
      <p>The second experiment aims to select the best features for
transportation modes prediction task for the Geolife dataset.</p>
      <p>We select one method from filter category which is
information theoretical method to see the efect of the heterogeneity of
data on feature selection method. Another method was selected
from wrapper category which is the full search wrapper method.
Filter methods sufer from having i.i.d assumption, while
wrapper methods do not. Therefore, comparing these two methods
shows the importance of taking into account the heterogeneity
of features of trajectory data.</p>
      <p>
        We selected the wrapper feature selection method because
it can be used with any classifier. Using this approach, we first
defined an empty set for selected features. Then, we searched
all the trajectory features one by one to find the best feature to
append to the selected feature set. The maximum accuracy score
was the metric for selecting the best feature to append to selected
features. After, we removed the selected feature from the set of
features and repeated the search for union of selected features
and next candidate feature in the feature set. We selected the
labels applied in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and the same cross-validation technique.
      </p>
      <p>The results are shown in figure 4. The results of this method
suggest that the top 19 features get the highest accuracy.
Therefore, we selected this subset as the best subset for classification
purposes using the random forest algorithm.</p>
      <p>
        Information theoretical feature selection is one of the
methods widely used to select essential features. Random forest is a
classifier that has embedded feature selection using information
theoretical metrics. We calculated the feature importance using
random forest. Then, each feature is appended to the selected
feature set and calculating the accuracy score for random forest
classifier. The user-oriented cross-validation was used here, and
the target labels are similar to [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Figure 5 shows the results of
cross-validation for appending features with respect to the
importance rank suggested by the random forest. We chose the wrapper
approach results since it produces statistically significant higher
accuracy score.
5.3
      </p>
    </sec>
    <sec id="sec-8">
      <title>Comparison with the related work</title>
      <p>
        In this third experiment, we filtered transportation modes which
have been used by [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] for evaluation. We divided the validation
fold into the training and test folds in a way that each user
can appear only either in the training or test fold. The top 19
features were selected to be used in this experiment which is
the best features subset mentioned in section 5.2. Therefore, we
approximately divided 80% of the data as training and 20% of the
data as the test set.
      </p>
      <p>
        We selected [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] because this is the only paper that divided
the dataset in a way that isolated users in training and test set.
Moreover, This research applied the handcrafted features and
interpretable classifiers, while [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] did not isolated users and used
representation learning features. Therefore, these two research
are in the two ends and spectrum and comparing our results with
theirs and may provide insights for validating our results.
      </p>
      <p>
        We assume the bayes error is the minimum possible error
and human error is near to the bayes error [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ]. Avoidable bias
is defined as the diference between the training error and the
human error. Achieving the performance near to the human
performance in each task is the primary objective of the research.
The recent advancements in deep learning lead to achieving some
performance level even more than the performance of doing the
task by human because of using large samples and scrutinizing
the data to fine clean it. However, “we cannot do better than
bayes error unless we are overfitting". [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ]. Having noise in GPS
data and human error, as we discussed, suggest that the avoidable
bias is more than five percent. This ground truth was our base to
exclude papers that reported more than 95% of accuracy.
      </p>
      <p>
        Thus, we compare our accuracy per segment results, repeated
for 8 diferent seeds, against [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] mean accuracy, 67.9%. A
onesample T-test indicated that our accuracy results (70.97%) are
higher and statistically significantly better than [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]’s results
(67.9%), p=0.0182.
      </p>
      <p>
        The label set for [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]’s research is walking, train, bus, bike,
taxi, subway, and car so that the taxi and car are merged and
called driving. Moreover, subway and train merged and called
the train class. We filtered the Geolife data to get the same
subsets as [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] reported based on that. Then, we randomly selected
80% of the data as the training and the rest as test set, we
applied five-fold cross-validation and repeated this for 8 diferent
seeds. The best subset of features was the same as the previous
experiment (Section 5.2). Running the random forest classifier
with 50 estimators, using SKlearn implementation [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ], results
on a mean accuracy of 87.16% for the five-fold cross-validation.
A one-sample T-test indicated that our accuracy results (87.16%)
are higher and statistically significantly better than [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]’s results
(84.8%), p=2.27e-12.
      </p>
      <p>We avoided using the noise removal method in the above
experiment because we believe we do not have access to labels of
the test dataset and using this method only increases our accuracy
unrealistically.
5.4</p>
    </sec>
    <sec id="sec-9">
      <title>Efects of types of cross-validation</title>
      <p>To visualize the efect of type of cross-validation on
transportation modes prediction task, we set up a controlled experiment.
We used the same classifiers and same features to calculate the
cross-validation accuracy on the whole dataset. Only the type of
cross-validation is diferent in this experiment, one is random,
and another is user-oriented cross-validation. Figure 6 shows that
there is a considerable diference between the cross-validation
accuracy results of user-oriented cross-validation and random
cross-validation.</p>
      <p>Furthermore, figure 7 shows that there is a considerable
diference between the cross-validation f-score results of user-oriented
cross-validation and random cross-validation.</p>
      <p>These results indicate that random cross-validation provides
overestimated accuracy and f-score results. Since the
correlation between user-oriented cross-validation results is less than
random cross-validation, proposing a specific cross-validation
method for evaluating the transportation mode prediction is a
topic that needs attention.
In this work, we reviewed some recent transportation modes
prediction methods and feature selection methods. We proposed
a framework for transportation modes prediction and four
experiments were conducted to cover diferent aspects of transportation
modes prediction.</p>
      <p>First, the performance of six recently used classifiers for the
transportation modes prediction was evaluated. The results showed
that the random forest classifier performs the best among all the
evaluated classifiers. The SVM was the worst classifier, and the
accuracy result of XGBoost was competitive with the random
forest classifier.</p>
      <p>In the second experiment, the efect of features using two
diferent approaches, the wrapper method and information
theoretical method were evaluated. The wrapper method shows that
we can achieve the highest accuracy using the top 19 features.
speed (the percentile 90 of
Both approaches suggest that the Fp90
the speed as defined in section 3) is the most essential feature
among all 70 introduced features. This feature is robust to noise
since the outlier values do not contribute to the calculation of
percentile 90.</p>
      <p>
        In the third experiment, the best model was compared with
the results showed in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The results show that our
suggested model achieved a higher accuracy. Our applied features
are readable and interpretable in comparison to [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and our model
has less computational cost.
      </p>
      <p>Finally, we investigate the efects of user-oriented cross-validation
and random cross-validation in the last experiments. The results
showed that random cross-validation provides overestimated
results in terms of the analyzed performance measures.</p>
      <p>We intend to extend this work in many directions. The
spatiotemporal characteristic of trajectory data is not taken into
account in most of the works from literature (e.g. autocorrelation
and heterogeneity). Fine tuning the classification models with
grid search and automatic (e.g. Genetic Algorithms, Racing
algorithms, and meta-learning) methods. We also intend to deeply
investigate the efects of cross-validation and other strategies like
holdout in trajectory data. Finally, space and time dependencies
can also be explored to tailor features for transportation means
prediction.</p>
    </sec>
    <sec id="sec-10">
      <title>ACKNOWLEDGMENTS</title>
      <p>The authors would like to thank NSERC (Natural Sciences and
Engineering Research Council of Canada) for financial support.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Gowtham</given-names>
            <surname>Atluri</surname>
          </string-name>
          , Anuj Karpatne, and
          <string-name>
            <given-names>Vipin</given-names>
            <surname>Kumar</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Spatio-Temporal Data Mining: A Survey of Problems and Methods</article-title>
          .
          <source>arXiv arXiv:1711.04710</source>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Vania</given-names>
            <surname>Bogorny</surname>
          </string-name>
          , Chiara Renso, Artur Ribeiro de Aquino, Fernando de Lucca Siqueira, and Luis Otavio Alvares.
          <year>2014</year>
          .
          <article-title>Constant-a conceptual data model for semantic trajectories of moving objects</article-title>
          .
          <source>Transactions in GIS 18</source>
          ,
          <issue>1</issue>
          (
          <year>2014</year>
          ),
          <fpage>66</fpage>
          -
          <lpage>88</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Sina</given-names>
            <surname>Dabiri</surname>
          </string-name>
          and
          <string-name>
            <given-names>Kevin</given-names>
            <surname>Heaslip</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Inferring transportation modes from GPS trajectories using a convolutional neural network</article-title>
          .
          <source>Transportation Research Part C: Emerging Technologies</source>
          <volume>86</volume>
          (
          <year>2018</year>
          ),
          <fpage>360</fpage>
          -
          <lpage>371</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Erico N de Souza</surname>
            , Kristina Boerder, Stan Matwin, and
            <given-names>Boris</given-names>
          </string-name>
          <string-name>
            <surname>Worm</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Improving fishing pattern detection from satellite AIS using data mining and machine learning</article-title>
          .
          <source>PloS one 11</source>
          ,
          <issue>7</issue>
          (
          <year>2016</year>
          ),
          <year>e0158248</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Renata</given-names>
            <surname>Dividino</surname>
          </string-name>
          , Amilcar Soares, Stan Matwin, Anthony W Isenor, Sean Webb, and
          <string-name>
            <given-names>Matthew</given-names>
            <surname>Brousseau</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Semantic Integration of Real-Time Heterogeneous Data Streams for Ocean-related Decision Making. In Big Data and Artificial Intelligence for Military Decision Making</article-title>
          . STO. https://doi.org/ 10.14339/
          <string-name>
            <surname>STO-MP-IST-</surname>
          </string-name>
          160
          <string-name>
            <surname>-</surname>
          </string-name>
          S1-3-PDF
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Yuki</given-names>
            <surname>Endo</surname>
          </string-name>
          , Hiroyuki Toda, Kyosuke Nishida, and
          <string-name>
            <given-names>Akihisa</given-names>
            <surname>Kawanobe</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Deep feature extraction from trajectories for transportation mode estimation</article-title>
          .
          <source>In Pacific-Asia Conference on Knowledge Discovery and Data Mining</source>
          . Springer,
          <fpage>54</fpage>
          -
          <lpage>66</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Mohammad</given-names>
            <surname>Etemad</surname>
          </string-name>
          , Amílcar Soares Júnior, and
          <string-name>
            <given-names>Stan</given-names>
            <surname>Matwin</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Predicting Transportation Modes of GPS Trajectories using Feature Engineering and Noise Removal</article-title>
          .
          <source>In Advances in AI: 31st Canadian Conf. on AI</source>
          ,
          <source>Canadian AI</source>
          <year>2018</year>
          , Toronto, ON, CA,
          <source>Proc. 31</source>
          . Springer,
          <fpage>259</fpage>
          -
          <lpage>264</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Shanshan</given-names>
            <surname>Feng</surname>
          </string-name>
          ,
          <source>Gao Cong, Bo An, and Yeow Meng Chee</source>
          .
          <year>2017</year>
          .
          <article-title>POI2Vec: Geographical Latent Representation for Predicting Future Visitors.</article-title>
          .
          <source>In AAAI.</source>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Sabrina</given-names>
            <surname>Fossette</surname>
          </string-name>
          , Victoria J Hobson, Charlotte Girard, Beatriz Calmettes, Philippe Gaspar,
          <string-name>
            <surname>Jean-Yves Georges</surname>
          </string-name>
          , and Graeme C Hays.
          <year>2010</year>
          .
          <article-title>Spatiotemporal foraging patterns of a giant zooplanktivore, the leatherback turtle</article-title>
          .
          <source>Journal of Marine systems 81, 3</source>
          (
          <year>2010</year>
          ),
          <fpage>225</fpage>
          -
          <lpage>234</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Andre</given-names>
            <surname>Salvaro</surname>
          </string-name>
          <string-name>
            <surname>Furtado</surname>
          </string-name>
          , Laercio Lima Pilla, and
          <string-name>
            <given-names>Vania</given-names>
            <surname>Bogorny</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>A branch and bound strategy for Fast Trajectory Similarity Measuring</article-title>
          .
          <source>Data Knowledge Engineering</source>
          <volume>115</volume>
          (
          <year>2018</year>
          ),
          <fpage>16</fpage>
          -
          <lpage>31</lpage>
          . https://doi.org/10.1016/j.datak.
          <year>2018</year>
          .
          <volume>01</volume>
          .003
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Quanquan</given-names>
            <surname>Gu</surname>
          </string-name>
          and Jiawei Han.
          <year>2011</year>
          .
          <article-title>Towards feature selection in network</article-title>
          .
          <source>In Proceedings of the 20th ACM ICIKM. ACM</source>
          ,
          <volume>1175</volume>
          -
          <fpage>1184</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Isabelle</given-names>
            <surname>Guyon</surname>
          </string-name>
          and
          <string-name>
            <given-names>André</given-names>
            <surname>Elisseef</surname>
          </string-name>
          .
          <year>2003</year>
          .
          <article-title>An introduction to variable and feature selection</article-title>
          .
          <source>Journal of ML research 3</source>
          ,
          <string-name>
            <surname>Mar</surname>
          </string-name>
          (
          <year>2003</year>
          ),
          <fpage>1157</fpage>
          -
          <lpage>1182</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Jiawei</surname>
            <given-names>Han</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Jian</given-names>
            <surname>Pei</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Micheline</given-names>
            <surname>Kamber</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Data mining: concepts and techniques</article-title>
          . Elsevier.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>X</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D</given-names>
            <surname>Cai</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P</given-names>
            <surname>Niyogi</surname>
          </string-name>
          .
          <year>2005</year>
          .
          <article-title>Laplacian Score for Feature Selection</article-title>
          ,
          <source>Advances in Nerual Information Processing Systems</source>
          . (
          <year>2005</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Sungsoon</surname>
            <given-names>Hwang</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cynthia</surname>
            <given-names>VanDeMark</given-names>
          </string-name>
          , Navdeep Dhatt, Sai V Yalla, and
          <string-name>
            <surname>Ryan T Crews</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Segmenting human trajectory data by movement states while addressing signal loss and signal noise</article-title>
          .
          <source>International Journal of Geographical Information Science</source>
          (
          <year>2018</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>22</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Jungwook</surname>
            <given-names>Jun</given-names>
          </string-name>
          , Randall Guensler, and
          <string-name>
            <given-names>Jennifer</given-names>
            <surname>Ogle</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>Smoothing methods to minimize impact of global positioning system random error on travel distance, speed, and acceleration profile estimates</article-title>
          .
          <source>Transportation Research Record: Journal of the TRB 1</source>
          ,
          <year>1972</year>
          (
          <year>2006</year>
          ),
          <fpage>141</fpage>
          -
          <lpage>150</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Hye-Young</surname>
            <given-names>Kang</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joon-Seok Kim</surname>
          </string-name>
          , and
          <string-name>
            <surname>Ki-Joune Li</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Similarity measures for trajectory of moving objects in cellular space</article-title>
          .
          <source>In SIGAPP09</source>
          .
          <fpage>1325</fpage>
          -
          <lpage>1330</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>Jundong</given-names>
            <surname>Li</surname>
          </string-name>
          , Kewei Cheng, Suhang Wang, Fred Morstatter, Robert P Trevino, Jiliang Tang, and Huan Liu.
          <year>2017</year>
          .
          <article-title>Feature selection: A data perspective</article-title>
          .
          <source>CSUR 50</source>
          ,
          <issue>6</issue>
          (
          <year>2017</year>
          ),
          <fpage>94</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Zechao</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Yi</given-names>
            <surname>Yang</surname>
          </string-name>
          , Jing Liu, Xiaofang Zhou,
          <string-name>
            <given-names>Hanqing</given-names>
            <surname>Lu</surname>
          </string-name>
          , et al.
          <year>2012</year>
          .
          <article-title>Unsupervised feature selection using nonnegative spectral analysis.</article-title>
          .
          <source>In AAAI</source>
          , Vol.
          <volume>2</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>Hongbin</given-names>
            <surname>Liu</surname>
          </string-name>
          and
          <string-name>
            <given-names>Ickjai</given-names>
            <surname>Lee</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>End-to-end trajectory transportation mode classification using Bi-LSTM recurrent neural network</article-title>
          .
          <source>In Intelligent Systems and Knowledge Engineering (ISKE)</source>
          ,
          <year>2017</year>
          12th International Conference on.
          <source>IEEE</source>
          , 1-
          <fpage>5</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>Huan</given-names>
            <surname>Liu</surname>
          </string-name>
          and
          <string-name>
            <given-names>Rudy</given-names>
            <surname>Setiono</surname>
          </string-name>
          .
          <year>1995</year>
          .
          <article-title>Chi2: Feature selection and discretization of numeric attributes</article-title>
          .
          <source>In Tools with artificial intelligence</source>
          ,
          <year>1995</year>
          . proceedings.,
          <source>seventh international conference on. IEEE</source>
          ,
          <fpage>388</fpage>
          -
          <lpage>391</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>B. N.</given-names>
            <surname>Moreno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. Soares</given-names>
            <surname>Júnior</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. C.</given-names>
            <surname>Times</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Tedesco</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Stan</given-names>
            <surname>Matwin</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Weka-SAT: A Hierarchical Context-Based Inference Engine to Enrich Trajectories with Semantics</article-title>
          .
          <source>In Advances in Artificial Intelligence</source>
          . Springer International Publishing, Cham,
          <fpage>333</fpage>
          -
          <lpage>338</lpage>
          . https://doi.org/10.1007/978-3-
          <fpage>319</fpage>
          -06483-3_
          <fpage>34</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>Andrew</given-names>
            <surname>Ng</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Nuts and bolts of building AI applications using Deep Learning</article-title>
          . NIPS.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Christine</surname>
            <given-names>Parent</given-names>
          </string-name>
          , Stefano Spaccapietra, Chiara Renso, Gennady Andrienko, Natalia Andrienko, Vania Bogorny, Maria Luisa Damiani,
          <string-name>
            <surname>Aris</surname>
            <given-names>GkoulalasDivanis</given-names>
          </string-name>
          , Jose Macedo, Nikos Pelekis, Yannis Theodoridis, and
          <string-name>
            <given-names>Zhixian</given-names>
            <surname>Yan</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Semantic Trajectories Modeling and Analysis</article-title>
          .
          <source>ACM Comput. Surv</source>
          .
          <volume>45</volume>
          ,
          <issue>4</issue>
          ,
          <string-name>
            <surname>Article 42</surname>
          </string-name>
          (
          <issue>Aug</issue>
          .
          <year>2013</year>
          ),
          <volume>32</volume>
          pages.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>F.</given-names>
            <surname>Pedregosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Varoquaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gramfort</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Michel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Thirion</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Grisel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Blondel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Prettenhofer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Weiss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Dubourg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Vanderplas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Passos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Cournapeau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Brucher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Perrot</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E.</given-names>
            <surname>Duchesnay</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Scikit-learn: Machine Learning in Python</article-title>
          .
          <source>MLR</source>
          (
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Hanchuan</surname>
            <given-names>Peng</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Fuhui</given-names>
            <surname>Long</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Chris</given-names>
            <surname>Ding</surname>
          </string-name>
          .
          <year>2005</year>
          .
          <article-title>Feature selection based on mutual information criteria of max-dependency, max-relevance, and minredundancy</article-title>
          .
          <source>IEEE Transactions on pattern analysis and machine intelligence 27</source>
          ,
          <issue>8</issue>
          (
          <year>2005</year>
          ),
          <fpage>1226</fpage>
          -
          <lpage>1238</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>David R Roberts</surname>
          </string-name>
          , Volker Bahn, Simone Ciuti, Mark S Boyce, Jane Elith, Gurutzeta Guillera-Arroita, Severin Hauenstein,
          <string-name>
            <surname>José J Lahoz-Monfort</surname>
            , Boris Schröder,
            <given-names>Wilfried</given-names>
          </string-name>
          <string-name>
            <surname>Thuiller</surname>
          </string-name>
          , et al.
          <year>2017</year>
          .
          <article-title>Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure</article-title>
          .
          <source>Ecography</source>
          <volume>40</volume>
          ,
          <issue>8</issue>
          (
          <year>2017</year>
          ),
          <fpage>913</fpage>
          -
          <lpage>929</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>A.</given-names>
            <surname>Soares Júnior</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. N.</given-names>
            <surname>Moreno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. C.</given-names>
            <surname>Times</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Matwin</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L. A. F.</given-names>
            <surname>Cabral</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>GRASP-UTS: an algorithm for unsupervised trajectory segmentation</article-title>
          .
          <source>International Journal of Geographical Information Science</source>
          <volume>29</volume>
          ,
          <issue>1</issue>
          (
          <year>2015</year>
          ),
          <fpage>46</fpage>
          -
          <lpage>68</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>A.</given-names>
            <surname>Soares Júnior</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Renso</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Matwin</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>ANALYTiC: An Active Learning System for Trajectory Classification</article-title>
          .
          <source>IEEE Computer Graphics and Applications</source>
          <volume>37</volume>
          ,
          <issue>5</issue>
          (
          <year>2017</year>
          ),
          <fpage>28</fpage>
          -
          <lpage>39</lpage>
          . https://doi.org/10.1109/
          <string-name>
            <surname>MCG</surname>
          </string-name>
          .
          <year>2017</year>
          .3621221
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>A.</given-names>
            <surname>Soares Júnior</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Cesario Times</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Renso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Matwin</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L. A. F.</given-names>
            <surname>Cabral</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>A Semi-Supervised Approach for the Semantic Segmentation of Trajectories</article-title>
          .
          <source>In 2018 19th IEEE International Conference on Mobile Data Management (MDM)</source>
          .
          <volume>145</volume>
          -
          <fpage>154</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <surname>Xiao</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Identifying Diferent Transportation Modes from Trajectory Data Using Tree-Based Ensemble Classifiers</article-title>
          .
          <source>ISPRS 6</source>
          ,
          <issue>2</issue>
          (
          <year>2017</year>
          ),
          <fpage>57</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>Zheng</given-names>
            <surname>Zhao</surname>
          </string-name>
          and
          <string-name>
            <given-names>Huan</given-names>
            <surname>Liu</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Spectral feature selection for supervised and unsupervised learning</article-title>
          .
          <source>In Proceedings of the 24th international conference on Machine learning. ACM</source>
          ,
          <volume>1151</volume>
          -
          <fpage>1157</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <surname>Yu</surname>
            <given-names>Zheng</given-names>
          </string-name>
          , Yukun Chen,
          <string-name>
            <given-names>Quannan</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Xing</given-names>
            <surname>Xie</surname>
          </string-name>
          , and
          <string-name>
            <surname>Wei-Ying Ma</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Understanding transportation modes based on GPS data for web applications</article-title>
          .
          <source>TWEB 4</source>
          ,
          <issue>1</issue>
          (
          <year>2010</year>
          ),
          <fpage>1</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <surname>Yu</surname>
            <given-names>Zheng</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Quannan</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Yukun</given-names>
            <surname>Chen</surname>
          </string-name>
          , Xing Xie, and
          <string-name>
            <surname>Wei-Ying Ma</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Understanding mobility based on GPS data</article-title>
          .
          <source>In UbiComp 10th. ACM</source>
          ,
          <volume>312</volume>
          -
          <fpage>321</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <surname>Qiuhui</surname>
            <given-names>Zhu</given-names>
          </string-name>
          , Min Zhu,
          <string-name>
            <given-names>Mingzhao</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Min</given-names>
            <surname>Fu</surname>
          </string-name>
          , Zhibiao Huang, Qihong Gan, and
          <string-name>
            <given-names>Zhenghao</given-names>
            <surname>Zhou</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Transportation modes behaviour analysis based on raw GPS dataset</article-title>
          .
          <source>International Journal of Embedded Systems</source>
          <volume>10</volume>
          ,
          <issue>2</issue>
          (
          <year>2018</year>
          ),
          <fpage>126</fpage>
          -
          <lpage>136</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>