=Paper=
{{Paper
|id=Vol-2696/paper_97
|storemode=property
|title=Leverage the Predictive Power Score of Lifelog Data's Attributes to Predict the Expected Athlete Performance
|pdfUrl=https://ceur-ws.org/Vol-2696/paper_97.pdf
|volume=Vol-2696
|authors=Anh-Vu Mai-Nguyen,Van-Luon Tran,Minh-Son Dao,Koji Zettsu
|dblpUrl=https://dblp.org/rec/conf/clef/Mai-NguyenTDZ20
}}
==Leverage the Predictive Power Score of Lifelog Data's Attributes to Predict the Expected Athlete Performance==
<pdf width="1500px">https://ceur-ws.org/Vol-2696/paper_97.pdf</pdf>
<pre>
    Leverage the Predictive Power Score of Lifelog
      Data’s Attributes to Predict the Expected
                Athlete Performance

    Anh-Vu Mai-Nguyen1 , Van-Luon Tran1 , Minh-Son Dao?2 , and Koji Zettsu2
                    1
                       University of Science, VNU-HCMC, Vietnam
                      {1612904,1612362}@student.hcmus.edu.vn
      2
        National Institute of Information and Communications Technology, Japan
                               {dao,zettsu}@nict.go.jp


        Abstract. Doing exercises regularly and scientifically can bring better
        health for people and improve sports performance for athletes. Many in-
        vestigations have carried on to build necessary models that utilize data
        collected from people to predict sports performance. Nevertheless, most
        researches use data directly related to the moment when exercises happen
        to build prediction models, even though more data related to people’s
        daily activities also impact on sports performance. Thanks to lifelogging,
        we can now have more data reports not only on people’s exercises but
        also on people’s daily activities (both mental and physical aspects). Un-
        fortunately, finding out which data attributes correlate to the changing
        of sports performance and leveraging these correlated attributes to build
        a precise prediction model is not a trivial problem. In this paper, we
        introduce the solution that utilized the predictive power score of lifelog
        data attributes during a long time to predict an athlete who trained for a
        sporting event. We evaluate our solution using the dataset and evaluation
        metric given by imageCLEFlifelog task 2: sports performance lifelog.


1     Introduction

Having long-term and regular exercise can bring many benefits to people’s health
and daily activities [1]. This argument is also valid for athletes who want to
improve their sports performance [2]. Hence, if we can monitor training periods
and other factors that could impact both the mental and physical aspects of a
person, we can predict that person’s sports performance.
   In [3], the authors used SVM with particle swarm optimization to tackle the
problem of athlete performance prediction. They used that dataset of 500 records
of 100 meters running for training and testing their model. The comparison
between the proposed method and the linear regression and neural networks
  Copyright c 2020 for this paper by its authors. Use permitted under Creative Com-
  mons License Attribution 4.0 International (CC BY 4.0). CLEF 2020, 22-25 Septem-
  ber 2020, Thessaloniki, Greece.
?
  corresponding author
confirmed the better performance. This method’s vital contribution is to apply
chaotic theory on the historical data of the athletes to discover the hidden rules
towards improving the productivity of prediction models. Unfortunately, the
description of the dataset was not clarified enough. Hence, the re-producing of
this method could be difficult.
    In [4], the authors introduced the problem of post hoc analysis (i.e., a pro-
cess to analyze the athlete’s performance after the performed sports activity)
using artificial intelligence. The historical data of the athlete’s performance was
analyzed, mostly using heart rate, to automatically make up the time deficit
on the running competition by using a differential evolution (DE) algorithm.
Unfortunately, the model did not concern the environmental conditions (e.g.,
weather, altitude, topography, humidity) that logically influences the athlete’s
performance.
    In [5], the authors utilized computational intelligence and visualization to
analyze heart rate and GPS data to better understand cycling and fitness physi-
cal activities. The authors discovered the positive correlation between heart rate
and altitude gradient, the negative correlation between heart rate and speed,
and the correlation between The mean heart rate change delay and changes in
the altitude gradients associated with cycling up and down.
    In [6], the authors introduced a new algorithm based on the behavior of
micro-bats for association rule mining (BatMiner) to explore the athlete’s char-
acteristics that have the most significant positive impact on performance. Based
on the results, an athlete can practice alone without the appearance of his/her
coach. There were two kinds of data sources: (1) activity datasets obtained by
sports trackers or other wearable mobile devices, and (2) subjective informa-
tion about psycho-physical characteristics of the athlete during training sessions
through conversations between the athlete and the trainer. The former was a
duration of the training session, distance of the training session, average heart-
rates, and calorie consumption. The latter were external factors (e.g., weather
conditions, the type of training session), sports nutrition, rest time (e.g., after-
noon, night), and overall health (e.g., fatigue, cramping, welfare). The data were
captured from TCX files of a professional, 32-years-old, male cyclist with many
years of experience, who underwent training sessions during the first half of 2014,
and he prefers to remain anonymous. The result was that the BatMiner algo-
rithm is slightly better than the HBCS-ARM (a family of SI-based algorithms)
for all measures when comparing the best ten association rules discovered by
both algorithms obtained in 25 independent runs.
    The imageCLEFlifelog 2020 (task sports performance lifelog) [7] provides
such a lifelog dataset and raises the exciting challenge to predict the change in
running time and weight of a person between the beginning and end of training
periods. The challenge here, in our opinion, is to find out which data attributes
correlate to the changing of sports performance, and leveraging these correlated
attributes to build a precise prediction model is not a trivial problem. In this
paper, we introduce the solution that utilized the predictive power score of lifelog
data attributes during a long time to predict an athlete who trained for a sporting
event.
    The paper is organized as follows: section 2 introduces our methodology,
section 3 reports our results and discussions, and section 4 concludes our contri-
bution.


2     Methodology
In this section, we describe the dataset given by the organizers and our approach
used to predict the athlete’s expected performance.

2.1   Data Prepossessing
We have to process a multi-modal dataset collected from Fitbit versa 2, PMSYS,
and Google form [8]. This dataset consists of various interval observed attributes
such as minute-observed attributes (e.g., calories burned, heart rate, step), day-
observed attributes (e.g., weight, meals), and event-observed attributes (e.g.,
sleep, activity).
    First, we synchronized different intervals of time into one interval of time for
further processing. To do that, we summarized minute-observed attributes fol-
lowing the time length of event-observed attributes and day-observed attributes.
For instance, we calculated the total calories burned for each activity and total
calories consumed per day.
    Then, we normalized the unit of attributes sharing the same meaning to one
basic unit. For example, we converted the ’duration’ attribute from millisecond
to second and active action attributes (e.g., lightly active, very active) from
minutes to seconds. Besides, we used one-hot encoding with category attributes
such as ’meal.’
    Next, we dealt with missing values by filling them with previous value if they
are not in the first order of time and replacing the first time-order values with
their following values. Especially with non-value attributes, we filled with the
average value of these attributes from all participants. We also detected and
deleted outliers to decrease their impact on the final results.
    Finally, we generated a new attribute representing time per kilometer of
running activity in exercise data using speed attributes.

2.2   Feature Selection
As mentioned in the previous section, we use a predictive power score (PPS) to
find a correlation coefficient among dataset’s attributes. We are utilizing PPS
to imply the (hidden) correlations among attributes so that the following things
can be detected and summarized (1) non-linear relationships, (2) asymmetric
correlation, and (3) predictive value among categorical variables and nominal
data. Clearly, the dataset we deal with has all three characteristics mentioned
above. Another reason is that the PPS has some advantages over correlation
                                    Table 1: Chosen attributes for time prediction models


                                                                                                                                                Condition LSTM
                                                                                                              Vanilla LSTM
                                                                                     Second CNN


                                                                                                                             Stack LSTM
                                                                                                  Third CNN
                                                                         First CNN


                                                                                                                                                                 Data Type
                                                                                                                                          GRU
                                                                   MLP
                                                              LR
                                       active duration         x   x     x           x            x            x             x            x      x time-series
                                     average heart rate        x   x     x           x            x            x             x            x      x auxiliary
                                           calories            x   x     x           x            x            x             x            x      x time-series
                                           distance            x   x     x           x            x            x             x            x      x time-series
     Multi-attribute Models


                                       elevantion gain         x   x     x           x            x            x             x            x      x time-series
                                             steps             x   x     x           x            x            x             x            x      x time-series
                               time in cardio heart rate zone x    x     x           x            x            x             x            x      x auxiliary
                              time in fat burn heart rate zone x   x     x           x            x            x             x            x      x auxiliary
                                time in peak heart rate zone x     x     x           x            x            x             x            x      x auxiliary
                                    very active seconds        x   x     x           x            x            x             x            x      x auxiliary
                                           id parti                                                                                              x auxiliary
                                        time per km            x   x     x           x            x            x             x            x      x time-series
                                              age                                                                                                x auxiliary
                                            height                                                                                               x auxiliary
                                            gender                                                                                               x auxiliary
  One-attribute


                                           id parti                                                                                       x      x auxiliary
                                        time per km            x   x     x           x            x            x             x            x      x time-series
    Models


                                              age                                                                                                x auxiliary
                                            height                                                                                               x auxiliary
                                            gender                                                                                               x auxiliary


for finding predictive patterns in the data. That leads to the cue to select the
suitable features for our prediction models.
    After building the PPS matrix from cleaned data, we remove all attributes
that do not have any relation to running time and weight. Besides, we also ignore
attributes that can be predicted by another attribute to reduce the complexity of
feature sets. Moreover, we keep pairs of mutually predictive attributes to main-
tain a strong correlation. Finally, we come up with a set of attributes denoted
in Table 1, 2, and 3
                        Table 2: Chosen attributes for weight prediction models (part 1)


                                                                                                                                            Condition LSTM
                                                                                                          Vanilla LSTM
                                                                                 Second CNN


                                                                                                                         Stack LSTM
                                                                                              Third CNN
                                                                     First CNN


                                                                                                                                                             Data Type
                                                                                                                                      GRU
                                                               MLP
                                                          LR
                                            weight         x   x     x           x            x            x             x            x      x time-series
                                            glasses        x   x     x           x            x            x             x            x      x auxiliary
                                         very active       x   x     x           x            x            x             x            x      x auxiliary
                                       lightly active      x   x     x           x            x            x             x            x      x auxiliary
                                          sedentary        x   x     x           x            x            x             x            x      x auxiliary
                                           calories        x   x     x           x            x            x             x            x      x auxiliary
                                           distance        x   x     x           x            x            x             x            x      x auxiliary
                                             steps         x   x     x           x            x            x             x            x      x auxiliary
                                          heart rate       x   x     x           x            x            x             x            x      x auxiliary
                                            fatigue        x   x     x           x            x            x             x            x      x auxiliary
                                             mood          x   x     x           x            x            x             x            x      x auxiliary
      Multi-attribute Models


                                          readiness        x   x     x           x            x            x             x            x      x auxiliary
                                     sleep duration h      x   x     x           x            x            x             x            x      x auxiliary
                                       sleep quality       x   x     x           x            x            x             x            x      x auxiliary
                                           soreness        x   x     x           x            x            x             x            x      x auxiliary
                                             stress        x   x     x           x            x            x             x            x      x auxiliary
                                          breakfast        x   x     x           x            x            x             x            x      x auxiliary
                                             lunch         x   x     x           x            x            x             x            x      x auxiliary
                                            dinner         x   x     x           x            x            x             x            x      x auxiliary
                                           evening         x   x     x           x            x            x             x            x      x auxiliary
                                          efficiency       x   x     x           x            x            x             x            x      x auxiliary
                                          end time         x   x     x           x            x            x             x            x      x auxiliary
                                        overall score      x   x     x           x            x            x             x            x      x auxiliary
                                    composition score      x   x     x           x            x            x             x            x      x auxiliary
                                   revitalization score    x   x     x           x            x            x             x            x      x auxiliary
                                      duration score       x   x     x           x            x            x             x            x      x auxiliary
                                    resting heart rate     x   x     x           x            x            x             x            x      x auxiliary
                                         restlessness      x   x     x           x            x            x             x            x      x auxiliary
                                       deep seconds        x   x     x           x            x            x             x            x      x auxiliary
                               deep thirty day avg seconds x   x     x           x            x            x             x            x      x auxiliary


2.3    Prediction Models

We consider data attributes that reported exercise activities (e.g., running, jog-
ging) as time-series data because athletes are in the training time prepared for
sports events. It means that their exercises repeat regularly and seasonally. We
also consider the rest of the data attributes as auxiliary data such as age, gender,
and height.
    As discussed in previous sections, two main directions of research in an ath-
lete’s performance prediction topic have investigated. The first one considers only
                           Table 3: Chosen attributes for weight prediction models (part 2)


                                                                                                                                              Condition LSTM
                                                                                                            Vanilla LSTM
                                                                                   Second CNN


                                                                                                                           Stack LSTM
                                                                                                Third CNN
                                                                       First CNN


                                                                                                                                                               Data Type
                                                                                                                                        GRU
                                                                 MLP
                                                            LR
                                light thirty day avg seconds x   x     x           x            x            x             x            x      x auxiliary
       Multi-attribute Models


                                          rem count          x   x     x           x            x            x             x            x      x auxiliary
                                         rem seconds         x   x     x           x            x            x             x            x      x auxiliary
                                 rem thirty day avg seconds x    x     x           x            x            x             x            x      x auxiliary
                                          wake count         x   x     x           x            x            x             x            x      x auxiliary
                                        wake seconds         x   x     x           x            x            x             x            x      x auxiliary
                                wake thirty day avg seconds x    x     x           x            x            x             x            x      x auxiliary
                                           id parti          x   x     x           x            x            x             x            x      x auxiliary
                                              age                                                                                              x auxiliary
                                            height                                                                                             x auxiliary
                                            gender                                                                                             x auxiliary
    One-attribute


                                            weight           x   x     x           x            x            x             x            x      x time-series
                                           id parti                                                                                            x auxiliary
      Models


                                              age                                                                                              x auxiliary
                                            height                                                                                             x auxiliary
                                            gender                                                                                             x auxiliary


data collected during the exercise and ignore other data even these data probably
correlate to the athlete’s performance. The second one concern all correlated and
related data. Followed these directions, we design two types of models. The first
one, called the univariate time-series model, utilizes one attribute. The second
one uses a set of attributes.
    First, we build two baseline methods: (1) a simple linear regression model
and (2) a multilayer perception model using the ReLU activation function.


                                            Fig. 1: First CNN model architecture
                   Fig. 2: Second CNN model architecture


                   Fig. 3: Third CNN model architecture


(a) Vanilla LSTM model (b) Stack LSTM model     (c) Condition LSTM model

                   Fig. 4: LSTM-like model architectures


   Then, we build three CNN models: (1) the first one consists of one 1D-
convolution and 1D-max-pooling layers (Fig. 1), (2) the second one contains
more 1D-convolution layers than the first (Fig. 2), and (3) the third one includes
more 1D-convolution 1D and 1D-max pooling layers than the second (Fig. 3).
   Next, we create three LSTM-like models: (1) the Vanilla LSTM model that
has the number of units in the hidden layer equaled to the number of time steps
(Fig. 4.a), (2) the Stack LSTM model (Fig. 4.b)., and (3) the conditional-LSTM
model with initial condition attributes such as age, gender, and height (Fig. 4.c).
   Finally, we build the GRU model (Fig. 5).


                          Fig. 5: GRU model architecture


3     Experimental Results
In this section, we describe how to utilize our models with selected data at-
tributes to predict the change in running speed and weight since the beginning
of the reporting period to the end of the reporting period.
    We use the dataset and evaluation metric provided by the imageCLEFlifelog
organizers [9]. The readers could refer to the paper written by the organizers
for more details [7]. For short, the evaluation metric is defined as follow: ”For
the evaluation of the tasks the main ranking will be based on whether there is a
correct positive or negative change (a point per correct) - and if there is a draw,
the difference between the predicted and actual change will be evaluated and used
to rank the task participants.”3 .
    The organizers define three subtasks: ”(1) predict the change in running speed
given by the change in seconds used per km (kilometer speed) from the initial run
to the run at the end of the reporting period., (2) predict the change in weight
since the beginning of the reporting period to the end of the reporting period
in kilos (1 decimal), and (3) predict the change in weight from the beginning
of February to the end of the reporting period in kilos (1 decimal) using the
images.”
3
    https://www.imageclef.org/2020/lifelog
                 Table 4: The ten best models for time prediction
         Time Prediction Model              Validation MSE Validation MAE Train MSE Train MAE
VanillaLSTMmodel one attribute time-steps 5 3872.775       43.5962        3593.101  43.40623
VanillaLSTMmodel one attribute time-steps 7 3922.615       47.3889        3554.712  42.87104
Condition LSTM one attribute time-steps 5 4023.349         43.10283       4874.171  51.20744
StackLSTMmodel one attribute time-steps 5 4039.655         44.82555       4191.725  45.65004
StackLSTMmodel one attribute time-steps 3 4044.094         46.14255       2586.746  35.38247
Second CNN model one attribute time-steps 5 4046.895       44.3603        4435.651  48.17548
GRU model one attribute time-steps 7        4076.255       47.21735       4922.799  52.2997
Condition LSTM one attribute time-steps 7 4216.521         48.09542       4940      52.39553
First CNN model one attribute time-steps 7 4263.523        48.82619       2918.934  39.79049
MLPmodel one attribute time-steps 7         4265.8         46.76684       4693.186  50.9879


3.1    Predict the change in running time
We train different prediction models for predicting the change of running time,
both for a univariate attribute (i.e., time) and a set of attributes. We use three
different numbers of input time steps, which are 3, 5, and 7. Table 1 denotes
which attributes are used for training which models.
    After training these models, we evaluate them and select the ten best models
with the lowest validation loss as the official models for this subtask. Table 4
shows information of these models during training and validating stages.

3.2    Predict the change in weight
Like the first subtask, we build two types of models that use one or a set of
attributes. These models have the same architecture as the models of the first
subtask. However, we use four different input time steps: 7, 14, 21, and 30.
    After training these models, we evaluate them and select the ten best models
with the lowest validation loss as the official models for this subtask. Table 5
shows information of these models during training and validating stages.
    For subtask 3 ”Predict the change in weight from the beginning of February
to the end of the reporting period in kilos (1 decimal) using the images”, we use
the app, namely Calorie Mama4 , to approximately calculate the calories from
meal/food images. Then, we add this attribute to the set of current attributes.
Then, we run as subtask 2.
    The table 6 shows our results evaluated by the organizers. We apply ten
different models with the best accuracy filtered at the training stage, as described
in Table 4 and 5, to run at the testing stage (i.e., the model’s IDs expressed in
the table are the same with the run’s IDs denoted in table 6). As we can see
in the table 6, our models cannot reach the optimal stage where both accuracy
and the absolute difference can be optimal at the same time. For example, with
subtask 1 (i.e., predict the change in running time), run 8 got the best accuracy
(i.e., 1) while 10 received the smallest absolute difference (i.e., 96). With subtask
2 (i.e., predict the change in weight without image data), run 9 reached the best
accuracy (i.e., 0.9), while 6 and 8 gained the smallest absolute difference (i.e.,
4
    https://www.caloriemama.ai/CalorieMama
                 Table 5: The ten best models for weight prediction
            Weight Prediction Model                 Validation MSE Validation MAE Train MSE Train MAE
Condition LSTM weight one attribute time-steps 14 0.204387         0.320392       0.350457  0.260382
LR model weight one attribute time-steps 7          0.211679       0.318684       0.426128  0.305971
VanillaLSTMmodel weight one attribute time-steps 7 0.211884        0.326093       0.379312  0.285081
LR model weight one attribute time-steps 14         0.215518       0.33098        0.448237  0.296486
Condition LSTM weight one attribute time-steps 7    0.216421       0.318217       0.353498  0.25426
VanillaLSTMmodel weight one attribute time-steps 14 0.217808       0.341755       0.402532  0.285623
StackLSTMmodel weight one attribute time-steps 7    0.218527       0.337486       0.38896   0.288394
StackLSTMmodel weight one attribute time-steps 14 0.218925         0.330796       0.328408  0.267264
StackLSTMmodel weight one attribute time-steps 21 0.220059         0.346851       0.387996  0.269111
GRU model weight one attribute time-steps 7         0.221908       0.323945       0.363954  0.24998


11). With subtask 3 (i.e., predict the change in weight with image data), the
results look more stable than other subtasks; at run 8, the accuracy got the
maximum (e.g., 1) while at run 10 the smallest absolute difference was received
(i.e., 1). Nevertheless, we can accept one model that can balance between the
accuracy and the absolute difference, according to the purpose of users.
    The table 7 illustrates the results for three subtasks conducted by organizers,
namely baseline. Regarding the first subtask, our result is far better than the
baseline on both metrics. For instance, except run 5, the rest of our runs have a
higher accuracy score than the best accuracy of baseline from 0.2 to 0.4 points,
while run 10 and run 8 have absolute differences fewer 100 points than the best
one of baseline. Besides, the baseline cannot precisely perform on both accuracy
and absolute difference categories at the same time. Considering the second
subtask, although our best run for absolute difference (run 1) has a slightly
lower score than the baseline does, about 3 points, its accuracy is twice better
than that of the baseline (0.4). When coming to the third subtasks, our result
has double accuracy (run 8) and half absolute difference (run 10) compared to
the baseline.

3.3    Discussions
After we gain an insight into the given data, we find that people subjectively
provide plenty of information such as stress, fatigue, mood, and sleep score. It
means that these data are likely to be irrelevant to what we want to predict and
irrelevant among each person who provided the data. This subjective data could
lead to the unstable accuracy of our models.
    Moreover, the data collected from fitbit has many noises, making it more
difficult to generalize models. Furthermore, the amount of data for each partic-
ipant is limited and inconsistent. For instance, there are approximately twenty
running activities for participants 1, 2, and 4, especially only three of these ac-
tivities for participants 3 and 5. Another illustration of this is that the interval of
day-observer attributes is not equal, or some participants like 12 lack much infor-
mation such as sleep data. These things also prevent our models from reaching
optimal accuracy.
    Additionally, regarding the time running prediction task, the given data does
not have the direct attribute representing time running. However, there is an ini-
Table 6: The results of ten different runs for three subtasks returned by the
task’s organizers
               Run ID SubTask ID Accuracy Abs difference
                          1         0.8        291
                  1       2         0.8        11.3
                          3         0.5         4.6
                          1         0.6        290
                  2       2         0.5        14.5
                          3         0.5         4.6
                          1         0.6        238
                  3       2         0.6        12.1
                          3         0.5         4.6
                          1         0.6        356
                  4       2         0.6         12
                          3         0.5         4.6
                          1         0.4        358
                  5       2         0.6        12.6
                          3         0.5         4.6
                          1         0.6        304
                  6       2         0.7         11
                          3         0.5         4.6
                          1         0.8        234
                  7       2         0.7        11.6
                          3         0.5         4.6
                          1          1         232
                  8       2         0.7         11
                          3          1          2.6
                          1         0.8        112
                  9       2         0.9        11.4
                          3         0.5         4.6
                          1         0.8         96
                 10       2         0.6         15
                          3         0.5          1


Table 7: The results of two different runs for three subtasks using the baseline
of task’s organizers
               Run ID SubTask ID Accuracy Abs difference
                          1         0.4       192.6
                 1        2         0.4        8.5
                          3         0.5         2
                          1         0.6       302.8
                 2        2         0.4        8.5
                          3         0.5         2
tial 5km run time for each participant. We find that this initial time is randomly
extracted from each participant’s running activities by dividing the distance
attribute by speed attribute. Although the initial running time is claimed to
belong to 5km running time, the distance attribute shows the much shorter run.
Meanwhile, we are informed that data are collected from 16 people who train
for a 5km run, almost running activities have done less than 5km. Moreover,
despite the requirement of predicting the difference between seconds used per
km (kilometer speed) from the initial run to the run at the end of the reporting
period, there is no information to indicate which run or day is at the end of the
reporting period for each participant. To cope with these problems, we have to
apply some ad-hoc methods to preprocess data that prevent us from generalizing
our models.


4    Conclusions
We introduced the solution for predicting an athlete’s performance during the
training period using neural networks and predictive power score. The predic-
tive power score supports us in enhancing the quality of attributes/features sets
towards improving the accuracy of prediction models. We built different predic-
tion models and tested with various parameters and hyperparameters to find the
best one. The gained results are auspicious. We will compare our solution with
others and thoroughly consider the predictive power score of data attributes to
discover hidden patterns useful for improving the accuracy of prediction models.


Acknowledgement
This research is conducted under the Collaborative Research Agreement be-
tween National Institute of Information and Communications Technology and
University of Science, Vietnam National University at Ho-Chi-Minh City.


References
1. M. Reiner, C. Niermann, D. Jekauc, and A. Woll, “Long-term health benefits of
   physical activity–a systematic review of longitudinal studies,” BMC public health,
   vol. 13, no. 1, pp. 1–9, 2013.
2. R. P. Bunker and F. Thabtah, “A machine learning framework for sport result
   prediction,” Applied computing and informatics, vol. 15, no. 1, pp. 27–33, 2019.
3. P. Zhu and F. Sun, “Sports athletes performance prediction model based on machine
   learning algorithm,” in International Conference on Applications and Techniques in
   Cyber Security and Intelligence. Springer, 2019, pp. 498–505.
4. I. Fister, D. Fister, S. Deb, U. Mlakar, and J. Brest, “Post hoc analysis of sport
   performance with differential evolution,” Neural Computing and Applications, pp.
   1–10, 2018.
5. H. Charvátová, A. Procházka, S. Vaseghi, O. Vyšata, and M. Vališ, “Gps-based
   analysis of physical activities using positioning and heart rate cycling data,” Signal,
   Image and Video Processing, vol. 11, no. 2, pp. 251–258, 2017.
6. I. Fister, I. Fister Jr, and D. Fister, “Batminer for identifying the characteristics of
   athletes in training,” in Computational intelligence in sports. Springer, 2019, pp.
   201–221.
7. V.-T. Ninh, T.-K. Le, L. Zhou, L. Piras, M. Riegler, P. l Halvorsen, M.-T. Tran,
   M. Lux, C. Gurrin, and D.-T. Dang-Nguyen, “Overview of ImageCLEF Lifelog
   2020:Lifelog Moment Retrieval and Sport Performance Lifelog,” in CLEF2020
   Working Notes, ser. CEUR Workshop Proceedings. Thessaloniki, Greece: CEUR-
   WS.org <http://ceur-ws.org>, September 22-25 2020.
8. V. Thambawita, S. A. Hicks, H. Borgli, H. K. Stensland, D. Jha, M. K. Svensen, S.-
   A. Pettersen, D. Johansen, H. D. Johansen, S. D. Pettersen et al., “Pmdata: a sports
   logging dataset,” in Proceedings of the 11th ACM Multimedia Systems Conference,
   2020, pp. 231–236.
9. B. Ionescu, H. Müller, R. Péteri, A. B. Abacha, V. Datla, S. A. Hasan, D. Demner-
   Fushman, S. Kozlovski, V. Liauchuk, Y. D. Cid, V. Kovalev, O. Pelka, C. M.
   Friedrich, A. G. S. de Herrera, V.-T. Ninh, T.-K. Le, L. Zhou, L. Piras, M. Riegler,
   P. l Halvorsen, M.-T. Tran, M. Lux, C. Gurrin, D.-T. Dang-Nguyen, J. Chamber-
   lain, A. Clark, A. Campello, D. Fichou, R. Berari, P. Brie, M. Dogariu, L. D. Ştefan,
   and M. G. Constantin, “Overview of the ImageCLEF 2020: Multimedia retrieval in
   lifelogging, medical, nature, and internet applications,” in Experimental IR Meets
   Multilinguality, Multimodality, and Interaction, ser. Proceedings of the 11th Interna-
   tional Conference of the CLEF Association (CLEF 2020), vol. 12260. Thessaloniki,
   Greece: LNCS Lecture Notes in Computer Science, Springer, September 22-25 2020.

</pre>