=Paper= {{Paper |id=Vol-1842/paper_14 |storemode=property |title=Data Mining in Stabilometry: Application to Patient Balance Study for Sports Talent Mapping |pdfUrl=https://ceur-ws.org/Vol-1842/paper_14.pdf |volume=Vol-1842 |authors=Juan. A. Lara,David Lizcano,David de la Peña,José M. Barreiro |dblpUrl=https://dblp.org/rec/conf/pkdd/LaraLPB16 }} ==Data Mining in Stabilometry: Application to Patient Balance Study for Sports Talent Mapping== https://ceur-ws.org/Vol-1842/paper_14.pdf
      Data mining in stabilometry: Application to patient
          balance study for sports talent mapping
          Juan. A. Lara1, David Lizcano1, David1 de la Peña, José M. Barreiro2
     1 Open University of Madrid, UDIMA - School of Engineering, Ctra. De la Coruña, km

            38.500 – Vía de Servicio, 15 - 28400, Collado Villalba, Madrid, Spain
                 {juanalfonso.lara, david.lizcano, francis-
                             codavid.delapena}@udima.es
2 Technical University of Madrid, School of Computer Science, Campus de Montegancedo, s/n

                          - 28660, Boadilla del Monte, Madrid, Spain
                                  jmbarreiro@fi.upm.es



       Abstract. Stabilometry is a branch of medicine responsible for the study of bal-
       ance and postural control in human beings. To do this, it uses devices known as
       posturographs, which collect data related to people’s balance. In this paper we
       propose the use of data mining techniques in order to build predictive models
       based on a number of variables related to the balance of the analysed subjects.
       The resulting models can be applied as classification tools for sports talent
       mapping by determining the sport or sporting discipline best suited to young
       sportspeople depending on balance, as balance plays a key role in many sports
       activities. According to the results for data on 15 professional basketball play-
       ers and 18 ice-skaters, the predictive power is 90.91% in the best case (Unilat-
       eral Stance Test – Left Leg). This suggests that there is a close relationship be-
       tween balance and the sport practised by professional sportspeople in our ex-
       periments.


Keywords: Stabilometry, Data Mining, Classification, Reference model, Sports Talent
mapping.




1 Introduction

Stabilometry or posturography is a branch of medicine concerned with studying
people’s postural control [1, 2]. Patients take a series of tests in order to measure
postural control [3]. Testing is based on the use of dynamometric platforms, called
posturographs.
Stabilometric platforms record a huge amount of interesting data related to people’s
balance and postural control. Knowledge discovered from stabilometric data has led to
quite a few advances in the field of medicine. However, these data are not straightfor-
ward to analyse, and specialized data analysis techniques have to be used. Data mining
plays a key role in this respect [4].
In this paper, we describe how we applied data mining techniques in order to build
classification models based on historical stabilometric data collected from different
individuals. In particular, we applied decision trees and logistic regression in order to
generate these models.
The analysed data include balance-related information (constituting the independent
variables of our study) and other characteristics (each separately considered as a de-
pendent variable in the conducted experiments). These characteristics include age,
height and gender, as well as information related to the sports played by the respective
individual. The resulting models explain these characteristics in terms of patient bal-
ance, as well as acting as a tool for recommending sports or sporting disciplines for
young sports talents.



2 Background

Stabilometry was originally conceived merely as a technique for assessing patient
postural control and balance. However, it is now considered to be a useful tool for
diagnosing [5, 6] and treating [7] balance-related disorders. Some examples of its use
are described in [8-13].
Throughout this research we used a modern posturography device called Balance
Master from NeuroCom® Internacional [14]. Previous research has shown it to be
very precise and reliable for analysing postural control [15, 16]. Balance Master con-
sists of a metal platform, which is divided into two lengthwise interconnected plates
and placed on the floor. The patient has to stand on the metal platform to perform a
number of tests.
There are different types of tests. The patient has to perform the tests in a set order
following the physician’s instructions. Each test is designed to measure a different
component of patient balance. Additionally, each test is divided into test subtypes,
which include slight variations on the test.
As it is of special interest to the experts of this domain, we focused on the US (Unilat-
eral Stance) test. The aim of this test is to measure sway in patients standing on one
foot with eyes open and with eyes closed.
This test lasts 10 seconds, during which the patient has to stand as steadily as possible
on only one leg on the platform. Patients have to perform each of the four test sub-
types to complete the test: a) Stand on left leg with eyes open; b) Stand on left leg with
eyes closed; c) Stand on right leg with eyes open; and d) Stand on right leg with eyes
closed.
From the expert’s point of viewpoint, the most important aspect of this test is the anal-
ysis of losses of balance as the patient performs the test. It is especially important to
find out the extent and direction of the loss of balance and whether the imbalance ends
in a fall, that is, whether patients are obliged to put down the foot that they are not
standing on, which should be raised at all times.
As stabilometry is a relatively modern discipline, there is not much background on the
application of data mining techniques to stabilometric data. Beyond the two proposals
described in [18, 19] (where the authors apply data mining techniques to detect motor
fluctuations in Parkinson's disease and to analyse postural instability and consequent
falls and hip fractures associated with the use of hypnotics in the elderly), we have not
found any research in the literature specific to the application of data mining in the
stabilometric domain, except for the investigation that we have conducted over the last
few years in this domain [20-25]. This is the first paper in which we conduct research
in order to use data mining as a tool for gaining a better understanding of the stabilo-
metric domain.



3 Data and methods used

For this research we used stabilometric data from a total of 56 individuals. Of 56 sub-
jects under analysis, 15 are professional basketball players, 18 are elite ice skaters and
the other 23 are members of a control group of healthy people of different gender who
are not professional sportspeople.
The studies focused on the US test. US is one of the tests that provides more interest-
ing information about balance. The test was confined to studying both the Left and
Right subtypes with eyes closed. Eyes open subtypes provided hardly any information
of interest because the data in the different classes were almost constant.
The stabilometric data under analysis were acquired using the Balance Master static
posturograph manufactured by Neurocom (Figure 1). This posturograph is composed
of four sensors. Each sensor records the pressure of the patient’s feet at regular 10-
millisecond intervals throughout the test. These data generate time series.




                 Figure 1. Patient performing a test on a stabilometric platform.
This device generates time series that are hard to interpret, for which reason they were
pre-processed by the times series knowledge discovery framework that we proposed
elsewhere [20, 21, 22]. This framework was applied to acquire the raw data for analy-
sis. These data include indicators related to subject balance, such as sway velocity,
number of recorded imbalances, number of recorded falls, sum of the lengths of the
recorded falls and maximum intensity of the recorded fall measurements. These are all
the attributes generated by the above framework. We used all these attributes as inde-
pendent variables in the later experiments.
   The above indicators are measured by a framework that we implemented ad hoc for
the stabilometry domain [22-25]. The patient time series constitute the framework
input. The framework uses specialized time series analysis techniques to identify and
characterize the time series events using a special-purpose event identification lan-
guage [20]. Note that traditional techniques like Fourier transforms or wavelets are not
applicable as they analyse times series as a whole, whereas experts in stabilometric
time series focus exclusively on certain regions of interest in the time series that have
particular features. These regions are the events based on which the balance indicators
used in this paper are calculated (mean values of the different events). Figure 2a
shows a snippet of a stabilometric time series, and Figure 2b illustrates an example of
a fall event in the US test.




    Figure 2a. Snippet of a       Figure 2b. Example of an identified and characterized fall
   stabilometric time series.                              event.
For each test, information is stored about patient age, height, sport (BASKETBALL,
SKATING or CG, control group) and gender. Each attribute will be the dependent
variable in one of the experiments run.
A series of data pre-processing tasks were performed on the original data [17]: T1.
Age attribute discretization (transforming a quantitative attribute into an ordinal quali-
tative attribute) and numeration (associating a numerical value with each of the quali-
tative values taken by the original variable): 0 if Age > 20 and 1 if Age  20; T2.
Height attribute discretization and numeration: 0 if Height > 170 and 1 if Height 
170; T3. Sport attribute numeration: 0 Basketball, 1 Skating. The control group indi-
viduals are omitted for this attribute; T4. Gender attribute numeration: 0 Male, 1 Fe-
male; T5. Sportsperson attributization (creating a new attribute from the values of
other existing attribute(s)): 0 Sportsperson, 1 not Sportsperson; T6. Skater attributiza-
tion: 1 Skater, 0 not Skater. Basketball players were omitted for this attribute, as the
aim is to distinguish skaters from the control group; T7. Basketball Player attributiza-
tion: 1 Basketball Player, 0 not Basketball player. Skaters were omitted for this attrib-
ute, as the aim is to distinguish basketball players from the control group. Finally, we
divided the data into two subsets of records, one for each of the considered subtypes
(Left and Right).
In this research we have employed decision trees and logistic regression techniques.
Decision trees are tree-shaped structures that are used as predictive models in many
different areas [26]. To do this, the value of the known attributes of the object is used
to move down through the tree (each tree node contains a condition on known attrib-
ute values which determines the branch to be taken) to a leaf node. The algorithm that
we have used in this research is CART [27]. Logistic regression is a technique used in
data mining to predict the unknown value of a categorical, particularly a binary (two-
valued), variable based on the known values of other numerical variables [28].
   The Age and Height attributes were discretized following the instructions of do-
main experiments in stabilometry. Experiments without attribute discretization will be
conducted as part of future research in order to check whether there is any difference
in the results.



4. Data analysis and results

We have five independent variables for the experiments: Sway_Vel, Falls, Imbalanc-
es, Fall_Length and Max_Fall_Int. The dataset also includes another seven variables
that will be used as dependent variables in as many experiments. Also, each of these
seven experiments will be conducted twice (once for each of the two Left and Right
test subtypes). Therefore a total of 14 experiments, denoted Exp1-Exp14, will be
conducted.
Table 1 summarizes these experiments, specifying the respective subtype covered
(Left or Right), dependent variable, and number of samples of each of the classes
established by the dependent variable in percentage terms. With regard to the topic
addressed in this paper, the experiments considering the status of sportsperson or the
practised sporting discipline (Exp 7- Exp 14) are of most interest.

Table 1. Summary of the experiments

                                                         Instances for each class
 Experiment          Subtype      Dependent variable     0 (%)       1 (%) Total
 Exp1                Left         Age                    62.50       37.50 56
 Exp2                Right        Age                    62.50       37.50 56
 Exp3                Left         Gender                 67.86       32.14 56
 Exp4                Right        Gender                 67.86       32.14 56
 Exp5                Left         Height                 60.71       39.29 56
 Exp6                Right        Height                 60.71       39.29 56
 Exp7                Left         Sport                  45.45       54.55 33
 Exp8                Right        Sport                  45.45       54.55 33
 Exp9                Left         Sportsperson           58.93       41.07 56
 Exp10               Right        Sportsperson           58.93       41.07 56
 Exp11               Left         Skater                 56.10       43.90 41
 Exp12               Right        Skater                 56.10       43.90 41
 Exp13               Left         Basketball player      60.53       39.47 38
 Exp14               Right        Basketball player      60.53       39.47 38

The resulting models were validated using 10-fold cross validation and the results are
shown in Table 2.
Table 2. Results of the experiments in terms of classification accuracy

              Experiment        Decision Trees          Logistic Regression
              Exp1                 55.36%                     58.93%
              Exp2                 69.64%                     69.64%
              Exp3                 69.64%                     80.36%
              Exp4                 60.71%                     64.29%
              Exp5                 73.21%                     85.71%
              Exp6                 66.07%                     75.00%
              Exp7                 60.61%                     90.91%
              Exp8                 81.82%                     84.85%
              Exp9                 60.71%                     48.21%
              Exp10                39.29%                     55.36%
              Exp11                56.10%                     65.85%
              Exp12                56.10%                     63.41%
              Exp13                63.16%                     76.32%
              Exp14                47.37%                     71.05%
              MEAN                 61.41%                     70.71%



5. Discussion of Results

Looking at the results we find that logistic regression yields better results in 12 out of
the 14 conducted experiments (Exp1, Exp3-8 and Exp10-14), the CART algorithm is
better in one (Exp9) and the results of both techniques are similar in one (Exp2). Gen-
erally speaking, logistic regression therefore provides better results, outperforming
CART by on average 9.3% in terms of accuracy.
On the other hand, looking at logistic regression, we find that the Left subtype yields
better results in five out of the seven cases (Gender, Height, Sport, Skater and Basket-
ball player attributes), whereas the Right subtype is better in two out of the seven
cases (Age and Sportsperson attributes). This result is probably due to the fact that
most of the analysed basketball players are right handed (11 vs. 4), whereas the skater
population is less skewed (8 right-handed vs. 10 left-handed skaters).
Considering the global results, the variables that appear to be most related to balance
are Height and Sport. This suggests, for example, that balance is related to the sport
played. However, the poor results for the Sportsperson variable suggest that there
appears to be no relationship between a person being or not being a professional
sportsperson and their degree of balance.
The models output in Experiments 7 and 8 (Sport variable) could be said to be espe-
cially applicable for mapping out the career of young sports talents. These models
could be used to propose, depending on balance, a sports discipline at an early age for
talented young sportspeople enrolling in high-performance training programmes de-
signed to forge future sports talents. However, the results suggest that height affects
balance characteristics, which is an issue that is worth analysing. In this respect, note
that the mean height of the samples is 200.3 centimetres for basketball players and
162.3 centimetres for skaters. This is a big difference, and it could be behind the good
classification results in Experiments 7 and 8. Therefore, further experiments should be
run considering other sports in which there is not such a pronounced mean height
difference between the sportspeople from the two groups.
   With regard to the above, other authors have published research on talent detection
and development in sport [29, 30]. Our research, however, focuses on more practical
issues related to talent management. Several lines of research have been opened in this
respect [31]. Vaeyens et al. claim that traditional cross-sectional talent identification
models are likely to exclude many, especially late maturing, promising children from
development programmes due to the dynamic and multidimensional nature of sport
talent [31]. Other practical research has focused above all on the soccer field [32, 33].
Other research has addressed other sports like water-polo [34].
Considering its relationship to the work presented here, we should mention research
by Mohammed et al., studying which specific morphological and performance
measures describe differences between elite and non-elite young handball players
[35]. They found that elite players were heavier and had greater muscle circumfer-
ences that their non-elite peers. Elite players scored significantly better on strength,
speed and agility, and cardiorespiratory endurance, but not on balance.
In this respect, this paper appears to fit in with our results in the sense that balance
does not appear to differ substantially when comparing elite and non-elite sportspeo-
ple (Experiments 9 and 10). According to our results, however, balance is applicable
in the field of talent management, as it is related to the practised sporting discipline
(Experiments 7 and 8). These results are applicable for matching sports talents to the
most suited discipline depending on their balance control.
In this regard, the research presented here is, to the best of our knowledge, the first to
address the topic of talent mapping and is one of very few to date that has established
balance as the discriminating factor. As far as we are aware, it is also the first paper to
use data mining techniques (classification in this case) for sports talent management
based on balance indicators.



6. Conclusions and Future Work

The stabilometric data acquired after examining a particular person can provide an
enormous amount of information concerning their balance and postural control. In this
article, we described the experiments conducted based on the stabilometric data of a
series of individuals, some of whom were elite sportspeople whereas others were
members of a control group.
The results of our experiments can be used to discover interesting knowledge about
existing relationships between people’s balance and other characteristics. A prominent
finding of this research is the close relationship between people’s balance and the
sport that sportspeople play. This is applicable in practice, where the models can be
applied in order to determine the best sport or sports discipline for each subject de-
pending on their balance. This is very useful for state programmes for capturing young
sports talents. Interestingly, the experiments did not reveal any significant relationship
between balance and the fact that a person does or does not practise sport profession-
ally. Some possible future research lines are:
   - Considering that the experiments were conducted on a very small sample of
        sportspeople and significance is limited, it would be worth broadening the
        sample in order to conduct a richer and more comprehensive analysis.
   - Another future line of work is to consider other stabilometric tests to check
        whether they confirm the results of the US test. It would also be interesting to
        add other dependent variables to the analysis, such as right- or left-handedness.
   - In view of the results, it would also be worthwhile gathering stabilometric data
        on elite professional sportspeople who play other sports apart from basketball
        and skating. It would be very useful to consider sports that are more alike (for
        example, compare basketball players with handball or volleyball players).
        Classification accuracy can be expected to drop in this case, although further
        experiments should be run in order to confirm that there is a drop and by how
        much.
   - It would be a good idea to use other techniques in order to analyse the correla-
        tions between the analysed data: clustering, plots, SVMs.
   - Although not a central issue to this paper, there appears, according to the ex-
        periments, to be a close relationship between balance and gender. One line of
        research would be to study this relationship and check whether it holds for oth-
        er datasets, tests, sports, etc. Also it would be interesting to further analyse the
        relationship between balance and age or height because postural control varies
        with age or is dependent on height.
   - Again it would be worth analysing the causality between balance and sports
        discipline in order to confirm whether balance determines the sport for which a
        player is best suited or balance is the result of training for each sport. In this
        paper, elite skaters clearly had much better balance.



References

1.   Barigant, P., Merlet, P., Orfait, J., Tetar, C., New design of E.L.A. Stato-
     kinesemeter. Agressol, 13(C): pp. 69-74, 1972.
2.   Boniver, R., Posture et posturographie. Rev Med Liege. May 1, 49(5): pp. 285-
     290, 1994.
3.   Sanz, R., Test vestibular de autorrotación y posturografía dinámica. Verteré, 25:
     pp. 5-15, 2000.
4.   Fayyad, U. M., Piatetsky-Shapiro, G., Smyth, P., From Data Mining To
     Knowledge Discovery: An Overview. In Advances In Knowledge Discovery And
     Data Mining, eds. U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthu-
     rusamy, AAAI Press/The MIT Press, Menlo Park, CA., pp. 1-34, 1996.
5.   Ronda, J.M., Galvañ, B., Monerris, E., Ballester, F., Asociación entre Síntomas
     Clínicos y Resultados de la Posturografía Computerizada Dinámica. Acta Otor-
     rinolaringología Española, 53: pp. 252-255, 2002.
6.  Rama, J., Pérez, N., Artículos de Revisión: “Pruebas vestibulares y Postur-
    ografía”. Revista Médica de la Universidad de Navarra, vol. 47, nº 4, pp. 21-28,
    2003.
7. Barona, R., Interés clínico del sistema NedSVE/IBV en el diagnóstico y val-
    oración de las alteraciones del equilibrio. Revista de Biomecánica del Instituto de
    Biomecánica de Valencia (IBV), Ed. February, 2003.
8. Lázaro, M., Cuesta, F., León, A., Sánchez, C., Feijoo, R., Montiel, M., Valor de
    la posturografía en ancianos con caídas de repetición. Med. Clin., Barcelona, pp.
    124:207-10, 2005.
9. Nguyen, D., Pongchaiyakul, C., Center, J. R., Eisman, J. A., Nguyen, T. V., Iden-
    tification of High-Risk Individuals for Hip Fracture: A 14-Year Prospective
    Study. Journal of Bone and Mineral Research, 20(11), 2005.
10. Raiva, V., Wannasetta, W., Gulsatitporn, S., Postural stability and dynamic bal-
    ance in Thai community dwelling adults. Chula Med J 2005 Mar, 49(3): pp. 129 –
    141, 2005.
11. Sinaki, M., Brey, R. H., Hughes, C. A., Larson, D. R., Kaufman, K. R., Signifi-
    cant Reduction in Risk of Falls and Back Pain in Osteoporotic-Kyphotic Women
    Through a Spinal Proprioceptive Extension Exercise Dynamic (SPEED) Program.
    Mayo Clin., 80(7): pp. 849-855, 2005.
12. Song, D., Chung, F., Wong, J., Yogendran, S., The Assessment of Postural Stabil-
    ity After Ambulatory Anesthesia: A Comparison of Desflurane with Propofol.
    Anesth. Analg., 2002.
13. Martín, E., Barona, R., Vértigo paroxístico benigno infantil: categorización y
    comparación con el vértigo posicional paroxístico benigno del adulto. Acta Otor-
    rinolaringología Española, 58(7): pp. 296-301, 2007.
14. BM Neurocom® International, Balance Master Operator’s Manual v8.2.
    www.onbalance.com (accessed in October 2010), 2004.
15. Liston, R. A., Brouwer, B. J., Reliability and validity of measures obtained from
    stroke patients using the Balance Master. Arch. Phys. Med. Rehabil., 77: pp.
    425–430, 1996.
16. Brouwer, B., Culbam, E. G., Liston, R. A., Grant, T., Normal variability of pos-
    tural measures: implications for the reliability of relative balance performance
    outcomes. Scand J Rehabil. Med., 30: pp. 131–137, 1998.
17. Lara, J. A., Manual de Minería de Datos, Ed. Udima, 2013.
18. Bonato, P., Sherrill, D. M., Standaert, D. G., Salles, S. S., Akay, M., Data Mining
    Techniques to Detect Motor Fluctuations in Parkinson's Disease, Proc. of the 26th
    Annual International Conference on the IEEE Engineering in Medicine and Biol-
    ogy Society, pp. 4766 – 4769, 2004.
19. Allain, H., Bentué-Ferrer, D., Polard, E., Akwa, Y., Patat, A., Postural Instability
    and Consequent Falls and Hip Fractures Associated with Use of Hypnotics in the
    Elderly, Drugs & Aging 22(9), pp. 749-756, 2005.
20. Anguera, A., Lara, J. A., Lizcano, D., Martínez, M.A., Pazos, J., Sensor-
    generated Time Series Events: A definition language, Sensors 12(9), pp. 11811-
    52, 2012.
21. Alonso, F., Lara, J. A., Martínez, L., Pérez, A., Valente, J. P., Generating Refer-
    ence Models for Structurally Complex Data: Application to the Stabilometry
    Medical Domain, Methods of Information in Medicine, 52, pp. 441-453. 2013.
22. Lara, J. A., Moreno, G., Pérez, A., Valente, J. P. , López-Illescas, A., Comparing
    posturographic time series through events detection, 21st IEEE International
    Symposium on Computer-Based Medical Systems, CBMS '08, pp. 293-295,
    2008.
23. Lara, J. A., Pérez, A., Valente, J. P., López-Illescas, A., Generating time series
    reference models based on event analysis, 19th European Conference on Artificial
    Intelligence - ECAI 2010, pp. 1115-116, 2010.
24. Lara, J. A., Lizcano, D., Martínez, M. A., Pazos, J., Riera, T., A UML Profile for
    the Conceptual Modelling of Structurally Complex Data: Easing Human Effort in
    the KDD Process, Information and Software Technology 56, pp. 335-351, 2014a.
25. Lara, J. A., Lizcano, D., Martínez, M. A., Pazos, J., Data preparation for KDD
    through automatic reasoning based on description logic, Information Systems 44,
    pp.54-72, 2014b.
26. Huo, X., Kim, S. B., Tsui, K.-L., & Wang, S. A frontier-based tree pruning algo-
    rithm (FBP). INFORMS Journal on Computing, 18, 494–505, 2006.
27. Hastie, T., Tibshirani, R., Friedman, J. The element of statistical learning. New
    York, NY: Springer, 2001.
28. James, G., Witten, D., Hastie, T., Tibshirani, R., An Introduction to Statistical
    Learning, Springer, 2013.
29. Morris, T., Psychological characteristics and talent identification in soccer. Jour-
    nal of Sport Sciences, 18:9, 715-726, 2000.
30. Abbott, A., Collins, D., Eliminating the dichotomy between theory and practice in
    talent identification and development: considering the role of psychology. Journal
    of Sport Sciences, 22:5, 395-408, 2004.
31. Vaeyens, R., Lenoir, M., Williams, A. M., Philippaerts, R. M., Talent identifica-
    tion and development programmes in sport. Sports Med, 38:9, 703-714, 2008.
32. Reilly, T., Williams, A. M., Nevill, A., Franks, A., A multidisciplinary approach
    to talent identification in soccer. Journal of Sport Sciences, 18:9, 695-702, 2000.
33. Williams, A. M., Perceptual skill in soccer: Implications for talent identification
    and development. Journal of Sport Sciences, 18:9, 737-750, 2000.
34. Falk, B., Lidor, R., Lander, Y., Lang, B., Talent identification and early devel-
    opment of elite water-polo players: a 2-year follow-up study, Journal of Sport
    Sciences, 22:4, 347-355, 2004.
35. Mohamed, H., Vaeyens, R., Matthys, S., Multael, M., Lefevre, J., Lenoir, M.,
    Philippaerts, R., Anthropometric and performance measures for the development
    of a talent detection and identification model in youth handball, Journal of Sport
    Sciences, 27:3, 257-266, 2009.