Using Machine Learning to Classify Volleyball Jumps
Miki Jauhiainen1, Michael Jones1,*
1
    Brigham Young University, Provo, Utah, USA 84602


                                         Abstract
                                         In this study, inertial measurement units (IMUs) were used to train a random forest classifier to correctly
                                         classify different jump types in volleyball. Athlete motion data were collected in a controlled setting
                                         using three IMUs, one on the waist and one on each ankle. There were 11 participants who at the
                                         time played volleyball at the collegiate level in the United States, seven male and four female. Each
                                         performed the same number of jumps across the eight jump types–five BASIC jumps and three each of
                                         the other seven–resulting in 26 jumps per subject for a total of 286. The data were processed using a
                                         max-bin method and trained using a leave-one-out cross-validation method to produce a classifier that
                                         can determine jump type with an accuracy of 0.967, as measured by an 𝐹1 -score.

                                         Keywords
                                         sports, wearable sensors, supervised machine learning, volleyball


1. Introduction
In this paper, we investigate classification of blocking jumps in volleyball through supervised
machine learning using inertial measurement unit (IMU) data. Jump classification could be used
to create novel analysis tools for coaches and athletes.
   IMU sensors are inexpensive and can be easily attached to volleyball players in both practice
and game settings. A single sensor can collect more than 100 readings per second and each
reading contains nine data points representing linear acceleration, rotational velocity and
magnetic field values. When used to collect motion data for volleyball players, the challenge is
turning IMU readings into useful insights for coaches, athletes, and others.
   In order to use sensors to improve performance as part of sports training, we will need to find
specific events in the data and classify jumping movements, which is not a trivial task. Finding
events and classifying movements in data represented using a graph is hard for the untrained
human eye, as exemplified in Figure 1. Figure 1 contains data that we collected from an IMU
attached to a volleyball player in a practice setting. The IMU measures linear acceleration,
rotational velocity, and magnetic field in three dimensions, all of which are displayed in Figure
1. The different lines represent the values for the 𝑥, 𝑦, or 𝑧 axes for either the accelerometer,
the gyroscope, or the magnetometer. For the gyroscope, the 𝑥, 𝑦, and 𝑧 axes correspond to roll,
pitch, and yaw.


NTSPORT’22: New Trends in HCI and Sports Workshop at MobileHCI’22, October 1, 2022
*
 Corresponding author.
$ mikimj97@gmail.com (M. Jauhiainen); jones@cs.byu.edu (M. Jones)
 0000-0002-0131-527X (M. Jones)
                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
Figure 1: A graph of an X3L jump using the waist sensor. The green line is the takeoff and the red is
the landing.

   The data in this graph were collected during a blocking move, which consists of movement
along a volleyball net, and a jump. We measured this data because blocking is an important
skill in volleyball. It is possible for a person to spot this event and that movement with some
training, depending on the movement type, but it is not easy.
   Training a classifier to classify movements in the data could generate a more usable description
of the data. Training a classifier involves two tasks: processing the data for use as input and
setting classifier parameters. There are many ways to process the data and there are many
settings for classifier parameters.
   Classifying jump types in volleyball motion data has value for both players and coaches.
One of the authors played volleyball at both the collegiate and international level. In that
author’s experience, athletes and coaches care about tracking and improving jumping skills
while avoiding injury. To validate this perspective, we talked to two collegiate volleyball coaches
about tracking jumps. One coach stated that measuring differences in jump height between
different types of jumps would allow for more specific training programs. One coach suggested
aligning sensor data with film from the practice or match. This would allow looking up the
jump on the film using the timestamp. Building a system that matches sensor data with video
from practice is a promising direction, and the work done in this report contributes to the future
construction of such a system.
   However, the first step in both of these ideas is to identify jumps in the data itself. Once
jumps can easily and accurately be identified, systems can be built to measure and compare
jump heights and match up specific jumps in sensor data and video.
   Collecting and analyzing data can also help coaches and athletes by leading to better injury
prevention protocols. It is in the best interest of everyone (such as coaches, athletes, and team
owners) for athletes to achieve better longevity through injury prevention: the athletes can
continue to do what they love doing while getting paid to do it, the coaches do not lose their
star players to chronic injuries as quickly, and the owners get to enjoy the profit from ticket
sales that their best players continue to boost.
   Injuries incur both financial and personal costs. In one collegiate program in the United
States, an MRI to diagnose a knee injury due to overuse, or other injuries, costs about $1,000.
The actual surgical repair is another $10,000 on top of that. Furthermore, it is extremely hard to
recover from surgeries and get back to the same level of play, which is bad for both the player
and the team they represent. Anterior cruciate ligament (ACL) injuries are fairly common in
volleyball [1], and they are one of the injuries that require the expensive surgery. Counting
jumps in data collected during training could be part of an injury prevention protocol.
   The following scenario illustrates the need for the proposed work. Jack is a 20-year-old
sophomore in college, and he aspires to play professional volleyball after graduating. His
position is middle blocker. He is great at attacking, but not so good at blocking. He recognizes
that not being able to block well could hinder his chances of making it as a professional. He
asks his coach to help him with blocking, so they start using a sensor-based app to monitor
Jack’s training. Jack starts using the app during practice and reviews the data after practice
with his coach. Because the app can distinguish between different types of jumps, Jack and his
coach can easily find the jumps going left and right and compare them on video. They notice
that when going left, his steps are too small, so he does not make it far enough in time.This
results in the coach being able to assign specific workouts to balance out Jack’s leg strength, as
well as monitor his footwork to make sure he takes big enough steps.
   Others have studied the problem of volleyball action detection and classification but with
limited accuracy. Using computer vision, Ibrahim et al. [2] attempted to classify blocking,
hitting, and setting, among other things, but they only achieved 51.1% accuracy. Kautz et al. [3]
used an IMU strapped to a wristband to identify different volleyball actions with near-perfect
recall, but only 34.8% overall accuracy. Their work with IMUs is encouraging, but use for
performance improvement requires more accuracy. Furthermore, our top priority, blocking, had
the lowest accuracy among the actions they targeted.
   To attempt to solve the problem of classifying volleyball jumps, we labeled our collected
data and processed them using a max-bin approach so that it could be used to train a classifier
and validated using leave-one-out cross validation (LOOCV). We gathered the data using IMUs
and labeled it with the help of our IMU-synchronized video. We processed the data using a
max-bin approach, which allowed us the aggregate the data while saving the peak values. We
then trained a random forest classifier using the aggregated data, only including the parts with
jumps. Finally, we used LOOCV to measure accuracy with an 𝐹1 -score.
   We were able to achieve an 𝐹1 -score of 0.97 using the combination of the left and right foot
sensors, window size of 360, and bin size of 25, with a random forest. Most results for any
combination of sensors were between 0.85 and 0.95, as long as the bin size stayed under 100.
These results suggest that we were able to successfully solve our problem, as 𝐹1 -scores in the
90s are typically accepted as good results, as demonstrated in [4, 5, 6, 7, 8]. This classifier could
be used in the future to build applications that measure jump height and can be synchronized
with video for more efficient coaching.
2. Related Work
There exist ways, such as VERT [9] to measure jump height in sports like volleyball, but, to
the best of our knowledge, there are no existing ways to accurately determine what kinds of
jumps volleyball players are performing. There have been no previous attempts in the research
literature to classify jumps using data from IMUs in volleyball, but similar work has been done
in other sports that involving jumping such as figure skating [10]. Similar to our work, they
used an IMU strapped to the waist together with synchronized video to gather and annotate
data. They labeled the takeoff and landing times of the jumps and then used those labeled jumps
as input to a supervised classification algorithm that learns to recognize those jumps, which is
the same exact approach we will be using. Like figure skating, volleyball involves jumping and
rotating in the air, which gives us confidence that this can be done. The jumps in figure skating
involve more spinning, but the basic concept of movement followed by a jump is the same in
both sports.
   Others have studied the problem of identifying volleyball movements in video. Those efforts
have not yet achieved accuracy needed to improve performance outcomes in training. In [2],
Ibrahim et al. attempted to pinpoint actions such as blocking, hitting, and setting, but they only
achieved 51.1% accuracy. In [11], Azar et al. recognized the activity fairly accurately through
recognizing what individual players are doing, but important pieces like information about
the ball and the net were missing. Using computer vision would require multiple expensive
cameras and visibility of the whole volleyball court. This might not be as feasible as using IMUs
due to financial reasons and possible venue limitations–it might not be viable to set up the
cameras in good enough places to be able to use the system. Kautz et al. [3] recognized different
volleyball-specific actions, like passing or serving, using an IMU strapped to a wristband. Using
a decision tree, they achieved high recall, but only 34.8% overall accuracy, meaning that there
were many false positives. Their work suggests that machine learning is a reasonable approach,
but more accuracy is needed for use in performance improvement. Additionally, the action
identified with the lowest accuracy was blocking, which is our top priority since we are studying
primarily blocking jumps.
   Salim et al. [12] performed a study similar to [3], but with slightly better results. Both studies
used an IMU strapped to the wrist, but in [12], they had one on each wrist, whereas in [3] it
was only on the dominant hand. The 𝐹1 -scores and accuracy scores ranged between 20-90%,
although for most actions they were around 70-80%. Once again, however, blocking actions were
not recognized accurately enough for performance improvement. Furthermore, attaching an
IMU to the wrist of a volleyball player would be like wearing a smart watch, which is generally
not recommended for for volleyball.
   There is also a body of work related to IMUs and swimming [13]. In [13], sensor placement
seems to be significant and the accuracy of the results when classifying stroke type look
promising. Distinguishing swim strokes based on motion is similar to classifying different
volleyball blocking movements because both activities include the position and motion of the
hips, which is where we placed one of the sensors. Results in [13] suggest that working with
several sensor locations will be needed to find an optimal placement.
3. Volleyball Background
In order to fully understand this research, it is important to have some knowledge of how
volleyball is played. Although one of the most popular team sports in the world, the difficulty of
the actions required and of the rules makes it hard for people unfamiliar with the sport to grasp.
   Volleyball is played on a court with two sides of 9 x 9 meters that are divided by a net that
stands at 243 centimeters for men and 224 centimeters for women. There is a line on both
sides, three meters from the net, that separates the court into front court and back court. Both
teams have six players on the court at once, although seven play actively. The seven comprise
of one setter, one opposite hitter, two outside hitters, two middle blockers, and one libero (a
defensive specialist). Three players play at the net and three in the back court. The player who
has most recently rotated into the back court is always the one to serve. The middle blockers
and outsides, respectively, are also positioned across from each other in a similar manner. The
lineup of one team on their half of the court is illustrated in Figure 2, and the setter would be
the one serving in this situation.
   There are three touches allowed on each possession. Ideally, the setter always gets the second
touch and sets the ball to an attacker, meaning that the setter decides who gets to attack the
ball over the net. The defending side usually attempts to block the attack with as many players
as possible (which is three) but at least with one player. Players in the back court can not put
the ball over the net–or prevent it from coming over the net–if they step inside the three-meter
line. Hence, only three players can block. Because the blockers are spread out across the net,
but they all try to end up blocking the ball in the same spot, they have to use different footwork
to get there. That is why there are several different types of blocking jumps that are recognized
and taught on the highest levels of volleyball.
   The blocking jumps studied in this research are: BASIC, a jump straight up; Q3, a quick


Figure 2: Volleyball lineup. The net is at the bottom, and the front court is blue.
Figure 3: Five elements of a volleyball blocking jump starting in the neutral position (a). Take-off occurs
when the player’s feet leave the ground (c) and landing when the player’s feet touch the ground again
(d)

shuffle-step move with three steps left or right; X3, a crossover 3-step move left or right; X2, a
crossover 2-step move left or right; and ATTACK, an attacking jump with typically a 3- or 4-step
approach. Left and right are indicated by an "L" or an "R" after the jump type. During a Q3,
your chest is facing the net the whole time, and the jump happens off both feet. While taking
the first step of an X3 and an X2, you turn and face the direction you are going. Furthermore,
the jump happens off one foot for an X2 and both feet for an X3, and the chest starts turning
back towards the net again on takeoff so that at the peak of the jump you are facing the net. All
the movement in the blocking jumps happens parallel to the net, the attacking jump is the only
one that happens perpendicular or at an angle.


4. Methods
This research consists of five major components: data collection, data labeling, data processing,
training, and testing.
   In this report, we focus on classifying jump types from a wearable 9-axis IMUs attached to
the athletes ankles and waist (but not wrists). We assume that jumps can be detected using
threshold-based algorithm. This means that every segmented jump in training and testing
contains a jump. We had experimentally determined earlier that a value of 24.5 sm2 for the x-axis
of linear acceleration indicates a jump, but that is outside the scope of this report. Nevertheless,
the classifier classifies jump type assuming the data contain a jump.

4.1. Data Collection
For gathering the data we recruited 11 NCAA Division I volleyball players at a university in the
United States with 7 male and 4 female. There were players from every position group except
libero (because liberos do not perform jumps in games). Every participant was between 18 and
24 years old. All participants performed the same 26 jumps: five of type BASIC, and three each
of types Q3L, Q3R, X3L, X3R, X2L, X2R, and ATTACK. These jumps are defined in Section 3.
Even though the focus of this study is blocking, attacking is such a common occurrence in
volleyball that it is important to include it so that the classifier is trained on the complete set of
jump types.
   During the jumps, each subject wore 3 IMUs1 : one around the waist so that the IMU was
centered on the small of the back, and one on the lateral part of each ankle right above the shoe.
These IMUs were configured to measure linear acceleration, rotational velocity, and magnetic
field on 3 different axes at a rate of 120 samples per second. Jumps were filmed with a Qualisys
Miqus Video camera synchronized with the IMUs so that it recorded 120 frames per second,
with a resolution of 1280 x 720. The two systems were hardware synchronized using a common
trigger that was wired to sync inputs from both systems.
   Two different courts were used to perform the jumps, both of which were empty except for
the athlete jumping at the time. The courts used were side by side and all jumps were performed
on the same side of the net. The jumps happened at the net and were filmed from the service
line. Each athlete was allowed adequate warm-up time according to their needs. In order to
allow full focus on the blocking motion, no balls were used.
   The athletes performed the jumps one by one. To decrease the risk of having the sensors
and camera become unsynchronized, we only recorded for a couple minutes at a time. The
recordings were split up by jump type. We did not collect the dominant foot (left or right) for
each athlete. Because for blocks the approach and jump motion are the same regardless of the
athlete’s dominant foot.

4.2. Data Labeling
Once the data had been collected, each jump was annotated with four events: motion started,
feet left ground, one foot back on group, and motion done. Every jump starts from a stationary
neutral position, as shown in Figure 3 (a) For example, for a X3L, we would label the moment the
subject’s left foot starts moving to the left as the start of the movement (Figure 3 (b)), the moment
their toes leave the ground as the takeoff (Figure 3 (c)), the moment the toes touch the ground
again as the landing (Figure 3 (c)), and the moment they return to a relatively stable position
(hard to pinpoint exactly) after landing as the end of the movement (Figure 3 (e)). Because the
camera and the IMUs were synchronized, we could now pinpoint the exact moments in the
raw motion data where the jumps happened. After this initial round of annotation, a volleyball
expert reviewed all labels to confirm that they were accurate, and fixed any potential errors.

4.3. Data Processing
We processed the data using a max-bin approach, because it smooths out high frequency noise
while preserving peaks. Peaks are important because they show when landing and take-off
happen.
   The max-bin approach, given, for instance, a window size of 100 and a bin size of 10, works
as follows. First, we take the 100 rows of data and split into 50 in the past and 50 in the future,
with the current row arbitrarily assigned to be in the "past." We then apply a filter that first takes
1
    Opal model, APDM, Inc., Portland OR, USA
Figure 4: The aggregation process for a single point in time, or row of data.

the value with the maximum magnitude of each of 9 columns for a single sensor for the first 10
values in the past and adds those 9 values to the feature vector. Next, we take the max of the
next 10 values in the past and concatenate those to the feature vector. We repeat the process for
the 50 rows in the past a total of five times. The same process is then repeated for the 50 rows
in the future. This creates one input vector with 9 x (5 + 5) = 90 values per sensor. This process
is pictured in Figure 4. If the bin size does not divide the window evenly, the remaining rows
are treated as their own bin. The aggregation process is started from the middle of the window,
so the partial bins, if any, are at the beginning and end of the window. For instance, with a
window size of 100 and a bin size of 15, the process begins with two halves of the window with
50 rows each. 15 goes into 50 three times with 5 left over. The bin process starts from the center
of the window and works to either end. Any leftover rows in an incomplete bin are treated as a
single partial bin. Thus, for a window size of 100 and a bin size of 50 the window would be split
up into bins as follows: 5-15-15-15-15-15-15-5.
   The whole process of creating a window and computing an input vector then "slides" across
every row in the data frame, as shown in Figure 5. All of the feature vectors stacked together
make up the rows in the final set of feature vectors. Since there are about 70-150 rows per
jump–depending on the movement type–this process creates about a hundred slightly altered
copies of a single jump, increasing the number of feature vectors. This way there is enough
data to train a reasonably general classifier even with a smaller original data set.
   This process is repeated for each labeled data frame, and they are all concatenated to each
other to form one massive preprocessed data frame with dimensions N x T , where N is the
number of columns and T is the total number of rows resulting from adding all the smaller data
Figure 5: The way the aggregation window "slides" through the data to form the preprocessed data
frame. Each window is processed as shown in Figure 4. The gray row is the same row each time,
visualizing how the window shifts around it.

frames together.

4.4. Preliminary Study
Before running extensive experiments to find the best processing and training parameters for a
classifier, we ran a preliminary study to compare performance across a group of supervised
learning algorithms. The independent variables for the preliminary study were window size, bin
size, algorithm, and sensor combination, and the dependent variable was accuracy, measured as
an F1 -score (defined in detail in Section 4.5). There are multiple supervised learning algorithms
in the Python library scikit-learn that handle multi-class classification problems. The ones we
tested were random forest, decision tree, AdaBoost, logistic regression, multilayer percepton
(MLP), k-nearest neighbors (KNN), naive Bayes, and support vector machine (SVM). To compare
the different algorithms, we ran tests using a window size of 350 and a bin size of 25. We tested
with all three sensors combined, as well as each of them separately. The summarized results are
in Table 1. The first row contains average accuracy across each of the 4 conditions (all sensors,
left ankle, right ankle and waist). The second row contains the maximum observed accuracy in
the same 4 conditions.
   We got results ranging from 0.040 all the way to above 0.90, and random forest consistently
producing the best results. Some algorithms, like SVM and naive Bayes, performed poorly
across all tests. We did not expect accurate results using naive Bayes because it is a fairly simple
classifier, but the poor accuracy of SVM surprised us. It is possible that the implementation of
SVM we used was not equipped to handle the complexity of the input data.
   As the results of the preliminary study show, random forest was more accurate than the other
algorithms (for the chosen parameter settings), so further testing involved just the random
            Result type    RF      DT      AB      LR     KNN      NB     SVM     MLP
             Average      0.898   0.721   0.266   0.809   0.524   0.565   0.040   0.629
             Highest      0.970   0.753   0.288   0.845   0.607   0.670   0.040   0.704
Table 1
Algorithm comparison results. The random forest (RF) generated the most accurate average results as
well as the most accurate single result as shown in bold.
forest algorithm.

4.5. Variables
There are four independent variables in the second study: window size, bin size, movement
type, and sensor combination. For movement type, there are only two options: full movement
or jump only. There are seven different sensor combinations: waist, left foot, right foot, waist
+ left, waist + right, left + right, and all three. There are a large but finite number of options
for window and bin sizes, but we imposed some restrictions on them to keep the experiment
tractable. Since we sampled at 120 frames per second, and each row of data is one frame, 120
rows represents one second in real time. It takes about one second to perform a BASIC jump,
and all the other ones take longer, so we decided to not use window sizes smaller than 200
to allow fitting the entire jump sequence in the window. The jumps, including the approach
motion, should not take longer than 3-4 seconds, so we used 440 as the biggest window size.
We used a step size of 20 (i.e., to obtain window sizes of 200, 220, 240, ... 440). There is likely
little benefit to trying every single window size, and going through all the results would have
been extremely time-consuming. The bin sizes we used were 5, 10, 15, ..., 75, 90, 110, 130, ... all
the way up to the size of the window. To keep the experiment and analysis tractable, we limited
the number of combinations by choosing 5 as the bin size interval up until 75, at which point
the bin size is already so big that using every 5 would most likely have been redundant, hence
the switch to using every 20. It does not make sense to have a bin size larger than the window
size, so that is the upper limit.
   The dependent variable is accuracy, as measured by an 𝐹1 -score using a macro average over
the eight jump types. An 𝐹1 -score is defined as the harmonic mean of precision and recall as
follows:
                                                      𝑡𝑝
                                       𝐹1 =         1                                           (1)
                                              𝑡𝑝 + 2 (𝑓 𝑝 + 𝑓 𝑛)
where 𝑡𝑝, 𝑓 𝑝, and 𝑓 𝑛 stand for true positive, false positive, and false negative, respectively.
Precision measures the ratio of relevant items picked to irrelevant items picked, and recall
measures the ratio of relevant items picked to all relevant items. Further, true positive means
the number of correctly picked items, false positive means the number of incorrectly picked
items, and false negative means the number of items that should have been picked but were not.
We chose this measure because it penalizes extremes (aggressive/timid classifying), ensuring
that the classifier is balanced.

4.6. Training & Testing
As a result of the preprocessing, the data are organized into a collection of input vectors of
dimensions 𝑁 x 𝑇 with each input vector labeled as a type of jump or a non-jump. The input
vectors all contain a jump.
  We initially chose the training and testing data randomly, but decided to switch to LOOCV to
simulate testing on completely new data from an unseen athlete. We used 10 out of 11 athletes
for the training and the remaining one for testing. This way the classifier had not seen any of
the jumps from the specific athlete before testing, which combats overfitting. This process was
Figure 6: Best scores across all bin sizes for all sensor combinations. The combination of sensors on the
left and right ankles, as shown by the grey line, consistently produced the best results.

repeated for every athlete so that the classifier was exhaustively tested on each athlete. All the
results presented are averages from doing this for all 11 athletes so that results for a specific
athlete do not dominate.


5. Results
Figures 6 through 8 show key results.
   Figure 6 shows the best scores by sensor combination across all bin sizes, for each window
size, and for both jump types. In the graph, the horizontal axis represents the window size. Each
line of data in the graph represents F1 -scores for a different combination of sensors as shown
in the legend at the bottom of the graph. The vertical axis represents the maximum F1 -score
averaged across all bin sizes for a given window size and sensor combination. A combination
of both the left and right ankle sensors produced the best results while the waist sensor alone
produced the least accurate results.
   Figure 7 also shows F1 -scores for different window sizes and sensor combinations but for a
single bin size of 25. As in Figure 6, the left and right ankle sensors produce the best results
while the waist alone produced the least accurate results.
   Figure 8 shows F1 -scores for all sensor combinations and bin sizes with window size 360.
The vertical axis still represents the F1 -score and the horizontal axis is the bin size. Note that
the gap between bin sizes varies in the horizontal axis. Larger bin sizes produce less accurate
results as might be expected.
Figure 7: Results for all sensor combinations and window sizes with bin size 25. The left & right feet
together performed the best.

6. Discussion
We obtained accurate jump type classifiers by training a random forest classifier on input vectors
generated from volleyball blocking jumps using window size 360, bin size 25, and left and right
ankle sensors together. Feature importance analysis did not indicate that any single feature was
significantly more important than others.
   Compared to [14] which uses a similar approach, we obtained more accuracy on a larger
set of jump classes. There are three factors that may explain our increased accuracy. First, we
generated more input vectors by sliding the feature vector window over jumps. We went from
having 26 jumps per athlete to about 1500 input vectors per athlete based on those jumps. The
reason a single jump can be turned into many useful input vectors without creating redundant
noise is that the jump itself moves around in the window. Because we are looking at windows
larger than the duration of the jumps, they can slide around inside the windows, making each
window unique, even though the jump is the same. Additionally, depending on the bin size and
how the values line up across the bins, the peak values around take-off and landing could end
up being slightly different after the aggregation process, altering the critical pieces of the jump
each time.
   Second, our data are collected in a highly controlled setting while data in [14] were collected
in a more general practice setting. Moreover, figure skating motion data may include more
motion that is not directly related to a jump because the athlete is always in motion on the
ice. In contrast, volleyball players in our data collection process remained stationary until
performing the actual jumping motion.
   Third, we used data from sensors on the ankles rather than the waist. The better accuracy
achieved by the ankle sensors could be because the waist moves in similar ways across the jumps,
Figure 8: Results for all bin sizes and sensor combinations with window size 360. The scores drop off
significantly once the bin sizes get past 150.

while the feet do something different every time. This could create additional inconvenience
in practice, because the jump detection algorithm we are relying on primarily uses the waist
sensor, which means that usage in a live setting would require all three sensors. Ideally we
would only need one sensor, because having to strap them on can be annoying for the athletes.
   Collecting more input data from more athletes would likely increase the accuracy of our
classifiers. This would involve recruiting more athletes and organizing more data collection
sessions.
   One weakness of this study is that we were not able to collect data and test the classifier in a
live volleyball setting. We did our best to simulate one with our testing method, but nothing
compares to testing during an actual game or practice, especially since the collected data–and
hence, the jumps that were left out for testing–were so clean and from a controlled setting.
This could leave to overfitting, which is when a classifier fits exactly to its training data, but
struggles to generalize to unseen data. Overfitting is a problem because game and practice
settings involve more movement than our tightly controlled data collection sessions. That extra
motion may prevent an overfit classifier from recognizing a jump; and may also include false
positives. Overfitting may be exacerbated by the combination of max-bin and a small data set.
   Another limitation is that the orientation of the only two courts we used to collect data was
the same, meaning that the values of the magnetometer, which tracks orientation relative to
the magnetic north pole, were always similar. It is possible that a classifier trained like this
could confuse left and right directions if it was used on jumps performed on the opposite sides
of these nets or on a net with a different orientation. To avoid having this problem, we could
zero out the magnetometer values at takeoff to "reset" the orientation so it only accounts for
the rotation in the air. We decided to test this approach with the best parameters we found (i.e.,
window size 360, bin size 25 and the left and right feet together). We achieved an 𝐹1 -score of
0.97 with this method, which is about the same as what we had before, so at least the impact
was not negative. This suggests that court orientation may not be a significant factor.
   Overall, these results could support the implementation of an app that tracks volleyball jumps,
which could be useful in a coaching setting. For example, tracking different types of jumps in a
game or practice and being able to search for them would make film study a lot easier, and it
could help spot aspects that need to be worked on in a player’s game. Additionally, identifying
and classifying jumps could become the foundation for a recommendation system that identifies
trends or issues in a specific athlete’s training. For example, such a system might notify a coach
and athlete that the athletes jumps to the left have lost power. The coach and athlete can then
follow up to determine why.


References
 [1] D. Xu, X. Jiang, X. Cen, J. S. Baker, Y. Gu, Single-leg landings following a volleyball
     spike may increase the risk of anterior cruciate ligament injury more than landing on
     both-legs, Applied Sciences 11 (2021). URL: https://www.mdpi.com/2076-3417/11/1/130.
     doi:10.3390/app11010130.
 [2] M. S. Ibrahim, S. Muralidharan, Z. Deng, A. Vahdat, G. Mori, A hierarchical deep temporal
     model for group activity recognition, in: 2016 IEEE Conference on Computer Vision and
     Pattern Recognition (CVPR), 2016, pp. 1971–1980. doi:10.1109/CVPR.2016.217.
 [3] T. Kautz, B. H. Groh, J. Hannink, U. Jensen, H. Strubberg, B. M. Eskofier, Activity recognition
     in beach volleyball using a deep convolutional neural network, Data Mining and Knowl-
     edge Discovery 31 (2017) 1678–1705. URL: https://doi.org/10.1007/s10618-017-0495-0.
     doi:10.1007/s10618-017-0495-0.
 [4] F. Magalhães, G. Vannozzi, G. Gatta, S. Fantozzi, Wearable inertial sensors in swimming
     motion analysis: A systematic review, Journal of Sports Sciences 33 (2014). doi:10.1080/
     02640414.2014.962574.
 [5] D. Dalmazzo, S. Tassani, R. Ramírez, A machine learning approach to violin bow technique
     classification: A comparison between IMU and MOCAP systems, in: Proceedings of
     the 5th International Workshop on Sensor-Based Activity Recognition and Interaction,
     iWOAR ’18, Association for Computing Machinery, New York, NY, USA, 2018. URL:
     https://doi.org/10.1145/3266157.3266216. doi:10.1145/3266157.3266216.
 [6] T. E. Lockhart, R. Soangra, J. Zhang, X. Wu, Wavelet based automated postural event
     detection and activity classification with single IMU, Biomedical sciences instrumentation
     49 (2013) 224–233. URL: https://pubmed.ncbi.nlm.nih.gov/23686204.
 [7] Z. Zhang, D. Xu, Z. Zhou, J. Mai, Z. He, Q. Wang, IMU-based underwater sensing system for
     swimming stroke classification and motion analysis, in: 2017 IEEE International Conference
     on Cyborg and Bionic Systems (CBS), 2017, pp. 268–272. doi:10.1109/CBS.2017.8266113.
 [8] D. Yang, J. Tang, Y. Huang, C. Xu, J. Li, L. Hu, G. Shen, C.-J. M. Liang, H. Liu, Tennis-
     master: An IMU-based online serve performance evaluation system, in: Proceedings of
     the 8th Augmented Human International Conference, AH ’17, Association for Comput-
     ing Machinery, New York, NY, USA, 2017. URL: https://doi.org/10.1145/3041164.3041186.
     doi:10.1145/3041164.3041186.
 [9] Player management system for injury prevention and player load management, ????
     Https://www.myvert.com/ Accessed July 2022.
[10] D. A. Bruening, R. E. Reynolds, C. W. Adair, P. Zapalo, S. T. Ridge, A sport-specific wearable
     jump monitor for figure skating, PLOS ONE 13 (2018) 1–13. URL: https://doi.org/10.1371/
     journal.pone.0206162. doi:10.1371/journal.pone.0206162.
[11] S. Azar, M. Ghadimi Atigh, A. Nickabadi, A multi-stream convolutional neural network
     framework for group activity recognition, ArXiv (2018).
[12] F. A. Salim, F. Haider, D. Postma, R. van Delden, D. Reidsma, S. Luz, B.-J. van Beijnum, To-
     wards automatic modeling of volleyball players’ behavior for analysis, feedback, and
     hybrid training, Journal for the Measurement of Physical Behaviour 3 (2020) 323 –
     330. URL: https://journals.humankinetics.com/view/journals/jmpb/3/4/article-p323.xml.
     doi:10.1123/jmpb.2020-0012.
[13] R. Mooney, G. Corley, A. Godfrey, L. R. Quinlan, G. ÓLaighin, Inertial sensor technology for
     elite swimming performance analysis: A systematic review, Sensors (Basel, Switzerland)
     16 (2015) 18. URL: https://pubmed.ncbi.nlm.nih.gov/26712760. doi:10.3390/s16010018.
[14] M. D. Jones, S. T. Ridge, M. Caminita, K. E. Bassett, D. A. Bruening, Automatic classification
     of take-off type in figure skating jumps using a wearable sensor, in: ISEA Engineering of
     Sport 14, 2022.