<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Machine Learning Physical Fatigue Estimation Approach Based on IMU and EMG Wearable Sensors</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Suraj P Nair</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marco Sica</string-name>
          <email>marco.sica@tyndall.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Salvatore Tedesco</string-name>
          <email>salvatore.tedesco@tyndall.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrea Visentin</string-name>
          <email>andrea.visentin@ucc.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>SFI Insight Centre for Data Analytics, University College Cork</institution>
          ,
          <country country="IE">Ireland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>School of Computer Science &amp; IT, University College Cork</institution>
          ,
          <country country="IE">Ireland</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Tyndall National Institute, University College Cork</institution>
          ,
          <country country="IE">Ireland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Physical fatigue refers to a state of exhaustion or reduced capacity for physical performance due to prolonged exertion, repetitive movements, or lack of rest. It is a multifaceted condition that can severely impact performance, especially in activities requiring sustained efort, precision, or concentration. In physical tasks, fatigue manifests as a decrease in muscle strength, coordination, and endurance, leading to diminished performance and an increased risk of injury. Detecting physical fatigue is crucial in a variety of domains: professional sports, collaborative robotics, construction, and more. This research introduces a novel framework for predicting fatigue during shoulder movements using data collected from wearable inertial measurement units and electromyography sensors. By integrating the Borg Scale, a subjective measure of perceived exertion, our approach uniquely combines objective sensor data with user-reported fatigue levels, creating a more holistic fatigue assessment model. The primary aim of this study is to develop a predictive model capable of accurately estimating fatigue, as measured by the Borg Scale. An investigation of the best machine learning algorithm for this task ensures that the chosen method provides the most reliable predictions. Furthermore, by systematically reducing the number of sensors and analyzing the impact on model performance, it is possible to find a minimal sensor configuration that maintains the model's predictive power while reducing complexity and cost. The Ridge Regression model, after hyperparameter tuning, outperformed other models, achieving a mean absolute error of 2.417 in predicting fatigue. This preliminary study shows the potential of integrating data from diferent inertial and electromyography sensors for fatigue prediction in shoulder movements, with potential applications in occupational safety.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Fatigue Estimation</kwd>
        <kwd>Wearable Sensors</kwd>
        <kwd>Machine Learning</kwd>
        <kwd>Feature Selection</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Muscular fatigue is a critical factor influencing performance, safety, and recovery in a wide range of
activities, from athletic training to industrial work and rehabilitation. Repeated or sustained physical
exertion can lead to overuse and fatigue in specific muscle groups, particularly those involved in
repetitive or high-intensity movements. Shoulder joint fatigue, in particular, is of great concern
due to the central role it plays in numerous sports, occupational tasks, and rehabilitation exercises
[
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]. Activities such as tennis serves, golf swings, swimming strokes, and overhead throwing place
significant demands on the shoulder muscles, making them highly susceptible to fatigue [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ]. In
work environments, tasks like lifting, assembling, and pulling often require prolonged or repetitive
shoulder movements, increasing the risk of fatigue-induced musculoskeletal disorders [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Likewise,
in rehabilitation, managing fatigue is crucial for developing safe, progressive exercise programs that
facilitate muscle recovery and prevent overstrain.
      </p>
      <p>
        The ability to accurately monitor and assess physical fatigue during shoulder movements is essential
for optimizing performance, preventing injury, and improving recovery outcomes. Traditional methods
of fatigue monitoring rely heavily on subjective reporting or periodic physical evaluations, but these
approaches are often limited by their inability to capture real-time changes in muscle performance and
fatigue progression. Advances in wearable technology, however, now ofer an objective and continuous
means of monitoring fatigue. Wearable sensors such as Inertial Measurement Units (IMUs) and surface
Electromyography (EMG) enable real-time tracking of biomechanical and physiological data, providing
a more comprehensive understanding of how fatigue develops over time[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. EMG is a technique that
measures the electrical activity produced by muscles during contraction, providing insights into muscle
activation and overall function. IMUs are sensors that track motion and orientation using accelerometers,
gyroscopes, and sometimes magnetometers, capturing detailed kinematic data during physical activities.
IMUs and sEMG sensors are essential for capturing detailed biomechanical and muscle activity data
in fatigue prediction [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ]. This integration ofers a holistic view of physiological states, enhancing
prediction accuracy. The combined use of IMU and EMG sensors provides deeper insights into fatigue
development across diferent muscles and activities, proving beneficial for real-time monitoring in
sports and rehabilitation contexts.
      </p>
      <p>This study aims to address the current gap in fatigue monitoring by providing a comprehensive
approach that includes both objective sensor data and subjective fatigue assessments during shoulder
internal rotation and external rotation exercises under varying load conditions. By analyzing the
relationship between muscle activity, motion patterns, and physiological responses, this work seeks
to develop a robust fatigue prediction model that can be applied in sports, occupational settings, and
rehabilitation. Furthermore, the dataset supports the development of machine learning algorithms
for real-time fatigue detection, ultimately contributing to improved performance management, injury
prevention, and workplace ergonomics.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Literature Review</title>
      <p>
        Wearable sensors are small, non-invasive devices that can measure a range of physiological signals,
such as muscle activity, heart rate, and movement patterns. The data collected from these sensors is
often time series data, which captures the dynamic changes in physiological parameters over time.
This data can then be analyzed using various predictive models to estimate the onset and progression
of fatigue during physical activities. In sports science, accurately predicting fatigue is crucial for
optimizing performance and preventing injuries. Fatigue can afect an athlete’s technique, reaction time,
and overall performance, making monitoring fatigue levels during training and competition essential.
Wearable sensors have been widely used to monitor athletes in real time, providing data that can be
analyzed to detect early signs of fatigue. One study by [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] utilized IMU data collected from athletes
performing running exercises to predict muscle fatigue. The IMU sensor was attached to their wrist. The
researchers applied a machine learning approach using models such as Linear Regression with Elastic
Net regularization (EN) and Linear Regression with Least Absolute Shrinkage and Selection Operator
regularization (LASSO) to predict the rating of perceived exertion (RPE) based on the IMU signals. The
study demonstrated that machine learning models could accurately predict fatigue, providing a valuable
tool for coaches and trainers to manage athlete workload and prevent overtraining. Similarly, a study by
[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] explored the use of IMUs to analyze complex datasets, bringing innovation to the monitoring and
optimization of athlete training cycles. The raw time series data were used to train a supervised machine
learning model based on frequency and time-domain characteristics. The model was able to forecast the
beginning of fatigue before any physical symptoms appeared, highlighting the potential of wearable
sensors in sports performance optimization. The model demonstrated that timely interventions could
prevent overtraining and potential accidents. Additionally, the study demonstrated the model‘s eficacy
in real-time monitoring, improving the decision-making abilities of both coaches and athletes. Another
notable study by [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] developed an intelligent sports fatigue prediction system based on spectral sensors
and machine learning algorithms. This system was particularly efective in handling the nonlinearities
associated with physiological data during sports activities, ofering a robust solution for predicting the
degree of fatigue in athletes. These studies highlight the importance of integrating advanced machine
learning techniques with wearable sensor data to create predictive models that can accurately assess
and manage fatigue in sports contexts.
      </p>
      <p>
        Wearable sensors are vital in physical rehabilitation, as they continuously monitor patients during
recovery exercises. Pinto-Bernal et al. focused on fatigue management during walking tasks using
IMUs and sEMG sensors, developing a random forest model that classified fatigue into four states with
high accuracy. This tool is essential for customizing rehabilitation programs [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Additionally, EMG
and accelerometer signals have been used for predicting whole-body fatigue, showcasing the ability of
wearable sensors to provide real-time feedback and assist clinicians in adjusting therapy intensity [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
The work describes the
      </p>
      <p>
        In industrial applications, wearable sensors are efective for predicting fatigue and enhancing
workplace safety and productivity. Kuber et al. developed a fatigue prediction model using EMG and IMUs
in workers wearing back-support exoskeletons during trunk-flexion tasks [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. They applied machine
learning algorithms like SVM, Random Forest, and XGBoost, achieving up to 95% accuracy with
combined sensor data. This study underscores the potential of wearable technology in monitoring fatigue
and improving ergonomic practices in demanding jobs.
      </p>
      <p>
        In the context of fatigue prediction, IMU and EMG sensors are particularly useful due to their ability
to capture detailed biomechanical and muscle activity data. These sensors complement each other;
IMUs track movement dynamics, while sEMG records muscle activation patterns, making them highly
efective in identifying fatigue. For instance, the study by [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] utilized a combination of EMG and IMU
data to monitor fatigue in real-time during rehabilitation exercises. By integrating data from these
sensors, the researchers could assess muscle fatigue more accurately, which is crucial for preventing
overexertion during recovery tasks. Similarly, [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] employed both EMG and IMUs in their study on
fatigue management during rehabilitation. Their approach involved using these sensors to gather
comprehensive data on muscle activity and movement, which was then processed using machine
learning algorithms to classify diferent fatigue states. This method proved highly efective, as the
combination of EMG and IMU data provided a more holistic view of the patient’s physiological states,
leading to more accurate fatigue prediction. This combined approach is particularly beneficial in
applications requiring real-time monitoring and feedback, such as in sports and rehabilitation settings.
However, their approach tackled it as a classification problem, providing less detailed prediction
compared to a regression task.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Dataset Preparation</title>
      <p>
        This section describes the dataset used and the methodology used to preprocess it. This study is based on
the measurements collected by Yasar et al. [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. The data collection took place at the Tyndall National
Institute’s Wearable Laboratory, University College Cork, in Cork, Ireland, between April and July 2023.
The research was conducted on 34 healthy individuals, of which 23 were male and 11 were female. All
34 subjects had no previous history of musculoskeletal injuries. All the participants were physically
active and engaged in some form of physical training at least thrice a week.
      </p>
      <p>
        To evaluate muscle activation and upper extremity movements, EMG electrodes and IMU sensors
were applied to the dominant side of the upper body following the warm-up. A wireless EMG system
with a 1000 Hz sampling frequency was used to capture muscle activation. The reference areas indicated
in Cram’s Introduction to Surface Electromyography [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] were followed while positioning the EMG
electrodes. IMU sensors were placed on the hand, forearm, upper arm, shoulder, sternum, and pelvis. A
push/pull dynamometer was used to measure the volunteer’s greatest Voluntary Isometric Contraction
forces during shoulder internal rotation and external rotation movements in order to determine the
greatest force the muscles can exert without changing their length. Every measurement had a duration
of five seconds and was conducted twice every two minutes. Participants in the MVIC assessment were
sitting on a bench, pulling a dynamometer that was clamped to a table next to them, with their wrists
straight and elbows bent 90 degrees. For analysis, the mean of the two consecutive force measurements
was employed. After that, participants used cable pulley equipment to repeat shoulder internal rotation
and external rotation motions in a random order using 30 - 40% of their MVIC force. There was a
10-minute break in between each measurement. While standing, subjects were positioned laterally to
the fixed cable pulley, elbow flexed at a 90-degree angle, and wrist straight for the shoulder’s internal
and external rotation movements, as shown in Figure ??.
      </p>
      <p>
        The Borg RPE scale was used to gauge each participant’s felt state of physical exhaustion before
and during shoulder internal rotation and external rotation movements, as well as every 10 seconds.
The Borg RPE scale describes "no exertion at all" to "maximum exertion" and has a range of 6 to 20. It
enables people to gauge how hard a task is and how tired they are. The exercise was continued until
the subject was unable to exert any more efort (level 20). During the tests, a metronome set at 40 beats
per minute was utilized to guarantee that the workouts were performed at a steady pace. For more
details, the original dataset paper is publicly available [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
      <sec id="sec-3-1">
        <title>3.1. Data Pre-processing</title>
        <p>
          This section describes the procedure to prepare the collected dataset into a format suitable for a
supervised machine learning task. Established literature shows that a band-pass filter efectively
eliminates motion artefacts in EMG signals. Increasing the cut-of frequency reduces ECG contamination
and smooths the signal, concluding that a high-pass filter with a cut-of of around 30 Hz is optimal.
For the EMG, the mean is subtracted from the signal, and a 4th-order Butterworth band-pass filter has
been applied with a low cut-of at 30 Hz and a high cut-of at 350 Hz [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. To segment the continuous
EMG and IMU signals into the diferent repetitions, EMG and IMU data were segmented by identifying
the relevant peaks and delimitating their beginning/end. For internal rotation movements, repetitions
were defined as segments between two successive zero-crossing points. For external rotation, they were
defined between a peak and the subsequent valley. This method ensured accurate segmentation for
both movements.
        </p>
        <p>The Borg scale ratings, a self-assessment metric of perceived exertion, have to be assigned to each
individual data repetition. The Borg scale was recorded at 10-second intervals during the exercises.
Since repetitions occurring between assessments lacked corresponding Borg values, these values had
to be inferred using linear interpolation of the known Borg values. The time for each repetition was
cumulatively summed, and ratings were mapped directly based on the nearest recorded values. The
interpolated Borg ratings, starting at 6 and ending at 20, were stored and used as targets of the machine
learning task. Any samples exceeding the final Borg recording time by 10 seconds were excluded to
avoid inaccuracies.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Feature Extraction</title>
        <p>Feature extraction is fundamental to avoid overfitting due to the high dimensionality and limited amount
of individuals. This process transforms raw data into representative features for analysis and modelling.
The goal is to reduce data dimensionality while retaining essential information. The TSFEL package has
been used to automatically extract features from time series data, encompassing statistical, temporal,
and frequency domains. It simplifies the extraction process, ensuring a diverse feature set for analysis
[16]. After segmenting the EMG and IMU data, features were extracted using a window size of 30
samples with a 10-sample overlap for EMG and 300 samples with a 100-sample overlap for IMU. This
approach efectively captured fine details in EMG and broader motion patterns in IMU data.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Methodology</title>
      <p>The problem is a supervised regression task. Given all the signals collected by the sensors from a single
repetition, predict the Borg level. The preprocessed dataset described in the previous section is fed
to a machine learning pipeline composed of a min-max scaler, Principal Component Analysis (PCA)
dimensionality reduction, Recursive Feature Elimination (RFE) and finally, the chosen regressor.</p>
      <p>A large number of features were generated after feature extraction, leading to redundancy and
collinearity. Dimensionality reduction reduces the number of input variables while retaining relevant
information, simplifying interpretation and analysis. For these reasons, the feature space has been
reduced by deploying PCA. PCA is one of the most widely used dimensionality reduction techniques.
It identifies orthonormal vectors that explain the variance structure of the data, aiming to find the
most meaningful basis to re-express a dataset and filter out noise to reveal hidden structures. While
RFE is a feature selection method that iteratively reduces the number of features to identify the most
important ones. It works by training an estimator on the full set of features, ranking them based on
their importance. The least important features are removed, and the process is repeated on the reduced
feature set until the desired number of features is selected.</p>
      <p>The following machine learning regressors have been considered due to their good performances in
similar tasks:
• Random Forest Regressor: is an ensemble learning algorithm used for regression tasks,
predicting continuous values by using a bagging technique (Bootstrap Aggregating).
• Lasso Regression: applies L1 regularization, which forces some coeficients to be exactly zero,
efectively performing feature selection and reducing overfitting.
• Ridge Regression: incorporates L2 regularization, adding the squared magnitude of coeficients
to the loss function, which helps prevent overfitting, especially in datasets with many features or
multicollinearity.
• Elastic Net: is a hybrid of Lasso and Ridge regressions, combining both L1 and L2 penalties
to balance model complexity and improve predictive accuracy, particularly in datasets with
correlated features.</p>
      <p>The dataset’s particularity requires a custom cross-validation approach to avoid data leakage. The
folds have been designed to group together all the repetitions of an individual, avoiding the presence of
data from a person in both the train and test sets. A leave-one-out approach at the subject level has
been used in this case. This created 34 models, each tested in a participant after being trained in all the
data from the other subjects. The best parameters for PCA (number of components) and the regressors
have been selected through an extensive hyperparametrisation procedure using grid search.</p>
      <sec id="sec-4-1">
        <title>4.1. Sensor Reduction Procedure</title>
        <p>The dataset analysed comprised 12 sensors, requiring a complex setup that might be quite uncomfortable
for the subject. For instance, in sports involving physical activity, sensors placed on diferent parts of a
player’s body could restrict movement, cause irritation, or even afect the athlete’s natural performance.
This raises important considerations regarding the practicality and feasibility of deploying such a
comprehensive sensor array in real-world situations. The volume of data generated from these 12
sensors could present challenges in data storage and processing and in real-time analysis. Reducing the
number of sensors used to predict fatigue levels would lead to smaller, cheaper, and more comfortable
devices with longer charge life. For these reasons, a procedure to identify an eficient set of sensors is
crucial for the application. Once a sensor is added to the design, there is no additional cost to leverage
all the features provided by that sensor. Traditional feature selection does not consider this aspect.
Potentially, some sensors are not contributing to fatigue detection, or their impact can be inferred from
the other sensors, making them redundant. Evaluating all possible sensor combinations would require
an exponential number of models, 212 in this case, making it unfeasible in reality. For this reason, an
iterative backward sensor elimination procedure has been devised. Starting from the full set of sensors,
the procedure iteratively eliminated the sensor that has the least impact in terms of accuracy. Given a
set of n remaining sensors, the procedure applied is the following:
1. Create n models, each trained on a dataset obtained by removing all data from a sensor.
2. Evaluate each model and select the one with the higher accuracy. That model is the one that has
been the least afected by the sensor removal.
3. Remove from the dataset the sensor with the least impact.
4. Iterate the procedure with n − 1 sensor unless only one sensor is remaining.</p>
        <p>In this case, the number of models evaluated is the triangular number of the number of sensors (78
starting with 12 sensors). While this is not guaranteed to provide the best subset, it can be computer in
a reasonable time.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Metrics</title>
        <p>All models have been compared in terms of Mean Squared Error (MSE), Root Mean Squared Error
(RMSE), Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE). The MSE measures
the average squared prediction error, emphasizing larger errors due to its sensitivity to outliers. The
RMSE is the square root of MSE, providing an interpretable error metric in the same units as the
target variable while maintaining sensitivity to large errors. The MAE calculates the average absolute
diference between predicted and actual values, treating all errors equally and making it more robust
against outliers. Finally, the MAPE expresses the error as a percentage of the actual values by averaging
the absolute percentage diferences between predicted and actual values. It provides an intuitive measure
of prediction accuracy but can be distorted by very small actual values, leading to large percentage
errors.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Experimental Results</title>
      <p>The function of this section is to present the experimental study. In Section The research was conducted
using Python. The code and preprocessed datasets are made publicly available1.</p>
      <sec id="sec-5-1">
        <title>5.1. All Sensors</title>
        <p>This experiment evaluated the potential to predict the Borg scale with diferent regressors using data
from all the sensors available. Due to the lack of a comparable approach in the literature, a simple mean
baseline has been considered. This baseline predicted the average Borg value of the training set. Its
goal was to assess if the regressors can extrapolate information relative to the fatigue from the sensors
or are only learning the distribution of the training labels.
1https://github.com/andvise/FatigueEstimation</p>
        <p>The Random Forest Regressor is the least efective machine learning model in this comparison. It
has the highest MAE (2.561), which means its predictions are, on average, the furthest from the actual
values compared to the other models. The poorer performance of the Random Forest Regressor may
be due to its complexity or the possibility that it is not as well-suited to the specific characteristics of
this dataset as the linear models. The patterns learned are likely overfitting and poorly translated to an
unseen subject. Figure 2 shows the model’s predictions. Compared to the Ridge Regressor, the dots are
more sparse and distributed across the plot. Due to its superior performance and for the sake of brevity,
the next sections focus on the Ridge Regression model.</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. IMU and EMG Sensor Data Comparison</title>
        <p>Two Ridge models have been tested on all IMU or EMG sensor datasets. The analysis provides an
overview of which type of sensor provide more relevant patterns for fatigue estimation. The results of
the evaluation are given in Table 2.</p>
        <p>The overall performance considerably worsens with only one type of sensor. The IMU-only model
outperforms the EMG-based one. However, its accuracy is considerably worse than the Random Forest
Regression in Table 1, the model with the lowest accuracy. The efect can be seen in Figure 3; the model
is more conservative with the predictions, avoiding outputting high or low values. The performance of
the EMG-only data is surprisingly inaccurate. The EMG is considered an efective sensor for measuring
fatigue [17]. In this particular case, the full set performs only slightly above the chance level.</p>
        <p>When both IMU and EMG data were integrated, the combined model from the previous section
achieved an MSE of 8.639, which is notably lower than the MSEs of both the IMU-only and EMG-only
models. These results clearly indicate that integrating both IMU and EMG data leads to a more accurate
and reliable model, leveraging the complementary strengths of each sensor type to enhance the overall
predictive performance.</p>
      </sec>
      <sec id="sec-5-3">
        <title>5.3. Individual Sensor Comparison</title>
        <p>The set of 12 sensors is distributed across the body, and their connectivity requires a bulky device.
Wearing multiple sensors during physical activity could restrict movement, cause irritation, or even
afect the athlete’s natural performance. This raises important considerations regarding the practicality
and feasibility of deploying such a comprehensive sensor array in real-world situations. This experiment
evaluates the performance of the Ridge Regressor on data collected from a single sensor.</p>
        <p>Table 3 presents the accuracy of the model trained on individual sensors specifying their location. In
contrast with the previous section, the individual EMG sensors are the most accurate ones, outperforming
the set containing all EMG sensors. This could be caused by the fact that the high dimensionality of the
data leads to patterns that are not representative of the real phenomena. However, all models trained
on a single sensor are less accurate than the ones trained on the full dataset. The location of the most
relevant sensors is consistent with the biomechanical movement. The infraspinatus is one of the four
muscles of the rotator cuf; its main role is to rotate the humerus and stabilize the shoulder. While the
pectoralis major is the largest muscle of the frontal chest. Both of them are widely involved in the two
rotation movements.</p>
      </sec>
      <sec id="sec-5-4">
        <title>5.4. Backward Sensor Elimination</title>
        <p>The previous experiments showed that while some individual sensors yielded relatively good MSE
results, they still fell short of the model’s performance using all sensors. This indicates that relying on a
single sensor is insuficient for accurately predicting outcomes in complex tasks like shoulder rotation
exercises. However, some of the individual sensors might be sampling areas that are not involved in the
movement or are redundant. The aim of this section is to identify a set of sensors that provide accuracy
comparable to the complete data. The procedure described in Section 4.1 is used for this goal.</p>
        <p>Figure 4 shows the decrease of the MAE over the increase of the number of sensors. The horizontal
axis contains the sensor removed at each step. The rightmost point represents the accuracy of the full
dataset. The first sensor was discarded in the EMG on the trapezius ascendes. The process is iterated
until the EMG sensor on the pectoralis major is removed, leaving only the one on the infraspinatus
that alone reaches an MSE of 2.93 (since that sensor has not been removed, it is not present in the
horizontal axis labels). As expected, the progressive reduction of sensors leads to an increase in the
model’s accuracy.</p>
        <p>Notably, the error increases significantly after removing the pelvis and shoulder IMU sensors. In
line with the previous results, infraspinatus, pectoralis major and deltoideus anterior are the last ones
to be removed. This is due to their role in stabilizing the shoulder during external rotation, making
them sensitive to fatigue. The palm and upper arm IMUs could reflect a change in posture or movement
due to the overall muscle fatigue, efectively capturing key activities in shoulder external rotation. The
analysis indicates that removing the forearm and torso IMUs, and the deltoideus posterior and trapezius
ascendens would likely not impact the accuracy of the model. This would allow a simplification of the
sensors that does not impact the performance.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusions</title>
      <p>This study aligns with research highlighting the usefulness of IMU and EMG sensors for tracking
physiological and biomechanical characteristics. Similar studies have demonstrated their utility across
various fields, including sports science, occupational health, and rehabilitation. However, predicting
the Borg scale value is a challenging task due to the subjective nature of physical fatigue perception
and the uncertainty caused by the self-assessment. The proposed data preprocessing and machine
learning pipeline demonstrated the potential in predicting the perceived fatigue in individual previously
unseen. The importance of both the mechanical aspect of the movement assessed by the IMUs and
the muscular electrical activity measured by the EMGs has been highlighted since the best models
require both sensors. Finally, a procedure to obtain a reduced set of sensors by iteratively removing the
least impacting one provided a smaller set of sensors with equivalent accuracy. Opening the path for
the development of cheaper, lighter and more eficient tools. This work underscores advancements in
machine learning algorithms and sensor technologies, enhancing our understanding and application of
these instruments for real-time fatigue evaluation.</p>
      <p>While significant findings were achieved, the study had limitations. The primary limitation was the
relatively small sample size, which may restrict the generalizability of the results and the accuracy
of the models. The high dimensionality and the small population size lead to the creation of various
patterns that are not related to fatigue. Additionally, the focus on healthy adults may not translate
to populations with musculoskeletal disorders or athletes with diferent fatigue dynamics. However,
the procedures developed herein and their implementation can be easily adapted to the new dataset
collected. Finally, the addition of explainability approaches would provide a more informative picture
to practitioners [18].</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>This work was conducted with the financial support of Science Foundation Ireland under Grant Nos.
12/RC/2289-P2 and 16/RC/3918 which are co-funded under the European Regional Development Fund.
This research was partially supported by the EU’s Horizon Digital, Industry, and Space program under
grant agreement ID 101092989-DATAMITE. For the purpose of Open Access, the author has applied a CC
BY public copyright license to any Author Accepted Manuscript version arising from this submission.
[16] M. Barandas, D. Folgado, L. Fernandes, S. Santos, M. Abreu, P. Bota, H. Liu, T. Schultz, H. Gamboa,</p>
      <p>Tsfel: Time series feature extraction library, SoftwareX 11 (2020) 100456.
[17] H. A. Yousif, A. Zakaria, N. A. Rahim, A. F. B. Salleh, M. Mahmood, K. A. Alfarhan, L. M. Kamarudin,
S. M. Mamduh, A. M. Hasan, M. K. Hussain, Assessment of muscles fatigue based on surface emg
signals using machine learning and statistical approaches: A review, in: IOP conference series:
materials science and engineering, volume 705, IOP Publishing, 2019, p. 012010.
[18] P. O’Sullivan, M. Menolotto, A. Visentin, B. O’Flynn, D.-S. Komaris, Ai-based task classification
with pressure insoles for occupational safety, IEEE Access (2024).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>N.-T.</given-names>
            <surname>Tsai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. W.</given-names>
            <surname>McClure</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. R.</given-names>
            <surname>Karduna</surname>
          </string-name>
          ,
          <article-title>Efects of muscle fatigue on 3-dimensional scapular kinematics</article-title>
          ,
          <source>Archives of physical medicine and rehabilitation 84</source>
          (
          <year>2003</year>
          )
          <fpage>1000</fpage>
          -
          <lpage>1005</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Garg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Hegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kapellusch</surname>
          </string-name>
          ,
          <article-title>Short-cycle overhead work and shoulder girdle muscle fatigue</article-title>
          ,
          <source>International Journal of Industrial Ergonomics</source>
          <volume>36</volume>
          (
          <year>2006</year>
          )
          <fpage>581</fpage>
          -
          <lpage>597</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Fett</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ulbricht</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ferrauti</surname>
          </string-name>
          ,
          <article-title>Impact of physical performance and anthropometric characteristics on serve velocity in elite junior tennis players</article-title>
          ,
          <source>The Journal of Strength &amp; Conditioning Research</source>
          <volume>34</volume>
          (
          <year>2020</year>
          )
          <fpage>192</fpage>
          -
          <lpage>202</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>N. A.</given-names>
            <surname>Evans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. E.</given-names>
            <surname>Simon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Konz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. J.</given-names>
            <surname>Nitz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. L.</given-names>
            <surname>Uhl</surname>
          </string-name>
          ,
          <article-title>Reliability of isokinetic decay slope is superior to using fatigue indices for shoulder horizontal abduction</article-title>
          ,
          <source>Journal of Bodywork and Movement Therapies</source>
          <volume>37</volume>
          (
          <year>2024</year>
          )
          <fpage>372</fpage>
          -
          <lpage>378</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>X.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Yan</surname>
          </string-name>
          ,
          <article-title>Prevalence of work-related musculoskeletal disorders among workers in the automobile manufacturing industry in china: a systematic review and meta-analysis</article-title>
          ,
          <source>BMC Public Health</source>
          <volume>23</volume>
          (
          <year>2023</year>
          )
          <year>2042</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Lambay</surname>
          </string-name>
          , Y. Liu,
          <string-name>
            <given-names>P.</given-names>
            <surname>Morgan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Ji</surname>
          </string-name>
          ,
          <article-title>A data-driven fatigue prediction using recurrent neural networks, in: 2021 3rd International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA)</article-title>
          , IEEE,
          <year>2021</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>H.</given-names>
            <surname>Dong</surname>
          </string-name>
          , I. Ugalde,
          <string-name>
            <given-names>N.</given-names>
            <surname>Figueroa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>El Saddik</surname>
          </string-name>
          ,
          <article-title>Towards whole body fatigue assessment of human movement: A fatigue-tracking system based on combined semg and accelerometer signals</article-title>
          ,
          <source>Sensors</source>
          <volume>14</volume>
          (
          <year>2014</year>
          )
          <fpage>2052</fpage>
          -
          <lpage>2070</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Pinto-Bernal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. A.</given-names>
            <surname>Cifuentes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Perdomo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rincón-Roncancio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Múnera</surname>
          </string-name>
          ,
          <article-title>A data-driven approach to physical fatigue management using wearable sensors to classify four diagnostic fatigue states</article-title>
          ,
          <source>Sensors</source>
          <volume>21</volume>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>T.</given-names>
            <surname>Op De Beéck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Meert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Schütte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Vanwanseele</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Davis</surname>
          </string-name>
          ,
          <article-title>Fatigue prediction in outdoor runners via machine learning and sensor fusion</article-title>
          ,
          <source>in: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery &amp; Data Mining, KDD '18</source>
          ,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2018</year>
          , pp.
          <fpage>606</fpage>
          -
          <lpage>615</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Biró</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Cuesta-Vargas</surname>
          </string-name>
          , L. Szilagyi,
          <article-title>Ai-assisted fatigue and stamina control for performance sports on imu-generated multivariate times series datasets</article-title>
          ,
          <source>Sensors</source>
          <volume>24</volume>
          (
          <year>2023</year>
          )
          <fpage>132</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>H.</given-names>
            <surname>Yao</surname>
          </string-name>
          ,
          <article-title>Prediction of sports fatigue degree based on spectral sensors and machine learning algorithms</article-title>
          ,
          <source>Optical and Quantum Electronics</source>
          <volume>56</volume>
          (
          <year>2024</year>
          )
          <fpage>696</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>P. M. Kuber</surname>
            ,
            <given-names>A. R.</given-names>
          </string-name>
          <string-name>
            <surname>Kulkarni</surname>
            ,
            <given-names>E. Rashedi,</given-names>
          </string-name>
          <article-title>Machine learning-based fatigue level prediction for exoskeleton-assisted trunk flexion tasks using wearable sensors</article-title>
          ,
          <source>Applied Sciences</source>
          <volume>14</volume>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>M. N.</given-names>
            <surname>Yasar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sica</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. O</given-names>
            <surname>'Flynn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tedesco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Menolotto</surname>
          </string-name>
          ,
          <article-title>A dataset for fatigue estimation during shoulder internal and external rotation movements using wearables</article-title>
          ,
          <source>Scientific Data</source>
          <volume>11</volume>
          (
          <year>2024</year>
          )
          <fpage>433</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Cram</surname>
          </string-name>
          , Introduction to Surface Electromyography, 1st ed., Aspen Publishers, Gaithersburg,
          <string-name>
            <surname>MD</surname>
          </string-name>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>M.</given-names>
            <surname>Redfern</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hughes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chafin</surname>
          </string-name>
          ,
          <article-title>High-pass filtering to remove electrocardiographic interference from torso emg recordings</article-title>
          ,
          <source>Clinical Biomechanics</source>
          <volume>8</volume>
          (
          <year>1993</year>
          )
          <fpage>44</fpage>
          -
          <lpage>48</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>