<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Using Machine Learning to Predict the Number of Latent Skills in Online Learning Environments</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Changsheng Chen</string-name>
          <email>changsheng.chen@kuleuven.be</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Robbe D'hondt</string-name>
          <email>robbe.dhondt@kuleuven.be</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Celine Vens</string-name>
          <email>celine.vens@kuleuven.be</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Wim Van Den Noortgate</string-name>
          <email>wim.vandennoortgate@kuleuven.be</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Public Health and Primary Care, KU Leuven</institution>
          ,
          <addr-line>Campus KULAK, Kortrijk</addr-line>
          ,
          <country country="BE">Belgium</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Faculty of Psychology and Educational Sciences, KU Leuven</institution>
          ,
          <addr-line>Campus KULAK, Kortrijk</addr-line>
          ,
          <country country="BE">Belgium</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>imec research group itec, KU Leuven</institution>
          ,
          <addr-line>Kortrijk</addr-line>
          ,
          <country country="BE">Belgium</country>
        </aff>
      </contrib-group>
      <fpage>15</fpage>
      <lpage>25</lpage>
      <abstract>
        <p>Extracting skill information for students in online learning environments has been a challenging topic across different domains. Predicting the number of skills is the first step towards estimating students' skills. In this paper, we propose prediction methods based on Machine Learning (ML) models, where we used the analysis model to generate simulation data reflecting the data features of our target scenarios and took the features from simulation data to train and test ML models. We illustrated this approach in tandem with Multidimensional Item Response Theory (MIRT) for the simple and complex structure, and further compared the trained ML models with a selection of statistical methods based on the test data. Our preliminary results show that, compared to statistical methods, ML models generally reach a noticeably higher proportion of correct estimations for both structures. Additionally, we find that an increase in the percentage of missing values and sample size leads to negative and positive effects on the methods' performance respectively. Using simulation data from the analysis model to train ML models and doing prediction can extend the current operation of skill extraction, which provides extra options for the practitioners.</p>
      </abstract>
      <kwd-group>
        <kwd>machine learning</kwd>
        <kwd>multidimensional item response theory</kwd>
        <kwd>latent skills</kwd>
        <kwd>online learning1</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Skill information is one type of fundamental quantitative evidence for building an online
learning system (including adaptive lifelong learning system). With accurate users’ skill
estimates, such a system can personalize materials and instruction design to improve the
learning experience effectively and efficiently. With monitoring the changes of users’ skill
information, the system can recommend further learning resources to adapt to users’ situation
frequently. However, what skills can be extracted and monitored and how the skill information
can be estimated by which test items and relevant users’ response are still a challenging topic.</p>
      <p>
        Several kinds of techniques have been used to extract users’ skill information based on users’
response to test items, such as Multidimensional Item Response Theory (MIRT) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], Cognitive
Diagnostic Model (CDM) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], Matrix Factorization (MF) [
        <xref ref-type="bibr" rid="ref4 ref5">4,5</xref>
        ], and so forth. The common start
for conducting these techniques is to decide the number of skills and clarify the relationship
between items and skills (or knowledge components). In other words, the number of skills and
which items can be used to measure which skills are clearly defined before skill estimation and
tracing algorithms are performed. For example, in the MIRT, the item-dimension relationship
needs to be explored, which serves as the basis for estimating user’s skill values, after the
predetermination of the number of latent dimensions. In the CDM, the item-attributes
relationship depicted by the Q-matrix functions in a similar way and the number of attributes
should also be confirmed beforehand. In the MF, the number of ranks for shaping two
decomposed matrices (i.e., a user-factor matrix and an item-factor matrix) is required initially
before the technique is performed. Traditionally, the number of skills and the item-skill
relationship are theoretically defined by domain experts. However, human examination is too
inefficient to satisfy the needs of online learning system because of the large number of items,
which calls for the data-driven approach (i.e., extracting the number of skills and exploring and
confirming the item-skill structure based on the response matrix).
      </p>
      <p>
        Many techniques have been proposed to estimate the number of skills based on data-driven
evidence. For example, in the MIRT, the number of latent dimensions is estimated by certain
statistical methods, such as Kaiser Criterion (KC) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], Empirical Kaiser Criterion (EKC) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ],
Parallel Analysis (PA) [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], non-graphical Scree Plot with Optimal Coordinates (OC) or
Acceleration Factor (AF) [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], Very Simple Structure (VSS) with two variants (i.e., C1 &amp; C2) [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ],
and so forth. In the CDM, the number of attributes and related Q-matrix are estimated and
evaluated by the designed algorithms or statistics, such as the G-DINA Discrimination Index
(GDI) method [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], the stepwise method [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], and so on. In the MF, the number of ranks is
usually seen as a hyperparameter, which is predicted based on the evaluation of defined loss
[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Additionally, some researchers have explored using Machine Learning (ML) methods to
estimate the number of skills, and they found that it can increase the proportion of correct
predictions. For example, Goretzko &amp; Bühner [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] used eXtreme Gradient Boosting (XGBoost),
Random Forest (RF), and Adaptive Boosting to predict the number of factors for continuous
response simulation data, and found that these methods performed better than other traditional
statistical methods in terms of prediction accuracy (i.e., the proportion of correct estimation).
However, their study did not explore the possibilities of using ML methods to predict the
number of skills for the dichotomous response with considering the features of online or
adaptive learning data (e.g., the sparsity and the large number of items) and properties of
different multidimensional structures.
      </p>
      <p>
        In this study, we aim to fill this research gap by proposing ML prediction methods inspired
by Goretzko &amp; Bühner [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] and comparing their performance with other selected statistical
methods. The general operation is that we use the analysis model (such as the MIRT, CDM, or
MF) to generate simulation data reflecting the data features of target scenarios in online
learning environments. The simulation data includes two parts, i.e., the training data (including
validation data) for training and tuning ML models and the test data for evaluating the
performance of ML models and selected statistical methods. In detail, the selected methods
included: 1) ML models: the regression variant of XGBoost and RF whose results were rounded
to the integer; 2) statistical methods: KC, PA, EKC, Scree Plot (OC), Scree Plot (AF), VSS (C1),
and VSS (C2). For the sake of parsimony, the explanation of methods’ mechanism is skipped,
and relevant details can be consult by provided references.
      </p>
      <p>In the following sections, we illustrate this operation in tandem with the MIRT for the simple
and complex structure. MIRT is the prevailing statistical model for analyzing students’ binary
response (0: wrong; 1: right) to estimate students’ ability and relevant item parameters in the
field of psychological and educational assessments. The principle of MIRT is that it models the
probability of giving a correct answer based on the interaction between students’ ability and
item parameters. For example, a 2-parameter MIRT model can be expressed by:
 (  = 1|  ;   ,   ) =</p>
      <p>1 + 
(   ′ +   )
(   ′ +   )</p>
      <p>
        In the above formula,   = 1 refers to the correct response of user i for item j and the   =
(  1,   2, … ,   ),   = (  1,   2, … ,   ), and   indicate the ability of user i for skill k, the item
discrimination of item j for skill k, and the item intercept for item j respectively [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. As for the
two multidimensional structures, under the simple structure, each item is solely related to one
latent skill and the latent skills are correlated with each other. Under the complex structure,
each item is related to more than one latent skill and the latent skills are correlated with each
other as well.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Method</title>
      <p>2.1. Data
Missingness (proportion)
Correlation (latent skills)
From 0 to 0.9
From 0.1 to 0.5
0, 0.25, 0.5, 0.75, 0.9
0.1, 0.2, 0.3, 0.4, 0.5</p>
      <sec id="sec-2-1">
        <title>2.2. Methods Implementation</title>
        <p>
          All methods implementation was based on R 4.3.2 [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. The statistical methods were mainly
implemented based on the tetrachoric correlation matrix corresponding to the dichotomous
responses by R function “tetrachoric2” of R package “sirt” [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] with Bonett method [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ]. The
results of KC and EKC were estimated by manual function in R. PA and scree plot (OC &amp; AF)
were performed by relevant functions in R package “nFactors” [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ], and VSS (C1 &amp; C2) was
implemented by relevant functions in R package “psych” [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ].
        </p>
        <p>
          The RF and XGBoost were implemented by relevant functions in R package “mlr” [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ] and
“xgboost” [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ]. Both ML models were trained and tested based on the features extracted from
available information, such as the original response matrix, the estimated tetrachoric
correlation matrix, and the estimated results of statistical methods. The features included [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]:
1) from the response matrix: the sample size, the number of items, and the proportion of
missingness; 2) from the correlation matrix: the determinant, the number of entries smaller or
equal to 0.1, the number of eigenvalues larger than 0.7, the relative proportion of eigenvalues,
the standard deviation of all eigenvalues, the number of eigenvalues accounting for over 50% or
75% of the variance, the matrix norms (i.e., the L1-norm, Frobenius-norm, maximum-norm, and
spectral-norm), the average of off-diagonal entries and the communality estimates, the
sampling adequacy [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ], the Gini-coefficient [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ], the Kolm inequality [27], the top 50
eigenvalue estimates; 3) from the results of statistical methods: KC, PA, EKC, scree plot (OC),
scree plot (AF), VSS (C1), and VSS(C2).
        </p>
        <p>As ML models can be trained by integrating the results of statistical methods, which may
lead to a fairness concern regarding the method comparison, we trained RF and XGBoost in two
ways, i.e., one without including results of statistical methods in the features and another with
including them. Additionally, all ML models were trained by 10-fold cross-validation based on
the training data. Table 2 provides the partial hyperparameter settings for the RF and XGBoost
with or without extra features (i.e., the results of statistical methods). The settings of other
possible hyperparameters followed the default settings of two R packages. The relevant codes
will be publicly available by contacting the corresponding author when the paper with final
results is published.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.3. Evaluation Metrics</title>
        <p>To evaluate and compare the performance of all candidate methods, the deviation score and
several metrics based on the deviation score were used. The deviation score is defined as the
estimated number of latent skills minus the true number of latent skills. The correct-estimation
proportion is the number of deviation scores equal to zero divided by the total number of
estimates (i.e., 1000). The under-estimation proportion is the number of deviation scores lower
than zero divided by the total number of estimates. The over-estimation proportion is the
number of deviation scores higher than zero divided by the total number of estimates. The bias
is the average of deviation scores. The precision is the average absolute deviation score.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Results</title>
      <p>scree plot (AF), VSS (C1), and VSS (C2) tended to estimate a smaller number of latent skills (their
under-estimation proportions ranging from around 0.4 to 0.5). ML models also estimated a
smaller number of latent skills, but their under-estimation proportions (around 0.14) were
noticeably lower than statistical methods. The patterns of under and over estimations were
further supported by the results of bias and precision.</p>
      <sec id="sec-3-1">
        <title>Correct</title>
        <p>estimation</p>
      </sec>
      <sec id="sec-3-2">
        <title>Proportion</title>
      </sec>
      <sec id="sec-3-3">
        <title>Under</title>
        <p>estimation</p>
      </sec>
      <sec id="sec-3-4">
        <title>Proportion</title>
      </sec>
      <sec id="sec-3-5">
        <title>Over</title>
        <p>estimation</p>
      </sec>
      <sec id="sec-3-6">
        <title>Proportion</title>
      </sec>
      <sec id="sec-3-7">
        <title>Bias</title>
      </sec>
      <sec id="sec-3-8">
        <title>Precision Table 3</title>
        <p>Figure 1 and Figure 2 present the effects of simulation features on the correct-estimation
proportions of selected methods. As these proportions were extremely low for KC, PA, EKC,
and scree plot (OC), they were omitted in the effects analysis. For the simple structure, when
the percentage of missing values in the response matrix increased from 0 to 90%, the respective
proportions of all methods decreased, especially for ML models (falling from above 0.8 to below
0.2). Raising the sample size from 300 to 800 generally led to an increase in the respective
proportions of ML methods by 0.2, while the effects of sample size on statistical methods were
not detectable due to the fluctuations. Regarding the effects of the number of latent skills,
changing the settings from 1 to 8 was related to the tremendous decrease in the proportions of</p>
        <p>1
0.9935
0.9805
0.9805
0.0260
0.1883
0.2662
0.1494
0.2597
0.1818
0.2597</p>
        <p>1
0.9450
0.8990
0.9400
0.0460
0.0800
0.2940
0.1120
0.1130
0.0640
0.0690
scree plot (AF) and VSS (C1) by around 0.7. For the effects of the number of items, when it rose
from 400 to 600, the proportion of most methods went down by around 0.2.</p>
        <p>Compared to the patterns in the case of simple structure, the changes of proportions for the
complex structure fluctuated less. When the missingness percentage went up from 0 to 90%, the
proportions of ML methods dropped down from over 0.9 to lower than 0.3 and the proportions
of statistical methods went down relatively slightly by around 0.2. Raising the sample size led
to the increase in proportions of ML methods by around 0.2, while the proportions of statistical
methods fluctuated by a small amount. In terms of the number of latent skills, when it changed
from 2 to 8, the proportion of statistical methods fell down massively from over 0.6 to below
0.1. In contrast, the proportion of ML models almost stayed the same. Regarding the number of
items, the proportion of all methods fluctuated slightly without noticeable changes across
different settings.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Discussion</title>
      <p>
        In the present study, we proposed a general operation of building prediction models using ML,
with simulation data to estimate the number of latent skills for online learning environments,
which was illustrated based on the MIRT. The results of the performance comparison revealed
that ML models had a markedly better performance than statistical methods regarding the
correct-estimation proportions. This finding is generally consistent with the previous study [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
However, the correct estimation of proportions in the previous study is higher than 0.9, which
is different from the results in this study (ranging from 0.65 to 0.8). One possible explanation
for this difference might be due to the different simulation models and scenarios. In the previous
study, the dichotomous response generated by the MIRT was not considered. The simulation
settings more reflected the features of relatively small-scale psychological tests instead of the
large-scale online learning settings. For example, the number of items is usually set below 100
in the field of psychology, while it might be over hundreds and even thousands in the online
learning environments. Additionally, the problem of missingness or sparsity is also less of a
concern in previous research. Regarding the performance of statistical methods, our results
showed that they performed surprisingly poorer than previous studies. Goretzko &amp; Bühner [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]
found that KC, EKC and PA reached over 0.75 regarding the correct estimation proportion,
which is completely different from our results. Guo &amp; Choi [28] found that the proportion of
identifying the correct number of latent skills for PA with tetrachoric ranged from 0.43 to 1
across various simulation features, which is also dissimilar from our results. It may be
speculated that this is because of the different settings of simulation features.
      </p>
      <p>
        Except for the results of methods comparison, the effects analysis of simulation features
found that the increase in the missingness and sample size lead to a going-down and going-up
trends for most of methods regarding the correct estimation proportions. It is interesting to
note that raising missingness and sample size may have negative and positive impact on
methods’ performance respectively. As mentioned above, missingness was not considered in
the previous study, and our study fills this gap. As for the positive effects of sample size, our
results further confirm the findings of the previous study. For example, the correct estimation
proportion of ML models increased by 0.06 when the sample size rose from 250 to 1000 in the
study of Goretzko &amp; Bühner [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
      <p>Overall, the results of this study imply that compared to statistical methods, using simulation
data generated by the analysis model (e.g., the MIRT) to train ML models and applying them to
do predictions can work relatively effectively for estimating the number of latent skills in online
learning environments. This kind of operation can be generalized to other kinds of analysis
models. For example, when practitioners believe that their real-world data fits the assumptions
of CDM, they can choose a suitable model of CDM to simulate data reflecting the data features
of expected scenarios and train ML models to predict the number of attributes in the Q-matrix.
This can also be used for MF in terms of predicting the number of ranks.</p>
      <p>Several limitations of this study need to be acknowledged. First, the trained and tuned ML
models were not tested by real data. The conclusions of simulation study heavily rely on the
data-generation model and the settings of simulation features, so relevant findings should be
confirmed further based on real data. Second, due to the constraints of computational power,
the present preliminary study only covered partial simulation scenarios, and the number of
simulated data was limited to one for each scenario, which may make the relevant conclusions
less stable. Third, as mentioned above, the illustration was based on the MIRT, and whether the
findings remain the same for CDM or MF still needs to be tested.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>In this study, we used the MIRT to generate simulation data reflecting the data features of target
scenarios and took the features from simulation data to train and test two ML models (i.e., RF
and XGBoost) for the simple and complex structure. These two ML models were compared with
selected statistical methods regarding their performance of predicting the number of latent
skills. The preliminary results show that the ML models (with or without including results of
statistical methods during the training stage) generally outperform statistical methods in terms
of correct estimation proportions. Additionally, regarding the effects of simulation features, we
find that raising missingness level and the number of samples leads to a falling-down and
goingup trend respectively in the correct estimation proportions of most methods. To conclude, our
result implies that compared to statistical methods, using simulation data generated by the
selected analysis model to train ML models and further doing prediction can relatively improve
the prediction of the number of latent skills and extend the current operation related to users’
skill extraction.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgements</title>
      <p>This work was funded by Research Fund Flanders (FWO fellowship 1S38023N). We also
acknowledge the Flemish Government (AI Research Program).
[27]S.-C. Kolm, The rational foundations of income inequality measurement, in: Handbook of</p>
      <p>Income Inequality Measurement, Springer, 1999: pp. 19–100.
[28]W. Guo, Y.-J. Choi, Assessing dimensionality of IRT models using traditional and revised
parallel analyses, Educ. Psychol. Meas. 83 (2023) 609–629.
https://doi.org/10.1177/00131644221111838.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Gharahighehi</surname>
          </string-name>
          ,
          <string-name>
            <surname>R. Van Schoors</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Topali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ooge</surname>
          </string-name>
          ,
          <article-title>Adaptive Lifelong Learning (ALL)</article-title>
          ,
          <source>in: International Conference on Artificial Intelligence in Education</source>
          , Springer Nature Switzerland, Cham,
          <year>2024</year>
          : pp.
          <fpage>452</fpage>
          -
          <lpage>459</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>W.</given-names>
            <surname>Bonifay</surname>
          </string-name>
          ,
          <article-title>Multidimensional item response theory</article-title>
          ,
          <source>Sage</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>von Davier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.S.</given-names>
            <surname>Lee</surname>
          </string-name>
          , Handbook of diagnostic classification models, Springer Publishing,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.C.</given-names>
            <surname>Desmarais</surname>
          </string-name>
          ,
          <article-title>Mapping question items to skills with non-negative matrix factorization</article-title>
          ,
          <source>ACM SIGKDD Explorations Newsletter</source>
          <volume>13</volume>
          (
          <year>2012</year>
          )
          <fpage>30</fpage>
          -
          <lpage>36</lpage>
          . https://doi.org/10.1145/2207243.2207248.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.C.</given-names>
            <surname>Desmarais</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Naceur</surname>
          </string-name>
          ,
          <article-title>A matrix factorization method for mapping items to skills and for enhancing expert-based Q-matrices</article-title>
          , in: H.
          <string-name>
            <surname>C. Lane</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Yacef</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Mostow</surname>
          </string-name>
          , P. Pavlik (Eds.),
          <source>Artificial Intelligence in Education</source>
          , Springer Berlin Heidelberg, Berlin, Heidelberg,
          <year>2013</year>
          : pp.
          <fpage>441</fpage>
          -
          <lpage>450</lpage>
          . https://doi.org/10.1007/978-3-
          <fpage>642</fpage>
          -39112-5_
          <fpage>45</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>H.F.</given-names>
            <surname>Kaiser</surname>
          </string-name>
          ,
          <article-title>The application of electronic computers to factor analysis</article-title>
          ,
          <source>Educ. Psychol. Meas</source>
          .
          <volume>20</volume>
          (
          <year>1960</year>
          )
          <fpage>141</fpage>
          -
          <lpage>151</lpage>
          . https://doi.org/10.1177/001316446002000116.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Braeken</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.A.L.M. Van Assen</surname>
          </string-name>
          ,
          <article-title>An empirical Kaiser criterion</article-title>
          .,
          <source>Psychol. Methods</source>
          <volume>22</volume>
          (
          <year>2017</year>
          )
          <fpage>450</fpage>
          -
          <lpage>466</lpage>
          . https://doi.org/10.1037/met0000074.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.L.</given-names>
            <surname>Horn</surname>
          </string-name>
          ,
          <article-title>A rationale and test for the number of factors in factor analysis</article-title>
          ,
          <source>Psychometrika</source>
          <volume>30</volume>
          (
          <year>1965</year>
          )
          <fpage>179</fpage>
          -
          <lpage>185</lpage>
          . https://doi.org/10.1007/BF02289447.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>G.</given-names>
            <surname>Raîche</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.A.</given-names>
            <surname>Walls</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Magis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Riopel</surname>
          </string-name>
          , J.-G. Blais,
          <article-title>Non-graphical solutions for cattell's scree test</article-title>
          ,
          <source>Methodology</source>
          <volume>9</volume>
          (
          <year>2013</year>
          )
          <fpage>23</fpage>
          -
          <lpage>29</lpage>
          . https://doi.org/10.1027/
          <fpage>1614</fpage>
          -2241/a000051.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>W.</given-names>
            <surname>Revelle</surname>
          </string-name>
          , T. Rocklin,
          <article-title>Very simple structure: an alternative procedure for estimating the optimal number of interpretable factors</article-title>
          ,
          <source>Multivariate Behavioral Research</source>
          <volume>14</volume>
          (
          <year>1979</year>
          )
          <fpage>403</fpage>
          -
          <lpage>414</lpage>
          . https://doi.org/10.1207/s15327906mbr1404_
          <fpage>2</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>J. De La Torre</surname>
            ,
            <given-names>C.-Y.</given-names>
          </string-name>
          <string-name>
            <surname>Chiu</surname>
          </string-name>
          , A
          <article-title>general method of empirical Q-matrix validation</article-title>
          ,
          <source>Psychometrika</source>
          <volume>81</volume>
          (
          <year>2016</year>
          )
          <fpage>253</fpage>
          -
          <lpage>273</lpage>
          . https://doi.org/10.1007/s11336-015-9467-8.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>W.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. De La Torre</surname>
          </string-name>
          ,
          <article-title>An empirical Q‐matrix validation method for the sequential generalized DINA model</article-title>
          ,
          <source>Br. J. Math. Stat. Psychol</source>
          .
          <volume>73</volume>
          (
          <year>2020</year>
          )
          <fpage>142</fpage>
          -
          <lpage>163</lpage>
          . https://doi.org/10.1111/bmsp.12156.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>W.-S.</given-names>
            <surname>Chin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhuang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.-C.</given-names>
            <surname>Juan</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.-J. Lin</surname>
          </string-name>
          ,
          <article-title>A fast parallel stochastic gradient method for matrix factorization in shared memory systems</article-title>
          ,
          <source>ACM Trans. Intell. Syst. Technol</source>
          .
          <volume>6</volume>
          (
          <issue>2015</issue>
          ) 2:
          <fpage>1</fpage>
          -
          <lpage>2</lpage>
          :
          <fpage>24</fpage>
          . https://doi.org/10.1145/2668133.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>D.</given-names>
            <surname>Goretzko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bühner</surname>
          </string-name>
          ,
          <article-title>One model to rule them all? Using machine learning algorithms to determine the number of factors in exploratory factor analysis</article-title>
          .,
          <source>Psychological Methods 25</source>
          (
          <year>2020</year>
          )
          <fpage>776</fpage>
          -
          <lpage>786</lpage>
          . https://doi.org/10.1037/met0000262.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>R.P.</given-names>
            <surname>Chalmers</surname>
          </string-name>
          ,
          <article-title>mirt: a multidimensional item response theory package for the R environment</article-title>
          ,
          <source>Journal of Statistical Software</source>
          <volume>48</volume>
          (
          <year>2012</year>
          ). https://doi.org/10.18637/jss.v048.
          <year>i06</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>R Core</given-names>
            <surname>Team</surname>
          </string-name>
          ,
          <string-name>
            <surname>R:</surname>
          </string-name>
          <article-title>A language and environment for statistical computing</article-title>
          , (
          <year>2024</year>
          ). https://www.R-project.
          <source>org/.</source>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Robin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Yoo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Manna</surname>
          </string-name>
          ,
          <article-title>Statistical Properties of the GRE ® Psychology Test Subscores</article-title>
          ,
          <source>ETS Research Report Series</source>
          <year>2018</year>
          (
          <year>2018</year>
          )
          <fpage>1</fpage>
          -
          <lpage>13</lpage>
          . https://doi.org/10.1002/ets2.
          <fpage>12206</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>USMLE</surname>
          </string-name>
          ,
          <year>2024</year>
          <article-title>USMLE bulletin of information</article-title>
          , (
          <year>2023</year>
          ). https://www.usmle.org/sites/default/files/2023-08/2024bulletin.pdf.
          <source>pdf (accessed March 23</source>
          ,
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>A.</given-names>
            <surname>Robitzsch</surname>
          </string-name>
          ,
          <article-title>sirt: Supplementary item response theory models</article-title>
          , (
          <year>2024</year>
          ). https://CRAN.Rproject.org/package=sirt.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>D.G.</given-names>
            <surname>Bonett</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.M.</given-names>
            <surname>Price</surname>
          </string-name>
          ,
          <article-title>Inferential methods for the tetrachoric correlation coefficient</article-title>
          ,
          <source>J. Educ. Behav. Stat</source>
          .
          <volume>30</volume>
          (
          <year>2005</year>
          )
          <fpage>213</fpage>
          -
          <lpage>225</lpage>
          . https://doi.org/10.3102/10769986030002213.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>G.</given-names>
            <surname>Raiche</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Magis</surname>
          </string-name>
          ,
          <article-title>nFactors: Parallel analysis and other non graphical solutions to the cattell scree test</article-title>
          , (
          <year>2022</year>
          ). https://CRAN.R-project.org/package=nFactors.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>William</given-names>
            <surname>Revelle</surname>
          </string-name>
          ,
          <article-title>psych: Procedures for psychological, psychometric</article-title>
          , and personality research, (
          <year>2024</year>
          ). https://CRAN.R-project.org/package=psych.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>B.</given-names>
            <surname>Bischl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kotthoff</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schiffner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Richter</surname>
          </string-name>
          , E. Studerus,
          <string-name>
            <given-names>G.</given-names>
            <surname>Casalicchio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.M.</given-names>
            <surname>Jones</surname>
          </string-name>
          , mlr: Machine Learning in
          <string-name>
            <surname>R</surname>
          </string-name>
          ,
          <source>Journal of Machine Learning Research</source>
          <volume>17</volume>
          (
          <year>2016</year>
          )
          <fpage>1</fpage>
          -
          <lpage>5</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>T.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Benesty</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Khotilovich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Cho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Mitchell</surname>
          </string-name>
          , I. Cano,
          <string-name>
            <given-names>T.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Xie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Geng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Yuan</surname>
          </string-name>
          , xgboost: Extreme gradient boosting, (
          <year>2024</year>
          ). https://CRAN.R-project.org/package=xgboost.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>H.F.</given-names>
            <surname>Kaiser</surname>
          </string-name>
          ,
          <article-title>A second generation little jiffy</article-title>
          ,
          <source>Psychometrika</source>
          <volume>35</volume>
          (
          <year>1970</year>
          )
          <fpage>401</fpage>
          -
          <lpage>415</lpage>
          . https://doi.org/10.1007/BF02291817.
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>H.</given-names>
            <surname>Dalton</surname>
          </string-name>
          ,
          <article-title>The measurement of the inequality of incomes, Econ</article-title>
          . J.
          <volume>30</volume>
          (
          <year>1920</year>
          )
          <article-title>348</article-title>
          . https://doi.org/10.2307/2223525.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>