<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Dificulty of Items - Predictions on Linguistic Features</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Anna Winklerová</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Masaryk University</institution>
          ,
          <addr-line>Botanická 68a, 60200, Brno, Czechia</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>To fulfill adaptive and mastery learning parameters of an educational system (both learning and assessment), it is necessary to continuously develop and manage a large item pool containing thousands of items in a properly designed structure. Content management can be eficiently supported by utilizing augmented intelligence models that can deduce behaviour of items in the system based on linguistic features, independent on user data. This paper focuses on categorizing linguistic features for short L2 English multiple choice items, discusses ways of feature selection towards feature interpretability and its consequences on model prediction. It demonstrates practical application of prediction results for item management and further meaningful feature development.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Item dificulty prediction</kwd>
        <kwd>Second language acquisition</kwd>
        <kwd>Natural language feature engineering</kwd>
        <kwd>Interpretable features</kwd>
        <kwd>Estimation of question statistics</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The overall accuracy behind an adaptive educational (both learning and assessment) system
performance is dependent on student – item interactions.Dificulty of an item is one of the most
descriptive metrics of how an item behaves in the system but the reasoning for the behaviour is
more complex and tricky. For instance, we can easily identify that an item behaves diferently
than expected by measuring the item’s error rate distance from the mean error rate of the whole
set of items however the reasoning for this behaviour can not be easily done without further
complementary data. This applies to the item complexity features that are free of student
interaction [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], in other words, textual and form related content of items. Analysis of linguistic
features characterizing item complexity in combination with item dificulty serve as a valuable
insight into item behavior in the system.
      </p>
      <p>Features engineered from even such short digital entities as multiple choice gap filling (MCQ)
items in educational content summarized in recent research count in hundreds. Linguistic
features free of user interaction data take a vast share. They diverge from simple lexical
and surface features, through syntax and basic semantic features to composite discourse and
embedding features. Current acceleration in computational linguistics brings profits in improved
methods and tools for feature engineering but also challenges in the dimensionality reduction
and proper machine learning model application to reach practical goals such as dificulty
prediction.</p>
      <p>
        Our research is motivated mainly by two groups of users: content decision makers and model
developers. These two distinctive groups of users represent diferent views on item features as
they both stress out diferent objectives. Decision makers are mostly domain specialists and
they need to comprehend the results of the predictions and its underlying decisions. On the
other hand model developers aim to improve the precision of the predictions no matter how
complex and inarticulate the models and features are. Based on the work of Alexandra Zytek et
al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] we examine the overlapping Interpretable feature space and Model-ready feature space to
ifnd the suitable set of feature properties to select relevant interpretable features in the dificulty
prediction setting.
      </p>
      <p>The aim of this paper is to (1) describe feature engineering methods in the context of item
dificulty prediction (2) report on the ongoing research of automated methods for interpretable
feature selection methods on the 230 feature set of real life data from a educational system
containing thousands of items on English L2 practice with thousands of student interactions
and to (3) demonstrate exploitation of augmented intelligence for decision making in item
management.</p>
      <p>The dificulty estimation is implemented as a regression task utilizing simple Random Forest
(RF) ensemble algorithm and Gradient Boosting Trees (GBT) for comparison. The focus of this
work is not on optimizing the ML algorithms, and cross validation of hyperparameter setting
was not performed. Rather, the ML algorithms are used in static setting for comparable results
on feature engineering and selection.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Relevant research</title>
      <p>
        The recent comprehensive state-of-the-art overviews on item dificulty prediction [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ] have
demonstrated intensive ongoing research in various learning contexts. The predominant reason
to predict dificulty of an item is to accelerate establishment of new items in educational systems,
which could reduce the resource cost of item pool management and administration. Behind the
pursue of these clear quantitative goals there also emerge data quality benefits of item feature
modeling. As mentioned in[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], the data-driven insights, such as intensive item modeling with
linguistic features, can (1) inform on overall structuring of content and in particular (2) help
predict the individual learners’ dificulties and skills.
      </p>
      <p>
        Distilling features as numerical representations of textual parameters of a natural language
text passage is increasingly influential in recent research and applications. There are being
developed libraries for basic handcrafted features engineering such as LFTK [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] containing 220
features covering lexical, semantic, syntactic or discourse features. The eforts to categorize and
standardize text features are creditable, reduce the heavy lifting of text preprocessing and, in
our research context, enable systematically build up specific features. In addition, implementing
standardized vector representations of items by linguistic features can contribute to sharing data
in educational and assessment systems. Lack of data for methodology comparison is identified
as one of the obstacles in item dificulty prediction research.
      </p>
      <p>
        With the increasing number of item features and the aim of practical usage of the predictive
models, the need for interpretable features is eminent. Consistently with the recent work of
Alexandra Zytek et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] we define the key stakeholders in item pool management as decision
makers and model developers. Decision makers use the model results to gain insight on creating
new items, identifying items for revision and taking appropriate action such as modifying
or removing dysfunctional items, inspecting the coherency of domain subsets (e.g., splitting
topics, adding new topics), analysing item functioning in order to gain insight with applicable
actions [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. These users need the model results to be understandable and consistent with their
domain expertise.
      </p>
      <p>Model developers use and finetune machine learning algorithms to improve model precision
for a given task (such as dificulty prediction) on a particular dataset. Their motivation is
therefore focused on collecting predictive, model-ready features correlating with the target
variable.</p>
      <p>
        Building models on interpretable features aims to build trust [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] in end users and opens door
to further development including crowd feature engineering. With compliance with [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] we
require the interpretable features to have at the same time the following properties, or have
clearly defined transform functions from interpretable to model-ready feature space. Relevant
interpretable properties have features that are:
• Understandable – related to real-world metric, e.g. Age of acquisition stated in age rather
than given by a scale from 1 to 100,
• Readable – labeled with plain human language with understandable meaning (for our
purpose readable is defined to the extent of human-worded features), e.g. Number of words
in a sentence instead of Average number of tokens per sentence,
• Meaningful – with clear relation to the target variable, comprehensible to decision makers,
From the model point of view, a feature must be predictive: improve a model performance,
have suficient data coverage, explain data variance and be independent on other features.
      </p>
      <p>
        Interpretable features subset from the set of 230 features needs to be justified on a
computational basis. Based on comprehensive review of dimensionality reduction techniques by R.
Zebari et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] we examine two main groups of approaches of achieving reduced features set by
feature selection and feature extraction. Although we have experimented with feature extraction
methods such as PCA on feature correlation clusters obtaining significant computational time
improvements while keeping the prediction accuracy, the main objective of our work lies heavily
in the interpretability of features’ significance to item modeling. Therefore, the main ongoing
work leverages from the feature selection methods.
      </p>
      <p>
        The studied feature selection mechanisms focus on maximizing relevant information while
minimizing redundant information. The research in this area deals with automated calculation
of minimal or optimal number of features that still cover the relevant variance of data while
decreasing bias or noise, suggesting various approaches and methods such as mutual information,
vector variance inflation, clustering or correlation based analysis.[
        <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
        ]
      </p>
      <p>
        Although, in many cases of feature selection, the best performance is obtained by using all
available features, [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] improvement in item dificulty prediction by feature subset selection
was demonstrated in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Furthermore, models with smaller interpretable set of features are
more usable to decision makers and are open to further development by users that are not
machine learner experts.
      </p>
      <p>0.8
0.6
y
ilfft
u
c
id0.4
m
e
It
0.2
0.0</p>
    </sec>
    <sec id="sec-3">
      <title>3. Framework implementation</title>
      <p>Our work does not focus merely on item dificulty prediction, but on a wider context of
augmented intelligence models supporting decisions towards item pool management. Secondly,
we aim to give insights and recommendations to model developers concerning (1) which feature
types are still meaningful to further investigate in the context of educational content and also
(2) which features are useful in a given ML application task. Therefore, the individual pipeline
steps starting with the text preprocessing up to result evaluation are covered in a framework.</p>
      <p>In this paper, we focus mainly on the implementation and evaluation of the feature engineering
and selection methods.</p>
      <sec id="sec-3-1">
        <title>3.1. Umíme dataset</title>
        <p>The items in this dataset are from private educational system for practicing grammar, vocabulary
and use of English. It is targeted on L2 learners of English from the first years of studying
language up to advanced high school learners. Over 5900 items are structured in 34 resource
sets focusing on diferent concepts of language. Dificulty of items is calculated as an error rate
on the scale from 0 to 1.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Feature engineering</title>
        <p>MCQ items are short, one to two sentence long passages of text. The lexical and semantic
representation of these sentences hardly contains every aspect of what makes an item dificult.
To distinguish MCQ relevant features, we propose describing item characteristics as static
or dynamic. Imagine a sentence The United States have around 330 million inhabitants. This
sentence consists of static features including POS tags, syntax tree parameters, surface features
such as word syllables count or even various metrics for readability indexes, word frequencies
or age of acquisition.</p>
        <p>Next assume MCQ item created from the basic sentence. The item has one correct answer
(stated as the first in the bracket) and one distractor. The item can be created as one of the
following examples:
• The United States have around [330;660] million inhabitants.
• The United States [has;have] around 330 million inhabitants.</p>
        <p>• The United States has around 330 million[s;_] inhabitants.</p>
        <p>Although the static features of the above-mentioned MCQ items are almost identical, the
dynamic of student engagement in the individual items is essentially diferent. Features
describing the dynamic item component can be derived from the item answers or grammar pattern of
underlying knowledge domain. These features try to explain a context-sensitive stimuli leading
to a student action and resolving into item dificulty.</p>
        <p>Results of our experiments show that the development of dynamic item features contributes
to the dificulty prediction and item behaviour modeling in an educational system.</p>
        <sec id="sec-3-2-1">
          <title>3.2.1. Feature set</title>
          <p>After a standard text processing steps containing item purification, contraction expansion
(i.e. it’s – it is), tokenization, stopword filtering etc. we derived numerical representation of
handcrafted features provided by the LFTK package for Python and handcrafted more features
describing mostly the dynamic item component.</p>
          <p>In the LFTK package, there are currently 220 features divided into overlapping sections of
foundation (e.g. verb count) and derivation (e.g. average verb count per word/per sentence)
features from diverse domains and families (syntactic, semantic, discourse or named entities).
The short, mostly one sentence long items, do not utilize all features from the LFTK package as
many of them are designed for longer passages of text (readability measures, counts of unique
words per sentence).</p>
          <p>MCQ specific features are not present in the package. The 10 remaining features were derived
from textual parts of items using nlp libraries such as SpaCy, nltk for distance metrics for
distractor similarity measures or CEFR level dictionary for vocabulary dificulty analysis. These
features are described in table 2</p>
          <p>The resulting size of features set is 230 containing all LFTK features and basic MC specific
features.</p>
        </sec>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Dimensionality reduction</title>
        <p>In order to achieve model interpretability and usability, we performed several feature reduction
experiments based on diferent methods and underlying calculations. To lower the high number
of dimensions and to reduce bias caused collinearity of features, we have experimented with (1)
hierarchical clustering based on feature correlations and Principal Component Analysis as well
as (2) various approaches for feature selection.</p>
        <sec id="sec-3-3-1">
          <title>3.3.1. Feature extraction</title>
          <p>Description
Diference between correct and wrong sentence represented by LFTK
feature vector.</p>
          <p>Edit distance between the correct answer and distractor.</p>
          <p>Edit distance between the whole correct sentence (correct answer inputed
into the gap) and wrong sentence.</p>
          <p>BERT fill in the mask pretrained model percentual value of certainty for
masked expression.</p>
          <p>Normalized score of the position of the gap in the sentence on scale 0-1
representing beginning and end of sentence respectively.</p>
          <p>Mean of all words in CEFR A1-C2 label on scale 1-6
Mean of all words in CEFR A1-C2 label on scale 1-6
Distractor max CEFR A1-C2 label on scale 1-6
Correct answer max CEFR A1-C2 label on scale 1-6
Sentence parse depth represented by average number of children of parse
tree nodes.</p>
          <p>The feature extraction methods such as PCA are very straightforward and automatically
applicable on any data types (with respect to numerical representations and normalization), preserving
data variation and improving ML tasks in some cases. Instead of applying PCA across the whole
set of 230 features, we first implemented clustering on correlated features as illustrated in
figure ??. The figure shows large groups of highly correlated linguistic features (e.g. token absolute
and average lengths and counts). Large clusters with closely similar features (correlating more
than 0.8) indicate that these types of features are not interesting for further investigation, as
the informational gain has already been exhausted. These features can be substituted in further
computations by representative selection or linear combination (e.g. PCA).</p>
          <p>On the other hand, the low size clusters may contain features of significant importance
towards the ML task and represent promising areas for deeper feature analysis and engineering
(e.g. distractor similarity).</p>
          <p>The size of clusters is parametrized by the distance factor which is directly influencing the
number and size of feature groups, PCA variation ratio and dificulty prediction accuracy.</p>
          <p>In this particular experiment, PCA was applied on features from clusters containing at least 5
features. Features in smaller clusters are used as individual independent variables in the training
set.</p>
          <p>The extracted principal component features lose their interpretable qualities and can not be
used unless a transformation function is applied which translates the principal components or
factors into an abstract concept. Such abstract concept (could be labeled for example as textual
complexity) must be understandable and meaningful to the decision maker.</p>
          <p>1.0
0.5
0.0
0.5
3.3.2. Feature selection</p>
          <p>To investigate the methods of feature selection that best fit our dataset and ML task, we made
use of the fine granularity of the dataset structure. The feature correlations with target value,
but also collinearity among features proved to be very diverse among diferent item resource
sets. Future work on the item pool and feature set will focus on finding general characteristics
in resource sets with low prediction accuracy. The desired result is a predictive model that is as
general as possible yet able to explain most of the variance in various data.</p>
          <p>Two approaches are described in the following text in detail: features selected based on
model-feature importances and interpretable features selected based on hierarchical clustering.</p>
          <p>Features selected based on model-feature importances. This approach combines results
of the random forest decision algorithm and the model feature importances and structure of the
data. The dificulty prediction was calculated separately on each resource set (5900 items in 34
resource sets). The top 20 features in each resource set were multiplicated by the importance of
each given feature calculated by the RF and further multiplicated by the Pearson’s correlation
coeficient of the prediction on the whole resource set. From this cumulated importances, 10
features were selected and new prediction was performed.</p>
          <p>Interpretable features selected based on hierarchical clustering. This approach uses the
results of hierarchical clustering depicted in figure 3.3.1 and selects (manually) representatives
of the constructed clusters that are predictive and interpretable. For instance all features
from the LFTK package based on verbs ended up in the "verb" cluster as shown in figure
4. From the features in the cluster, we have selected the most interpretable feature while
maximizing the predictive quality (correlation with target value). These are usually translated
to an understandable definition and were created as combination of model-ready features.</p>
          <p>The table 5 shows diferent results of predictions on structured data. Table contains worst
and best predictions expressed with the pearson correlation. The dificulty predictions vary
greatly among diferent resource sets.</p>
          <p>The overall accuracy of predictions on all data combined is stated in table 6.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Applicable results for decision making</title>
      <p>Table 7 shows practical steps in evaluation of dificulty prediction results. Dificulty of the item
"My sisters a beach house." based on linguistic features was predicted 0.24, whereas the
observed dificulty (error rate) is calculated as 0.6.</p>
      <p>With further investigation of linguistic parameters of similar items within the same resource
set (item modeling in context), the table 8 shows that items with similar mask "s have" represent
an outlier with suggested action for the content developer to add more similar items in order to
balance structure and content of the resource set. The overall number of items with a correct
answer containing have got or has got from the whole resource set is 67 of which only two
represent similar token linkage with plural noun My sisters or My cousins respectively. The
token (grand)parents implicates plural form of the verb have got. Other items use diferent
subject forms, such as plural pronouns they, we, you and multiple personal names Jane and Pete.</p>
      <p>This item could have been marked as dysfunctional by simply comparing the outlying
dificulty towards the mean dificulty of the resource set. However, the further analysis leading
to content developer action is heavily dependent on rich item modeling (POS tags and syntactic
marking).</p>
      <p>The other side of the dificulty prediction error scale contains the cases of significantly higher
predicted values than true values. This is a much more interesting result considering the novelty
insight into data. The explanation is often in an unintentional clue in the item revealed by the
students, or wrongly placed item (too easy in the context of other items, but not necessarily the
easiest). This type of item deficiency can be corrected by the content developer, who should be
able to find the hidden clue and rewrite the item, or place it in diferent set of items.</p>
      <p>In table 9 is an example of an item with the highest diference between predicted dificulty
and true dificulty. The explanation lies in markedly diferent options. Correct answer Do I live
and distractor Have I live are a combination that has not appeared in any other item and the
distractor is of no attraction for the student.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>Exploiting recent advancements in linguistic computations and natural language processing,
this paper demonstrates a framework covering steps of feature engineering, dimensionality
reduction and practical application. The main importance is in the interpretability of model by
its features. We believe that models with understandable features are of the most use to decision
makers. Furthermore, the interpretability can lead to model precision improvement as more
involved users, who are not ML specialists, can contribute to the dynamic feature engineering.</p>
      <p>Dificulty prediction tasks give deeper insight into the anatomy of items in context of student
skills which helps to manage the content of educational systems.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Online Resources</title>
      <p>• The Umime educational system umimeto.org
• LFTK lftk.readthedocs.io,
• scikit scikit-learn.org,
• BERT Fill-Mask pretrained huggingface.co/google-bert.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R.</given-names>
            <surname>Pelánek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Efenberger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Čechák</surname>
          </string-name>
          ,
          <article-title>Complexity and dificulty of items in learning systems</article-title>
          ,
          <source>International Journal of Artificial Intelligence in Education</source>
          <volume>32</volume>
          (
          <year>2022</year>
          )
          <fpage>196</fpage>
          -
          <lpage>232</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Zytek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Arnaldo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Berti-Equille</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Veeramachaneni</surname>
          </string-name>
          ,
          <article-title>The need for interpretable features: Motivation and taxonomy</article-title>
          ,
          <source>ACM SIGKDD Explorations Newsletter</source>
          <volume>24</volume>
          (
          <year>2022</year>
          )
          <fpage>1</fpage>
          -
          <lpage>13</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>AlKhuzaey</surname>
          </string-name>
          , et al.,
          <article-title>Text-based question dificulty prediction: A systematic review of automatic approaches</article-title>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>L.</given-names>
            <surname>Benedetto</surname>
          </string-name>
          , et al.,
          <article-title>A survey on recent approaches to question dificulty estimation from text</article-title>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>I.</given-names>
            <surname>Pandarova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Schmidt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hartig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Boubekki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. D.</given-names>
            <surname>Jones</surname>
          </string-name>
          , U. Brefeld,
          <article-title>Predicting the dificulty of exercise items for dynamic dificulty adaptation in adaptive language tutoring</article-title>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>B. W.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. H.-J.</given-names>
            <surname>Lee</surname>
          </string-name>
          , Lftk: Handcrafted features in computational linguistics,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>R.</given-names>
            <surname>Pelánek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Efenberger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kukučka</surname>
          </string-name>
          , et al.,
          <article-title>Towards design-loop adaptivity: identifying items for revision</article-title>
          ,
          <source>Journal of Educational Data Mining</source>
          <volume>14</volume>
          (
          <year>2022</year>
          )
          <fpage>1</fpage>
          -
          <lpage>25</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S. R.</given-names>
            <surname>HONG</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. HULLMAN</surname>
          </string-name>
          , E. BERTINI,
          <article-title>Human factors in model interpretability: Industry practices, challenges, and needs</article-title>
          , arXiv preprint arXiv:
          <year>2004</year>
          .
          <volume>11440</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>R.</given-names>
            <surname>Zebari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Abdulazeez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Zeebaree</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Zebari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Saeed</surname>
          </string-name>
          ,
          <article-title>A comprehensive review of dimensionality reduction techniques for feature selection</article-title>
          and feature extraction,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <article-title>Feature selection based on mutual information with correlation coeficient</article-title>
          ,
          <source>Applied Intelligence</source>
          <volume>52</volume>
          (
          <year>2022</year>
          )
          <fpage>5457</fpage>
          -
          <lpage>5474</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>H.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <surname>X. Zhang,</surname>
          </string-name>
          <article-title>Feature selection based on data clustering</article-title>
          ,
          <source>in: Intelligent Computing Theories and Methodologies: 11th International Conference, ICIC</source>
          <year>2015</year>
          , Fuzhou, China,
          <source>August 20-23</source>
          ,
          <year>2015</year>
          , Proceedings,
          <source>Part I 11</source>
          , Springer,
          <year>2015</year>
          , pp.
          <fpage>227</fpage>
          -
          <lpage>236</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Munson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Caruana</surname>
          </string-name>
          ,
          <article-title>On feature selection, bias-variance, and bagging</article-title>
          ,
          <source>in: Joint European Conference on Machine Learning and Knowledge Discovery in Databases</source>
          , Springer,
          <year>2009</year>
          , pp.
          <fpage>144</fpage>
          -
          <lpage>159</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>