<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Integrating IRT Analysis into LMS for Item Pool Optimization</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Panagiotis Fotaris</string-name>
          <email>panagiotis.fotaris@kcl.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Theodoros Mastoras</string-name>
          <email>mastoras@uom.gr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dental Institute, King's College London</institution>
          ,
          <addr-line>London</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Applied Informatics, University of Macedonia</institution>
          ,
          <addr-line>Thessaloniki</addr-line>
          ,
          <country country="GR">Greece</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Due to the computerization of assessment tests, the use of Item Response Theory (IRT) has become commonplace for educational assessment development, evaluation, and refinement. When used appropriately by a Learning Management System (LMS), IRT can improve the assessment quality, increase the efficiency of the testing process, and provide in-depth descriptions of item and test properties. This paper introduces a methodological and architectural framework which embeds an IRT analysis tool in an LMS so as to extend its functionality with assessment optimization support. By applying a set of validity rules to the statistical indices produced by the IRT analysis, the enhanced LMS is able to detect several defective items from an item pool which are then reported for reviewing of their content. Assessment refinement is achieved by repeatedly employing this process until all flawed items are eliminated.</p>
      </abstract>
      <kwd-group>
        <kwd>e-learning</kwd>
        <kwd>Item Pool Optimization</kwd>
        <kwd>Computer Aided Assessment</kwd>
        <kwd>Item Analysis</kwd>
        <kwd>Massive Open Online Courses</kwd>
        <kwd>MOOCs</kwd>
        <kwd>Item Response Theory</kwd>
        <kwd>IRT</kwd>
        <kwd>Learning Management System</kwd>
        <kwd>Technology Enhanced Learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Due to the recent advances in Internet technologies and the booming development of
massive open online courses (MOOCs), the use of Computer Aided Assessment
(CAA) tools has become a major trend in academic institutions worldwide [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
Through these systems, tests composed of various question types can be presented to
students in order to assess their knowledge [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. However, there has been considerable
criticism of the test quality, with both research and experience showing that many test
items are flawed at the initial stage of their development. Test developers can expect
about 50% of the items in their item pool to fail to perform as intended, which may
eventually lead to unreliable results of examinee performance [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Thus a critical
challenge lies in how to ensure that the individual test items are of the highest quality
possible since an inferior item could have an inordinately large effect on some scores.
      </p>
      <p>
        The present paper introduces a comprehensible way to present IRT analysis results
to test developers without delving into unnecessary details. Instead of memorizing
numerous commands and scenarios from technical manuals, test developers can easily
detect problematic questions from the familiar user interface of an LMS. The latter
can automatically calculate the limits and rules for the α (discrimination), b
(difficulty), and c (guessing) parameters [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] based on the percentage of questions wanted for
revision. The examinee’s proficiency (θ) is represented on the usual scale (or metric)
with values ranging roughly between -3 and 3, but since these scores include negative
ability estimates which would undoubtedly confuse many users, they can optionally
be normalized to a 0…100 range scale score.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        Students’ increasing demand for more flexible learning options during the last decade
has led to the widespread use of LMS and CAA tools in education, and, more
recently, to the rapid expansion of MOOCs distributed in platforms such as Coursera,
Udacity, and EdX. However, there is serious concern around the assessment of
student learning due to the fact that only a small fraction of the aforementioned systems
supports an assessment quality control process based on the interpretation of item
statistic parameters. Popular e-learning platforms such as Moodle and Blackboard
have plug-ins or separate modules that provide statistics for test items, but apart from
that they offer no suggestions to test developers on how to improve their item pool.
Similarly, although new web technologies allow for scalable ways to deliver video
lectures, implement social fora, and track student progress in MOOCs [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], there is
limited feedback regarding the quality of the test items and the accuracy of the
assessment results. Therefore, many researchers have recently endeavored to provide
mechanisms for assessment optimization.
      </p>
      <p>
        Hsieh et al. introduced a model that presents test statistics and collects students’
learning behaviors for generating analysis result and feedback to tutors [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Hung et
al. proposed an analysis model based on Item Analysis (IA) that collects information
such as item difficulty and discrimination indices, questionnaire and question style,
etc. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. These data are combined with a set of rules in order to detect defective items,
which are signaled using traffic lights. Costagliola et al.’s eWorkbook system
improved this approach by using fuzzy rules to measure item quality, detect anomalies
on the items, and suggest improvements [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Nevertheless, all of the aforementioned
works preferred IA to IRT due to its ease of use without taking into consideration its
numerous deficiencies.
      </p>
      <p>
        On the other hand, IRT has been mainly applied in the Computerized Adaptive
Test (CAT) domain for personalized test construction based on individual ability [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
Despite its high degree of support among theoreticians and some practitioners, IRT’s
complexity and dependence on unidimensional test data and large samples often
relegate its application to experimental purposes only. While a literature review can
reveal many different IRT estimation algorithms, they all involve heavy mathematics
and are unsuitable for implementation in a scripting language designed for web
development (e.g., PHP). As a result, their integration in internet applications such as
LMSs is very limited. A way to address this issue is to have a web page call the
opensource analysis tool ICL [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] to carry out the estimation process and then import its
results for display. The present paper showcases a framework that follows this exact
method in order to extend an LMS with IRT analysis services at no extra
programming cost.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Open-source IRT Analysis Tool ICL</title>
      <p>
        Several computer programs that provide estimates of IRT parameters are currently
available for a variety of computer environments, including Rascal, Ascal,
WINSTEPS, BILOG-MG, MULTILOG, PARSCALE, RUMM and WINMIRA to
name a few that are easily obtainable [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Despite being the de facto standard for
dichotomous IRT model estimation, BILOG is a commercial product and limited in
other ways. Hanson provided an alternative stand-alone software for estimating the
parameters of IRT models called IRT Command Language (ICL) [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. A recent
comparison between BILOG-MG and ICL [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] showed that both programs are
equally precise and reliable in their estimations. However, ICL is free, open-source, and
licensed in a way that allows it to be modified and extended. In fact, ICL is actually
IRT estimation functions embedded into a fully-featured programming language
called TCL that supports relatively complex operations. Additionally, ICL’s
command line nature enables it to run in the background and produce analysis results in
the form of text files. Since the proposed framework uses only a three-parameter
binary-scoring IRT model (3PL), ICL proves more than sufficient for our purpose and was
therefore selected to complement the LMS for item pool optimization.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Integrating IRT Analysis in Dokeos</title>
      <p>Dokeos is an open-source LMS implemented in PHP that requires Apache acting as a
web server and MySQL as a Database Management System. It has been serving the
needs of two academic courses at the University of Macedonia for over six years,
receiving satisfactory feedback from both instructors and students. In order to extend
its functionality with IRT analysis and item pool optimization functions, we had to
modify its source code so as to support the following features:
1. After completing a test session, the LMS stores in its database the examinee’s
response to each test item instead of keeping only a final score by default.
2. Test developers define the acceptable limits for the following IRT analysis
parameters: item discrimination (α), item difficulty (b), and guessing (c). The LMS stores
these values as validity rules for each assessment. There is an additional choice of
having these limits set automatically by the system in order to rule out a specific
percentage of questions (Fig. 1.1).
3. Every time the LMS is asked to perform an IRT analysis, it displays a page with
the estimated difficulty, discrimination and guessing parameters for each test item.
If the latter violates any of the validity rules already defined in the assessment
profile, it is flagged for review of its content (Fig. 1.2). Once item responses are
evaluated, test developers can discard, revise or retain items for future use.
4. In addition to a total score, the assessment report screen displays the proficiency θ
per examinee as derived from the IRT analysis (Fig. 1.3).</p>
      <p>Normalized θ
The proposed methodology consists of four steps, with each one of them being an
action performed by the LMS (Fig. 2). Additionally, the initial database schema has
been extended in order to support some extra functions. Once an update of the IRT
results is called for, the LMS exports the proper data files and TCL scripts. It then
performs a number of calls to the ICL using PHP and after parsing the analysis
results, it imports them to its database. A detailed description of the four methodology
steps follows:</p>
      <p>Examinee</p>
      <p>Examinee</p>
      <sec id="sec-4-1">
        <title>Examinee</title>
      </sec>
      <sec id="sec-4-2">
        <title>Assessment Test</title>
      </sec>
      <sec id="sec-4-3">
        <title>Developer</title>
        <p>Web Server</p>
      </sec>
      <sec id="sec-4-4">
        <title>LMS “Dokeos”</title>
        <p>Assessment</p>
        <p>Profile
IRT Analysis
Results
Calibration
Rules
4
1</p>
      </sec>
      <sec id="sec-4-5">
        <title>IRT tool “ICL”</title>
        <p>Assessment</p>
        <p>Results
Estimated
Parameters
Estimated
Theta</p>
        <p>Parameter
Estimation</p>
        <p>Script</p>
        <p>Theta
Estimation</p>
        <p>Script
2
3
1. The LMS exports the assessment results to a data file and generates a TCL script to
process them (parameter estimation script).
2. The LMS then calls up ICL with the parameter estimation script passed as a
parameter in order to create a data file containing the α, b, and c values for each test
item. At the same time it prepares a second TCL script to process these IRT
parameters (θ estimation script).
3. The LMS calls up ICL with the θ estimation script passed as a parameter so as to
make a data file with the examinees’ θ values.
4. Finally, the LMS imports the two ICL-produced data files (*.par and *.theta) to its
database for further processing in the context of the aimed item pool optimization.
As already mentioned, some modifications to the Dokeos database schema had to be
performed in order for the system to function properly. More specifically, while the
initial schema supported only a total score per examinee (“track_e_exercices” table),
the proposed one requires a detailed recording of each examinee’s performance per
item. The additional functionalities of this new schema are outlined in the following
list:
1. Each assessment can have multiple versions based on its revised items. By
monitoring the examinees’ performance on each item, test developers can determine
whether a certain modification of a specific item affected positively its quality. In
practice, each version serves as a new test for the LMS.
2. Each examinee’s score per item is recorded for every test being administered.
These values are held in the assessment results data file (*.DAT) used by ICL.
3. Test developers can establish a new set of rules for each version of the assessment.</p>
        <p>Version
quiz_version
PK,I1 id
PK,I2 version</p>
      </sec>
      <sec id="sec-4-6">
        <title>U1 lqouwize_r_ida</title>
        <p>lower_b
upper_b
upper_c
As the main aim of the revised solution is to facilitate further updating processes, the
structure and the fields of the initial LMS database have been kept intact, with the
only change being the addition of two new tables:
1. Table “quiz_version” records each assessment’s versions and has a one-to-one
relationship to table “quiz” (Fig. 3.1).
2. Table “track_e_answers” stores the examinee’s choice per item (fields “answer_
id” and “answer”), whether this choice was correct (field “correct”), and its weight
value (field “weighting”) (Fig. 3.2). Moreover, it supports the recording of multiple
responses for future polytomous analyses.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Item Pool Optimization Process</title>
      <p>The proposed system has been implemented by adding the previous features to an
existing version of Dokeos at the Department of Applied Informatics, University of
Macedonia. A pilot assessment test containing an item pool of 40 questions on
“Fundamentals of Information Systems” was arranged for the experiment. Since it was not
connected to an actual university course and contained questions of a general nature,
it managed to attract the attention of 113 students who voluntarily participated in the
experiment. Before administering the test, the acceptable limits for the IRT
parameters were set to α ≥ 0.5, -1.7 ≤ b ≤ 1.7, and c ≤ 0.25 respectively.</p>
      <p>Once an initial item pool has been optimized, examinees can be tested routinely.
Such a programme of testing is likely to generate a need to retire flawed, obsolete, or
frequently used items, and to replace these with new ones. The extended LMS under
consideration detects these problem areas, thus making it easier for test developers to
improve the quality of their tests provided that they investigate these issues further
and focus on addressing the root cause of the problem in each case (e.g., obscure or
ambiguous phrases, typographic or logical errors, a lack of essential information,
etc.). In addition, the LMS allows them to create a new version of the assessment test
effortlessly by copying the previous iteration and either correcting or replacing
whichever items have been flagged as defective. Subsequently, once the revised
examination cycle is completed, a new analysis report will ascertain whether all items
conform to the validity rules. The number of times a specific assessment must be
repeated before leading to a final version with all the problematic items eliminated
relies on the comprehension of the analysis results. The faster test developers identify
the actual cause of each problem and come up with an appropriate solution, the fewer
the necessary iterations.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>The present paper introduced a methodological and architectural framework for
extending an LMS with IRT–based assessment optimization. Instead of having web
developers implement complex IRT estimation algorithms within the LMS, the
proposed methodology uses ICL to obtain reliable IRT analysis results. The latter are
then automatically imported into the LMS, thus releasing test developers of this
burdensome duty. By applying a set of validity rules, the enhanced LMS is able to detect
several defective items which are then reported for review of their content. As a
result, the suggested approach is capable of assisting test developers in their continuous
effort to optimize their item pools. Moreover, the user-friendly interface allows users
with no previous expertise in statistics to comprehend and utilize the IRT analysis
results.</p>
      <p>
        According to research focused on IRT sample size effects, a great number of
examinees are needed to obtain accurate results [12]. For example, Swaminathan and
Gifford concluded that about 1,000 examinees are required when using the 3PL model
[13]. Such sample size requirements would normally pose a problem for most test
developers due to the fact that the number of examinees in academic courses rarely
exceeds 150. However, in cases where instructors are only trying to identify items that
are either unrelated to the overall score, too easy, or too difficult, reliable results can
be produced even for relatively small classrooms [14]. MOOCs, on the other hand,
enroll tens of thousands of students which are more than enough to obtain accurate
estimates with any IRT model. As a result, the proposed system would be ideally
suited for a MOOC environment; optimizing its extensive item pools will improve the
quality of assessment of student learning and could possibly drive more institutions to
offer course credit for MOOC completion, thus further expanding the influence of
these courses on higher education throughout the world [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>This initial experiment produced encouraging results, showing that the system can
effectively evaluate item performance and therefore increase the overall validity of
the assessment process. The fact that the proposed methodology is not limited to
Dokeos but can be adopted by different e-learning environments (e.g., Moodle,
MOOC platforms etc.) makes it especially suitable for academic use.
12. Bunderson, C.V., Inouye, D.K. &amp; Olsen, J.B. (1989) 'The Four Generations of
Computerized Educational Measurement', in Linn, R.L. (ed.), Educational
Measurement, Collier Macmillan Publishers, London.
13. Swaminathan, H. &amp; Gifford, J.A. (1983) 'Estimation of Parameters in the
Threeparameter Latent Trait Model', in Weiss, D.J. (ed.), New Horizons in Testing,
Academic Press, New York.
14. Fotaris, P., Mastoras, T., Mavridis, I., &amp; Manitsaris, A. (2011) ‘Identifying
Potentially Flawed Items in the Context of Small Sample IRT Analysis’,
International Journal On Advances In Intelligent Systems, vol. 4, no. 1&amp;2, pp.
3142.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Virtanen</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          (
          <year>2009</year>
          )
          <article-title>'Increasing the self-study effort of higher education engineering students with an online learning platform'</article-title>
          ,
          <source>International Journal of Knowledge and Learning</source>
          , vol.
          <volume>4</volume>
          , no.
          <issue>6</issue>
          , pp.
          <fpage>527</fpage>
          -
          <lpage>538</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Hindi</surname>
            ,
            <given-names>N.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Najdawi</surname>
            ,
            <given-names>M.K.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Jolo</surname>
            ,
            <given-names>H.A.M.</given-names>
          </string-name>
          (
          <year>2008</year>
          )
          <article-title>'An Examination of Assessment Practices in Colleges of Business at Various Middle East Countries'</article-title>
          ,
          <source>International Journal of Teaching and Case Studies</source>
          , vol.
          <volume>1</volume>
          , no.
          <issue>4</issue>
          , pp.
          <fpage>319</fpage>
          -
          <lpage>332</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Haladyna</surname>
            ,
            <given-names>T.M.</given-names>
          </string-name>
          (
          <year>1999</year>
          )
          <article-title>Developing and Validating Multiple-Choice Test Items (2nd edition)</article-title>
          , Lawrence Erlbaum Associates, Mahwah, New Jersey.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Lord</surname>
            ,
            <given-names>F.M.</given-names>
          </string-name>
          (
          <year>1980</year>
          )
          <article-title>Applications of Item Response Theory to Practical Testing Problems</article-title>
          , Lawrence Erlbaum Associates, Hillsdale, New Jersey.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Piech</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Do</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ng</surname>
            <given-names>A.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Koller</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>'Tuned Models of Peer Assessment'</article-title>
          , in S. D'
          <string-name>
            <surname>Mello</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Calvo</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Olney</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . (eds.),
          <source>Proceedings of the 6th International Conference on Educational Data Mining (EDM</source>
          <year>2013</year>
          ),
          <article-title>July 6-9</article-title>
          , Memphis,
          <string-name>
            <surname>TN</surname>
          </string-name>
          , USA.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Hsieh</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shih</surname>
            ,
            <given-names>T.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Ko</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          (
          <year>2003</year>
          )
          <article-title>'Feedback and Analysis from Assessment Metadata in E-learning'</article-title>
          ,
          <source>in 17th International Conference on Advanced Information Networking and Applications (AINA '03)</source>
          , pp.
          <fpage>155</fpage>
          -
          <lpage>158</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Hung</surname>
            ,
            <given-names>J.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>L.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shih</surname>
            ,
            <given-names>T.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hsu</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>H.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>H.P.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          (
          <year>2004</year>
          )
          <article-title>'A Cognition Assessment Authoring System for E-Learning'</article-title>
          ,
          <source>in 24th International Conference on Distributed Computing Systems Workshops (ICDCS 2004 Workshops)</source>
          , pp.
          <fpage>262</fpage>
          -
          <lpage>267</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Costagliola</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ferrucci</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Fuccella</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          (
          <year>2008</year>
          )
          <article-title>'A Web-Based E-Testing System Supporting Test Quality Improvement', paper presented to Advances in Web Based Learning - ICWL</article-title>
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9. Meyer,
          <string-name>
            <given-names>J. P.</given-names>
            , &amp;
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          (
          <year>2013</year>
          )
          <article-title>'Fair and equitable measurement of student learning in MOOCs: An introduction to item response theory, scale linking</article-title>
          , and score equating',
          <source>Research &amp; Practice in Assessment</source>
          , vol.
          <volume>8</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>26</fpage>
          -
          <lpage>39</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Hanson</surname>
            ,
            <given-names>B.A.</given-names>
          </string-name>
          (
          <year>2002</year>
          )
          <article-title>IRT Command Language (ICL). Obtained through the Internet: http://www.b-a-h</article-title>
          .com/software/irt/icl/index.html, [accessed 26/6/2013].
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Mead</surname>
            ,
            <given-names>A.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morris</surname>
            ,
            <given-names>S.B.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Blitz</surname>
            ,
            <given-names>D.L.</given-names>
          </string-name>
          (
          <year>2007</year>
          )
          <article-title>Open-source IRT: A Comparison of BILOG-MG and ICL Features and Item Parameter Recovery</article-title>
          , Illinois Institute of Technology, Institute of Psychology, Chicago, Unpublished manuscript. Obtained through the Internet: http://mypages.iit.edu/~mead/MeadMorrisBlitz2007.pdf, [accessed 1/7/2013].
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>