<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Construction of Adaptive Educational Forums Based on Intellectual Analysis of Structural and Semantics Features of Messages</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Alexander Kozko</string-name>
          <email>alkozko@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Chelyabinsk State University</institution>
          ,
          <addr-line>Chelyabinsk</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <fpage>46</fpage>
      <lpage>51</lpage>
      <abstract>
        <p>The paper deals with organization of interaction and communication between subjects of the distance learning process through forums and comment system. Considered existing software tools, their structure and disadvantages. Propose a model of adaptive educational forums, as well as the structural and semantic similarity metrics for extracting dialogues and thematic discussions from arrays of individual comments, as a basis for the construction of adaptive educational forums.</p>
      </abstract>
      <kwd-group>
        <kwd>distance learning</kwd>
        <kwd>online discussion forums</kwd>
        <kwd>semantic similarity</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Development of information technologies makes distance education more available for
everyone, and technology of management and support of education process becomes
more and more important. Many scientific works dedicated to LMS (Learning
Management System), for example works [
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1, 2, 3</xref>
        ], give a description of LMS, and works [
        <xref ref-type="bibr" rid="ref4 ref5">4,
5</xref>
        ] devoted to review of existing LMS products. However, these works consider of
learning management tools mainly from the following perspectives: development,
management, courses and distribution of education materials, but not paying enough
attention to the issues of communication and information exchange between the different
subjects of education process.
      </p>
      <p>
        At present, modern learning theories such as the theory of connectivism proposed
by George Siemens and Stephen Downes, indicate as the basic conditions for successful
learning activities not only communication with the teacher, but also the interaction
between students and the exchange between them so-called “sensemaking artifacts” [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
Sensemaking artifacts means blog posts, notes, podcasts, and other educational
materials that created by a student for discussion with other students. In the paper [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] analyzed
this learning approach and claimed it more effective in comparison with traditional.
      </p>
      <p>Thus, the theme of the organization of communication and interaction between
subjects of distance education process is extremely relevant today, but it isn't considered
in the existing works.</p>
    </sec>
    <sec id="sec-2">
      <title>Instruments of communication and informational exchange for education process</title>
      <p>Consider the instruments of communication that are used in distance education.</p>
      <p>On the one hand LMS, like Moodle, ILIAS, and Sakai and the other hand, of the
platform for organization MOOCs (Massive open online course), such as Stepic,
Udemy, Coursera, EdX, Udacity propose to use for interaction within the course online
forums, blogs, chat rooms, questions and answers websites, wiki-pages and in addition
allowed commenting learning materials. For some courses, authors suggest to use third
party sites such as Reddit, StackExchange or social networks instead of internal
platform tools.</p>
      <p>Irrespective of the specifics and type of a course the main tools that are used for
communication within the course are standard forums, blogs, questions and answers
system, as well as various materials commenting system. However, such instruments
do not take into account the specificity of education process and do not provide any
applicable analytical functions for organization of education.
3
3.1</p>
    </sec>
    <sec id="sec-3">
      <title>Overview of informational exchange tools</title>
      <sec id="sec-3-1">
        <title>Interaction tools</title>
        <p>Forum, in generally, - is an online tool for website's visitors communication. The
essence of any forum is to create topics with its subsequent discussion. Users can
comment on created topics, ask questions and receive answers, and answer other forum
user’s questions of the forum and give them advice. Thus, each topic is an initial entry
with a set of comments.</p>
        <p>Blogs are a set of copyright entries, sorted by creation time, usually from the newest
to the oldest. Blogs characterized by the ability to publish reviews (comments) by other
users, and that makes blogs an instrument for information exchange. Technically, each
blog entry as well as forum topic is the initial post with a set of comments to it.</p>
        <p>Comment systems additionally used in distance education platforms to allow discuss
of any course materials, such as videos or articles.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Comments structure</title>
        <p>Structuring of post's comments may also vary depending on the implementation.
There are three main types of structuring:</p>
        <p>• Tree comments - comments list is presented as a hierarchical tree view. New
message is placed right after the previous one (quoting it isn't necessary). New comment
can also start its own discussion branch.</p>
        <p>• Linear (flat) comments - comments within the same topic are published under each
other, as they become available; new message is placed last (usually at the bottom);
interactive relationship between comments is based on specially decorated citations of
references to the author and other means.
• Hybrid comments - represent a cross between a tree and a linear structure, now, it
is the most popular form of comments submission.</p>
        <p>Comments are usually ordered by date, popularity and the number of votes. Each
comment has text but it also has the following attributes: Author Name and Timestamp.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Adaptive learning forums</title>
      <p>As noted earlier, the above instruments of online communication do not take into
account the specifics of education process and do not provide any specific functions for
it. Furthermore, they are not designed for a large number of participants, which is usual
for MOOC. Problems such as a large number of overlapping topics, unanswered
questions, incorrectly exposed statuses and tags, are an evidence that the forum is not an
effective communication instrument, we lack of effectiveness of the learning process.</p>
      <p>To increase the effectiveness of forum usage, as an instrument of communication
and information exchange within education process we propose develop the technology
of adaptive educational forums, based on data mining.</p>
      <p>Adaptive educational forum is an online forum, whose structure is rebuilt depending
on student educational trajectory, his information needs, the features of the course and
users activity. For teachers, the analysis of information from educational forum can be
a source of data for implicit course quality feedback, student's problems with the
understanding of educational materials, student's activity evaluation, etc. Thus, the
technology of adaptive educational forums could be a way to increase the quality of distance
learning.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Analysis of comments on online discussion forum</title>
      <p>Based on the overview, it can be argued that in such systems, the main content is
often not the whole topic and not individual comments, but thematic discussions and
dialogues, consisting of comments and united by one common theme. Therefore, the
primary tasks are extracting individual discussions of the whole comments list, and the
problem of determining the semantic similarity of discussions among themselves. For
dialogues extracting of the whole comments list we offer use at both the semantics of
messages and their location in the structure of comments.
5.1</p>
      <sec id="sec-5-1">
        <title>Extracting tree commenting relations from linear comments</title>
        <p>Because linear structure's main distinction is implicit comments relations, we
propose an approach to the reduction of the linear structure to explicit comments tree
structure.</p>
        <p>For comments linear structure conversion to the tree and find comments relations,
proposed focus on the comment text and message attributes, for example, author
nickname, post timestamp, user's nickname which are responsible, text of citation, position
in comments list.</p>
        <p>Using these attributes, and the semantics of messages for analyze, for linear
comments conversion into a tree we propose determine the pairwise comments relations. In
general, the numerical metric defines an association between two messages comments
represented by equation 1.
(1)
 (c1,c2) – relations between the two commentaries,    (t1,t2) – semantic
similarity of the two comments texts,    ( 1,  2) – attribute similarity of the two
comments, defined by their attributes,   и   – coefficients.</p>
        <p>To extract tree structure, it is necessary to calculate each comment's c1 degree of
relation with all previous time comments c2, then c2 comment that has a maximum
relation score becomes the parent of the current comment c1 in a tree structure.
5.2</p>
      </sec>
      <sec id="sec-5-2">
        <title>Extraction of thematic discussions</title>
        <p>In a discussion with a large number of participants some of the participants might
drift away from the main theme and start to discuss unrelated topics. Such thematic
discussions (subtopics) within the main topic can also provide useful information for
the user, but their detection difficult. Therefore we propose to allocate such subtopics
in separate entities and use them in the construction of adaptive educational forums.</p>
        <p>In tree-like comment systems, a message has the following attributes: author
nickname, post timestamp, comment depth in tree. Based on this, we proposed similarity
metric for two comments, represented by equation 2.

(c1,c2) =  
∙   
(t1,t2) ∗ ( 
1
(c1,c2)
+  
∙   
( 1,  2))
(2)
 (c1,c2) – similarity between the two commentaries,    (t1,t2) – semantic
similarity of the two comments texts,    ( 1,  2) – attribute similarity of the two
comments,  (c1,c2) – the distance between the two comments in the tree,
  ,   – coefficients.</p>
        <p>To unite a group of comments in a thematic discussion we proposed to use an
approach based on clustering algorithms, the result of which (clusters of comments)
should be used as boundaries for thematic subtopics.
5.3</p>
      </sec>
      <sec id="sec-5-3">
        <title>Semantic similarity metrics for comments</title>
        <p>We consider separately the question of calculating the semantic similarity between
the two сomments texts. First it is worth noting the specifics of messages - basically all
posts extremely short - from a few words to 2-3 sentences, so the use of methods for
calculating the semantic similarity of documents can be difficult. It is proposed to
determine the similarity of the documents, as similarity of containing concepts.</p>
        <p>
          Existing semantic similarity metrics can be divided into several classes. In work [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]
it is proposed to the following classification:
1) Measures based on a corpus of texts: for example, LSA, Web-based - NGD and
PMIIR.
        </p>
        <p>2) Measures based on ontologies: measures by Wu and Palmer, Leackock and
Chodorow, Resnik, Lin and others.</p>
        <p>3) Measures based on definitions: ExtendedLesk, GlossVectors.</p>
        <p>Another group of metrics that can be allocated
4) The metrics based on Wikipedia: Wikipedia Link-based Measure (WLM),
Explicit Semantic Analysis (ESA), WikiWalk, WikiRelate! and others.</p>
        <p>Choosing a semantic similarity metric for concepts it's necessary to pay attention to
the specific learning courses. Themes of learning courses may be beyond ontology but
in domain that is disclosed in course may contain specific concepts which well-known
ontologies as WordNet don't include and specific ontologies for course does not exist.
Therefore, a similarity measure metrics based on ontologies can be used to analyze
educational forums, only if a teacher will build the ontology of course by himself.</p>
        <p>For this reason assumed to use the metrics based on the online encyclopedia
Wikipedia. The advantages of using Wikipedia as a source of data are a volume and wide
range of different themes, relevance and partial structure.
6</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Related Work</title>
      <p>Forums analysis is not widely discussed in scientific papers, however it is possible
to highlight the following investigation.</p>
      <p>
        Work [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] devoted to the study groups Usenet, it proposes a method for measuring
the similarity of the different groups on the activity of participants in them, as well as
introduce a measure for evaluation post belonging specific group, which allows exclude
cross-posts from the analysis.
      </p>
      <p>
        In study [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] proposes a model to estimate the probability of involvement or
noninvolvement of a user in a specific online discussion based on the activity of his friends
and his interests, and the list of friends and user’s interests based on the previous
activity in other topics.
      </p>
      <p>
        In the work [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] is proposed an approach for extracting context information, as well
as questions and answers from the topic using the method of SVM, in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] considered
other methods used for this purpose.
      </p>
      <p>
        Finally, in [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], the authors propose an approach that combines both structural and
semantic analysis for search discussions and find of key messages in the threads.
7
      </p>
    </sec>
    <sec id="sec-7">
      <title>Conclusion</title>
      <p>Thanks to the Internet, distance education has been made available for millions of
people around the world, and now it's getting increasingly important role of supporting
communication within education process. Number of students enrolled to courses is
increased, and growth of the distance education percentage in the education process
realm sets new requirements that existing communication tools, do not respond, it
reduces the efficiency of education process as a whole.</p>
      <p>We propose the concept and the model of adaptive educational forums, the use of
which, in our opinion, will increase the efficiency of interaction between students, as
well as enhance the role of communication environment in distance education process
and give teachers the ability for automatically collect information about students as well
as the quality of the course.</p>
      <p>For creation of adaptive educational forums, we propose to use an approach that
includes allocation of individual thematic subtopics that in turn bases on messages
structural and semantic features.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Dalsgaard</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Social software: E-learning beyond learning management systems</article-title>
          .
          <source>European Journal of Open</source>
          , Distance and E-Learning,
          <volume>2</volume>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Sclater</surname>
          </string-name>
          , N.:
          <article-title>Web 2.0, personal learning environments, and the future of learning management systems</article-title>
          .
          <source>Research Bulletin</source>
          ,
          <volume>13</volume>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Coates</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>James</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Baldwin</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>A critical examination of the effects of learning management systems on university teaching and learning</article-title>
          .
          <source>Tertiary education and management</source>
          ,
          <volume>11</volume>
          ,
          <fpage>19</fpage>
          -
          <lpage>36</lpage>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Ketcham</surname>
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Landa</surname>
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brown</surname>
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Charuk</surname>
            <given-names>K.</given-names>
          </string-name>
          , DeFranco T.,
          <string-name>
            <surname>Heise</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McCabe</surname>
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Youngs-Maher</surname>
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Learning Management Systems Review</article-title>
          . (
          <year>2011</year>
          ) http://openscholar.purchase.edu/sites/default/files/keith_landa/files/doodle_lmsreport_final.pdf
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Siemens</surname>
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>Learning or Management Systems? A Review of Learning Management System Reviews</article-title>
          . Learning Technologies Centre, University of Manitoba (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Siemens</surname>
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>Connectivism: Design and Delivery of Social Networked Learning</article-title>
          .
          <source>The International Review of Research in Open and Distance Learning</source>
          <year>2011</year>
          ;
          <volume>12</volume>
          :
          <fpage>3</fpage>
          -
          <lpage>11</lpage>
          . (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Mott</surname>
            <given-names>J.</given-names>
          </string-name>
          , Wiley D.:
          <article-title>Open for Learning: The CMS and the Open Learning Network</article-title>
          . In education,
          <volume>15</volume>
          (
          <issue>2</issue>
          ). (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Panchenko</surname>
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Similarity Measures for Semantic Relation Extraction (</article-title>
          <source>Ph.D. Thesis)</source>
          . Université catholique de Louvain &amp; Bauman Moscow State Technical University, (
          <year>2012</year>
          -2013)
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>McGlohon</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hurst</surname>
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Community Structure and Information Flow in Usenet: Improving Analysis with a Thread Ownership Model</article-title>
          . ICWSM. (
          <year>2009</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qiu</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , Zhang,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          :
          <article-title>Modeling Dynamic MultiTopic Discussions in Online Forums</article-title>
          . In AAAI. (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Cao</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>W. Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>C. Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>A structural support vector method for extracting contexts and answers of questions from online forums</article-title>
          .
          <source>Information Processing &amp; Management</source>
          ,
          <volume>47</volume>
          (
          <issue>6</issue>
          ),
          <fpage>886</fpage>
          -
          <lpage>898</lpage>
          . (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Cong</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>C. Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Song</surname>
            ,
            <given-names>Y. I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sun</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Finding question-answer pairs from online forums</article-title>
          .
          <source>In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval</source>
          (pp.
          <fpage>467</fpage>
          -
          <lpage>474</lpage>
          ). ACM. (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>J. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cai</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>X. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          :
          <article-title>Simultaneously modeling semantics and structure of threaded discussions: a sparse coding approach and its applications</article-title>
          .
          <source>In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval</source>
          (pp.
          <fpage>131</fpage>
          -
          <lpage>138</lpage>
          ). ACM. (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>