=Paper= {{Paper |id=None |storemode=property |title= A Pedagogical Agent with Embedded Data Mining Functions to Support Collaborative Writing |pdfUrl=https://ceur-ws.org/Vol-1210/DC_01.pdf |volume=Vol-1210 |dblpUrl=https://dblp.org/rec/conf/ht/EpsteinR14 }} == A Pedagogical Agent with Embedded Data Mining Functions to Support Collaborative Writing== https://ceur-ws.org/Vol-1210/DC_01.pdf
A pedagogical agent with embedded data mining functions
            to support collaborative writing

                          Daniel Epstein                                         Eliseo Reategui
         Graduate Program of Computers in Education               Graduate Program of Computers in Education
           Federal University of Rio Grande do Sul                  Federal University of Rio Grande do Sul
                         (UFRGS)                                                  (UFRGS)
                    Porto Alegre - Brazil                                    Porto Alegre - Brazil
                    daepstein@gmail.com                                   eliseoreategui@gmail.com

ABSTRACT                                                          as those to support collaborative writing. These tools are
Internet growth has induced the development of a large num-       meant to facilitate writing in a multiple author environment,
ber of collaborative tools for online writing and informa-        allowing users to work on the same document concurrently,
tion sharing. Educators quickly realized the benefits of such     notifying them when a document is modified, and main-
tools for learners, allowing them to work online, to share        taining a revision history. Such features may enhance col-
their knowledge and help each other. Distance learning is         laboration, increasing group awareness, creating a sense of
a key concept in today’s educational research; collaborative      consciousness in the group members about cooperative team
learning environments are becoming widespread, being more         work, and exposing authors to different aspects of writing
dynamic and resourceful. However, distance learning also          [1].
introduced a series of problems, such as high evasion rates
resulting from lack of support and personalized feedback. It      Alongside with the internet growth and online collaborative
has also introduced difficulties for educators to follow and      learning environments, data mining has become more impor-
reviews students’ assignments. Based on this scenario, the        tant and popular in the education field. From the student’s
work presented here proposes the development of a peda-           perspective, the possibility to easily search for learning ma-
gogical agent supported by an intelligent tutoring system         terial using indexes and other reference tools increases their
to provide students and teachers with assistance in order         resources at the same time that reduces the effort needed to
to minimize some of these problems. The use of a peda-            find relevant information. From the teachers’ perspective, it
gogical agent allows students to have a constant feedback         can help them to summarize students’ writings in a distant
and guidance based on the identification of problems that         learning context [2], to assess the quality of posts in discus-
may emerge from an online collaborative writing activity.         sion forums [3] and to evaluate the students participation
The presence of this agent is intended to help students co-       in discussion forums [4]. Furthermore, it can provide use-
ordinate their efforts in writing a text collaboratively, and     ful feedback to teachers so they can easily identify the main
improve their work in terms of coherence. Furthermore, the        concepts in students’ writings and the connection between
pedagogical agent is also be able to assist teachers, reporting   these concepts [2].
problems and simplifying their tasks related to the analysis
of the work produced by each student. To support our ped-         The possibility to work collaboratively is attractive both
agogical agent we propose the use of data mining tools to         for students and teachers. It allows students to exchange
extract information related to the students’ writings, and a      knowledge, to help each other and complement their work
recommender system to suggest additional resources.               with different ideas. However, it also creates several diffi-
                                                                  culties for teachers when evaluating the work done, once it’s
                                                                  hard and demanding to monitor each student production
Keywords                                                          [5]. Besides, in a distance learning context, certain barriers
Pedagogical Agent, Intelligent Tutoring System, Data Min-         may hinder the establishment and maintenance of distance
ing, Collaborative, Distance Learning                             learning programs, such as technical problems, infrastruc-
                                                                  ture, motivational difficulties, necessary skills, social prob-
1.   INTRODUCTION AND MOTIVATION                                  lems and time/interruptions [6]. This project’s goal is to
In the last few years the Web 2.0 and 3.0 have helped pro-        contribute with the development of a social learning environ-
liferate a large number of educationally driven tools, such       ment in which collaborative writing takes place in a cohesive
                                                                  way, minimizing technical and interactive/social difficulties
                                                                  that are inherent to online collaborative work.

                                                                  This work has also been motivated by the fast increase in
                                                                  the number of collaborative writing tools available online,
                                                                  and the problems that often originate from their use. For
                                                                  instance, the lack of interaction and actual collaboration in
                                                                  collaborative writing tasks, and the lack of supervision and
                                                                  feedback to students’ work.
This project proposes a pedagogical agent to be used in           them during this process. The experiences and knowledge
a collaborative writing environment, so that it may assist        acquired may be reused in their future experiences [19, 20,
teachers and students in their tasks. The agent uses data         21, 22]. The PA may also be seen is a tool capable of mediat-
mining to identify problems in the students’ writings, and        ing learning, once it may interact with learners, individualize
based on this information it tries to guide them in improving     feedbacks and foster autonomy and collaborative skills. So-
their collaborative text production. Besides, the agent gives     cial interaction, according to the sociocultural perspective,
teachers more accurate information about the students’ par-       is essential for the promotion of learning and development
ticipation, their difficulties and interactive writing process.   [19, 20, 21].

2.   RELATED WORK                                                 3.   RESEARCH PROPOSAL
Intelligent tutoring systems (ITS ) are computer tools capa-      Based on the highlighted difficulties of the development of
ble of providing customized instruction or feedback to learn-     collaborative work online, we propose a pedagogical agent
ers [7]. They usually operate without the need of human           to be inserted in an online collaborative writing environ-
intervention. They differ from traditional content-delivery       ment. This agent will be capable of helping students through
computerized learning systems for their ability to improve        immediate feedback about their collaborative text produc-
the effectiveness of a learner’s experience through the use of    tion, and it will assist teachers through the presentation of
an artificial intelligence [8].                                   information/indicator regarding students’ participation and
                                                                  progress in the assigned tasks. Our goal is to provide a full
ITS often uses a variety of computational resources for an-       time assistance to the users of the collaborative environment
alyzing the users interaction with the system. These are          and to reduce the amount of work needed to analyze their
adaptive mechanisms, capable of personalizing learning ac-        interactions and work.
cording to individual student characteristics, such as knowl-
edge on the subject, mood and emotion [9] and learning style      In order to do that, we designed a pedagogical agent to
[10]. They may be programmed to identify user’s informa-          be integrated in the intelligent tutoring system. The ITS
tion as they interact with the system and choose from many        will be responsible for collecting all information regarding
actions the one that most likely would be beneficial to each      the students activities in the environment. The student
particular user.                                                  could be adding text, images, audio, video or any other
                                                                  resource to the project or simply reviewing and modifying
However, the ability to properly help the user often depends      some previous work. In any case, the ITS must keep a log
on the interface between the system and the user. Not rarely,     of those interactions in order to determine which action to
ITS require a virtual character to interact with users. These     perform (when needed). Among the different information
characters are called pedagogical agents. The use of peda-        collected by the system, we may list: the student’s contri-
gogical agent (PA) in educational applications has demon-         bution to the project (either by the addition of new contents
strated that these animated characters may improve stu-           or revision/edition of previous work); frequency (how many
dent’s engagement and learning experience [11].                   times each student accessed the collaborative environment,
                                                                  for how long and when) and a concept map summarizing
A PA is a human-like virtual character that has the advan-        what has been written by each student. Once we are con-
tage to operate continuously and autonomously. It is ca-          sidering a collaborative environment, several users may be
pable of searching and interpreting information received or       at the system at the same time and it is important for the
perceived through the system and provides a more natural          ITS to identify which student is responsible for which ac-
interaction with the user. PAs are capable of adapting their      tion. Whether the students access the environment simulta-
actions and interventions, providing feedback and guiding         neously or separated, it is important for all users to have an
problem solving, reflection, understanding and collaborative      identifier that will inform the system which user is current
learning [12, 13, 14].                                            online and modifying any given document.

Among the many benefits of using PAs are the increase of          All the information collected by the ITS will be processed
motivation, perception of ease and comfort in the learn-          using data mining techniques. This allows the system to
ing environment, the promotion of fundamental behaviors of        identify what type of contribution the user has made to the
learning, the realization of a need for personal relationships    project and infer how cohesive and coherent the text pro-
in learning and gains in terms of memory, understanding           duced collaboratively is. A data miner similar to Sobek[2]
and problem solving [15]. Not only PAs can present con-           will be used to perform these tasks. Sobek is a text miner
tents to the users, as they may suggest additional resources,     that uses statistical analysis to obtain the most relevant con-
highlighting important issues and recommending new ex-            cepts in a text and the relationships between them. A data
ercises and reference materials according to user’s progress      mining process will be used to convert multimedia resources
[16]. Studies have shown that the use of a PA with text min-      present in the project in concepts and relations.
ing features could help students bring relevant contributions
to a reading discussion [17]                                      The system will combine the data extracted from the users’
                                                                  writings with the data provided by the teacher to evaluate
According to sociocultural theory, learning can be consid-        if the students project is related to the requirements. In
ered a regulatory process that is mediated by social interac-     order to do it, the agent will compare the concepts and re-
tion among individuals, cultural artifacts (computer, peda-       lationships extracted from the students’ writings with the
gogical agent) and speech [18]. Users’ interactions help them     concepts and relationships extracted from the task speci-
in knowledge constructing and sharing, being internalized by      fication and resources provided by the teacher. Breno et.
al. [4] showed that this kind of comparison could provide          work is constantly edited by other students). The second
useful information regarding the quality/relevance of stu-         form of interaction will be a support interface, where the
dents’ contributions in discussion forums. The results of          teacher will be able to request specific information regard-
this comparison will determine if the PA has to make any           ing student’s activities in the environment. All those PA
intervention to help students improve their text. It is par-       features are meant to provide a more personalized contact
ticularly important for this intervention to result in positive    between the agent, the students and/or teachers.
reinforcement.
                                                                   Another aspect the agent is concern about is the coherence
There are several types of interventions planned for both          of the project. The students may not be together when
students and teacher. The most common type of interven-            writing the project and it may result in disjoint texts, unre-
tions for students is a direct message sent by the PA inquir-      lated or redundant information. Therefore, it is important
ing about some aspect of the project. Those inquiries are          for the agent to identify coherence problems and contact the
intended to foster critical thinking and help students cor-        students who produced the incoherent parts of the project.
rect what the ITS identifies as a problem. The messages            The agent may also use the information gathered from those
are sent when the students’ work is incomplete or lack co-         incoherent parts to search for additional material and learn-
hesion. In both cases, it is possible for the agent to suggest     ing objects that could help students fill the possible gaps in
additional material that may help them correct the prob-           the project. Although we considered using Latent Seman-
lem. Another possible intervention is the use of e-mail mes-       tic Analysis to perform this task [23], it is still problematic
sages when students are not participating in the collabora-        to decide on how to interpret multimedia resources that are
tive work, or when their contributions are not coherent with       neither text nor learning objects (in which case it is possible
the remaining project. The last type of intervention is not        to use its keywords and descriptions).
an automatic answer from the PA, but an explicit request for
help from a student. Specific functions are being developed
to enable students to ask the PA for further information           4.      STATE OF THE PROJECT
about some aspect of the project, or about its structure and       In the current state of the project, we are developing a script
coherence.                                                         program that will collect the data from students’ writings
                                                                   and send it to the pedagogical agent. This is part of the
A key feature in those interactions between PA and students        development of our intelligent tutoring system. This is also
is the agent’s ability to identify additional resources that       one of the most challenging parts of this project, as it is
may help the students’ text production. As the students’           important to correctly interpret and evaluate multimedia
project may include several types of media and different for-      resources. Through the use of scripts without a particular
mat of resources, the agent is able to recommend students          user interface, we intend to create a more reusable system,
with additional learning objects extracted from repositories       allowing it to be used in many tools and environments with-
as well as from the web. Learning objects are usually in-          out the need for changing the program’s core.
dexed using metadata, where keyword and object descrip-
tions are very common, information that is used to search          Our data mining system based on Sobek is already being
for learning objects that are most related to the topic at         modified to provide useful information regarding resource
hand. Using a similar technique to the one used by Breno           similarity, concepts and relationships. The input for our
et. al. [4], the learning objects that present the highest simi-   data miner is very restrictive, but we are working on mak-
larity values are presented to the students. Learning objects      ing it more general. This is most useful to investigate how to
selected will be separated by their format (video, audio, im-      successfully mine multimedia resources. Using the learning
age, text, etc.) so that it will be easier for the student to      object repository, we may conduct experiments that will as-
select the most appropriate ones.                                  sess the quality of the results and the reliability of our data
                                                                   mining tool.
Some of the PA’s functions are specific to help teachers. As
it is often difficult and time consuming for teachers to ana-      The experiments will be carried out in projects using Google
lyze the individual production of each student in collabora-       Drive 1 . The choice for this environment has been the ex-
tive writing tasks, the PA will provide accurate information       tensive database and number of projects that are developed
about each student’s contributions and progress based on           with this tecnology daily. It is one of the most known and
the information collected by the ITS. This will allow teach-       complete environment for collaborative writing, and it also
ers to identify relevant contributions made by each student        allows the input of scripts that could facilitate the integra-
or the absence of a student in a particular subject or part        tion of our pedagogical agent with the system. Google sup-
of the project.                                                    port and APIs make it a natural choice for our project to
                                                                   be integrated in Google Drive Documents, to ensure the dis-
The interaction between the PA and the teachers will be            semination of our ideas and software to a larger number of
set in two ways. The first one is through direct messages          people across the world.
sent by the PA to the teacher’s email or other communica-
tion method. Those messages are sent to inform about lack
of participation from some student or if the system detects
                                                                   5.      REFERENCES
                                                                       [1] S. G. Tammaro, J. N. Mosier, N. C. Goodwin, and
that one or more students need further assistance with the
                                                                           G. Spitz, “Collaborative writing is hard to support: A
project (this may be identified through a constant request for
                                                                           field study of collaborative writing.” Computer
help from a student, when his/her work is always identified
to have coherence/cohesion problems, or when the student’s         1
                                                                       https://drive.google.com
     Supported Cooperative Work, vol. 6, no. 1, pp. 19–51,           the cambridge handbook of multimedia learning,”
     1997.                                                           2005.
 [2] E. Reategui, D. Epstein, A. Lorenzatti, and                [19] L. . S. Vygotsky, “Mind in society: the development of
     M. Klemann, “Sobek: a text mining tool for                      higher psychological processes.” Cambridge, MA,
     educational applications,” 2011.                                1978.
 [3] Y. Chen, X. Cheng, and Y. Huang, “A wavelet-based          [20] L. S. Vygotsky, “Thought and language.” Cambridge,
     model to recognize high-quality topics on web forum.”           MA, 1986.
     in IEEE/WIC/ACM International Conference on Web            [21] ——, “The collected works of l. s. vygotsky. v.1.
     Intelligence and Intelligent Agent Technology. IEEE,            thinking and speaking.” New York, N.Y.: Plenum
     2008, pp. 343–351.                                              Press, 1987.
 [4] B. F. Azevedo, P. A. Behar, and E. B. Reategui,            [22] J. Lantolf and S. Thorne, “Sociocultural theory and
     “Qualitative analysis of discussion forums,” 2011.              the genesis of second language development. oxford
 [5] A. A. Juan, T. Daradoumis, J. Faulin, and F. Xhafa,             university press, 2006.” Applied Linguistics, vol. 28,
     “Developing an information system for monitoring                no. 3, p. 477, 2007.
     student’s activity in online collaborative learning.” in   [23] M. A. Hearst, “The debate on automated essay
     Proceedings of the International Conference on                  grading,” IEEE Intelligent Systems Vol.15 N.5, 2000.
     Complex, Intelligent and Software intensive Systems,
     CISIS, 2008, pp. 270–275.
 [6] F. Dabaj, “Analysis of communication barriers to
     distance learning: A review study,” Online Journal of
     Communication and Media Technologies, 2011.
 [7] J. Psotka, L. Massey, and S. Mutter, Intelligent
     Tutoring Systems: Lessons Learned. L. Erlbaum
     Associates, 1988.
 [8] P. Brusilovsky and C. Peylo, “Adaptive and intelligent
     web-based educational systems,” International Journal
     of Artificial Intelligence in Education, vol. 13, pp.
     159–172, 2003.
 [9] S. D’Mello, R. W. Picard, and A. Graesser, “Toward
     an affect-sensitive autotutor,” IEEE Intelligent
     Systems, vol. 22, no. 4, pp. 53–61, 2007.
[10] V. Yannibelli, D. Godoy, and A. Amandi, “A genetic
     algorithm approach to recognise students’ learning
     styles.” Interactive Learning Environments, vol. 14,
     no. 1, pp. 55–78, 2006.
[11] S. D. Craig, B. Gholson, and D. M. Driscoll,
     “Animated pedagogical agents in multimedia
     educational environments,” 2002.
[12] R. Moreno, “Multimedia learning with animated
     pedagogical agents,” 2005.
[13] R. Moreno, R. E. Mayer, H. A. Spires, and J. C.
     Lester, “The case for social agency in computer-based
     teaching. do students learn more deeply when they
     interact with animated pedagogical agents?” 2001.
[14] D. M. Dehn and S. van Mulken, “The impact of
     animated interface agents: A review of empirical
     research,” 2000.
[15] A. Gulz, “Benefits of virtual characters in
     computer-based learning environments: Claims and
     evidence,” 2004.
[16] E. B. Reategui, L. M. A. Santos, and T. L. M. R,
     “Pedagogical agents and the efficiency of instructional
     conditions in educational applications,” 2012.
[17] I. da Costa Pinho, D. Epstein, E. B. Reategui,
     Y. Correa, and E. Polonia, “The use of text mining to
     build a pedagogical agent capable of mediating
     synchronous online discussions in the context of
     foreign language learning,” 2013 IEEE Frontiers in
     Education Conference (FIE), pp. 393–399, 2013.
[18] D. H. Jonassen, C. B. Lee, C. Yang, and L. Laffey,
     “The collaboration principle in multimedia learning.
Daniel Epstein has Master Degree in Computer Science
from Universidade Federal do Rio Grande do Sul working
with robotics and artificial intelligence. He is a Ph.D. stu-
dent at the Post Graduation Program in Computer Science
in Education, currently researching Data Mining and Peda-
gogical Agent use on collaborative learning. He started the
program in March 2013 and is currently writing his thesis
proposal. He will complete his PhD in the middle of 2016.

Curriculum: http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4475739T8



Eliseo Berni Reategui has a PhD degree in Computer
Science from the University of London, England, and MS in
Computer Science from Universidade Federal do Rio Grande
do Sul, Brazil. After finishing his PhD and working in the
industry for 5 years, Dr. Reategui held a lecturer position at
the University of Caxias do Sul, Brazil, for a few more years.
Nowadays, he works as a lecturer and researcher at the Fed-
eral University of Rio Grande do Sul. His research interests
are related to the use of computers in education, involving
areas such as artificial intelligence and human-computer in-
teraction.

Curriculum: http://lattes.cnpq.br/9140136724972740



The main topics on which the student would like to receive
advices are:

  • Data Mining
  • Recommender system