=Paper=
{{Paper
|id=Vol-3238/paper009
|storemode=property
|title=Human context in Sentiment Analysis symbolic technique
|pdfUrl=https://ceur-ws.org/Vol-3238/paper9.pdf
|volume=Vol-3238
|authors=Daniel Amo-Filva,Mireia Usart,Carme Grimalt,Jiahui Chen
|dblpUrl=https://dblp.org/rec/conf/lasi-spain/FilvaUG022
}}
==Human context in Sentiment Analysis symbolic technique==
<pdf width="1500px">https://ceur-ws.org/Vol-3238/paper9.pdf</pdf>
<pre>
Human context in Sentiment Analysis symbolic technique
Daniel Amo-Filvà1, Mireia Usart 2, Carme Grimalt-Álvaro 2 and Jiahui Chen 1
1
    La Salle, Universitat Ramon Llull, Barcelona, Spain
2
    Universitat Rovira I Virgili, Tarragona, Spain

                      Abstract
                      Learning methodologies in Virtual Learning Environments that encourage students' written
                      communication require additional effort from the trainers, in terms of management and
                      sentimental awareness of both, the group and each participant. Analysing and evaluating
                      sentiment for every message in every conversation is a hard and tedious work. This is one of
                      the reasons why Natural Language Processing (NLP) and Sentiment Analysis (SA) are gaining
                      popularity. The idea of automating the processes of emotional evaluation of students'
                      conversations in an academic context invites us to consider those automatisms as substitutes
                      for manual processes, such as SA. The challenge of including the human context, together with
                      treating the data with adequate privacy in terms of current legislation, makes these techniques
                      complex. There are two main techniques in SA, those based on lexicons and those based on
                      machine learning. In the present study, results of SA based on two different lexicons, are
                      compared with the results of a manual labelling performed by human trainers to test the
                      effectiveness of the SA technique. Regarding the privacy concerns, an open-source local
                      analysis tool was updated and incorporated such automated processes, both for the present
                      study and for trainers to use considering the extracted results. The results show that lexical-
                      based SA processes tend to consider messages towards the extremes (positive/negative), while
                      human beings' evaluation tends towards sentimental neutrality, both in female and male.

                      Keywords 1
                      Sentiment Analysis, Natural Language Processing, Word List, Human Context.

1. Introduction
    The use of learning management systems (LMS) has become even more widespread after COVID-
19 pandemic [1]. In these virtual contexts, communication between students and trainers is enabled via
discussion forums. Conversations regarding learning topics occur asynchronously, and interactions take
place naturally considering the cultural, social, and political context of each participant. Emotions
always guide interactions [2]. The problem for us is that in face-to-face interactions, we have the non-
verbal language that modulates such interactions and provides us with additional information about the
meaning and implications of the messages. However, in written communication, it is a more tedious
work (you must read each of the communications), and the non-verbal part is missed, which can cause
misinterpretation of the messages and teachers/participants not being aware of the state of others.
Hence, Sentiment Analysis (SA) can contribute to better assessing participants’ emotions and the
emotional climate of the classroom to improve teachers' feedback to participants [3, 4]. With emotional
climate assessment, trainers can change flow, meaning, and purpose of the participant’s conversations
to avoid misunderstood and better relations among participants individually and as a group.
    The emotional climate of participants is “defined as the quality of the social and emotional
interactions between trainees, their peers, and trainers” and is “a key element for creating safe and
creative learning environments” [5]. In recent years, trainers have been applying the identification of
emotional climate through Big Data techniques considering Machine Learning algorithms or even other
Artificial Intelligence approaches running in the cloud. This techno-solutionism raises concerns about


Learning Analytics Summer Institute Spain (LASI Spain) 2022, June 20–21, 2022, Salamanca, Spain
EMAIL: daniel.amo@salle.url.edu (A. 1); mireia.usart@urv.cat (A. 2); carme.grimalt@urv.cat (A. 3); jiahui1@hotmail.es (A. 4)
© 2022 Copyright for this paper by its authors.
Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
CEUR Workshop Proceedings (CEUR-WS.org)


                                                                                      61
ethical practices, privacy and security of data, and context awareness in algorithms. However, analysing
textual data in a LMS while teaching can be both stressful and time-consuming for trainers [1], thus,
automatic detection, such as SA should be used to help trainers optimize their teaching time.
    SA is a natural language processing (NLP) technique that analyses textual data in natural lingua
(human readable text) with the aim of extracting and interpreting the expressed emotions. As stated by
Usart et al. [5] SA of text input has been characterized as a non-intrusive and behavioural manner of
emotion measurement. Hence, a group of participants can be assessed emotionally using this approach
to detect the emotional climate and envision results through simple visualizations or dashboards [6–8].
    There are two main techniques in SA, those based on lexicons and those based on Machine Learning.
In both, as automatic processes, it is very complex to introduce the social, cultural, and political context
of human beings. The writing of texts includes different literary figures, such as hyperbole, metaphor,
sarcasm, irony, and oxymoron, which semantically, and depending on the context of discourse, can be
misinterpreted by automatisms. The challenge of including the human context, together with the
challenge of treating the data with adequate privacy in terms of current legislation, makes SA complex
and requires advanced approaches such as supervised machine learning. SA based on general lexicon
analyses the lexeme of the words in the input texts, which can lead to a distortion in the sentimental
result, requiring lexical corpus adapted to the reality to be analysed. The same happens with a machine
learning approach, since the training datasets must consider the context analysed. In either case, the real
context is complex to incorporate in the automatisms. In summary, SA works using one of two
methodologies [9], both facing different challenges such as high-quality data acquisition, human-in-
the-loop in machine learning under production mode, human context detection, and ethics and privacy
issues:
     • Symbolic techniques: The symbolic technique works by breaking a message down into words
          or sentences and then assigning a sentiment score to each one. A linguistic corpus previously
          labelled with the sentiments is used to find the sentiment of the analysed texts by applying some
          sort of combination and aggregation function to the word scoring.
     • Machine Learning: The machine learning technique uses a training (fitting) set and a testing set
          for developing classification models. The training set is previously labeled by humans and used
          to develop the classification model of sentiments. Results are compared with the test set to
          validate the model.
    In this research, the objective is two-fold. On the one hand, we focus on the development of a tool
that allows the execution of SA (symbolic technique). On the other hand, we aim to automatically detect
emotions from different learning forums through SA (symbolic technique). Considering that both SA
technics face similar problems, we decide to use symbolic technique as a first approach due to three
reasons:
    1. The development of the tool entails less technological complexity.
    2. Scientific literature in education set majority focus on machine learning approaches making the
         symbolic technique interesting to research.
    3. Most of the lexicon-based literature focus on MOOC forums [10]. A manual labelling of a
         MOOC forum is a tedious and very exhausting work. This could be the reason why no literature
         was found in educational research considering lexicon-symbolic technique that compares results
         to manual labelling.
    The third reason points out the novelty of our work. Our general aim is to compare SA results with
a manual detection made by trainers in a specific educational context. One of the issues that present
both SA technics is the difficulty to detect human social and cultural context in natural language. In
symbolic techniques, some authors point out the need of an additional domain-focused lexicon to
improve analysis accuracy [11, 12]. Comparing results may help to understand how context is
considered in both SA techniques in the educational domain considered in this work. May there be
important differences between manual and automatic results, which can show that human context is not
fully addressed in SA techniques. Also, results may explain possible biases in the labelling of sentiments
and thus foster research and development of new domain-focused lexicon. Considering the above, we
want to answer how a symbolic technique using a general lexicon database differs from labelling made
by humans’ experts in pedagogy.
    Therefore, it is necessary not only to develop methods that help trainers to process the large amount
of textual data generated in LMSs, but also to develop tools that exe-cute those methods, considering


                                                    62
concerns regarding ethics, data privacy, and human context, such as gender, which could help trainers
detect differences in participants’ interactions caused by gender inequalities. Hence, we will develop
tools that execute locally, without the need to send data and process it in the (public) cloud, so any
trainer can use it in their own computer. In a next phase, as future work, we will execute the counterpart
Machine Learning technique for comparative reasons between automatic and human labelling.
    The rest of this paper is organized as follows. Section 2 describes the methodology followed
throughout this work. Section 3 presents the results obtained. Section 4 discuss the results. Finally,
Section 5 presents the conclusions obtained through this work.

2. Methodology

    To achieve the proposed objectives this study is divided into three phases. In the first phase, all
datasets are downloaded and prepared for sentiment processing. The datasets are composed of different
exported discussion forums from pedagogy courses of Universitat Rovira i Virgili’s Moodle platform.
In these forums, students discussed topics related to their subject in Spanish. A total of two datasets
have been used for the comparison, ranging from 2019 to 2021 courses. The first one, exported from
2019-2020 academic course forum consists of 145 messages and the second, from 2020-2021 forum of
37 messages. A total sample of 182 messages were analysed, both manually and automatically.
    In the second phase, jsMLA (JavaScript Moodle Learning Analytics) is adapted and updated [13]
This tool can be used for local data analysis in any web browser, which has been developed,
scientifically substantiated and published previously. The tool analyses Moodle logs locally with
different educational metrics and renders a dashboard with indicator visualizations (see Fig. 1).


Figure 1: Previous jsMLA dashboard. General information is shown in cards on the top side.
Interactions are shown both in graphical and tabular mode.

   The new version of the jsMLA tool is renamed to MLA (Moodle Learning Analytics) [14, 15], also
developed with web technologies for local deployment to enhance privacy and security in data analysis.
The new MLA is served as a standalone multi-platform application to facilitate its distribution,
installation with an enhanced user interface for any teacher (view Fig. 2. The newer platform also adds
support for Moodle forums logs SA.


                                                   63
Figure 2: New MLA dashboard.

   The process of SA in MLA is divided in different phases (see Fig. 3). Once Moodle forum JSON
logs files are imported to the platform, the tool takes the following steps: (i) detects the language of the
messages, (ii) performs SA evaluation, (iii) renders the data back to the UI.


Figure 3: MLA SA processing methodology diagram.

    The language used in the forums conditions the possible lexicons, as well as the actual context of
the datasets. After exhaustive research, we found no specific lexicon in regards of the context of the
datasets. In this respect, the sentiment analysis is implemented with the leverage of AXA’s NPL
JavaScript [16] library that 1) by default utilizes a lexicon-based approach and 2) allows Spanish words
lists. The AXA’s NPL library can be configured with 2 different sets of words lists, ML-SentiCon [17]
(Multi-Layered, Multilingual Sentiment Lexicon) and a Spanish translated version of the AFINN [18]
lexicon. Both lexicon corpus are compared with the human labelling to find out which of the two is
more accurate.
    In the third phase, both automated and human sentiment analysis is conducted. The forum’s
messages are classified by gender (M - male / F - female), taking into consideration this perspective in
the resulting insights. The labelling comprises the following three types of sentiments, no matter the
process of labelling (manual or automatized) (see Fig. 4):
     • Positive: messages that transmit an optimistic attitude.
     • Negative: messages that are leaned to an adverse thought.
     • Neutral: messages that do not fall in either of the previously mentioned categories.


                                                    64
Figure 4: Sentiment Analysis (SA) feature of the MLA tool. Messages are shown in the left bottom
section. Positive messages are in green, neutral in yellow and negative in red.

   SentiCon uses a valence scoring system that spans from -1 (extremely negative) to +1 (incredibly
positive), while AFINN’s ranges, vary from -5 to +5. More positive vocabulary being used in the text
means more optimistic attitude and vice versa. Unlike manual classification, humans do this process by
considering their contextual background (cultural, social, and political) and the content of the messages
themselves. Datasets contain conversations with human-specific nuances that could be overlooked by
automatisms. As a hypothesis of the study, it is expected that the results will reflect a difference between
manual and automated analysis based on lexicons.

3. Results
   The results of both manual and automated (using both dictionaries, SentiCon and AFINN) SA are
displayed in a vertical bar chart (see Fig. 5). Considering messages processed by human beings:
    • The sentiment of the 2019 - 2020 forum is strongly neutral (89.7%), with low positive rating
        (9.65%) and very low negative rating (0.7%)
    • The sentiment of the 2020 - 2021 forum is strongly neutral (75.7%), with a low positive rating
        (10.8%) and a slight negative rating (13.5%) compared to the previous period

    Considering the SentiCon library:
    • The sentiment of the 2019 - 2020 forum is strongly positive (68.3%), with substantial negative
       ratings (31.7%) and no neutral ratings
    • The sentiment of the 2020 - 2021 forum is positive (51.4%), with considerable negative rating
       (29.7%) and neutral (18.9%)

    Considering the AFINN library:
    • The sentiment of the 2019 - 2020 forum is balanced, positive ratings (49%) and negative
       (44.8%) with a very low neutral rating (6.2%)
    • The sentiment of the 2020 - 2021 forum is slightly more positive (51.4%) than negative (40.5%)
       with a low neutral rating (8.1%)


                                                    65
Figure 5: Bar graph of sentiment analysis on ’19 - ‘20 and ’20 - ‘21 discussion forums datasets, a
comparison between human vs machine (SentiCon) vs machine (Spanish AFINN).

  The Fig. 6 shows the SA results considering the gender of each written message’s author.
Considering the results for the female gender:
   • The sentiment of the 2019 - 2020 forum is strongly neutral (88.5%) in human being ratings,
       with low positive (10.1%) and exceptionally low negative (1.4%) ratings. The SentiCon based
       SA tends towards a more positive (69.6%) than negative (30.4%) rating, with no neutrality
       considered. The AFINN based SA tends towards a low variability ranking between positivity
       (49.3%) and negativity (43.5%), with an incredibly low neutrality (7.2%).
   • The sentiment of the 2020 - 2021 forum is strongly neutral (79.3%) in human being ratings,
       with low positive (13.8%) and very low negative (6.9%) ratings. The Senti-Con based SA tends
       towards a more positive (51.7%) than negative (37.9%) rating, with very low neutrality
       (10.4%). The AFINN based SA tends towards a more positive (51.7%) than negative (31%)
       rating, with a low neutrality (17.3%).

   Considering the results for the male gender:
   • The sentiment of the 2019 - 2020 forum is strongly neutral (90.8%) in human being ratings,
       with few positive ratings (9.2%) and no negative connotations. The SentiCon based SA tends
       towards a more positive (67.1%) than negative (32.9%) rating, with no neutrality considered.
       The AFINN based SA tends towards a low variability between positivity (48.7%) and negativity
       (46%), with an unelevated neutrality (5.3%)
   • The sentiment of the 2020 - 2021 forum is more neutral (62.5%) than negative (37.5%) in the
       human being ratings, with no positive ratings. The SentiCon based SA showed an equally
       positive (50%) and negative (50%) rating, with no neutral consideration. The AFINN based SA
       is inclined to a more positive ranking (50%), with equally positive (25%) and neutral (25%)
       ratings


                                                66
Figure 6: SA bar graph of students’ messages performed by human (female vs male), and vs ma-chine
(SentiCon) vs machine (Spanish AFINN).

4. Discussion

    In light of the results obtained above, human beings are more prone to classify messages as neutral.
Changing the lexicon used by the machine to perform NLP, it does not involve significant difference
of the labelled sentiment. However, machine’s evaluations are more prone to classify messages into
either having a cheerful attitude or being negatively assessed. There is a clear difference between
humans’ sentimental consideration and SA lexicon based.
    Another insightful result extracted from this analysis, is the correlation between gen-der
differentiated messages and the distribution of the diverse types of connotations in these. On the first
dataset (2019 – 2020 forum), the distribution of messages sent by females and males are equitable, 69
and 76 messages, respectively. Based on these results, there is no clear evidence whether being female
or male may influence in the labelling, though females have (very few) negative considerations as
opposed to male.
    On the second dataset (2020 – 2021 forum), discussions are fundamentally women predominant with
29 messages, over 8 from males which translates into a ratio of 78% to 22%. Based on these results,
there is no clear evidence whether being female or male may influence in the labelling, though male
don’t show positive consideration compared to women.
    Finally, results regarding automatic labelling, indicate that the machine lexicon NLP analysis
labelled half of the messages being positive and almost half of the messages being negative. Neutrality
is negligible in automatic labelling results.

5. Conclusions

   The different LMS-mediated learning methodologies that encourage students' written
communication require additional effort on the part of the trainers, in terms of management and
sentimental control of both, the group and each participant. Analysing and evaluating sentiment for


                                                  67
every message in every conversation is hard and tedious work. This is one of the reasons why NLP and
Sentiment Analysis are gaining increased popularity [5]. The idea of automating the processes of
emotional evaluation of certain academic texts invites us to consider them as substitutes for manual
processes.
    In the present study we compare the results of two automated SA based on two different lexicons
(SentiCon [17] and AFINN [18]), with the results of a manual labelling performed by human trainers.
For the process to address privacy concerns, we update an opensource local analysis tool [13, 15] and
incorporate such automated processes, both for the study and for trainers to use considering the
extracted results.
    The results of the comparison show that lexical-based SA processes tend to consider messages
towards the extremes (positive/negative), with those from the used datasets being considered more
positive than negative. The results analysed by human beings tend towards sentimental neutrality, both
in those evaluated by female and male, with most messages being marked as neutral and very few as
negative/positive. These results can be explained on the one hand by 1) the use of general lexical
dictionaries disconnected from the human context in which the conversations are resolved, and 2) the
type of activities related to pedagogical studies where the conversations are related to academic work
and theory. However, these conclusions do not mean that SA symbolic techniques are biased to the
extremes, nor can it be generalized. In particular, these results explain how the lack of context in a
dictionary of words can produce bias regarding human expertise of a specific domain context.
    In future research we will consider this study’s datasets as training datasets to apply machine
learning-based SA, to overcome the limitations found regarding the lack of human context in generic
lexicon-based SA.

6. Acknowledgments

   We appreciate the support received by Silvia Blasi, Aleix Ollé, and Ángel García in the manual
labelling process. This project has been funded by the Social Observatory of the “la Caixa” Foundation
as part of the project LCF/PR/SR19/52540001.

7. References

[1]   Knopik, T., Oszwa, U.: E-cooperative problem solving as a strategy for learning mathe-matics
      during the COVID-19 pandemic. Education in the Knowledge Society (EKS). 22, e25176 (2021).
      https://doi.org/10.14201/eks.25176
[2]   Bakhtiar, A., Webster, E.A., Hadwin, A.F.: Regulation and socio-emotional interactions in a
      positive and a negative group climate. Metacognition Learning. 13, 57–90 (2018).
      https://doi.org/10.1007/s11409-017-9178-x
[3]   García-Peñalvo, F.J., Corell, A., Abella-García, V., Grande-de-Prado, M.: Online Assess-ment
      in Higher Education in the Time of COVID-19. Education in the Knowledge Society (EKS). 21,
      26 (2020). https://doi.org/10.14201/eks.23086
[4]   García-Peñalvo, F.J., Corell, A., Abella-García, V., Grande-de-Prado, M.: Recommenda-tions
      for Mandatory Online Assessment in Higher Education During the COVID-19 Pan-demic. In:
      Burgos, D., Tlili, A., and Tabacco, A. (eds.) Radical Solutions for Education in a Crisis Context:
      COVID-19 as an Opportunity for Global Learning. pp. 85–98. Springer, Singapore (2021)
[5]   Usart, M., Grimalt-Álvaro, C., Iglesias-Estradé, A.M.: Gender-sensitive sentiment analy-sis for
      estimating the emotional climate in online teacher education. Learning Environ-ments Research.
      2, 1–20 (2022). https://doi.org/10.1007/s10984-022-09405-1
[6]   Álvarez-Arana, A., Villamañe-Gironés, M., Larrañaga-Olagaray, M.: Mejora de los pro-cesos de
      evaluación mediante analítica visual del aprendizaje. Education in the Knowledge Society (EKS).
      21, 13 (2020). https://doi.org/10.14201/eks.22914
[7]   Vázquez-Ingelmo, A., García-Peñalvo, F.J., Therón, R.: Towards a Technological Eco-system to
      Provide Information Dashboards as a Service: A Dynamic Proposal for Supply-ing Dashboards
      Adapted       to    Specific      Scenarios.    Applied      Sciences.     11,     14      (2021).
      https://doi.org/10.3390/app11073249


                                                  68
[8]    Sarikaya, A., Correll, M., Bartram, L., Tory, M., Fisher, D.: What Do We Talk About When We
       Talk About Dashboards? IEEE Transactions on Visualization and Computer Graphics. 25, 682–
       692 (2018). https://doi.org/10.1109/TVCG.2018.2864903
[9]    Zhang, H., Gan, W., Jiang, B.: Machine Learning and Lexicon Based Methods for Senti-ment
       Classification: A Survey. In: Proceedings of the 2014 11th Web Information System and
       Application Conference. pp. 262–265. IEEE Computer Society, Tianjin, China (2014)
[10]   Mite-Baidal, K., Delgado-Vera, C., Solís-Avilés, E., Espinoza, A.H., Ortiz-Zambrano, J., Varela-
       Tapia, E.: Sentiment Analysis in Education Domain: A Systematic Literature Re-view. In:
       Valencia-García, R., Alcaraz-Mármol, G., Del Cioppo-Morstadt, J., Vera-Lucio, N., and
       Bucaram-Leverone, M. (eds.) Technologies and Innovation. pp. 285–297. Springer International
       Publishing, Cham (2018)
[11]   Yadav, S., Sarkar, M.: Enhancing Sentiment Analysis Using Domain-Specific Lexicon: A Case
       Study on GST. In: Proceedings of the 2018 International Conference on Advances in Computing,
       Communications and Informatics (ICACCI). pp. 1109–1114. IEEE, Ban-galore, India (2018)
[12]   Muhammad, A., Wiratunga, N., Lothian, R., Glassey, R.: Domain-based Lexicon En-hancement
       for Sentiment Analysis. In: Proceedings of the BCS SGAI Workshop on Social Media Analysis
       2013 co-located with 33rd Annual International Conference of the British Computer Society’s
       Specialist Group on Artificial Intelligence (BCS SGAI 2013). pp. 7–18, Cambridge, UK (2013)
[13]   Amo, D., Cea, S., Jimenez, N.M., Gómez, P., Fonseca, D.: A Privacy-Oriented Local Web
       Learning Analytics JavaScript Library with a Configurable Schema to Analyze Any Edtech Log:
       Moodle’s Case Study. Sustainability. 13, 28 (2021). https://doi.org/10.3390/su13095085
[14]   Chen, J., Amo, D.: Moodle Learning Analytics, https://ls-leda.github.io/Moodle-Learn-ing-
       Analytics/
[15]   Amo-Filva, D., Chen, J.: Moodle Learning Analytics. La Salle, URL (2022)
[16]   Seijas, J., Ràfols, R.: NLP.js. AXA (2022)
[17]   Cruz, F., Troyano, J., Pontes, B., Ortega, F.J.: ML-SentiCon: A multilingual, lemma-level
       sentiment lexicon. Procesamiento de Lenguaje Natural. 53, 113–120 (2014)
[18]   Nielsen, F.: AFINN, http://www2.imm.dtu.dk/pubdb/pubs/6010-full.html


                                                  69

</pre>