<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>ChatGPT to Enhance Students' Behavior in Social Media via the Moral Foundation Theory</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Daniele Schicchi</string-name>
          <email>daniele.schicchi@itd.cnr.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Apoorva Upadhyaya</string-name>
          <email>upadhyaya@l3s.de</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marco Fisichella</string-name>
          <email>mfisichella@l3s.de</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Davide Taibi</string-name>
          <email>davide.taibi@itd.cnr.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute for Education Technology, National Research Council of Italy</institution>
          ,
          <addr-line>Palermo</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>L3S Research Center, Leibniz University</institution>
          ,
          <addr-line>Hannover</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <abstract>
        <p>Social media poses numerous dangers, such as the spread of toxic content, hate speech, false information, and moral outrage. In particular, the propagation of moral outrage on social media can result in harmful outcomes like promoting conspiracy theories, violent protests, and political polarization. Teenagers frequently use social media and are especially susceptible to these threats, being exposed to several risks that can impact their lives. Recent studies have confirmed that a combination of human and AI eforts can be highly efective in several domains. In this study, we examine human-AI collaboration's potential and eficacy in detecting morality in online posts. To this end, we conducted a pilot study with teenagers to determine their ability to recognize moral content in online posts. Consequently, we exploited Prompt Engineering to program ChatGPT to recognize morality and comprehend its potential as an intelligent tutor in supporting students in such a context.</p>
      </abstract>
      <kwd-group>
        <kwd>Human-AI collaboration</kwd>
        <kwd>social media platforms</kwd>
        <kwd>moral classification</kwd>
        <kwd>ChatGPT</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        Social media (SM) are not immune to harmful elements such as online bullying, moral outrage,
cyber harassment, and hate speech. These toxic components can lead to severe implications in
the real world, such as depression and anxiety, especially for young individuals who are more
susceptible to the efects of such phenomena [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Prior research has placed considerable emphasis
on moral sentiments when analyzing social media discourse providing insightful perspectives on
significant occurrences such as the practice of social distancing and the spread of misinformation.
Evaluating moral sentiments on SM has been proven useful for several tasks such as describing
the comparative/cooperative attitude towards the pandemic [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], comprehending public opinions
expressed on various societal issues [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], and measuring the social distance between people [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
Therefore, the influence of moral values cannot be underestimated as they profoundly afect
∗Corresponding author.
†These authors contributed equally.
CEUR
Workshop
Proceedings
people’s outlooks and actions. Whether acknowledged or not, these values significantly shape
people’s positions and decisions on various issues. As such, it is imperative to scrutinize the
prevalent moral sentiments on social media platforms. This measure is essential in safeguarding
young individuals from experiences that may negatively impact their well-being [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
Recent advances in Artificial Intelligence have paved the way for new methodologies to help
in education [6, 7]. Recent intelligent systems rely on Large Language Models (LLMs). These
systems can provide answers to a wide array of questions across various subject areas. They
are designed to be user-friendly, with natural language interfaces that allow users to interact
with the system conveniently. ChatGPT [8] is an exceptional chatbot that harnesses the power
of LLMs to generate digital content that is both relevant and accurate with impressive speed.
Moreover, it ensures a seamless and eficient communication experience by ofering text-based
responses to user queries, setting it apart as a truly advanced and valuable tool for every type
of user.
      </p>
      <p>ChatGPT can be a valuable tool in education. It can help students understand complex concepts,
gather preliminary information on research topics, and support writing tasks by generating
outlines and improving sentence structure. Since its high performance, ChatGPT is a perfect
candidate to enhance the learning experience, and its usage is getting more frequent. The
main interface to interact with ChatGPT is a prompt, a set of instructions to program the LLM
customizing, enhancing and/or refining its capabilities [ 9]. A prompt has a significant impact
on the behavior of the LLM. It defines the rules and guidelines the system must follow and sets
the context for the conversation, indicating to the LLM what information is crucial and what
the desired output form and content should be.</p>
      <p>In this paper, we aim to analyze the capability of ChatGPT to act as a companion to detect
moral content on social media. To achieve this objective, we analyzed how ChatGPT classifies
the content when diferent prompt patterns are used. Using the Moral Foundation Twitter
Corpus as the ground truth of our experiment, we evaluated the F1 score to ponder the efects
of precision and recall and determine ChatGPT performance in this task.</p>
      <p>The paper is organized as follows: in section 2 we give an overview about prompt engineering
and the usage of ChatGPT in education. In section 3 we describe the method that have conducted
to the results that are analyzed in section 4. Finally, in section 6 we give the final discussion
and conclusion.</p>
    </sec>
    <sec id="sec-3">
      <title>2. Literature Review</title>
      <sec id="sec-3-1">
        <title>2.1. Importance for Adolescents in Detecting Moral Content on Social Media</title>
        <p>Previous literature has applied AI techniques to combat societal issues such as the ones relative
to climate change and communal riots resulting from misinformation and hate speech on social
media [10, 11, 12, 13]. The presence of immoral and toxic content in online communication
eventually lead to harmful efects, especially on teenagers such as teen depression, teen anxiety
disorder, mental illness and more1. Moreover, a recent survey2 revealed how social media users
1https://www.newportacademy.com/resources/well-being/effect-of-social-media-on-teenagers/
2https://www.medienanstalt-nrw.de/fileadmin/user_upload/NeueWebsite_0120/Themen/Hass/Ergebnisbericht_
forsa-Befragung_zu_Hate_Speech_im_Internet_2022.pdf
aged 14-24 engage with hate speech due to its entertainment value. Therefore, it is crucial for
teenagers to recognize moral/non-moral content in social media and be warned to exercise
caution when consuming such content. This prompted us to use the moral task and conduct a
pilot study that analyzes teenagers’ and AI performance in detecting moral sentiments.</p>
      </sec>
      <sec id="sec-3-2">
        <title>2.2. ChatGPT: Prompt Engineering and Student Assistance</title>
        <p>Language models such as ChatGPT exploit prompts to personalize the model’s behavior and
shape its outputs. A prompt [9] is defined as a set of instructions that programs the LLM to
enhance and refine its capabilities. [ 14] review the most important patterns that can be used in
the context of LLM. In our work, we focused on the categories of Interaction and Customization
patterns. The Interaction category pattern allows the user to interact with the LLM in diferent
ways, for example, by letting the LLM ask questions to the users to accomplish a task. The
Customization category includes patterns that allow the user to limit the output to meeting
specific types, and structures, such as indicating the LLM to play as a specific person. Although
LLMs have shown drawbacks for students, such as cheating, plagiarism, and overreliance [15],
the education field has also benefited from their use. ChatGPT [ 16], the most famous LLM
nowadays, has made significant contributions to help students improve their competencies [ 17].
Students have been using ChatGPT to enhance their learning activities by quickly referencing
it and utilizing it for self-study [18]. One of the recent works [19] uses ChatGPT to improve
users’ moral judgment and decisions. The experiments in the study ask ChatGPT questions
such as “whether it is right to sacrifice one person’s life to save those of five others” or other
issues with the questions on abortion etc. and further prove that ChatGPT corrupts rather than
improves the moral judgement of its users. These works motivated us to investigate whether
ChatGPT can support students in classifying moral content on social media platforms.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3. Methods</title>
      <sec id="sec-4-1">
        <title>3.1. Dataset</title>
        <p>This study uses the publicly available Moral Foundations Twitter (MFT) corpus [20]. The
dataset was annotated by at least three trained annotators, with the majority for each label
considered ≥ 50%. The annotators considered morality as a set of vices and virtues according
to Moral Foundation Theory [21], including care–harm, fairness–cheating, loyalty–betrayal,
authority–subversion, and purity–degradation and provided the annotations for each tweet
with the presence of the vice-virtues or non-moral. Specifically, each tweet was assigned a label
indicating the presence or absence of each virtue and vice, or a label indicating that the tweet
was non-moral. This resulted in a set of 11 labels for each tweet. However, to simplify the study
and follow the existing work [22], we also consider the moral task as a binary classification
task, where each tweet belongs either to the moral class (belonging to any moral vice/virtue
category) or to the non-moral class. As a result, every tweet in the dataset falls into one of
two classes: moral or non-moral. We consider the labels (moral/non-moral) of the dataset as
ground truth and use these labels to evaluate the performance of students and ChatGPT in
the binary task of moral classification. Due to time constraints concerning the execution time,
we randomly selected 200 tweets from the publicly available MFT corpus and used these for
experiments involving ChatGPT in our research. In addition, to compare students’ performance
with each other and with ChatGPT, we use the subset of 15 tweets for the moral classification
task (moral/non-moral) as a part of the preliminary research for the pilot study to get a better
understanding of the performance in qualitative terms.</p>
      </sec>
      <sec id="sec-4-2">
        <title>3.2. Large Language Model- ChatGPT</title>
        <p>We use ChatGPT (GPT-3.5-Turbo) model using Azure OpenAI3 Service to classify the tweet
text. Our current study only examines ChatGPT’s performance in a zero-shot setting to explore
its ability without training. However, we plan to analyze ChatGPT in one-shot and few-shot
settings as part of our future research. Moreover, we utilized mixed prompt patterns by taking
advantage of the ability of LLMs to mimic diferent personas and the capacity to adjust prompts
by gathering more data for improved results. In detail, we leveraged the persona pattern and
the Flipped Interaction Pattern. The Persona Pattern (PP) allows users to generate personalized
outputs from a specific perspective. The purpose of this pattern is to give the LLM a specific
character that guides its output and prioritizes details. We have also used the Flipped Interaction
Pattern (FIP) to manage the conversation with ChatGPT to achieve the best prompt for our goal.
In the FIP, after the user provides a brief description of the problem, the LLM takes charge of
the conversation by asking specific questions that assist in creating an efective prompt capable
of achieving the desired task.</p>
      </sec>
      <sec id="sec-4-3">
        <title>3.3. Pilot Study Framework</title>
        <p>A pilot study involving 67 adolescents from a secondary school was conducted for our study.
Each participant was assigned a unique identifier and provided a link for direct access to
the platform. The platform was designed as a basic user interface containing tweet text and
moral/non-moral radio buttons to allow each participant to provide their labels for the binary
moral classification task. Each student was given a similar set of 15 tweets in a random fashion
order to classify the moral/non-moral content. We do not capture any personal data concerning
the privacy of each individual. Further, all students were asked to provide informed consent to
participate in agreement with the Data Protection Oficer of our University.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Results</title>
      <p>We first analyzed ChatGPT’s results on 200 tweets dataset based on the diferent contexts. Then
we compared the performance of students and ChatGPT on 15 tweets as a part of the pilot study.
We use accuracy, macro variant of precision, recall, and F1 score as performance metrics.</p>
      <sec id="sec-5-1">
        <title>4.1. Performance of ChatGPT based on diferent context (roles/personas)</title>
        <p>From Table 1, it can be seen that the performance of ChatGPT is diferent when we specify
diferent roles and personas in ChatGPT’s prompt. It is observed that ChatGPT performs best
3https://learn.microsoft.com/en-us/azure/ai-services/openai/overview
when it acts as a tutor for adults, resulting in 3.05% performance improvement in F1 score
compared to other scenarios. We also noticed that ChatGPT does not perform well enough in its
role as a tutor for children and adolescents with F1 scores of 74.65 and 73.57 respectively, even
though these age groups are the ones that need more guidance and are the main focus groups
for imparting education on the morality/non-morality of the content. As ChatGPT’s ability
to identify moral content varies across roles and performs poorly in the case of children and
young people, this provided us with a future research direction to explore ChatGPT’s generated
explanations to explore any significant diferences in helping kids and teenagers understand
the moral/non-moral content.</p>
      </sec>
      <sec id="sec-5-2">
        <title>4.2. Pilot Study Analysis: Performance Comparison of Students and ChatGPT</title>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5. Limitations</title>
      <p>In this study, we aim to investigate the extent to which ChatGPT can help students understand
moral content in social media. However, there are certain limitations in our current work. Due
to the disadvantage of the short length of tweets, students and ChatGPT may have trouble
identifying whether a tweet is moral or non-moral content. Better training of students on
moral content and a few-shot training scenario for ChatGPT could be beneficial to improve
the performance of both students and ChatGPT in the classification task. While we use the
GPT-3.5-Turbo model for the experiments, an investigation with a newer version of ChatGPT,
such as GPT4 and other more advanced LLMs, could reveal better qualities of the LLMs’ to
classify the moral content. Time constraints limit the number of students and the dataset size
for the pilot study. However, we plan to use a larger number of informative tweets from the
ground-truth dataset to evaluate the performance of LLMs in identifying moral content.</p>
    </sec>
    <sec id="sec-7">
      <title>6. Discussion and Conclusion</title>
      <p>In this research, we conduct a preliminary study whose main objective is to investigate the
capabilities of ChatGPT to guide students, especially children and adolescents, in identifying
moral/non-moral content disseminated on social media platforms such as Twitter. We first
analyze the performance of ChatGPT based on diferent roles and personas. We then conduct a
pilot study in which students from the teenage group categorize tweets into moral/non-moral
content. Our preliminary results suggest that ChatGPT performs better when it behaves as
a tutor for adults rather than as a tutor for children and teenagers. Moreover, we find that
ChatGPT outperforms the average student when acting as a trained AI model and teenager,
but 20.89% of students still perform better than ChatGPT’s best results. However, a large
proportion of students (53.73%) perform even worse than the average scores for all students.
Such results prompted us to analyze in more detail how students and ChatGPT understand the
moral and non-moral labels separately. This first round of analysis has also led us to explore
the explanations ChatGPT gives in the case of a tutor for children or young people to gain a
deeper understanding of the impact of diferent personas on ChatGPT’s evaluations. We will
consider this as part of our future research. Hence, in this study, we perform investigations
from a higher-level perspective, and we will look into in-depth details in our future works.
[6] C. Limongelli, D. Schicchi, D. Taibi, Enriching didactic similarity measures of concept maps
by a deep learning based approach, in: 2021 25th International Conference Information
Visualisation (IV), IEEE, 2021, pp. 261–266.
[7] D. Schicchi, B. Marino, D. Taibi, Exploring learning analytics on youtube: a tool to support
students’ interactions analysis, in: Proceedings of the 22nd International Conference on
Computer Systems and Technologies, 2021, pp. 207–211.
[8] OpenAI, 2023. URL: https://chat.openai.com/.
[9] P. Liu, W. Yuan, J. Fu, Z. Jiang, H. Hayashi, G. Neubig, Pre-train, prompt, and predict: A
systematic survey of prompting methods in natural language processing, ACM Computing
Surveys 55 (2023) 1–35.
[10] A. Upadhyaya, M. Fisichella, W. Nejdl, A multi-task model for emotion and ofensive aided
stance detection of climate change tweets, in: Proceedings of the ACM Web Conference
2023, 2023, pp. 3948–3958.
[11] A. Upadhyaya, M. Fisichella, W. Nejdl, A multi-task model for sentiment aided stance
detection of climate change tweets, in: Proceedings of the International AAAI Conference
on Web and Social Media, volume 17, 2023, pp. 854–865.
[12] A. Upadhyaya, M. Fisichella, W. Nejdl, Towards sentiment and temporal aided stance
detection of climate change tweets, Information Processing &amp; Management 60 (2023).
[13] A. Upadhyaya, M. Fisichella, W. Nejdl, Intensity-valued emotions help stance detection
of climate change twitter data, in: Proceedings of the Thirty-Second International Joint
Conference on Artificial Intelligence, IJCAI 2023, ijcai.org, 2023, pp. 6246–6254.
[14] J. White, Q. Fu, S. Hays, M. Sandborn, C. Olea, H. Gilbert, A. Elnashar, J. Spencer-Smith,
D. C. Schmidt, A prompt pattern catalog to enhance prompt engineering with chatgpt,
arXiv preprint arXiv:2302.11382 (2023).
[15] A. Tlili, B. Shehata, M. A. Adarkwah, A. Bozkurt, D. T. Hickey, R. Huang, B. Agyemang,
What if the devil is my guardian angel: Chatgpt as a case study of using chatbots in
education, Smart Learning Environments 10 (2023) 15.
[16] N. Fijačko, L. Gosak, G. Štiglic, C. T. Picard, M. J. Douma, Can chatgpt pass the life support
exams without entering the american heart association course?, Resuscitation 185 (2023).
[17] C. K. Lo, What is the impact of chatgpt on education? a rapid review of the literature,</p>
      <p>Education Sciences 13 (2023) 410.
[18] S. Nisar, M. S. Aslam, Is chatgpt a good tool for t&amp;cm students in studying pharmacology?,</p>
      <p>Available at SSRN 4324310 (2023).
[19] S. Krügel, A. Ostermaier, M. Uhl, Chatgpt’s inconsistent moral advice influences users’
judgment, Scientific Reports 13 (2023) 4569.
[20] J. Hoover, G. Portillo-Wightman, L. Yeh, S. Havaldar, A. M. Davani, Y. Lin, B. Kennedy,
M. Atari, Z. Kamel, M. Mendlen, G. Moreno, C. Park, T. E. Chang, J. Chin, C. Leong, J. Y.
Leung, A. Mirinjian, M. Dehghani, Moral foundations twitter corpus: A collection of 35k
tweets annotated for moral sentiment, Social Psychological and Personality Science 11
(2020) 1057–1071.
[21] J. Graham, J. Haidt, S. Koleva, M. Motyl, R. Iyer, S. P. Wojcik, P. H. Ditto, Chapter two
- moral foundations theory: The pragmatic validity of moral pluralism, volume 47 of
Advances in Experimental Social Psychology, Academic Press, 2013, pp. 55–130.
[22] J. W. Burton, N. Cruz, U. Hahn, Reconsidering evidence of moral contagion in online social
networks, Nature Human Behaviour 5 (2021) 1629–1635.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kaur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kaur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Verma</surname>
          </string-name>
          , et al.,
          <article-title>Impact of social media on mental health of adolescents</article-title>
          ,
          <source>Journal of Pharmaceutical Negative Results</source>
          (
          <year>2022</year>
          )
          <fpage>779</fpage>
          -
          <lpage>783</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>O.</given-names>
            <surname>Araque</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kalimeri</surname>
          </string-name>
          , L. Gatti, Sentiment and moral narratives during covid-
          <volume>19</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>R.</given-names>
            <surname>Rezapour</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Dinh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Diesner</surname>
          </string-name>
          ,
          <article-title>Incorporating the measurement of moral foundations theory into analyzing stances on controversial topics</article-title>
          ,
          <source>in: Proceedings of the 32nd ACM conference on hypertext and social media</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>177</fpage>
          -
          <lpage>188</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Dehghani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Johnson</surname>
          </string-name>
          , J.
          <string-name>
            <surname>Hoover</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Sagi</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Garten</surname>
            ,
            <given-names>N. J.</given-names>
          </string-name>
          <string-name>
            <surname>Parmar</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Vaisey</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Iliev</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Graham</surname>
          </string-name>
          ,
          <article-title>Purity homophily in social networks</article-title>
          .,
          <source>Journal of Experimental Psychology: General</source>
          <volume>145</volume>
          (
          <year>2016</year>
          )
          <fpage>366</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Taibi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Fulantelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Monteleone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Schicchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Scifo</surname>
          </string-name>
          ,
          <article-title>An innovative platform to promote social media literacy in school contexts</article-title>
          ,
          <source>in: ECEL 2021 20th European Conference on e-Learning</source>
          , Academic Conferences International limited,
          <year>2021</year>
          , p.
          <fpage>460</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>