A Large Language Model Implementing an AI Assistant in a
                                Higher Education Setting
                                Anna Mavroudi1,†, Gerald Torgersen2,∗

                                1 The University of Oslo (UiO), Boks 1072 Blindern, NO-0316 Oslo, Norway


                                                Abstract
                                                We are reporting on a piloting phase of creating, using, and evaluating an AI assistant for
                                                university students in the context of dental education. The assistant was meant to be used as a
                                                conversational partner with the students and it was instructed to implement the Socratic method;
                                                that is, avoiding giving answers to students, and help them obtain own understanding. We
                                                conducted a focus group discussion with a small number of students that had used the assistant.
                                                The results of our pilot pinpoint that the students used the AI assistant both as an information
                                                source as a conversation partner. Also, to new skills for both the teacher and the students to
                                                effectively use this type of technology.

                                                Keywords
                                                Large Language Model, chatGPT, artificial intelligence assistant, higher education 1


                                1. Introduction
                                Lately, Large Language Models (LLM) have attracted the attention of several researchers
                                worldwide working not only in Artificial Intelligence (AI), but also in other domains, such
                                as: business, security, and education. The main reason for the awakening of the interest of
                                the research community concerns the latest advancements on chatbots (for example
                                chatGPT from OpenAI) implementing LLM freely available online. In 2023, many prominent
                                researches and experts in AI signed an open letter calling for a six-month pause on
                                development of AI systems more capable than OpenAI’s latest GPT-4 arguing that AI is
                                advancing quickly and unpredictably. The letter garnered over 50,000 signatures [10].
                                ChatGPT is built upon the LLM technology called “Generative Pretrained Transformer”
                                (GPT). The use of tools that resemble chatbots, such as chatGPT, along with the LLM that
                                they incorporate has accelerated in most domains. Yet, LLM are in their infancy and we are
                                in a very early stage of understanding and using them [1]. The focus of this paper is in higher
                                education.


                                Proceedings of the 8th International Workshop on Cultures of Participation in the Digital Age (CoPDA 2024):
                                Differentiating and Deepening the Concept of "End User" in the Digital Age, June 2024, Arenzano, Italy
                                ∗ Corresponding author.
                                † These authors contributed equally.

                                   anna.mavroudi@iped.uio.no (A. Mavroudi); gerald.torgersen@odont.uio.no (G. Torgersen)
                                    0000-0002-3930-6336 (A. Mavroudi); 0000-0001-8335-7053 (G. Torgersen)
                                           © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
This paper discusses the case of an AI assistant for university students that incorporated
chatGPT with the aim of helping students with their assignments in a dental education
course. On the one side, we know from existing literature that chatGPT can generate human-
like responses that are more natural compared with most educational technologies. This
makes the tool appropriate for applications, such as AI assistants. On the other side, its
limitations include the need for large amounts of data and computational resources to train
and run the LLM [2].
 In this paper, we are describing the whole lifecycle of the AI assistant starting from its
conception, design, and development, to its use and empirical evaluation. With respect on
the first three phases, we highlight particularly the different roles that interplay and their
main tasks focusing on the division between professional developers and end users and
their collaborative interactions. Finally, we conclude with a discussion of preliminary
results on its use and evaluation. This involves the results of the thematic analysis of a focus
group discussion with final users i.e. a small group of university students. The results
indicate some interesting insights on additional learning demands that empower students
as end users to assess LLM possibilities and limits. Our hypothesis is that the development
of critical reflection around the use of this technology would be essential on behalf of the
students. The main reason for our hypothesis stems from the problem of overreliance of
end users in AI systems on the one hand [3] and on the other hand on the fact that LLM
frequently hallucinate [4]. The use of chatGPT shows that its output in many cases cannot
be regarded as secure knowledge, it is imprecise, or too general, which has consequences
for the learning activities. Academically sound knowledge must then be secured in other
ways [5]. Critical assessments must then be based on users’ previous knowledge and their
work with multiple information and knowledge resources [4].

2. Background
LLMs facilitate user interaction with generative transformers through conversational
interfaces. These transformers, such as the GPT, are constructed using reinforcement
learning and leverage large-scale online data, including human dialogues, to simulate
natural language responses. The GPT's network architecture allows it to pretrain extensive
language models using web-based textual resources like Wikipedia [7]. Pretrained models
undergo initial training on general-domain datasets, followed by fine-tuning for specific
tasks. While the GPT can generate contextually relevant text, it operates by discerning
statistical patterns rather than truly comprehending word meanings. Consequently, it can
produce coherent and contextually suitable language responses but lacks true semantic
comprehension.
   Prompt Engineering (PE) is the process of designing (or fine-tuning) effective questions
or instructions ("prompts") for AI language models with the aim of producing the desired
results [8]; herein, to help students develop their own understanding and ideas. In the
context of higher education, PE is important because it pinpoints to the fact that the success
of AI language models, like chatGPT, is not merely determined by their algorithms or data,
but also on the skills of the who creates the prompts [8]. Therefore, this technology
introduces a new role for the university teacher (prompt engineer) along with the
associated skills, a topic which is still under-researched.
   The use of pedagogical conversational agents in digital learning environments is not,
new since it dates back to the 70s. There exists a body of literature that discusses their
functions and their advantages. Regarding the former, their main teaching functions have
been categorized in [12] as: 1) motivation, 2) information, 3) information processing, 4)
storing and retrieving (information), 5) transfer of information and 6) monitoring and
guiding students. Regarding the latter, main advantages include[11]: their 24/7 availability
and their ability to respond naturally through dialogue-based systems.

3. Lifecycle of the AI assistant in the piloting phase
   The lifecycle of the AI assistant comprises the following phases: 1) conceptualization and
requirement analysis, 2) design and development, and 3) use and evaluation. We use two
actors named “Alex” and “Dora” who are actual professionals that cooperated to set up the
AI assistant.

1.      Conceptualization and requirements analysis: Our intention was to support
students in their learning process, using an approach in which the role of the tool is
primarily to respond by asking questions to promote student learning mimicking as much
as possible the Socratic method and avoiding as much as possible to provide answers to
students’ questions. We were also aiming in our approach to help students critically reflect
on the use of chatGPT for learning purposes. To do that, we needed an interdisciplinary
development team with a set of competences within AI, digital pedagogies, and the
disciplinary territory of the subject matter which was dental hygiene.

2.       Design and development: The development was based on an idea by "Alex," the
project leader, who had two roles: a university teacher and a supporter of the dental hygiene
course. Alex aimed to create an AI assistant for the dental hygiene course he was teaching.
Recognizing the importance of pedagogical integrity, Alex enlisted "Dora," a specialist in
digital pedagogy and AI in education, to support the project's educational effectiveness and
assess its outcomes. As mentioned already, our intention was not to create an agent that
would answer students’ questions, but a conversation partner. This was possible by using
the function “instruction” of the privacy-friendly chatGPT version 4 offered by the
university that both Alex and Dora work along with the following system prompt:

   “You are an intelligent teaching assistant, expert on medical physics and radiation
protection in oral radiology. You are supposed to help students learning and are not supposed
to give final answers. Give hints on the next step and inspire students to find the solution. If the
student is stuck, provide a little more help to avoid frustration. You should not give complete
answers, but help the students one step at a time to solve the problem. All answers and
communication must take place in Norwegian, and you must use questions and answers from
the given file as a source of knowledge when relevant. You must also be careful not to convey
information that is not supported by the documents in the file. The aim is to help students think
independently and critically, by giving them the tools they need to understand and solve
problems on their own.”

3.      Use and evaluation: Students got help from the AI assistant to write an essay for
their course. We conducted the evaluation with a focus group interview using an interview
protocol inspired by [6]. It comprises of the following stages:

       -   Introduction: purpose of the study, objectives, ethical considerations

       -   Main part: prior experience, quality of responses, functionality, pedagogical
           value, difficulties, affective domain (engagement, motivation), intentions of use
           in the future

       -   Closing: final questions, acknowledgment, considerations

The project team received ethical approval from the Norwegian Centre for Research Data
and the participant students gave written consent. A focus group discussion took place
using the interview protocol with five first year students that have used the AI assistant.
The focus group discussion was recorded and transcribed using a combination of a
dedicated AI tool offered by the university for the creation of the transcript and human
intervention for making corrections whenever needed in the transcript. The transcript was
qualitatively analyzed using thematic analysis comprising six phases [9]: 1) critically read
the transcript, 2) generating initial codes, 3) search for themes, 4) review themes, 5)
defining themes, 6) write up the results.

4. Preliminary results of the focus group discussion analysis
The main themes generated are:

   •   The difference in terms of student experience between the printed book, google
       search, and the AI assistant. This involves the role of other existing knowledge
       sources and what is unique for the assistant in comparison with the other sources.

“Sometimes it can take you a while to find it on google. And the information you need isn't
always on google. So chatGPT can help you with that. You just get a general overview of what
you're looking for. So it's straight to the point.” (Student 3)

“You want easy answers that you understand. Very simple and easy to explain. And you don't
always get that in textbooks or the internet. So you look up chatGPT, and you get short and
simple answers that you understand. And then you understand what's written in the
textbook.” (Student 2)

“No. And if you don't understand, it often says “it's not what you mean. Did you mean that?”
And then it's easy to say, “no, I meant this”. It's very rare that it doesn't understand. It's a
conversation you have with the chatbot.” (Student 1)
“It's like you have a teacher who asks…” (Student 4)

“It's like a different person. An assistant who helps.” (Student 5)

       •   In terms of learning demands posed, the students say that it requires critical
           thinking from them. Issues of trust were central to the discussion and in
           particular a) the degree to which the tools that implements the LLM can be
           considered a reliable knowledge source and b) what constitutes plagiarism in
           such a learning environment

About point (a):

 “You have to be careful with learning from it, that you can't... How do I say this? Source
critical? Yes, to be source critical. Because something has been wrong. Not everything is
correct. So you have to be very critical of what you get there.” (Student 5)

“But you have other sources that you refer to. I didn't trust chatGPT blindly, but I trusted it
enough to use it, to help me to understand better. You read the original text, or watch it on
YouTube, or on TV.” (Student 4)

About point (b):

“I've heard that someone has cheated with chatGPT. Because they have copied and pasted.
Yes, I agree. And that's not okay. “(Student 1)

“You can use it to steal information. Because it's easily accessible... You can tell it to write a
text about this and that. And then it can write it. So, you can just take this text. If you have an
assignment. “ (Student 3)

“It has more advantages in the learning experience (than disadvantages). And in the student
experience. Even though it's so strict with cheating. So not many people dare to abuse the
platform.” (Student 2)

5. Discussion
We report on a pilot phase of an AI assistant in the context of higher education that was
implemented using chatGPT4 along with an instruction to support and guide the students
in their assignment while avoid providing answers. The paper concludes on changes
regarding the role of end users in the era of LLMs. The main conclusions can be summarized
as follows: firstly, this new technology introduces a new role for the university teacher, that
of prompt engineer. On behalf of the students, the use of our AI assistant required critical
reflection. The students used it as a source of information, but also as a conversation partner
and this combination constitutes a different learning experience compared to other popular
means, namely the printed book and information fetched via google search.
Future plans include fine-tuning a standard LLM or using Retrieval-Augmented Generation
(RAG) with such a model. RAG is a method that improves the quality of generated text by
incorporating relevant information from external sources. To tailor the AI model, a
copyright-free dataset has been compiled, drawing from various reliable sources, including
government documents, publicly available internet materials, and Alex's teaching content.
Originally, we aimed in fine-tuning the AI assistant to the specific context and subject matter
of the course, but due to time restrictions in the piloting phase, this was not possible.

Limitations include the fact that we cannot claim generalization of findings since this was a
small piloting phase. The results are indicative of what other researchers might be
interested in focusing on in the future, since the field is still in its infancy.

References
   [1] F. Khaddage and K. Flintoff, Say Goodbye to Structured Learning chatGPT in
       Education, is it a Threat or an Opportunity?, in Society for Information Technology &
       Teacher Education International Conference, March 2023, pp. 2108-2114.
       Association for the Advancement of Computing in Education (AACE).

   [2] Thurzo, M. Strunga, R. Urban, J. Surovková, and K. I. Afrashtehfar, Impact of artificial
       intelligence on dental education: A review and guide for curriculum update.
       Education Sciences, vol. 13, no. 2, p. 150, 2023.

   [3] F. Leiser, S. Eckhardt, M. Knaeble, A. Maedche, G. Schwabe, and A. Sunyaev, From
       chatgpt to factgpt: A participatory design study to mitigate the effects of large
       language model hallucinations on users. In Proceedings of Mensch und Computer
       2023, 2023, pp. 81-90. https://dl.acm.org/doi/abs/10.1145/3603555.3603565 G.

   [4] Bansal, V. Chamola, A. Hussain, M. Guizani, and D. Niyato, Transforming
       Conversations with AI—A Comprehensive Study of ChatGPT. Cognitive Computation,
       pp. 1-24, 2024. https://link.springer.com/article/10.1007/s10639-023-11834-1

   [5] S. Ludvigsen, A.I. Mørch, and R.B. Wagstaffe, Lett å bruke – Vanskelig å forstå.
       Studenters samtaler og bruk av generativ KI: chatGPT. Institutt for pedagogikk. UV,
       UiO. Rapport, 2023.

   [6] M. Rodriguez-Arrastia, A. Martinez-Ortigosa, C. Ruiz-Gonzalez, C. Ropero-Padilla, P.
       Roman, and N. Sanchez-Labraca, Experiences and perceptions of final-year nursing
       students of using a chatbot in a simulated emergency situation: A qualitative study.
       Journal of Nursing Management, vol. 30, no. 8, pp. 3874-3884, 2022.

   [7] D. Luitse and W. Denkena, The great transformer: Examining the role of large
       language models in the political economy of AI. Big Data & Society, vol. 8, no. 2, 2021.
       https://doi.org/10.1177/20539517211047734
[8] Jacobsen, Lucas J., and Kira E. Weber. “The Promises and Pitfalls of chatgpt as a
    Feedback Provider in Higher Education: An Exploratory Study of Prompt
    Engineering and the Quality of Ai-driven Feedback.” OSF Preprints, 29 Sept. 2023.
    Web. https://doi.org/10.31219/osf.io/cr257

[9] V. Braun, and V. Clarke. Thematic analysis. In H. Cooper, P. M. Camic, D. L. Long, A. T.
    Panter, D. Rindskopf, & K. J. Sher (Eds.), APA handbook of research methods in
    psychology, Vol. 2. Research designs: Quantitative, qualitative, neuropsychological,
    and biological, pp. 57–71, American Psychological Association, 2012.
    https://doi.org/10.1037/13620-00

[10] I., Struckman and S. Kupiec. Why They're Worried: Examining Experts'
    Motivations for Signing the'Pause Letter'. arXiv preprint arXiv:2306.00891, 2023.
    URL: https://arxiv.org/abs/2306.00891

[11] F., Weber, T., Wambsganss, D. Rüttimann, and M., Söllner, 2021, December.
    Pedagogical Agents for Interactive Learning: A Taxonomy of Conversational Agents
    in Education. In International Conference on Information Systems (ICIS).

[12] S., Heidig, S. and G., Clarebout. Do pedagogical agents make a difference to student
    motivation and learning?. Educational Research Review, vol. 6, no. 1, pp.27-54, 2011.