=Paper=
{{Paper
|id=Vol-3796/short4
|storemode=property
|title=Automated Identification of Relevant Worked Examples for Programming Problems
|pdfUrl=https://ceur-ws.org/Vol-3796/CSEDM-24_paper_9053.pdf
|volume=Vol-3796
|authors=Muntasir Hoq,Atharva Patil,Kamil Akhuseyinoglu,Bita Akram,Peter Brusilovsky
|dblpUrl=https://dblp.org/rec/conf/edm/HoqPAAB24
}}
==Automated Identification of Relevant Worked Examples for Programming Problems==
Automated Identification of Relevant Worked Examples for
Programming Problems
Muntasir Hoq1 , Atharva Patil1 , Kamil Akhuseyinoglu2 , Bita Akram1 and Peter Brusilovsky2
1
North Carolina State University
2
University of Pittsburgh
Abstract
Novice programmers can greatly benefit from using worked examples demonstrating the implementation of programming concepts
that are challenging to them. Although large repositories of effective worked examples have been generated by CS education experts,
one main challenge is identifying the most relevant worked example in accordance with the particular programming problem assigned
to a student and their unique challenges in understanding and solving the problem. Previous studies have explored similar example
recommendation approaches. Our work takes a novel approach by employing deep learning code representation models to extract
code vectors, capturing both syntactic and semantic similarities among programming examples. Motivated by the challenge of offering
relevant and personalized examples to programming students, our approach focuses on similarity assessment approaches and clustering
techniques to identify similar code problems, examples, and challenges. We aim to provide more accurate and contextually relevant
recommendations to students based on their individual learning needs. Providing tailored support to students in real-time facilitates
better problem-solving strategies and enhances students’ learning experiences, contributing to the advancement of programming
education.
Keywords
problem-solving support, program examples, code structure, code similarities
1. Introduction curated by computer science (CS) education experts, a funda-
mental challenge persists: How to identify the most relevant
Example-based problem solving is the cornerstone of intel- worked example tailored to each student’s specific learning
ligent tutoring systems (ITSs) within the programming do- needs and the nuances of the programming problem at hand
main [1]. When students encounter difficulties in problem- with a scalable and reliable approach.
solving, such systems aim to provide relevant examples to In response to this challenge, we aim to develop an au-
aid in comprehension and resolution. Traditionally, select- tomated recommender system to recommend the most rel-
ing these examples has relied heavily on domain experts, evant problems and examples to students when they face
a time-consuming and resource-intensive process, particu- difficulty solving programming problems. We undertake
larly as the volume of learning content expands. However, a vector-based approach where we embed problems and
alternative approaches have emerged, seeking to link prob- examples into vector representations, preserving their struc-
lems and examples dynamically without expert intervention. tural and semantic information. To this end, we leverage
Content-based methodologies, such as keyword-based ap- a deep learning code representation model, Subtree-based
proaches, analyze surface-level similarities but often lack Neural Network (SANN) [9], to extract nuanced similarities
the depth necessary to discern truly relevant content [2, 3]. among programming problems and examples. We applied
In contrast, knowledge-based approaches investigate the this model to problems and examples available in PCEX [6].
semantic understanding of content, offering higher-quality We aim to provide contextually relevant recommenda-
links by focusing on the underlying concepts [4, 5]. tions that enhance students’ problem-solving abilities and
The motivation for exploring innovative example selec- enrich their learning experiences in programming education.
tion methodologies arises from recognizing the significant Using the extracted vectors from SANN, we recommend stu-
benefits novice programmers can gain from worked exam- dents with similar worked examples for a problem based
ples that illustrate challenging programming concepts. Hos- on vector similarity. To demonstrate the effectiveness of
seini et al. [6] demonstrated the engagement and perfor- our recommendation system, we evaluated it using Top-N
mance benefits of directly connecting worked examples and accuracy metrics (N = 1, 3, and 5). This measures how often
similar completion problems into a “bundle” by a tool called the correct example, as labeled by experts, appears within
Program Construction Examples (PCEX). A more recent the top N recommendations. Additionally, we used cluster-
study [7] demonstrated that semantic similarity between ing techniques such as DBSCAN and hierarchical clustering
connected problems and examples is one of the keys to bet- to group similar problems and examples, aiming to reduce
ter problem-solving performance and persistence achieved the manual effort required by experts. Our results suggest
when this connection is provided by the domain expert. that this method effectively identifies similar problems and
In cases where worked examples and problems are not ex- examples, enabling us to provide guidance and support to
plicitly linked, it is essential to provide clear guidance to students facing similar challenges. Using these advanced
students, such as recommending semantically similar exam- techniques, we aim to bridge the gap between the vast repos-
ples following a failed problem-solving attempt [8]. Despite itory of programming examples and problems and the lack
the availability of extensive repositories of such examples of manual support for selecting resources according to the
specific needs of individual students, thus fostering more
CSEDM’24: 8th Educational Data Mining in Computer Science Education effective and personalized learning experiences in program-
Workshop, July 14, 2024, Atlanta, GA ming education [10].
Envelope-Open mhoq@ncsu.edu (M. Hoq); aspatil2@ncsu.edu (A. Patil);
kaa108@pitt.edu (K. Akhuseyinoglu); bakram@ncsu.edu (B. Akram);
peterb@pitt.edu (P. Brusilovsky)
© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License
Attribution 4.0 International (CC BY 4.0).
CEUR
ceur-ws.org
Workshop ISSN 1613-0073
Proceedings
2. Related Work
The concept of automatically connecting similar content
items traces back to the pioneering work of Mayes and
Kibby in the realm of educational hypertext [2, 3]. Initially,
similarity-based navigation centered on keyword-level sim-
ilarity, but due to its limited quality, this approach has since
been supplanted by more robust semantic linking method-
ologies, often referred to as intelligent linking. One such
approach is metadata-based linking, which computes simi- Figure 1: Example 1 from the same bundle
larity measures across various facets of metadata to generate
higher-quality links [11].
In recent years, the focus has shifted towards ontology-
based linking, particularly within the hypermedia research
community. Ontology-based linking involves indexing doc-
uments with ontology terms and then leveraging ontological
structures to identify similar documents [5, 12]. Although
early investigations primarily focused on hypermedia appli-
cations, the educational domain saw a subsequent adoption
of ontology-based linking methodologies [13].
In the programming domain, similarity assessment has Figure 2: Example 2 from the same bundle
mainly relied on content-level information. For example,
Gross et al. [14] linked Java programming contents based on
the similarity of their Abstract Syntax Trees (ASTs), which
encompassed the entire body of examples and problems.
However, this approach may overlook finer-grained simi-
larities in smaller code fragments [15]. Recent approaches
have explored concept-based similarity methodologies [16],
that is, representing examples and problems as vectors of
domain concepts and measuring similarity between these
vectors. Further work attempted to calculate ontology-based
similarity metrics for programming items [8].
With the advent of automated program representation
techniques that use deep learning methodologies [9], the
extraction of syntactic and semantic structural information
from programming code snippets has gained traction. These
Figure 3: Example 3 from a different bundle
techniques offer the potential to alleviate the reliance on
experts for designing isomorphic problem-example pairs,
instead enabling the discovery of relevant examples within
learning materials. While such similarity approaches hold lems and examples that span 13 topics, including Variables
promise across a range of code-related programming prob- and Operations, If-Else statements, Boolean Expressions,
lems [17, 9], our study focuses on code comprehension prob- For Loops, and Nested Loops. The problems and examples
lems, wherein students are tasked with predicting the out- in the dataset are organized into 52 bundles, with an aver-
put or final value of variables from a given program code age of 4 bundles per topic. Bundles start with a single fully
rather than on program composition problems requiring worked out example and are followed by 1 to 3 similar prob-
code writing tasks. lems. On average, each bundle has 1.35 problems. We used
the current content organization in PCEX, which represents
expert knowledge, as the gold standard for the evaluation
3. Dataset of content recommendation approaches [18]. Figures 1 and
2 show two program examples from the same bundle under
In this study, we used a Python programming dataset the Variable and Operations topic. Figure 3 shows another
sourced from the PCEX system [6]. PCEX offers online example of a different bundle within the same topic to show
access to working code examples and small completion the difference in the bundle structures.
problems referred to as “challenges”. To increase learners’
motivation and improve overall learning outcomes, prob-
lems and examples are further organized into bundles by 4. Methodology
domain experts, who group together problems and examples
that target similar programming constructs and patterns. In this study, we used the Subtree-based Attention Neural
This combination approach was validated in previous re- Network (SANN) [9] to encode programs into vector rep-
search [6, 7], demonstrating its value across various metrics resentations. We computed the cosine similarity between
and stressing the importance of connecting learning and these vectors to recommend the closest examples for a given
assessment on the level of specific programming patterns problem. To further analyze and group similar problems
in addition to their traditional integration on the level of and examples, we employed clustering techniques such as
broader course topics. DBSCAN and Hierarchical clustering.
The PCEX dataset comprises 123 programming code prob- SANN is our primary model for encoding programs in
vector format. It has demonstrated its efficiency in captur- Table 1
ing both syntactic and semantic information from programs Top-N accuracy for recommending worked examples
in an interpretable manner and understanding the intricate Top-N Accuracy (%)
code structure [9, 17, 19]. SANN operates by encoding the Top-1 70.97
source code into vector representations using subtrees ex- Top-3 83.10
tracted from the Abstract Syntax Tree (AST) representation Top-5 87.32
of the code. These subtrees undergo a two-way embedding
process in which each subtree and its constituent nodes are
individually embedded. The resulting embeddings are then The dataset has problems/challenges and examples bun-
merged into a single embedded vector. Subsequently, the dled together based on similarity (these are called bundles),
embedded vectors from both approaches are concatenated and different bundles are combined under different topics
and passed through a time-distributed, fully connected layer, by the experts. Therefore, the dataset shows a hierarchy
generating subtree vectors that incorporate both node-level of topics and bundles. Each topic has some bundles, and
and subtree-level information. each bundle has some similar challenges and examples. If a
After the generation of subtree vectors, an attention neu- student faces difficulty in a problem, an example from the
ral network is employed to condense all subtree vectors into same bundle will be recommended. We trained the SANN
a singular source code vector. The attention mechanism model using only the topic information for challenges and
assigns scalar weights to each subtree vector, facilitating the examples, intentionally omitting any bundle information.
aggregation of all subtree vectors into a weighted average. Although bundles encapsulate more detailed and granular-
These weights are determined through a normalized inner level information about the program structures, our objec-
product between each subtree vector and a global attention tive was for SANN to learn this granular insight exclusively
vector, followed by applying a softmax function to ensure from the superficial and abstract topic information. We aim
that the weights sum up to 1. The resulting weighted av- to enable SANN to generalize effectively across diverse pro-
erage of subtree vectors, as determined by the attention gram structures by training on topics alone and trying to
mechanism, encapsulates the entire source code snippet. reconstruct underlying bundles based on their similarity
The SANN model leverages attention weights to prioritize in program pattern and structure to evaluate their effec-
the most important subtrees when generating the source tiveness compared to expert-identified bundles. There are
code vector. We recursively extract all subtrees from an 13 topics and 52 bundles in the dataset. After the training,
AST, ensuring comprehensive coverage of the code struc- we tested the trained model on the test data to predict the
ture during the encoding process. associated topics. SANN showed a testing accuracy of 88%.
Following the extraction of code vectors, we calculated Afterward, we extract the source code vectors for the prob-
cosine similarity to find the closest example for a given lems and examples from SANN further study. Finally, we
problem for the recommendation. Furthermore, we utilized investigated the effectiveness of these vectors in forming
various clustering techniques, including DBSCAN and Hier- groups of similar examples and problems that can serve as a
archical clustering, to group similar problems and examples. recommending tool. We calculated the cosine similarity to
DBSCAN is adept at identifying clusters of varying shapes find the closest example for a given problem. If the closest
and sizes while being robust to noise, while hierarchical example is also from the same original expert-identified
clustering provides insights into the clustering structure bundle as the challenge, the recommended example is cor-
through dendrogram analysis. Using these techniques, we rect. We calculated the Top-N accuracy, where N = 1, 3,
aim to comprehensively explore the similarity structure and 5 as stated in Table 1. The experimental result sug-
within our dataset and facilitate the identification of cohe- gests that our recommendation can effectively find similar
sive groups of programming problems and examples. worked examples for a given problem when a student is
facing difficulty. However, we speculate that this accuracy
can be improved with a bigger dataset to train SANN since
5. Experiments and Results the current dataset has only 123 challenges and examples,
where the average number of examples per problem is only
5.1. Code Vector Extraction and Example 0.73. We want to investigate the impact of dataset size on
Recommendation improving performance in the future.
We employed the Python AST parser1 to parse Python pro- We further hypothesize that since the bundles represent
gramming code into ASTs. For SANN training, we par- very similar challenges and examples, the corresponding
titioned our dataset into 80% training data and 20% test- vectors should show these similarities by being closer to
ing data. During the splitting process, we ensured that the programs of the same bundle than others. The same hy-
no bundle was excluded from the training set to retain all pothesis is applicable to topics. However, the topics contain
the diverse structural variations for comprehensive train- slightly less similar challenges and examples. Hence, the
ing. The embedding size for both subtree-based and node- vectors of a topic should be close to each other but not as
based embeddings was set to 64, chosen from {64, 128, and close as those of a bundle. According to our hypothesis, the
256}. Consequently, each source code vector was of size 128. vectors in these bundles and topics should show patterns
Throughout the model training phase, we employed the in their tightness. Tightness refers to the average distance
Adamax optimizer [9] with a default learning rate of 0.001 between points of a bundle or topic. To calculate the tight-
to learn the weights of the matrices. The batch size was ness, we used the expert labels from the dataset as the gold
set to 32, and the maximum number of epochs was capped standard to show the effectiveness of our method and verify
at 200, with an early stopping patience of 20, to prevent the hypothesis. For each topic/bundle, first, we calculated
overfitting of the model. the pairwise distances for all the points within it. Then, we
calculated the mean of these pairwise distances, which is
1
https://docs.python.org/3/library/ast.html
the tightness within the vectors of the topic/bundle. Figure are much closer to each other than those in a topic. Finally,
4 shows the scatter plot of the bundles using PCA = 2. the average distance between all dataset samples was found
to be 2.7 units. This result suggests that samples belonging
to the same bundle are semantically very similar to each
other; samples belonging to the same topic might have more
variations than bundles but still more similar than any other
sample from other topics in the course.
Figure 4: Bundle clusters
Figure 7: Average tightness of topics and bundles
5.2. Clustering Similar Examples
We investigated the effectiveness of multiple clustering ap-
proaches in identifying bundles of similar problems and
examples. Firstly, we employed DBSCAN clustering for top-
ics, given its capability to handle irregularly shaped clusters
when the number of clusters (topics) is unknown and dif-
ferent structured problems and examples can be set under
Figure 5: Average tightness of topics the same topic. Setting the epsilon value to 0.85 and the
minimum points parameter to 2, we successfully identified
13 distinct clusters based on topics, with only 2 points clas-
sified as noise. This is because we assume each topic cluster
must have at least 2 points, and if some point is not in the
vicinity of any other, it is best to consider it as noise rather
than part of some cluster. Figure 8 shows the scatter plot
visualization that highlights the nonspherical nature of the
clusters, indicating their irregular shapes.
Figure 6: Average tightness of bundles
To verify our hypothesis, we calculated the degree of
tightness for (1) vectors of the same bundle, (2) vectors of
the same topic, and (3) all vectors in the data set (entire
course). Figure 5 shows the topic-level tightness, and Figure
6 shows the bundle-level tightness. Here, we can see that
bundles have lower distances, whereas topics have higher
distances. We plotted the mean degree of tightness for
topics, bundles, and the whole dataset in Figure 7 to get Figure 8: Topic clusters using DBSCAN
a clearer comparative view. For topics and bundles, the
average tightness measures for all individual topics and
bundles were calculated. The average tightness of bundles We calculated the accuracy of the topic clustering using
was found to be 0.4 units, and the average tightness of topics DBSCAN by determining a clustering error, which was as-
was found to be 0.8. This implies that points of a bundle sessed by comparing the assigned clusters to gold standard
clusters based on predefined topics. Specifically, we calcu- 6. Discussion
lated how many items were incorrectly assigned to clusters
compared to their actual topic labels. The clustering error In this study, we tried to address the long-standing challenge
demonstrated an average of 11.69% (std dev 0.15) over all of dynamically recommending relevant programming exam-
the problems and examples. The highest clustering error ples tailored to individual student needs within the context
for a topic was 44%. The topic with 44% error was the topic of computer science (CS) education. Our approach centered
“Strings.” This can be considered an outlier because the code on leveraging the Subtree-based Neural Network (SANN)
for string programs is likely similar to other topics where model to extract nuanced syntactic and semantic similarities
some string operations are also required. It is important to among programming examples, thus facilitating the identi-
note that three topics, including “For Loops,” “Nested Loops,” fication of analogous examples crucial for problem-solving
and “Lists,” were assigned to the same cluster. We found support. In this study, SANN was trained only on the topic
that these topics are very similar in structure and have over- information of the examples. However, the dataset used
lapping, i.e., using loops to traverse a list, hierarchical For also contains bundle information, where similar problems
Loops in Nested Loops. and examples are bundled together under a topic. We used
Hierarchical clustering was utilized for bundles inside topic-level information about the problems and examples
topics as it allows for the exploration of hierarchical struc- to get deeper structural insight using SANN, which helps
tures within the data and accommodates scenarios where to identify similar worked examples for struggling students.
the number of clusters is uncertain. DBSCAN may not be Using the extracted code vectors, we recommend students
ideal for this because the plotted points for bundles are un- with worked examples for a given problem based on vector
likely to have irregular shapes, since problems and examples similarity. The experiment suggests that the recommenda-
inside a bundle tend to be the most similar. Hierarchical tion has an accuracy of 70.97%, 83.10%, and 87.32% with the
clustering starts by treating each sample as a separate clus- Top-1, Top-3, and Top-5 recommendations, respectively.
ter. Then, it repeatedly executes the following two steps: (i) In addition, we show the effectiveness of these vectors
identify the two clusters that are closest together, and (ii) by showing the tightness of each topic and bundle in the
merge the two most similar clusters. This iterative process course. The results suggest that the bundles represent very
continues until all the clusters have merged together. similar problems and examples, as reflected by the proxim-
The dendrogram from hierarchical clustering illustrated ity of their corresponding vectors. In contrast, the topics
that samples sharing similar bundles and topics clustered contain multiple bundles with slightly less similar problems
closely together, with their parent clusters predominantly and examples. Consequently, the vectors within a topic
aligning with their respective topics. In addition, we as- are close to each other but not as tightly clustered as those
sessed the closest sample for each item, categorizing them within a bundle. We further employed advanced clustering
based on bundle name and topic similarity. Based on this techniques, including DBSCAN and hierarchical clustering,
closest sample data, we evaluated the number of items for to effectively group similar programming problems and ex-
which their closest sample had (1) the same bundle name, (2) amples and alleviate the expert effort in bundling similar
a different bundle name but the same topic, and (3) a differ- problems and examples. This outcome highlights the initial
ent bundle name and topic. As evident in Figure 9, 43.9% had effectiveness of our approach in organizing and understand-
their closest sample from the same bundle and 30.9% had ing the structural and semantic relationships inherent in
their closest pair from the same topic. However, there were programming education datasets. However, with the lim-
25.2% samples whose closest pair was from a different topic ited training data (the current dataset has only 123 problems
and bundle. This result suggests that samples of the same and examples, where the average number of examples per
topic are closer and contained within the same local region. problem is only 0.73), our clustering and performance did
In addition, samples belonging to the same bundle are even not fully align with the expert labels. We hypothesize that
closer to each other. However, discrepancies between the minor structural changes and overlapping topics in smaller
clustering results and expert labels emerged when problems problems and examples could be captured more accurately
and examples involved multiple topics or multiple bundles, with a larger dataset. Exploring this possibility is an inter-
for example, the use of loops in For Loops, Nested Loops, esting direction for future research.
List, and Strings. The significance of our study lies in addressing a key
challenge in CS education: identifying relevant and contex-
tually appropriate programming examples [1]. By offering a
methodological framework for dynamically recommending
personalized examples, our study provides a scalable solu-
tion to the resource-intensive process of example selection
traditionally reliant on domain experts. Our approach ef-
fectively connects the extensive collection of programming
examples with the unique needs of individual students, im-
proving programming education by promoting more effi-
cient and personalized learning experiences [6, 20].
Figure 9: Hierarchical clustering summary
7. Limitations and Future Work
There are a few limitations that need to be addressed in
this study. Firstly, SANN was trained on the topics associ-
ated with each problem and examples labeled by experts.
This current setup limits our ability to use a vast corpus of
worked examples and programming problems that are not [3] J. T. Mayes, M. R. Kibby, H. Watson, Strathtutor©:
labeled with topics. In the future, we want to eliminate this The development and evaluation of a learning-by-
limitation by training the SANN model in a topic-agnostic browsing system on the macintosh, Computers &
way. We propose training the model not on explicit topic Education 12 (1988) 221–229.
information but instead on the underlying code structure [4] K. R. Koedinger, A. T. Corbett, C. Perfetti, The
using an encoder-decoder architecture. In this approach, the knowledge-learning-instruction framework: Bridging
encoder would process the source code to generate a latent the science-practice chasm to enhance robust student
representation that captures the structural and semantic learning, Cognitive Science 36 (2012) 757–798.
nuances of the code. The decoder would then reconstruct [5] L. Carr, W. Hall, S. Bechhofer, C. Goble, Concep-
the code from this latent representation. This unsupervised tual linking: ontology-based open hypermedia, in:
learning method aims to enable the model to understand Proceedings of the 10th International Conference on
and encode the intricate structure of the code more effec- World Wide Web, 2001, pp. 334–342.
tively, leading to better generalization and more accurate [6] R. Hosseini, K. Akhuseyinoglu, P. Brusilovsky,
recommendations based on structural similarities rather L. Malmi, K. Pollari-Malmi, C. Schunn, T. Sirkiä, Im-
than predefined topic labels. proving engagement in program construction exam-
Additionally, when we explored clustering techniques, ples for learning python programming, International
we observed that some worked examples are similar, though Journal of Artificial Intelligence in Education 30 (2020)
they are from different topics. It happens because some top- 299–336.
ics overlap with previously learned topics. For example, [7] K. Akhuseyinoglu, A. Klašnja-Milićević, P. Brusilovsky,
when dealing with List problems, they might require knowl- The impact of connecting worked examples and com-
edge of loops. In such cases, in the future, we might consider pletion problems for introductory programming prac-
sub-categories of these bundles to recommend previous top- tice, in: European Conference on Technology En-
ics when necessary based on the difficulty progression of hanced Learning (EC-TEL 2024), Lecture Notes in Com-
a student. For example, if a student struggles with travers- puter Science, Springer International Publishing, 2024.
ing a list due to difficulties using loops, they would benefit [8] R. Hosseini, P. Brusilovsky, A study of concept-based
from revisiting similar examples that focus on loops from similarity approaches for recommending program ex-
previously covered topics. amples, New Review of Hypermedia and Multimedia
Another future direction of this work is to make the rec- 23 (2017) 161–188.
ommendations more personalized based on student knowl- [9] M. Hoq, S. R. Chilla, M. Ahmadi Ranjbar,
edge. We want to track students’ learning at various stages P. Brusilovsky, B. Akram, SANN: Programming code
of the course and incorporate that information in recom- representation using attention neural network with
mending examples for the current problems they face. The optimized subtree extraction, in: Proceedings of the
tracing of student learning can also be on a topic level. If a 32nd ACM International Conference on Information
student faces difficulty in a particular topic, this can be im- and Knowledge Management, 2023, pp. 783–792.
portant information along with the problem code structure [10] K. Muldner, J. Jennings, V. Chiarelli, A review of
for the recommender system. In addition, struggling with worked examples in programming activities, ACM
the same topic can also act as an alarm for instructors, indi- Transactions on Computing Education 23 (2022) 1–35.
cating that a student needs personalized intervention and [11] D. Tudhope, C. Taylor, Navigation via similarity: au-
support. We also intend to add some baselines from the lit- tomatic linking based on semantic closeness, Informa-
erature to do a study to show the comparative effectiveness tion Processing & Management 33 (1997) 233–242.
of our framework in the future. [12] M. Crampes, S. Ranwez, Ontology-supported and
ontology-driven conceptual navigation on the world
wide web, in: Proceedings of the 11th ACM on Hyper-
8. Conclusion text and Hypermedia, 2000, pp. 191–199.
[13] P. Dolog, N. Henze, W. Nejdl, Logic-based open hy-
In this study, we used the Subtree-based Neural Network
permedia for the semantic web, in: Proceedings of
(SANN) model to recommend relevant programming ex-
the International Workshop on Hypermedia and the
amples tailored to individual student needs in computer
Semantic Web, Hypertext 2003 Conference, 2003.
science (CS) education. Through clustering techniques, in-
[14] S. Gross, B. Mokbel, B. Hammer, N. Pinkwart, How to
cluding DBSCAN and hierarchical clustering, we effectively
select an example? a comparison of selection strate-
organized the structural and semantic relationships of prob-
gies in example-based learning, in: Proceedings of
lems and examples to guide the recommendation of similar
the Intelligent Tutoring Systems: 12th International
practices to programming students. Our approach offers a
Conference, ITS 2014, Springer, 2014, pp. 340–347.
scalable solution to the resource-intensive process of exam-
[15] G. Weber, A. Mollenberg, Elm-pe: A knowledge-based
ple selection, providing contextually appropriate learning
programming environment for learning lisp. (1994).
resources tailored to individual student needs.
[16] R. Hosseini, P. Brusilovsky, Example-based problem
solving support using concept analysis of program-
References ming content, in: Proceedings of the Intelligent Tutor-
ing Systems: 12th International Conference, ITS 2014,
[1] P. Brusilovsky, C. Peylo, Adaptive and intelligent web- Springer, 2014, pp. 683–685.
based educational systems, International Journal of [17] M. Hoq, Y. Shi, J. Leinonen, D. Babalola, C. Lynch,
Artificial Intelligence in Education 13 (2003) 156–169. T. Price, B. Akram, Detecting chatgpt-generated code
[2] M. Kibby, J. Mayes, Towards intelligent hypertext, submissions in a cs1 course using machine learning
Hypertext: Theory into Practice (1989) 164–172. models, in: Proceedings of the 55th ACM Technical
Symposium on Computer Science Education V. 1, 2024,
pp. 526–532.
[18] A. J. Sabet, I. Alpizar-Chacon, J. Barria-Pineda,
P. Brusilovsky, S. Sosnovsky, S. Sosnovsky,
P. Brusilovsky, A. Lan, et al., Enriching intelli-
gent textbooks with interactivity: When smart
content allocation goes wrong, in: Proceedings of the
4th International Workshop on Intelligent Textbooks,
volume 3192, 2022.
[19] M. Hoq, J. Vandenberg, B. Mott, J. Lester, N. Norouzi,
B. Akram, Towards attention-based automatic miscon-
ception identification in introductory programming
courses, in: Proceedings of the 55th ACM Technical
Symposium on Computer Science Education V. 2, 2024,
pp. 1680–1681.
[20] M. Hoq, P. Brusilovsky, B. Akram, Analysis of an
explainable student performance prediction model in
an introductory programming course, in: Proceedings
of the 16th International Conference on Educational
Data Mining, 2023, pp. 79–90.