Analyzing Parsons Puzzle Solutions using Modified Levenshtein's Algorithm Salil Maharjan Amruth Kumar Ramapo College of New Jersey Ramapo College of New Jersey smaharj3@ramapo.edu amruth@ramapo.edu ABSTRACT solution. The series of edit distances computed after each puzzle- We adapted Levenshtein's algorithm to compute edit distance of solving step taken by the student provides a progress report of the student solution from the correct solution in Parsons puzzles with student solving the puzzle. the intention of using edit distance as an estimate of the degree of Since edit distance is computed in terms of the number of correctness of a student’s solution. We modified the algorithm to operations needed to covert one string to another, our interest was eliminate substitution operation, which is not allowed in Parsons in finding an edit distance algorithm that would consider the puzzles, and include reordering operation which is allowed. We operations (and only the operations) allowed when solving a used the solution log data from the tutor to reconstruct each step Parsons puzzle: insertion (when a student inserts another line of that the student took to solve the puzzle and applied the modified scrambled code into the solution), deletion (when the student algorithm to compute the edit distance for each step to generate deletes a line of code from the solution) and reordering (when a edit distance trails of student solutions. We used these edit student reorders lines of code within the solution). We wanted to distance trails to represent student solutions and applied k-means use this algorithm to generate edit distance trails of student clustering to find patterns. The analysis was conducted on the data solutions in order to find interpretable patterns from student log collected by a tutor on selection statements over four data from the tutor by using clustering techniques. years. We found interpretable patterns among complete solutions, of optimal versus sub-optimal solutions, based on the inclusion of optional lines of code. Among incomplete solutions, we found 2. MODIFYING LEVENSHTEIN’S patterns of known puzzle-solving behaviors. Edit distance trails ALGORITHM helped identify student patterns regardless of the sequence of The edit operations allowed in a Parsons puzzle are 1) insertion of individual statements manipulated. However, by being puzzle- a statement into the solution 2) deletion of a statement from the independent, they lose the ability to identify puzzle-specific solution and 3) reordering of a statement within the solution. The information. edit distance of a student’s solution from the correct solution is the number of these actions necessary to reach the correct solution Keywords from the student solution. Edit-Distance, Modified Levenshtein’s Algorithm, K-means In order to calculate edit distances, we modified Levenshtein’s clustering, Patterns in puzzle solutions algorithm [9]. Levenshtein’s algorithm calculates edit distance based on three operations: insertion, deletion and substitution. We 1. INTRODUCTION modified the algorithm to eliminate substitution and incorporate Edit distance is a widely used string similarity measure to quantify reordering operation as substitution is not an operation permitted how dissimilar two strings are based on the number of operations in the Parsons puzzle tutor, but reordering is. required to convert one string to another. It can be used to Levenshtein’s algorithm starts with a m x n matrix, where m and n compute the degree of correctness of a student’s answer in are the sizes of the two strings being compared. It initializes the Parsons Puzzle. matrix by filling the first row and column by row/column number and filling the cells row by row using the minimum of the In a Parsons puzzle, the lines of a correct program are scrambled following three states [9], given (i, j) is the matrix index: and presented to a student, who is tasked with reassembling the code in its correct order. In this scenario, edit distance is the 1. Insertion: largest when the student starts solving the puzzle, and reduces to 0 when the solution is complete and correct. So, edit distance is an 2. Deletion: indicator of how close a student’s solution is to the correct 3. Substitution: , when ( ) If the characters being compared are not the same, a unit cost is added. If they are the same, matrix(i, j) = matrix(i-1, j-1). For example, in Table 1, the edit distance to convert character ‘B’ to ‘A’ is at matrix (1, 1). It is computed as the minimum of three operations: insertion – cost at matrix (1, 0), deletion – cost at matrix (0, 1), and substitution – cost at matrix(0, 0). The minimum of these three costs is at matrix(0, 0). Therefore, matrix Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). (1, 1) is set to 0 + 1 = 1 after adding unit cost for substitution Table 2 shows the trace back from (m, n) to (0, 0), where the operation since ‘B’ ≠ ‘A’. length of both strings is 3. At (3, 3), the minimum value among matrix (3,2) and matrix (2, 3) is 3. Since both cells have the Omitting substitution operation: We modified the algorithm to minimum value, either cell can be chosen to visit next. In this remove substitution operation and compute the minimum from example, assume that the cell to the left, i.e., matrix (3, 2) is only two operations - insertion (matrix (i, j-1)) and deletion visited next. This highlights an insertion operation of character (matrix (i-1, j)). To convert character ‘B’ to ‘A’ without ‘C’. The hash map is updated with key C and value ‘1’, followed substitution, we require two operations: deletion followed by by another insertion of ‘B’. At index (3, 1), the character is ‘A’ in insertion, giving us an edit distance of 2. We compute this as the both (row and column) strings. Therefore, the algorithm moves minimum of matrix (0, 1) and matrix (1, 0), and add a unit cost diagonally without any cost. Next, consecutive deletion operations since ‘B’ ≠ ‘A’, yielding an edit distance of 2 for cell (1, 1). The are carried out to trace back to (0, 0). The resulting hash map has algorithm repeats this process for all the cells in row-major order. the following values: Table 1. Levenshtein distance matrix with substitution Table 3. Hash map record of trace back of (Table 2.) Col.(j) 0 1 2 3 Key C B E B Row(i) - - A B C Value 1 1 -1 (-1) 0 - 0 1 2 3 Upon reaching matrix (1, 0) (Table 2), a deletion operation of character ‘B’ is recorded to reach position matrix (0, 0). Since the 1 B 1 1 1 2 hash map already has an entry for ‘B’, and the value for it is 2 E 2 2 2 2 positive corresponding to insertion of ‘B’, the subsequent deletion of ‘B’ implies a reordering operation of ‘B’. Since insertion is 3 A 3 2 3 3 followed by deletion, character ‘B’ is reordered to a later index in the string. The string transformation can be summarized as: Table 2. Levenshtein distance matrix without substitution (with trace back) Table 4. Transformation of string “BEA” to “ABC” Col.(j) 0 1 2 3 Operation String (Source string) BEA Row(i) - - A B C Insertion of C BEAC Insertion of B BEABC 0 - 0 1 2 3 Deletion of E BABC 1 B 1 2 1 2 Deletion of B (Target string) ABC 2 E 2 3 2 3 In every case where the entry is positive for a particular key in the 3 A 3 2 3 4 hash map and a deletion operation is performed or where the entry is negative and an insertion operation is performed, a reordering operation is identified and the edit distance is decremented by Adding reordering operation using trace back: A reordering one. As seen in Table 3, operations do not have to be back-to- operation in Parsons puzzle can be broken down into two back. When the operations on a character are back-to-back, it consecutive operations – insertion and deletion: signifies a transposition of the character, i.e., reordering by a single position. • Insertion followed by a deletion of the same character later in the string is a moving operation towards the front of the The algorithm has two boundary conditions, when either ‘i’ or ‘j’ string. reaches 0. When j reduces to 0, we have a left column boundary condition: since no insertion operations are possible, the hash map • Deletion followed by an insertion of the same character later is updated with deletion operation based on the character from the in the string is a moving operation towards the end of the source (row) string. Table 2 exemplifies a column boundary string. condition. Similarly, when i reduces to 0, we have an upper row boundary condition: since no deletion operations are possible, the To identify reordering operations, we traced back from the end of hash map is updated by recording insertion operation using the the matrix (m, n), to the initial position (0, 0) and used a hash map characters in the target (column) string as the key. to determine that the insertion and deletion operations had been applied back to back to the same line. If they were, we counted Several other extensions of Levenshtein’s algorithm have been the insertion and deletion operations as one reordering operation. attempted before. Damerau-Levenshtein algorithm [4] extends Levenshtein’s algorithm by considering adjacent character For insertion operation, the character from the target string (row transpositions as another operation. Transposition is a special case character) at the current position is used as the key in the hash of reordering, where the reordering is done by just one character. map and its value is incremented. For a deletion operation, the We needed an algorithm that treated reordering by any number of character from the source string (column character) of which edit characters as a single-cost operation. distance is to be calculated is used as the key in the hash map and its value is decremented. A constant unit cost is used for each operation. Several modifications have dealt with the general case of two numbers and print the smaller value among them. The student reordering an entire substring. Shapira et al. provide a polynomial is expected to arrange the scrambled lines of code that contains if- time greedy algorithm to move substrings [3]. They identify this else statements in the correct order. problem as NP-complete. Since it uses a greedy strategy to handle move operations, the algorithm is unable identify all move cases; it rather gives an approximation. Comrode and Muthukrishnan provide a general approach that is subquadratic and deterministic called edit-sensitive-parsing (ESP) [5]. The algorithm approximates the edit distance with moves in O(n logn). Takabatake et al. [8] further optimize the index structure used in the ESP technique to make the algorithm near linear time. These modifications to accommodate reordering of substrings are more general than what we need in Parsons puzzles, where, only one character is reordered at a time. Because of the NP-complete nature of the problem of reordering substrings, these algorithms approximate edit distance calculations. For our problem, we were interested in exact calculation of edit distance, while restricting the moved substring to a single character, i.e., line of code. Figure 1. Edit distance trail graphical representation. So we simply looked at ways to identify move operations using the dynamically filled matrix generated by Levenshtein algorithm. The edit distance trail of student “Anon 25” is [14, 13, 12, 11, 10, We incorporate backtracking to the algorithm to effectively track 10, 9, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0] which has a length of 17. reordering operations in a Parsons’s puzzle. Dynamic Similarly, student “Anon 8” has a longer trail of length 26 that programming and backtracking are more expensive in terms of does not converge to an edit distance of value 0. This shows that time and memory than the approaches mentioned above but we student Anon25 was able to solve the puzzle in 17 steps whereas were able to correctly identify reordering operations as single-cost student Anon8 took a total of 26 steps but was unable to solve the operations in the scope of our problem. puzzle. Using this edit distance trail, we are able to identify the 3. COMPUTING EDIT DISTANCE TRAILS correctness of each action taken by the student. By doing so, we A student’s solution of a Parsons puzzle is logged as a sequence are able to plot the problem solving strategy of various students of actions such as: for different types of problems. 1. Moved from problem to solution at line 7: short firstNum 4. AN APPLICATION OF EDIT DISTANCE 2. Moved from problem to solution at line 9: TRAILS short secondValue; For this study, we used data collected by a Parsons puzzle tutor on 3. Moved from problem to solution at line 10: if-else statements from a suite of such tutors available online cout << "Enter the first value"; called epplets (epplets.org) [1]. In the tutor, the student was tasked 4. Reordered from line 10 to 12: with solving Parsons puzzles using drag-and-drop actions. The cout << "Enter the first value"; student was required to solve each puzzle completely and The tutor logs the sequence of actions taken by students to solve correctly before going on to the next puzzle. If the student took a each puzzle. Student are tasked with solving Parsons puzzles lot of redundant actions to solve a puzzle, the tutor scheduled using drag-and-drop actions. We wrote a program to reconstruct additional similar puzzles for the student to solve. If the student the partial solution of the student after each action, the partial took more than twice as many actions as necessary to solve a solution being the program the student had assembled so far for puzzle, the tutor offered the student the option to bail out. If a the puzzle. Next, we computed the edit distance of each partial student bailed out, the solution was marked as incomplete and the solution from the correct solution for the puzzle. If a puzzle had student was presented additional similar puzzles. multiple correct solutions, we computed the distance of each The first 3 puzzles presented by the tutor were on the following partial solution from the specific correct solution eventually programs, listed here along with a unique puzzle id associated reached by the student. The resulting edit distance trail of a with each puzzle: student for a puzzle with 6 lines of code might look like this: 1. A program to read two numbers and print the smaller value [6, 5, 5, 4, 5, 4, 3, 2, 3, 2, 1, 0] among them (puzzle id 2005). Since not everyone solved each puzzle with the same number of 2. A program to read numerical grade, convert it to letter grade actions, the length of the edit distance trail varied from student to - A (90 and up), B (80-89), C (70-79), D (55-69) and F student. But, the minimum length of the trail was n + 1 where n otherwise - and print it (puzzle id 2105). was the number of lines in the correct solution of the puzzle. 3. A program to read a number and print whether it is odd or Figure 1 shows a graphical representation of two edit distance even (puzzle id 2000). trails. These trails show the progress report of two different The first and third puzzle was on if-else statement and the second students who attempted the 2005 template problem on if-else puzzle was on nested if-else statements. If a student solved the statements. The 2005 template problem asks the student to read first puzzle with too many redundant actions, the tutor presented additional puzzles, the first of which was the third puzzle – it was around if-clause and else-clause. Both the pairs of braces were similar to the first puzzle. Each puzzle had two distracters – lines optional since both the clauses contained a single statement. So, of code that did not belong in the solution. the puzzle could have been solved with 14 (both pairs included), The tutor was used by our introductory programming students as 12 (only one pair included) or 10 (neither pair included) lines - All after-class assignments. For this study, we used the data collected three versions were accepted as correct. Therefore, all the trails start at a value between 10 and 14 in Figure 2. Since a puzzle with by the tutor over eight semesters: Fall 2015 – Spring 2019. We n lines can be optimally solved with n actions, cluster 1 (leftmost included data from only the students who gave permission for centroid in Figure 2) with a mean of 17.26 actions included all the their data to be used for research purposes. Students could use the optimal solutions. tutor as often as they wished. Students used the tutor in four In the figure, data points at 15 or 16 correspond to the start of different languages: C, C++, Java and C#. We combined the data trails in which students inserted one or both distracters into the from all four languages in our analysis. When a student used the solution before inserting any lines of code that actually belonged tutor multiple times, data from all the sessions was included in the in the solution. Each data point is part of one of more trails – study. In all, 1068 students used the tutor during those eight when a data point is shared among trails of different clusters, the semesters. 275 students withheld permission for use of their data colors of the different clusters have blended. Since our interest during at least one session (but, may have given permission was in finding patterns in the trails, i.e., centroid curves, and not during other sessions). distribution of data, we used a regular graph rather than a bubble In order to find patterns among edit distance trails, we used k- chart. means clustering in scikit-learn Python package. Since edit distance trails were not all of the same length, we padded trails at the end with -1 so that all the trails were of the length of the longest trail for the puzzle. We used elbow method to determine the optimum number of clusters k. k-means algorithm clustered the trails in n-dimensional hyperspace, wherein n was the uniform length of all the edit distance trails. So, the algorithm clustered edit distance trails, not individual edit distances in the trails. After grouping edit distance trails into clusters, we computed centroid curve of each cluster, which represents the pattern or archetype trail of the cluster. For the calculation of centroid curve, we ignored the -1 values used to right-pad the trails so that they would not affect the shape of the curve. We analyzed the edit distance trails of each puzzle separately. Within each puzzle, we analyzed edit distance trails of complete and incomplete solutions separately. The number of edit distance Figure 2. Clusters of Complete Solutions of the First Puzzle trails available for each puzzle and the optimal number of clusters found for each puzzle are listed in Table 5. We did not apply clustering algorithm if the number of trails was less than 50. Table 6: Complete Solution Clusters of the First Puzzle: Number of trails, minimum, maximum and mean actions Table 5: Number of Edit Distance Trails Available and taken to solve the puzzle Optimal Number of Clusters Found for each Puzzle Cluster N Actions to Solve the Puzzle Puzzle Complete Solutions Incomplete Solutions Number Minimum Maximum Mean No. (Id) 1 533 16 38 17.26 Trails Clusters Trails Clusters 2 157 18 44 24.66 1 (2005) 785 4 371 4 3 77 28 58 38.81 2 (2105) 376 4 356 3 4 18 42 94 63.88 3 (2000) 358 4 26 N/A The clusters found for incomplete solutions of the first puzzle are shown in Figure 3. Table 7 lists the number of incomplete 4.1 Puzzle 1 solutions in each of the four clusters, the minimum, maximum and The clusters found for complete solutions of the first puzzle are mean number of actions taken in the solutions of the clusters and shown in Figure 2 along with their centroids, which are the mean of the final edit distance of all the solutions in the themselves trails. We found that clustering separated edit cluster. The final edit distance shows how many more actions distance trails of completed solutions by how optimally students would have been necessary to complete the solution. solved the puzzle, i.e., by the slope of edit distance trails. Table 6 The first cluster corresponded to students bailing out after just two lists the four clusters, number of solutions in each cluster, and the actions. Since this was the first puzzle presented by the tutor, it is minimum, maximum and mean number of actions taken in those likely that students were familiarizing themselves with the user solutions to solve the puzzle. The puzzle contained 14 lines of interface of the puzzle and planned to return to use it in code and 2 distracters. The 14 lines included two pairs of braces seriousness later. Cluster 3 (leftmost centroid line) comprised of students who made steady progress (mean of 11.94 actions), but included in the final solution (i.e., the y intercept of edit distance reached a plateau at the end before bailing out. Cluster 2 (second trails at x = 0). Cluster 4 comprised of students who took a lot centroid curve from the left) comprised of students who made more actions to solve the puzzle than necessary. gradual progress towards the solution (mean of 24.44 actions) before bailing out. Both the clusters bailed out about 8 actions away from solving the puzzle, i.e., they bailed out about halfway Table 8: Complete Solution Clusters of the Second Puzzle: through the solution to the puzzle that contained 14 lines. Cluster Number of trails, minimum, maximum and mean actions 4 was comprised of students who were lost from the beginning. taken to solve the puzzle Note that the slopes of the centroid curves of incomplete solution Cluster N Actions to Solve the Puzzle clusters provide qualitative information about incomplete solutions in the cluster: solutions that were informed (steep slope) Number Minimum Maximum Mean versus those that were not informed and included a lot of 1 130 36 67 40.75 redundant actions (shallow slope), and the point at which a 2 44 36 57 41.45 solution hit a dead-end (plateau). 3 171 36 74 40.67 4 31 50 123 71.48 Figure 3. Clusters of Incomplete Solutions of the First Puzzle Table 7: Incomplete Solution Clusters of the First Puzzle: Figure 4. Clusters of Complete Solutions of the Second Puzzle Number of trails, minimum, maximum and mean actions taken to solve the puzzle and mean final edit distance The clusters found among incomplete solutions of the second Cluster N Actions to Solve the Puzzle Mean final puzzle are shown in Figure 5 and Table 9. Cluster 2 comprised of Number distance students who bailed out early. Cluster 1 students made rapid Min Max Mean progress (leftmost centroid curve in Figure 5), but abandoned the 1 171 0 2 0.43 13.98 solution about 9 actions short. Cluster 3 students struggled to 2 92 12 48 24.44 8.29 solve the puzzle (right-most centroid curve in the figure) and bailed out 14 actions short. Once again, we see steep versus 3 99 3 51 11.94 8.12 shallow slope and plateau – features of the centroid curves that 4 9 50 87 67.11 4.66 provide qualitative information about the solutions in the clusters. The analysis so far supports our hypothesis that patterns could be found in student solutions of Parsons puzzles that were interpretable. 4.2 Puzzle 2 Figure 4 and Table 8 show the clusters found among complete solutions of the second puzzle, which contained 34 lines of code Table 9: Incomplete Solution Clusters of the Second Puzzle: and 2 distracters. 16 of the 34 lines were open or close braces Number of trails, minimum, maximum and mean actions which were optional as explained earlier. So, complete solution taken to solve the puzzle and mean final edit distance edit distance trails started with a value in the range 18-34 and ended with 0. Cluster N Actions to Solve the Puzzle Mean final Number Min Max Mean distance The main difference between the four clusters was the number of optional braces that were included in the final solution: cluster 1 1 130 9 78 43.53 8.89 (third centroid from the left in Figure 4) included all the braces, cluster 2 (leftmost centroid), none of the braces, and cluster 3 and 2 156 0 9 1.05 27.33 4 (second and right-most centroid from the left), some of the 3 70 32 138 61.22 14.28 braces. In this case, clustering separated edit distance trails of completed solutions by the number of optional lines that were Figure 5. Clusters of Incomplete Solutions of the Second Puzzle Figure 6. Clusters of Complete Solutions of the Third Puzzle 4.3 Puzzle 3 The incomplete solutions of the third puzzle contained only 26 Figure 6 show the clusters found among complete solutions of the trails. So, clustering was not performed on incomplete trails. third puzzle. Table 10 lists the minimum, maximum and mean number of actions taken to solve the third puzzle for each complete solution cluster. 5. DISCUSSION In order to be able to objectively contrast the clusters of a puzzle The puzzle contained 11 lines of code and 2 distracters. Similar to as well as compare the clusters of different puzzles, we computed the first puzzle, puzzle 3 also includes two optional pairs of braces the degree of optimality of the solutions included in each cluster. around the if-clause and else-clause. All centroid curves start at a A puzzle with n lines needs no more than n actions to solve value between 7 and 13 in Figure 6 because of the four optional completely and correctly. So, an optimal solution of the puzzle code lines and two additional distractors. The puzzle could be has the same number of actions as the number of lines in the optimally solved with 11 actions. puzzle. The degree of optimality (O) of the solutions in a cluster is In puzzle three, we found that clustering separated edit distance calculated as μ / n, wherein, μ is the mean of the number of trails of completed solutions by how optimally students solved the puzzle- solving actions taken by all the solutions in the cluster and puzzle and also by the number of optional lines that were included n is the number of lines in the puzzle. in the final solution. Cluster 3 (left-most centroid curve) grouped Table 11 lists the degree of optimality of all complete solution students that did not use any optional braces. Figure 6 showed that clusters of all three puzzles. Different puzzles may have different this centroid curve starts at an edit distance value of 7. Clusters 1 number of lines of code. The degree of optimality abstracts away and 2 (second and third centroid from the left) grouped students this difference, thereby enabling us to compare clusters of who use both pairs of optional braces to solve the puzzle. The different puzzles. Note that on the first puzzle, the first cluster centroid curve for both these clusters starts at an edit distance comprised of the most optimal solutions (Table 6). In Table 10, value of 11. Furthermore, cluster 1 grouped students who solved the first cluster had the lowest degree of optimality for puzzle 3. the puzzle more optimally with an average of 13.75 actions and Clusters 1, 2 and 3 were all optimal for puzzle 2 (Table 8), cluster 2 grouped students who took more moves, an average of differing only in the number of optional statements that were 17.51 actions, to complete the puzzle. Cluster 4 (right most included in the final solution. All three clusters of puzzle 2 have centroid curve) grouped students who used only a single pair of similar degrees of optimality in Table 11. optional braces since their centroid curve starts at a value of 10. Figure 6 showed that this group of students took the most actions to complete the puzzle. Table 11: The Number of Solutions (N) and Degree of Optimality (O) in each Cluster of each Puzzle Table 10: Complete Solution Clusters of the Third Puzzle: No Complete Solution Cluster Number Number of trails, minimum, maximum and mean actions 1 2 3 4 taken to solve the puzzle N O N O N O N O Cluster N Actions to Solve the Puzzle 1 533 1.14 157 1.6 77 2.48 18 4.05 Number Minimum Maximum Mean 2 130 1.15 44 1.17 171 1.15 31 2.01 1 202 13 18 13.75 3 202 1.13 87 1.42 52 1.2 17 2.28 2 87 14 26 17.51 3 52 13 19 14.69 The complete solutions of all three puzzles yielded four clusters. 4 17 21 36 28.64 These clusters corresponded to either various levels of optimality or the number of optional lines that were included in the final solution. The first puzzle required students to assemble a single block of if-else statement and was the first puzzle that was than the centroids in the first puzzle. This showed that the students presented to students. Figure 2 showed that the majority of the in all the clusters of the second puzzle faced more difficulty trails start at an edit distance value of 14 as all the centroid curves completing the last few steps in the puzzle compared to the start at the same value. The first puzzle had twice the number of students who solved the first puzzle. All three puzzles cover if- complete solution trails compared to the second and third puzzles else statements, but the second puzzle is more difficult than the (785 vs 376 and 358 respectively). We do not see major variations first and third puzzles because it involves the concept of nesting, in the use of optional braces in the first puzzle even though the which is harder for novice programmers. We have found similar sample size is much larger. This highlights how most students in tails in edit distance trails of harder puzzles during the analysis of the first puzzle used all the optional braces to assemble the other concepts [7]. solution. Beginner programmers might not be aware that clauses Additionally, Table 11 showed that the second puzzle had a enclosing single statements do not necessarily require braces. maximum optimality 2.01. This means that the students took at Students might be including the optional braces as it is a good most twice the optimal number of moves to try and solve the practice, or they might just be unaware that the tutor considers second puzzle. The first puzzle included a group of students who braces as optional. This is why the first puzzle is clustered solely took about four times the required moves to solve the puzzle based on various levels of optimality. (cluster 4 with a degree of optimality of 4.05). This highlights that The third puzzle, which is a follow-up puzzle for students who students were more motivated to solve the first puzzle. Students struggled with the first puzzle showed more variety with the use most likely gave up more quickly on the second puzzle because of of optional braces. In the third puzzle, students who used all the the added nesting complexity. optional braces are clustered in clusters 1 and 2, students who Among the incomplete solutions, the first cluster in the first used none of the optional braces are clustered in cluster 3, and puzzle and the second cluster in the second puzzle, shown in students who used one of the two optional pairs are clustered in Table 7 and Table 9, identifies “lurkers” [6]. Lurkers are students cluster 4. This variety in the use of optional braces might be who take a couple of actions and bail out quickly. Hosseini in her accounted for by the student’s experience in programming or literature identifies lurkers as “stoppers”, who do not take any using the tutor. Interestingly, Table 10 showed that a majority of actions after encountering a problem. We use the term “lurkers” students who used only one of the two pairs of optional braces because we believe that these students were probably just testing take the greatest number of actions to solve the puzzle. These the interface to gain familiarity with the tutor. “Movers” identified students are clustered in cluster 4 and used an average of 28.64 in the literature [6] corresponded to all the students grouped in the actions to solve the puzzle. Table 11 showed that this cluster had complete clusters. These students were gradually able to solve the the most non-optimal solutions with a degree of optimality of puzzle by taking steps towards the correct solution. “Tinkerers” 2.28. [6] were students who took several actions to solve the puzzle by The second puzzle covered nested if-else statements and is more making small changes but were ultimately unable to solve the complex than both the first and third puzzles. This puzzle puzzle. All the other clusters in the incomplete solutions contained eight pairs of optional braces. Table 11 showed that the excluding the “lurkers” identify as “tinkerers”. Edit distance trails most non-optimal solutions were clustered in cluster 4 with a in this case helped identify the known problem-solving behaviors degree of optimality of 2.01. Similar to the third puzzle, cluster 4 of students. in the second puzzle corresponded to students who used some of Edit distance space has been used to generate hints in code- the optional braces. This observation asserts that students who do writing tasks [2]. We use edit distance to track progress of not follow one of the two practices (i.e., either including all students, not provide hints; we addressed Parsons puzzle solutions optional braces or eliminating all optional braces) struggle the which have a finite search space and a single correct solution most with assembling if-else statements. compared to code-writing exercises which must accommodate any Furthermore, in the second puzzle, the cluster with the second code written by the student and can have multiple possible most non-optimal solutions was cluster 2 with a degree of solutions; and we computed edit distance from the final and only optimality of 1.17. This cluster grouped students who did not use solution to the puzzle, not a weighted average of all possible any of the optional braces to construct the final solution. As this nearby paths to the solution. puzzle used nested if-else statements, students might had confused Edit distance trails helped to identify patterns by clustering some of the lines of the problem since they did not use any of the student solutions regardless of the sequence of individual optional braces. Optional braces are not necessary, but it improves statements manipulated by them. One shortcoming of using edit the readability of the code. This might be the reason why students distance trails is that since they abstract away puzzle-specific who did not use any braces took more actions to assemble the information, they cannot be used to determine the specific lines of puzzle than students who used optional braces. This shows that code that most students might have problems assembling the use of optional braces helped in solving more complex tasks. correctly. Cluster 1 grouped students who used all the optional braces and it had a degree of optimality of 1.15. Even though they assemble a In the future, we plan to analyze data from Parsons puzzle tutors larger puzzle (34 lines vs 18 lines), the solutions were more on other topics to see if we can generalize the results of this study optimal compared to students who did not use optional braces. across topics. We expect that we may find more clusters if we gather more data 6. ACKNOWLEDGEMENTS for the second puzzle. This might also show optimal and sub- Partial support for this work was provided by the National Science optimal solutions for various types of solutions: 1) solutions with Foundation under grant DUE-1432190. no optional lines included, 2) solutions with all optional lines included, and 3) solutions with some optional lines included. 7. REFERENCES From a visual inspection of complete clusters of the three puzzles, [1] Amruth N. Kumar. 2018. Epplets: A Tool for Solving we find that the centroids in the second puzzle had a longer tail Parsons Puzzles. In Proceedings of the 49th ACM Technical Symposium on Computer Science Education (SIGCSE '18). [6] Hosseini, Roya & Hellas, Arto & Brusilovsky, Peter. (2014). ACM, New York, NY, USA, 527-532. DOI: Exploring Problem Solving Paths in a Java Programming https://doi.org/10.1145/3159450.3159576. Course. [2] B. Paaßen, T.W. Price, S. Gross, B. Hammer, T. Barnes, N. [7] S. Maharjan and A.N. Kumar. 2020. Using Edit Distance Pinkwart. 2018. The Continuous Hint Factory – Providing Trails to Analyze Path Solutions of Parsons Puzzles. 13th Hints in Vast and Sparsely Populated Edit Distance Spaces. International Conference on Educational Data Mining Journal of Educational Data Mining, Volume 10, No 1. (EDM) 2020 (forthcoming). [3] D. Shapira and J. A. Storer. 2007. Edit distance with move [8] Takabatake, Yoshimasa & Nakashima, Kenta & Kuboyama, operations. Journal of Discrete Algorithms, 5, 2, 380–392. Tetsuji & Tabei, Yasuo & Sakamoto, Hiroshi. 2016. siEDM: [4] F.J. Damerau. 1964. A technique for computer detection and An Efficient String Index and Search Algorithm for Edit correction of spelling errors. Communications of the ACM, 7, Distance with Moves. Algorithms. 9. 26. 10.3390/a9020026. 3, 171-176. [9] V.I. Levenshtein. 1966. Binary codes capable of correcting [5] G. Cormode and S. Muthukrishnan. 2007. The String Edit deletions, insertions, and reversals. In Soviet physics doklady, Distance Matching Problem with Moves. ACM Trans. Algor. 10, 8, 707-710. 3, 1, Article 2 (Feb. 2007), 19 pages. DOI= http://doi.acm.org/10.1145/1186810.1186812.