=Paper=
{{Paper
|id=Vol-2434/paper2
|storemode=property
|title=Bad Smells in Scratch Projects: A Preliminary Analysis
|pdfUrl=https://ceur-ws.org/Vol-2434/paper2.pdf
|volume=Vol-2434
|authors=Ángela Vargas-Alba,Giovanni Maria Troiano,Quinyu Chen,Casper Harteveld,Gregorio Robles
|dblpUrl=https://dblp.org/rec/conf/ectel/Vargas-AlbaTCHR19
}}
==Bad Smells in Scratch Projects: A Preliminary Analysis ==
Bad Smells in Scratch Projects: A Preliminary Analysis
Ángela Vargas-Alba1 , Giovanni Maria Troiano2 , Quinyu Chen2 ,
Casper Harteveld2 and Gregorio Robles1
1
Universidad Rey Juan Carlos, Madrid, Spain — {a.vargasa@alumnos,grex@gsyc}.urjc.es
2
Northeastern University, Boston, MA — {g.troiano, q.chen, c.harteveld}@northeastern.edu
Martin defines code smells as follows [3]: “Code smells
are usually not bugs; they are not technically incor-
Abstract rect and do not prevent the program from function-
ing. Instead, they indicate weaknesses in design that
Computational Thinking (CT) is an area of may slow down development or increase the risk of
great relevance today. Although its skills may bugs or failures in the future.” Despite the negative
be developed in various ways, one of the most effect they produce, code smells have been little in-
common tools to learn it, train it and develop vestigated and analyzed in Computational Thinking
it, is through programming. From software (CT) research. As Hermans and Aivaloglou have found
engineering, we know that problems solved in an experiment with Scratch learners [1], we argue
through programming may have not been that bad smells hinder the proper development of CT
solved in the most appropriate way. These skills in learners. Their identification should be a first
symptoms are known as “bad smells”. This step towards guiding learners towards good practices
article aims to analyze the presence of sev- that offer them the possibility to develop themselves
eral bad smells in Scratch projects and how to their full potential.
they relate to the development of CT skills.
Therefore, we make use of a dataset of several For this reason, the main motivation of this research
hundreds of Scratch projects with the aim of is to analyze to what extent bad smells are present in
creating a game on climate change. Our re- Scratch projects, a block-based language that is widely
sults show that bad smells can be found in used around the globe to develop CT skills. Our re-
all types of Scratch projects, independently of search is similar to a previous one done on LEGO
the development of CT skills they require. We MINDSTORMS EV3 and Microsoft’s Kodu [2], ex-
discuss why the learning community should panding it with information on the complexity of the
address bad smells appropriately, as they may projects. Therefore, we use Dr. Scratch, a tool that
hinder the development of abstraction, reuse evaluates the richness of elements used in the pro-
and other relevant skills. grams, for evaluating the Scratch projects. Thanks
to Dr. Scratch it is possible to detect different types
of bad smells that are present in the code.
1 Introduction
Bad (code) smells are symptoms that the problem to The remainder of this paper is structured as follows:
be solved is not developed in the most appropriate way. In Section 2, we introduce and motivate the research
In other words, the program may run and may even goal and research questions that we address in this pa-
solve the problem, but it contains elements that make per. A more detailed description about the definition
it difficult to understand, to modify and to reuse [9]. of bad smells is summarized in Section 3. Section 4
and Section 5 describe the data set used in the study,
Copyright c 2019 for this paper by its authors. Use permitted as well as the functionality and design of Dr. Scratch
under Creative Commons License Attribution 4.0 International in more detail. Section 6 shows the results obtained
(CC BY 4.0). after the analysis and in Section 7 a discussion is pro-
In: I. Fronza, C. Pahl (eds.): Proceedings of the 2nd Systems of posed based on it. Limitations and problems found
Assessments for Computational Thinking Learning Workshop
(TACKLE 2019), co-located with 14th European Conference
are described in Section 7.1. Finally, Section 8 con-
on Technology Enhanced Learning (EC-TEL 2019), 17-09-2019, tains the main conclusions and future work that we
published at http://ceur-ws.org. envision.
2 Research Goal and Questions nent in more complex projects (proficiency). We as-
sume therefore that learners that achieve higher levels
The main objective of this paper is to analyze the
of complexity have overcome certain bad smells due
presence of bad smells in a large set of Scratch
to having developed certain CT skills, while other bad
projects.
smells appear in those more complex projects.
For this, the research questions that we want to
address are as follows:
3 Bad smells in Scratch
RQ1. To what extent are bad habits present
in Scratch projects? Scratch is a visual programming language formed by
In particular, we answer this question by offering different blocks, designed for children and beginner
the percentage of projects that have at least one type programmers, which contains different bad smells re-
of bad smell. This question allows to see how frequent lated to the use of these blocks [6].
projects show a bad smell, hinting to the relevance of In our research, we have identified four different
the topic. We expect a significant number of projects types of bad smells that can be present in Scratch
to contain bad smells. projects: copy and pasted code (duplicate scripts) [7],
RQ2. Does the development of CT skills re- the use of default names for sprites (default names),
late to the presence of bad smells? code that is never being executed (dead code), and
We would like to find out if the presence of bad variables that are not correctly initialized (attribute
smells correlates with the complexity of the projects. initialization). Their characteristics and impact are
Our hypothesis is that projects that have higher de- summarized in Table 1.
grees of CT development will have less bad smells, as
these may hinder the development of CT skills. 4 Research Context
RQ3. Do projects with more blocks have a In order to analyze the presence of bad smells, as well
higher number of bad smells? as their relationship with the level that users have in
More complex projects usually have more blocks. CT development, a large set of projects is necessary.
Thus having a single bad smell in a small, simple The data set used in this article is the same which was
project may have less impact than in a project with used for another, previous research [8]. A group of
hundreds of blocks. In the former case, the impact 438 students designed games for STEM using Scratch
could be big, while in the latter it could be seen as an 2.0. During this process, we obtained snapshots of the
exception, with little impact. process, in different periods of time, in order to show
To answer this question, for projects of the same a temporary evolution1 . The total number of projects
level of CT development we calculate the ratio of the without taking into account the replicas over time, is
number of bad smells detected to the total number of 711. As a result, the complete data set is comprised
blocks. We expect that this ratio decreases with an of 62,074 projects formed by the snapshots. With the
increase in the development of CT skills required to total data set, we wanted to analyze the same 711
create the Scratch projects. projects in different points of time.
RQ4. Can we find a relation among specific All these snapshots were analyzed with Dr. Scratch,
bad smells? of which 2,158 were erroneous in the analysis for dif-
As by now, we have considered all type of bad smells ferent reasons: the project was saved incorrectly, the
together. In this question, we dig into each of them code contained special characters, etc. The final set
separately. It may be possible that some bad smells of projects analyzed was 59,916 (further details can be
appear more frequently in projects of lower complexity, found in [8]).
while others appear in more complex projects.
RQ5. To which extent can bad smells 5 Methodology
be identified in each of the CT development
phases? We have taken all snapshots of the Scratch projects
Related to the previous question, we analyze how and have analyzed them with Dr. Scratch. Dr.
the different bad smell types appear in projects in the Scratch is a web-based tool that analyzes different cat-
different stages of CT development. Therefore, we con- egories related to computational thinking based on the
sider projects with a low complexity (basic), medium blocks of the Scratch projects [4] (a screenshot of their
complexity (developing) and major complexity (profi- main web page can be seen in Figure 1). It analyzes
ciency) and compute how often they contain a specific the code and, depending of the diversity of blocks used,
type of bad smell. the application gives a score to the project.
We expect that several types of bad smells appear in 1 https://drive.google.com/drive/u/0/folders/
the early phases (basic), while others are more promi- 1tDI6nx2f6344xJAKeUeWBeTg0YzxE3bO
Table 1: Types of Bad Smells.
Bad Smell Type Definition Impact on Learning
Duplicate scripts Code is copy and pasted, some- It hinders the use of user-defined
times with minor changes blocks and as such can be seen a
limitation to the development of
the abstraction skill
Default names Objects are not given a meaning- It hinders interaction among ob-
ful name, but keep the default jects, as using them in other ob-
SpriteN name jects becomes more difficult
Dead code Code that is never being exe- It may indicate missing function-
cuted (usually because they do ality
not have a starting condition)
Attribute initialization Variables are not well initialized It hinders the start of some ob-
jects, because their position, size,
costume, etc are not correctly
initialized
Figure 1: Main page of the web tool Dr. Scratch
Figure 2: Example of the analysis of a Scratch project
The outcome is a numeric punctuation based on with Dr. Scratch
seven categories of computational thinking: paral-
lelism, logical thinking, flow control, user interactivity,
described data set.
data representation, abstraction and synchronization.
For each of these abilities a project can obtain a punc-
tuation from 0 to 3 points, according to the different 6.1 To what extent are bad habits present?
blocks used in the Scratch project. In this way, the As shown in Table 2, bad smells can be found in almost
final punctuation can be from 0 to 21 total points. all projects in our data set – over 97% of the projects
Based on that, there are three different profiles of have at least one bad smell.
leaner: between 0 and 7 points, Basic, between 8 and Table 2: General presence of Bad Smells (RQ1)
15, Developing and between 16 and 21, Proficiency.
Once the project is analyzed, the application will Projects with Projects without
show different dashboards with these results, as it is bad smells bad smells
possible to see in the Figure 2. 58,162 1,754
In addition to the former, Dr. Scratch identifies the 97,07% 2,93%
four types of bad smells that we study in this work.
We expected a high share of projects having bad
6 Results
smells, but this result is a surprise for us, as the pres-
In this Section we describe the results obtained from ence of bad smells is not only more frequent than ex-
addressing our research questions using the previously pected, but almost general.
6.2 Does the development of CT skills relate they share a ratio of around 0.5 up to 21 points, where
to a minor presence of bad smells? the ratio is over 1).
We have analyzed in more detail those projects that
have and do not bad smells. In particular, we want
to see how the complexity of the projects is related
to having a bad smell. We use the mastery required
to create a project, as measured by Dr. Scratch, as a
proxy for the complexity of the project.
In Figure 3 we can observe the distribution of each
set of projects. We can observe that more than 50% of
those projects with not bad smells have a total mastery
of 0. In other words, these are skeleton projects with-
out any content. The amount of projects with content
and without any bad smell is therefore even lower than
calculated in the previous research question. In addi-
tion, we see that projects with no bad smells are in the Figure 4: Relation between the total number of bad
lower part of the complexity ladder. smells and the total number of blocks for each mastery
level (RQ3).
Table 3 offers more details about this situation.
6.4 Can we find a relation among specific bad
smells?
From Figure 5 (again, this graph is normalized), we
can observe that the four different types of bad smells
that we have studied have a similar distribution be-
low the proficiency level. The presence of bad smells
has a similar behavior for projects up to 17 points of
Figure 3: Distribution for projects without bad smells mastery. Then, when the mastery is above 17 points,
and with bad smells (RQ2). the behavior of each of them is different: While dupli-
cated scripts and bad attribute initialization continue
In summary, bad smells in Scratch can be found in to slightly grow, the number of dead code blocks grows
almost all projects. Only a small set of projects with more abrupt way. However, and the number of default
low complexity do not have them. names decreases considerably. This last trend may be
because in projects with a high number of blocks and
objects it is more difficult to program with the default
6.3 Do projects with more blocks have a names (Sprite 1, Sprite 2, etc) instead of personalizing
higher number of bad smells? them.
One could argue that projects with more complexity
(those with higher values of CT score in Dr. Scratch) 6.5 To which extent can bad smells be iden-
usually have more blocks, and that the impact of a tified in each of the CT development
code smell there is lower than in less complex projects. phases?
In other words, even if more complex projects have As already seen wit RQ3, the number of bad smells
bad smells, their presence is mitigated by the fact that is higher when the total mastery increases. In Figure
these are large projects. This would imply that achiev- 6, we have represented the percentage of projects that
ing high values of CT development means to have less have at least a specific type of bad smell for each level
bad smells. of CT proficiency. The percentage of projects from
Figure 4 visually shows the number of blocks for users with the basic level that have bad smells is much
all projects for a given CT score (blue line) and the smaller than the percentage of projects that require
number of bad smells in those projects. Both curves a proficiency level. This result indicates that all bad
have been normalized to their maximum values. We smells have an incremental evolution with the increase
can observe how the two curves run almost in parallel of CT efficiency. So, it seems that bad smells appear
(up to 8 points they share the same ratio, and then in early phases and instead of disappearing with the
Table 3: Detailed information on number of blocks and bad smells for each mastery level (RQ3).
Total Total Total Mean Median Total Mean Median Total Blocks
Mastery Projects Blocks Blocks Blocks Bad Smells Bad Smells Bad Smells / Bad Smells
0 1747 198 0.11 0 2087 1.19 1 0.09
1 279 2096 7.51 1 2200 7.89 1 0.95
2 171 702 4.11 3 468 2.74 1 1.50
3 464 3519 7.58 4 2914 6.28 0 1.21
4 425 2902 6.83 5 1414 3.33 0 2.05
5 1325 15365 11.60 9 7268 5.49 1 2.11
6 1417 28878 20.38 16 8298 5.86 2 3.48
7 1470 51024 34.71 22 10304 7.01 2 4.95
8 1935 92437 47.77 29 13459 6.96 2 6.87
9 2349 191908 81.70 55 20221 8.61 3 9.49
10 3473 334075 96.19 66 38376 11.05 3 8.71
11 5900 683490 115.85 82 64702 10.97 4 10.56
12 5786 779130 134.66 103 66401 11.48 4 11.73
13 6206 1142705 184.13 142 108307 17.45 7 10.55
14 8173 2184999 267.34 201 182128 22.28 10 12.00
15 4954 1383152 279.20 193 104416 21.08 7 13.25
16 5346 1735651 324.66 255 130280 24.37 10 13.22
17 3352 1276143 380.71 291 103394 30.85 16 12.34
18 1795 631899 352.03 279 50581 28.18 10 12.49
19 2092 872027 416.84 342 71391 34.13 14 12.21
20 960 533958 556.21 507 66419 69.19 26 8.04
21 296 1639757 5539.72 7183 355809 1202.06 1505 4.61
Figure 5: Evolution of the different types of bad smells Figure 6: Presence of bad smells for each proficiency
with CT mastery (RQ4). level (RQ5).
development of more advanced CT skills, they become
more prominent. should devote more effort in avoiding the presence of
bad smells in learning projects. The very nature of bad
smells makes them difficult to identify. They are not
7 Discussion
errors and a program could run perfectly having many
We have observed that bad smells are very common in of them. However, their effects are very well known in
Scratch projects. The results indicate as well that bad Software Engineering research. These effects are in the
smells do not limit the development of a project, be- long term, when maintainability of a software system
cause it is possible to create projects at the proficiency is considered. In such a situation, the software has to
level even with a large amount of bad smells. be understood and changed, and in that situation is
We argue that researchers, educators and learners where these bad smells become more prominent. The
large presence of bad smells indicates that understand- International Conference on Program Comprehen-
ing and changing code is not among the priorities of sion (ICPC), pages 1–10. IEEE, 2016.
the projects under study, although “good code” has
always been considered as that one that is easy to un- [2] F. Hermans, K. T. Stolee, and D. Hoepelman.
derstand and to change. That is why we think that Smells in block-based programming languages. In
with the current presence of bad smells learners do not 2016 IEEE Symposium on Visual Languages and
achieve their full potential of CT development skills. Human-Centric Computing (VL/HCC), pages 68–
In our opinion, further research and tools should be 72. IEEE, 2016.
envisioned and created to address this issue. [3] R. C. Martin. Clean code: a handbook of agile
software craftsmanship. Pearson Education, 2009.
7.1 Limitations
[4] J. Moreno-León, G. Robles, et al. Analyze your
As any research, our work comes with a number of scratch projects with dr. scratch and assess your
limitations that can be seen as threats to its validity. computational thinking skills. In Scratch confer-
The first one is related to our methodology, and in ence, pages 12–15, 2015.
particular to the limited set of bad smells that we can
identify. We are sure that many other types of bad [5] J. Moreno-León, G. Robles, and M. Román-
smells could be thought of in Scratch. On the other González. Comparing computational thinking de-
hand, we use as complexity metrics the CT scores pro- velopment assessment scores with software com-
vided by Dr. Scratch. While there have been some plexity metrics. In 2016 IEEE global engineer-
research that has studied how Dr. Scratch correlates ing education conference (EDUCON), pages 1040–
with other complexity metrics [5], this is always a cor- 1045. IEEE, 2016.
relation and not causation.
We have studied projects from a specific environ- [6] M. Resnick, J. Maloney, A. Monroy-Hernández,
ment, which may not be representative for all Scratch N. Rusk, E. Eastmond, K. Brennan, A. Millner,
projects, so the generalization of results is a threat to E. Rosenbaum, J. S. Silver, B. Silverman, et al.
validity. However, it should be noted that this is a case Scratch: Programming for all. Commun. Acm,
study, with the aim to raise attention on this matter. 52(11):60–67, 2009.
We should analyze a wider data set which includes [7] G. Robles, J. Moreno-León, E. Aivaloglou, and
different sectors of programming with Scratch, such as F. Hermans. Software clones in scratch projects:
stories, music, animations, among others. We do not On the presence of copy-and-paste in computa-
know if the behavior of bad smells could be different tional thinking learning. In 2017 IEEE 11th In-
in other areas of programming. ternational Workshop on Software Clones (IWSC),
pages 1–7. IEEE, 2017.
8 Conclusions
[8] G. M. Troiano, S. Snodgrass, E. Argımak, G. Rob-
Bad smells are common in Scratch projects, from basic les, G. Smith, M. Cassidy, E. Tucker-Raymond,
to proficient. As the complexity increases, bad smells G. Puttick, and C. Harteveld. Is my game ok dr.
do not disappear, so learners can create projects that scratch?: Exploring programming and computa-
demand a high development of CT skills having bad tional thinking development via metrics in student-
smells in them. designed serious games for stem. In Proceedings of
We ask for further research on this topic. the 18th ACM International Conference on Inter-
action Design and Children, pages 208–219. ACM,
Acknowledgments 2019.
This work has been co-funded by the Madrid Re- [9] M. Zhang, T. Hall, and N. Baddoo. Code bad
gional Government, through the project e-Madrid- smells: a review of current knowledge. Journal
CM (P2018/TCS-4307). The e-Madrid-CM project of Software Maintenance and Evolution: research
is also co-financed by the Structural Funds (FSE and and practice, 23(3):179–202, 2011.
FEDER).
References
[1] F. Hermans and E. Aivaloglou. Do code smells
hamper novice programming? a controlled exper-
iment on scratch programs. In 2016 IEEE 24th