JupyterLab Extensions for Blocks Programming, Self-Explanations, and HTML Injection Andrew M. Olney Scott D. Fleming University of Memphis University of Memphis 365 Innovation Drive, Suite 303 Dunn Hall 375 Memphis, Tennessee 38152 Memphis, Tennessee 38152 aolney@memphis.edu Scott.Fleming@memphis.edu ABSTRACT Project Jupyter has seen it integrated in a variety of plat- JupyterLab is a widely used platform for programming and forms, including Google’s Colaboratory, Kaggle’s Kernels, data science using computational notebooks, but it has not and Microsoft Azure Notebooks, and as a result, Project been widely used in the educational data mining commu- Jupyter is likely the most widely used computational note- nity as a source of student data. We have developed three book today. JupyterLab extensions to enable educational data mining research in CSEd and data science. Our Blockly exten- JupyterLab’s notebook frontend is web-based and presents sion supports blocks-based programming in JupyterLab and the user with a top-level menu followed by an expanding list logs both event-level blocks actions as well as kernel ac- of cells. Each cell can be text, code, or multimedia output. tions and errors. Our self-explanation extension appends For example, if the last line of a code cell produced a ma- self-explanation prompts to codes cells and logs the input trix, then the following output cell would be formatted as a text for further analysis. Finally, our HTML injection ex- table, and if the last line generated a graph, then the out- tension allows injection of arbitrary HTML and Javascript put cell would be the graph rendered as an image. Each cell into JupyterLab notebooks to enable pedagogies and data is runnable and re-runnable, and implicitly references the collection currently unsupported by JupyterLab. All exten- context of previously executed cells. This ability to chunk sions are open-source and distributed through NPM. pieces of code is an advance over traditional interactive pro- gramming where statements are entered line by line at the Keywords command prompt, because it allows larger chunks to be cre- JupyterLab, blocks programming, self-explanations, process ated and run at once. data, HTML injection JupyterLab has what is commonly described as a plugin architecture, which makes it possible to modify the behav- 1. INTRODUCTION ior of JupyterLab without changing its source code. In Computational notebooks have been adopted by professional JupyterLab terminology, these plugins are called Jupyter- data scientists [8], scientists generally [18], and are becom- Lab extensions. An extension is a software library, written ing increasingly popular in computer science education [7, in JavaScript, that extends the functionality of JupyterLab. 12, 20]. The popularity of computational notebooks stems from their ability to combine text, mathematical equations, Despite its wide professional use and use in classrooms, Jupy- code, and graphs. By combining these elements, computa- terLab hasn’t been used as a source of student educational tional notebooks allow data scientists to create shareable, re- data in research published in educational data mining (EDM) producible reports: anyone receiving a computational note- conferences and journals. In this paper we present three book can recreate the original analysis or modify it to ask JupyterLab extensions to advance EDM research. Our Block- new questions. Like any report, a computational notebook ly extension supports blocks-based programming in Jupyter- contains text explaining each step and describing results. Lab and logs both event-level blocks actions as well as kernel O’Hara et al. summarize it well: “A computational note- actions and errors. Our self-explanation extension appends book is a document that can be read like a journal paper self-explanation prompts to codes cells and logs the input and run like a computer program.”[12, p. 263] text for further analysis. Finally, our HTML injection ex- tension allows injection of arbitrary HTML and Javascript JupyterLab is the current iteration of the open-source Project into JupyterLab notebooks to enable pedagogies and data Jupyter and is widely used, with over a million computa- collection currently unsupported by JupyterLab. All ex- tional notebooks on GitHub [17]. The name Jupyter is a tensions are open-source and can be installed at the com- portmanteau of the programming languages Julia, Python, mand line with the standard jupyter labextension in- and R, which were the original target languages of Jupyter, stall command. They may be used independently, simulta- but now dozens of languages are supported and can operate neously, or merely as models for future extensions support- simultaneously within a single notebook [14]. The success of ing research in EDM. Copyright ©2021 for this paper by its authors. Use permitted under Cre- ative Commons License Attribution 4.0 International (CC BY 4.0) 2. BLOCKLY EXTENSION In the last decade, blocks languages have seen wide adop- Table 1: Blockly Extension Logged Data tion for teaching introductory programming [2, 16] as they Source Name Payload have shown multiple positive effects on learning, including JupyterLab execute-code code executed both cognitive and motivational effects, in introductory un- JupyterLab execute-code-error message / stack trace dergraduate courses [1, 5, 9, 10, 15]. Blocks languages com- JupyterLab active-cell-change contents of active cell JupyterLab xml-to-blocks xml string ® pose code elements via irregularly shaped graphical widgets, similar to puzzle pieces or LEGO . Their design typically JupyterLab block-to-code code / xml string makes syntactic mistakes difficult or impossible because the JupyterLab notebook-changed new notebook name widgets cannot fit together in nonsyntactic ways. Further- Blockly block-create event object more, since blocks are visually browsable on an interface Blockly block-delete event object palette, students need only recognize them rather than the Blockly block-change event object more difficult task of recalling code, cf. [19]. Blockly block-move event object Blockly var-create event object Blockly is an open-source JavaScript library for creating Blockly var-delete event object blocks-based editors for programming languages within a Blockly var-rename event object web browser [6]. Blockly supports five languages out of Blockly ui-selected event object the box, including JavaScript, Python, PHP, Lua, and Dart, Blockly ui-category event object and compiles a given assemblage of blocks into any one of Blockly ui-click event object these languages through code generators. A variety of other Blockly ui-commentOpen event object blocks-based projects use Blockly, including AppInventor, Blockly ui-mutatorOpen event object Microsoft’s MakeCode, and Code.org. Blockly ui-warningOpen event object Blockly ui-theme event object Blockly’s user interface minimally consists of a workspace for arranging blocks and a toolbox, or palette, for introduc- ing blocks to the workspace. Within the blocks workspace, used. Notebook sync brings the experience of working in blocks can be dragged, copied, pasted, deleted, or snapped Blockly closer to the experience of working in JupyterLab, together. The blocks themselves contain elements like free- where each cell can be manipulated independently of other text entry fields and dropdowns, and dropdowns can be set cells. to dynamically populate, e.g. with a list of current vari- ables, rather than solely being static. Variable and function We have also introduced a feature called intelliblocks [13]. categories of the toolbox are also dynamic, such that as a Intelliblocks are blocks that are dynamically configured by variable is created, blocks for getting, setting, and similar querying the kernel for string completions and variable in- operations are dynamically created. Likewise the function formation (i.e. intellisense queries). Intelliblocks first query category of the toolbox yields blocks for functions which, the variable named by the block for type information, e.g. once defined, are dynamically added to the toolbox so they pd in Figure 1, and then query all the children of that vari- can be called. To make the creation of new blocks and blocks able for both completions, e.g. pd., and type information. languages easier, Blockly also provides a web-based graph- Intelliblocks appear to solve the block authoring problem of ical authoring tool for blocks that allows authors to create, Blockly and allow it to be scaled up to arbitrary libraries. modify, and save blocks configurations, including code gen- Without a solution like intelliblocks, a human author would eration. be required to make thousands of blocks for a library of sufficient size, and a the user would then need to navigate We have integrated Blockly with JupyterLab by building a through all these blocks to find the ones needed. Through JupyterLab extension. When the user selects the extension, our extension, intelliblocks allow research on blocks-based it by default opens side-by-side with the active notebook programming without any additional authoring effort. as shown in Figure 1. When a user arranges blocks in the Blockly workspace and then presses the Blocks to Code The extension logs both Jupyter and Blockly process data. button, the corresponding Python code is generated in the The logging is controlled by three query string parameters active cell in the notebook, along with a serialized XML that may be appended to JupyterHub links distributed to string in a code comment. The XML string allows the user participants: log=xxx, which enables logging to the spec- to reconstruct the blocks workspace used to generate the ified url endpoint via POST requests; id=xxx, which logs code by clicking the Code to Blocks button. with the specified participant identifier; bl=1, which sets the Blockly extension to auto-open in split-pane view. Each da- Because the workspace can rapidly fill up with blocks, we tum is logged in JSON format with a payload that is either have introduced a feature called notebook sync. When note- a string or a JSON object. In the case of JupyterLab data, book sync is activated, clicking on a cell with an XML com- the format is fairly simple, as shown in Table 1. However, ment clears the current workspace and replaces it with the in the case of Blockly, the data is rather dense and complex. workspace that generated the code in that cell. This sync For example, a move event includes the block id, the previ- action is equivalent to the user manually deleting all the ous x/y position, and the current x/y position, in addition blocks in the workspace, selecting the target cell, and press- to ids for the larger group of attached blocks and current ing the Code to Blocks button. This feature allows users blocks workspace. Because of this complexity, the Blockly to focus on the blocks being used in their current workflow events are logged exactly as they appear in execution, again without having to be distracted by blocks they have already in JSON format. 3. SELF-EXPLANATION EXTENSION clickstream-level data for blocks manipulation that can be The self-explanation effect is a well-known and studied ef- used for future research on learning programming. The self- fect where asking a student to produce self-explanations en- explanation extension provides rich natural language data hances learning [3, 4, 11]. Self-explanation prompts elicit reflecting student reasoning during problem solving. The unstructured input from students about their thinking, and third extension, the HTML injection extension, can be used so represent a rich source of data for understanding their flexibly for presentation of rich media or for data collec- thought processes. Unlike the Blockly extension described tion. All extensions are open-source and distributed through in Section 2, there is no need for a side pane with self- NPM at https://www.npmjs.com/∼aolney. We hope these explanations; rather it is more parsimonious to prompt for extensions will enable future work using JupyterLab as a self-explanations within the notebook itself. source of student data for educational data mining. We have create a self-explanation extension that automat- ically adds self-explanation prompts to each code cell as 6. ACKNOWLEDGMENTS shown in Figure 2. Below each prompt is a text entry box for This material is based upon work supported by the National the student’s explanation and a button to save their expla- Science Foundation under Grants 1918751 and 1934745 by nation. While the student types, the font of the text is red, the Institute of Education Sciences under Grant R305A190448. until they press the save button, at which point it is logged Any opinions, findings, and conclusions or recommendations and the text changes to black. Similar to the Blockly exten- expressed in this material are those of the author(s) and sion, the self-explanation extension can be configured using do not necessarily reflect the views of the National Science query string parameters for id, log, and se=1 (to enable Foundation or the Institute of Education Sciences. the extension). The data is POSTed to the endpoint speci- fied by log and consists of the contents of the code cell and the self-explanation. Pairing the code and self-explanation 7. REFERENCES [1] M. Armoni, O. Meerbaum-Salant, and M. Ben-Ari. ensures that they are properly analyzed together as a snap- From Scratch to “real” programming. ACM shot in time, as the student is always free to rewrite the Transactions on Computing Education, code, self-explanation, or both. 14(4):25:1–25:15, Feb. 2015. [2] D. Bau, J. Gray, C. Kelleher, J. Sheldon, and 4. HTML INJECTION EXTENSION F. Turbak. Learnable programming: Blocks and The last extension we present, the HTML injection exten- beyond. Communications of the ACM, 60(6):72–80, sion, is quite different from the others in that it does not in- May 2017. trinsically collect data. Rather, this extension is a template [3] M. T. H. Chi, M. Bassok, M. W. Lewis, P. Reimann, for creating extensions that can be used for various pur- and R. Glaser. Self-explanations: How students study poses, including collecting data. We developed this exten- and use examples in learning to solve problems. sion in response to a particular problem, displaying hosted Cognitive Science, 13:145–182, 1989. videos embedded in JupyterLab. Typically this is done by [4] M. T. H. Chi, N. de Leeuw, M. H. Chiu, and executing Python and using associated widgets, but for ex- C. LaVancher. Eliciting self-explanations improves perimental purposes we wanted to present embedded videos understanding. Cognitive Science, 18(3):439–477, 1994. without associated code. Surprisingly this is not possible un- der the JupyterLab security model, which forbids JavaScript [5] W. Dann, D. Cosgrove, D. Slater, D. Culyba, and running in Markdown cells. S. Cooper. Mediated transfer: Alice 3 to Java. In Proceedings of the 43rd ACM Technical Symposium on To circumvent this limitation, we use the metadata asso- Computer Science Education, SIGCSE ’12, pages ciated with the Markdown cell to specify arbitrary HTML 141–146, New York, NY, USA, 2012. ACM. and JavaScript, which we then inject into the Markdown [6] Google. Blockly, 2019. original-date: cell outside the standard JupyterLab rendering of the note- 2013-10-25T21:13:33Z. book. The extension allows for easy embedding of video [7] J. B. Hamrick. Creating and grading IPython/Jupyter in the JupyterLab notebook, as shown in Figure 3. Any notebook assignments with NbGrader. In Proceedings other rich media disallowed by JupyterLab’s security model of the 47th ACM Technical Symposium on Computer can be embedded in the same way, as can JavaScript that Science Education, SIGCSE ’16, pages 242–242, New executes data collection functions. However, we note two York, NY, USA, 2016. ACM. caveats with this approach. First, the elements specified in [8] Kaggle. The state of ML and data science 2017, 2017. the metadata will not render on servers that do not have this [9] C. M. Lewis. How programming environment shapes extension installed, making the notebooks that depend on it perception, learning and goals: Logo vs. Scratch. In non-portable to some extent. Second, the extension allows Proceedings of the 41st ACM Technical Symposium on for non-obvious code to be run when a notebook loads, so it Computer Science Education, SIGCSE ’10, pages is not recommended for use outside the research context. 346–350, New York, NY, USA, 2010. ACM. [10] B. Moskal, D. Lurie, and S. Cooper. Evaluating the 5. CONCLUSION effectiveness of a new instructional approach. ACM We have presented three JupyterLab extensions to enable SIGCSE Bulletin, 36(1):75–79, Mar. 2004. educational data mining research in CSEd and data science. [11] T. J. Nokes, R. G. M. Hausmann, K. VanLehn, and Two of these, the Blockly extension and the self-explanation S. Gershman. Testing the instructional fit hypothesis: extension, support direct logging of data to a standard end- the case of self-explanation prompts. Instructional point for POST requests. The Blockly extension provides Science, 39(5):645–666, Sept. 2011. [12] K. J. O’Hara, D. Blank, and J. Marshall. Computational notebooks for AI education. In I. Russell and W. Eberle, editors, Proceedings of the Twenty-Eighth International Florida Artificial Intelligence Research Society Conference, pages 263–268. AAAI Press, 2015. [13] A. M. Olney, S. D. Fleming, and J. C. Johnson. Learning data science with Blockly in JupyterLab. In Proceedings of the 52nd ACM Technical Symposium on Computer Science Education, SIGCSE ’21, page 1373, New York, NY, USA, 2021. Association for Computing Machinery. [14] B. Peng, G. Wang, J. Ma, M. C. Leong, C. Wakefield, J. Melott, Y. Chiu, D. Du, and J. N. Weinstein. SoS notebook: an interactive multi-language data analysis environment. Bioinformatics, 34(21):3768–3770, 2018. [15] T. W. Price and T. Barnes. Comparing textual and block interfaces in a novice programming environment. In Proceedings of the Eleventh Annual International Conference on International Computing Education Research, ICER ’15, pages 91–99, New York, NY, USA, 2015. ACM. [16] M. Resnick, J. Maloney, A. Monroy-Hernández, N. Rusk, E. Eastmond, K. Brennan, A. Millner, E. Rosenbaum, J. Silver, B. Silverman, and Y. Kafai. Scratch: Programming for all. Communications of the ACM, 52(11):60–67, Nov. 2009. [17] A. Rule. We analyzed 1 million Jupyter notebooks – now you can too. [18] H. Shen. Interactive notebooks: Sharing the code. Nature News, 515(7525):151, Nov. 2014. [19] E. Tulving. How many memory systems are there? American Psychologist, 40(4):385–398, 1985. [20] G. Wilson, F. Perez, and P. Norvig. Teaching computing with the IPython Notebook (abstract only). In Proceedings of the 45th ACM Technical Symposium on Computer Science Education, SIGCSE ’14, pages 740–740, New York, NY, USA, 2014. ACM. APPENDIX A. FIGURES Additional figures that support the main text are shown below. Figure 1: The JupyterLab Blockly extension showing the Blockly workspace on the left and notebook on the right. The workspace shows a dynamically-generated intelliblock for pandas with the possible function calls from the pd alias. Tooltips for pd and the functions not shown due to space limitations. The notebook on the right shows the code generated from a previous import block with the commented serialized XML used to regenerate that workspace when the cell is clicked again using notebook sync. Figure 2: The JupyterLab self-explanation extension showing the text entry box it appends to the bottom of every code cell. As the student types, the text changes red to indicate unsaved changes. When the student presses the “save” button, the self-explanation is logged and the text changes to black. Figure 3: The JupyterLab HTML injection extension showing a video embedded directly in the notebook markdown via the metadata for that cell, circumventing the JupyterLab security model. Note that such elements will not render if the notebook is viewed on a server without this extension installed.