Containerizing an eTextbook Infrastructure Alexander Hicks Clifford A. Shaffer Virginia Tech Virginia Tech Blacksburg, United States Blacksburg, United States alexhicks@vt.edu shaffer@vt.edu ABSTRACT 1. INTRODUCTION The CS Education community has developed many educa- Currently, the OpenDSA eTextbook project gives instruc- tional tools in recent years, such as interactive exercises. Of- tors and students access to a hosted version of the applica- ten the developer makes them freely available for use, hosted tion for free. But this is not sufficient for all institutional on their own server, and usually they are directly accessi- users, or even for our own installation as demand and sub- ble within the instructor’s LMS through the LTI protocol. sequent computational load grows. In principle, OpenDSA As convenient as this can be, instructors using these third- (being an open-source project) has supported users or uni- party tools for their courses can experience issues related to versities seeking to deploy their own copy. But this could data access and privacy concerns. The tools typically collect only be done by following a complex set of instructions. clickstream data on student use. But they might not make it easy for the instructor to access these data, and the insti- This paper describes our work to simplify the installation tution might be concerned about privacy violations. While process. We have many motivations for this. At first we the developers might allow and even support local installa- wanted to allow instructors to host this tool on their own, tion of the tool, this can be a difficult process unless the tool both to reduce our server load and to avoid complications carefully designed for third-party installation. And integra- when their University requires strong privacy restrictions. tion of small tools within larger frameworks (like a type of Outsourcing in this way also lets them gain access to ad- interactive exercise within an eTextbook framework) is also ditional student data that is being collected. So, this work difficult without proper design. began as an effort to create a new development environment for OpenDSA that would work on any platform. Once we This paper describes an ongoing containerization effort for saw the benefits of this new development environment, and the OpenDSA eTextbook project. Our goal is both to serve realized the benefits of keeping our development and produc- our needs by creating an easier-to-manage decomposition of tion environments as similar as possible, we started creating the many tools and sub-servers required by this complex the production environment described below. In particular, system, and also to provide an easily installable production we have now made it easy for the many students who work environment that instructors can run locally. This new sys- on the project to easily spin up a complete development tem provides better access to developer-level data analysis environment for any part of the system, a great savings in tools and potentially removes many FERPA-related privacy effort for both those students and the project managers. concerns. We also describe our efforts to integrate Caliper Analytics into OpenDSA to expand the data collection and The OpenDSA system has two major parts that needed to analysis services. We hope that our containerization archi- be containerized: a front-end content delivery server and a tecture can help provide a roadmap for similar projects to back-end LTI and data collection server [1, 11]. In this pa- follow. per, the front-end server will be referred to as OpenDSA and the back-end server is called OpenDSA-LTI or LTI. These Keywords two parts work together to serve content using the LTI pro- Containerization, Caliper Analytics, Learning Tools Inter- tocol [2]. These two systems work together to serve content operability to students. OpenDSA compiles the book instances from RST files into html files (using Sphinx [5]) that can be dis- played in a user’s browser. Then OpenDSA-LTI communi- cates with the learning management system (LMS) to serve content files located on the server to the student. 2. BACKGROUND Containers are lightweight and portable packages of soft- ware than can run anywhere their runtime is supported [8]. There are several different container providers, but we se- lected Docker as our container platform due to its familiarity Copyright ©2021 for this paper by its authors. Use permitted under Cre- and its current status as an industry standard for container- ative Commons License Attribution 4.0 International (CC BY 4.0) ization [10]. Figure 2: Containerized OpenDSA System Architecture The OpenDSA production environment currently consists Figure 1: Previous OpenDSA System Architecture of five containers: OpenDSA, OpenDSA-LTI, OpenPOP, a database, and a web server. All of these images that we cre- ated are published and accessible on Docker Hub at https: A containerized application has several benefits over tradi- //hub.docker.com/orgs/opendsa/repositories and desc- tionally packaged applications including portability and ease ribed in section 3.1. of setup. With Docker, the only program a user will need to install on their server is Docker, and everything else will 3.1 Component Containers be installed and run within Docker containers using built-in The OpenDSA container includes the OpenDSA repository orchestration through Docker-Compose. This makes it par- (https://github.com/OpenDSA/OpenDSA). Python scripts ticularly easy for new developers to start working on copies support compiling books to a specific location on the shared of the system. filesystem that OpenDSA will serve using the LTI container. This repository also stores configurations for the books to be compiled from. The OpenDSA-LTI container contains the 3. ARCHITECTURE code from https://github.com/OpenDSA/OpenDSA-LTI in a In order to containerize the existing OpenDSA architecture, Ruby container designed by Bitnami for Rails production we first had to investigate how the technologies that make deployments [6]. This container consists of the Rails server, up the current stack could be split and containerized. The and delivers the compiled content as required by an LMS current stack consists primarily of a Ruby on Rails appli- such as Canvas using the LTI protocol. cation (OpenDSA-LTI), a MySQL database, and an Nginx web server along with several other tools and requirements. OpenPOP’s (https://github.com/OpenDSA/OpenPOP) con- tainer is structured in a way similar to the OpenDSA-LTI Since OpenDSA and OpenDSA-LTI work by serving con- container. OpenPOP is an external tool that is used by tent files to the LMS, these containers, described below in OpenDSA through a different external tool, CodeWorkout, Section 3.1, need to share a filesystem. Originally, these two and this work includes OpenPOP in the installation rather processes ran on the same host natively, so there were no than keeping it separate as it was done previously. OpenDSA issues with where files were located as long as both compo- uses MySQL as the database and provides that in a separate nents were installed in the correct location [11]. As shown in container. figure 1, the previous OpenDSA system architecture had a copy of OpenDSA (the content part) within OpenDSA-LTI In order to avoid negative impacts from using a database in (the server), and maintained external connections on the an ephemeral container, we use a Docker volume to mount host machine with the database and the web server. Ad- the database onto the host filesystem to preserve the data if ditionally, a supporting visualization tool named OpenPOP the container fails, and to allow the container to be restarted ran as a separate server on the same machine, but not in a separately for routine maintence. Finally, we include two op- dedicated environment. Under the new organization shown tions for a web server container. For users who can acquire in figure 2, all of the systems are running in Docker on the their own SSL certificates, this stack will have an option same host, meaning we no longer have to worry about com- to import those certificates and use them in a Traefik web patibility between OpenDSA and OpenPOP’s dependencies. server [7]. For users that do not have their own SSL certifi- All of the networking is handled by Docker rather than an cates, there is an option to use Let’s Encrypt to automate Nginx configuration file. In the first attempt at containeriz- their SSL encryption, as long as they provide a domain [9]. ing this system, these two processes were combined into only Both of these options have minimal overhead required to set one Docker container, which was functional but slow because up and help keep the overall set up effort for the complete the container required three language runtimes and thus ran application to a minimum. many different processes. In order to split this container into two, we created a REST API around the OpenDSA function- 4. DISCUSSION AND FUTURE WORK ality that handles book compilation. This API is accessed As the CS Education community grows, there is an increas- by the LTI container using HTTP requests and places the ing availability both for individual tools that need to be in- compiled books in the Docker shared directory that LTI can tegrated into broader systems, and also integrated systems access. such as OpenDSA. In both cases, containerization can ex- tend their use by other parties. As the number of these sys- September 15, 2020. tems grow, the CS Education community needs to address [11] H. L. Shahin. Design and Implementation of how to keep all of these tools interoperable, and more avail- OpenDSA Interoperable Infrastructure. Thesis, able to instructors. We hope to provide a roadmap for other Virginia Tech, Aug. 2017. tools to follow to extend their reach. As issues around data access and privacy become more prevalent, hosting software on premises becomes more attractive to administrators, and the work presented here provides one method for doing so. In addition to the privacy benefits, the roadmap explored in this paper provides a centralized data repository that is closer to the consumers and allows for greater access and additional analysis by the instructors. Containerization also provides a cloud-based install option using AWS or a more robust container orchestration system such as Kubernetes as a possibility. While the efforts described in section 3 are ongoing, there are several other future enhancements planned for the plat- form. Currently, OpenDSA uses LTI 1.1, which is due to be deprecated and will be upgraded into LTI 1.3. Along with providing additional security, LTI 1.3 adds a new feature set and an opportunity to connect OpenDSA with other learn- ing tools and platforms such as Caliper Analytics [3, 4]. In- tegrating Caliper will allow OpenDSA to collect additional, standardized, data on student progress that can be shared when appropriate across platforms that use the LTI system. Further work would include breaking the Rails application into a more containerizable microservice architecture. Cur- rently, OpenDSA-LTI is a monolith that cannot take advan- tage of some benefits of containerization, including scalabil- ity. 5. REFERENCES [1] 12. OpenDSA-Introduction — OpenDSA System Documentation. https://opendsa.readthedocs.io/ en/latest/Introduction.html. [2] Basic Overview of How LTI works | IMS Global Learning Consortium. https://www.imsglobal.org/ basic-overview-how-lti-works. [3] Caliper Analytics | IMS Global Learning Consortium. https://www.imsglobal.org/activity/caliper. [4] LTI Security Announcement and Deprecation Schedule | IMS Global Learning Consortium. https://www.imsglobal.org/lti-security-announcement- and-deprecation-schedule. [5] Overview — Sphinx documentation. https://www.sphinx-doc.org/en/master/. [6] Secure and Optimize a Rails Web Application with Bitnami’s Production Containers. https://docs.bitnami.com/tutorials/secure-optimize- rails-application-bitnami-ruby-production/. [7] Traefik. https://doc.traefik.io/traefik/. [8] What is a Container? | App Containerization | Docker. https://www.docker.com/resources/what-container. [9] Linuxserver/docker-swag. LinuxServer.io, May 2021. [10] Dave Bartoletti and Charlie Dai. The Forrester New Wave™: Enterprise Container Platform Software Suites, Q4 2018, 2020. https://www.docker.com/resources/report/ the-forrester-wave-enterprise-container- platform-software-suites-2018, accessed on