Privacy of Crowdsourcing Educational Platforms in the Light of New EU Regulations Katerina Zdravkova University Ss. Cyril and Methodius, Faculty of Computer Science and Engineering Rudjer Boshkovikj 16, 1000 Skopje, Macedonia katerina.zdravkova@finki.ukim.mk Abstract Many crowdsourcing systems enable an anonymous access and opportunity to namelessly contribute self-generated content without providing any personal data. However, Internet browsers collect metadata on a large scale, including learning management systems (LMS), which collect and store many identity and contact data. System administrators and the teachers responsible for the courses can access them at any time. Interactive activities embedded in the LMS can reveal sensitive data, such as religious beliefs, political views, health, sexual orientation, race, or membership to organizations. They are visible to all the enrolled students. Educational organizations who are hosting LMS, also collect a lot of data that is usually transferred to third countries, but also transmitted to third parties, including university researchers or outside companies, often even governments. This paper examines the challenges of a prospective crowdsourcing platform intended for education, which must be taken into consideration by design. It presents examples of violated privacy in education, the student protection regulations, and the privacy concerns of learning management systems. The compliance of the most popular LMSs, MOOCs and crowdsourcing systems with GDPR are examined and compared. The paper concludes with the privacy policy guidelines of the prospective crowdsourcing educational platform in the light of GDPR. Keywords: Crowdsourcing, GDPR, LMS, rights of data subject When students equate their performance within the 1. Introduction interactive educational system with their behaviour in the Many crowdsourcing systems enable an anonymous access social media, they can accidentally reveal some sensitive data, like their religious and political views, health status, and opportunity to namelessly contribute self-generated sexual orientation, race, and membership to organizations, content without providing any personal data (Halder, 2014). However, Internet browsers, which support the or intentionally impose their dogmas, enforced decisions, or beliefs (Zdravkova, 2016). Once posted, this information functioning of crowdsourcing platforms, collect metadata could remain visible to all the course participants. These on a large scale (Soltani and Seno, 2014). Digital traces include: users’ IP address, their exact location, time zone issues are a further privacy threat that is usually not protected at all so far (Drummond and Fischhoff, 2017). and language, the type of the used device (PC, laptop, In 2016, EU approved the General Data Protection tablet, mobile), hardware features (CPU, graphics cards, RAM specifications), the operating system, the screen Regulation (GDPR), which was enforced in May 2018 (European Commission, 2018). It enhances the regulation resolution, the battery level, the moment and the duration responsible for personally identifiable information, of accessing the browser, as well as the installed browser plugins. These facts generate a browser fingerprint, which processing and free movement. GDPR’s main purpose is “to enhance data protection rights of individuals and to is a very accurate method to identify unique browsers and improve business opportunities by facilitating the free flow track online activities (Eckersley, 2010). Moreover, servers send HTTP cookies to user’s browser, such as the of personal data in the digital single market”. It harmonized the protection of “fundamental rights and freedoms”, in the authentication ones, user preferences and settings, which context of technological developments, globalization, are stored on the user’s computer. Since data collection and cookie depositing are almost unavoidable, and permitted increasing scale of data collection and sharing, regarding the necessity of free flow of personal data, not only within according to most privacy protection laws, crowdsourcing EU, but also towards third countries. can be considered privacy safeguarded per se. New learning management systems collect and store a lot Educational crowdsourcing systems are a symbiosis of both. For educational purposes, most of the previously of identity and contact data, such as: student ID, name, e- mentioned data and metadata should inevitably be mail, picture, in addition to a list of server logs, all activities undertaken, their duration, grades of learning assignments, collected. Responsible platforms should enable their processing, accessing, sharing and transfer to third parties and the browser type and language (Flanagan and Ogata, and countries obeying precisely the privacy protection 2017). System administrators and all the teachers responsible for the course can access them at any time. principles. New EU regulations affect the creation of privacy policies of educational crowdsourcing. Educational organizations who are hosting LMSs collect This paper examines the challenges of a crowdsourcing additional identifiable data. Student records are sometimes extensive and completely incompatible to modern laws, platform intended for education, which should be taken into consideration prior to its launching. It continues with which tend to minimize the amount of personally examples of violated privacy in education, privacy identifiable information. Moreover, the collected data are usually transmitted to third parties via government concerns of learning management systems, and student protection regulations. In section 3, the compliance of the agencies, mainly to education researchers (Joiner, 2018). most popular LMSs, MOOCs and crowdsourcing systems Interactive activities embedded in the LMS, such as the wikis, discussion forums and blogs are always associated with GDPR, is examined and compared. Section 4 is dedicated to enetCollect’s affiliated organisations EURAC with the name and the picture of the content provider, and ILIAS. The paper concludes with the privacy policy which can be either a teacher or another student enrolled into the same course (Poore, 2015). guidelines of a prospective crowdsourcing platform. EnetCollect WG3 & WG5 Meeting, 24-25 October 2018, Leiden, Netherlands 44 2. Privacy in education However, the greatest privacy challenge for the learners One of the major imperatives of European higher education and their teachers is the opportunity to generate interactive area (EHEA) is student-centred learning, which promotes content, where all the uploaded information is visible to all supportive and inspiring learning environment based on other participants of the course, and the authorship is innovative teaching methods, pedagogical innovation and associated to its creator. Even when the content is erased, digital technologies (Bergan and Deca, 2018). The the traces of its existence remain permanent. effectiveness of digitally supported education highly 2.3 Student protection regulations depends on the well-established privacy protection (Zeide Most LMSs, MOOCs and crowdsourcing projects are and Nissenbaum, 2018). Privacy concerns additionally hosted in the US, and are used massively outside of them, grow due to the emergence of the MOOCs over the existing which led to the necessity to establish a reasonable online learning management systems (Sandeen, 2013). framework, in order to avoid some prospective They enable universal access, which amplifies their international conflicts. In spite of many regulations, such disruptive nature (Jones and Regner, 2016). The as: FERPA, PPRA, IDEA and COPPA there is not a single involvement of many non-educational institutions in the comprehensive federal U.S. law regulating the collection MOOCs additionally aggravates the intention to establish and use of personal data (https://www.usa.gov/privacy)1. strict privacy policy regulations. The following subsections To handle the problem, mutual EU-US and Swiss-US observe three aspects: examples of violated privacy, privacy agreements have been established. They regulate general privacy concerns of learning management systems, data privacy, safety and security, as well as cross-border and the privacy protection regulations applied to education. data transfers. The two frameworks are standardised for all 2.1 Violated privacy in education other European National Privacy regulations, so if one Suzanne Widup’s (2010) exhaustive report revealed that organization is compliant with GDPR, it is very probable from 2005 to 2009, more than 2 800 data breach incidents that it also fulfils the national regulations. occurred, 549 of them in educational organizations. The amount of breached records exceeded 10 million (Widup, 3. Compliance with GDPR 2010). According to this report, one of the crucial reasons The new EU privacy protecting regulations contain 99 for such a high occurrence of data violations in education articles divided into 11 chapters (EC, 2018). For the was the absence of monitoring systems that might prevent prospective crowd-oriented learning system, it is essential the malicious use of student data. Another report has to study the “rights of data subject”, where “data subject” recently proved that larger universities, universities with is any “identified or identifiable natural person” (chapter more financial resources, and universities with weak 3), and the “transfers of personal data to third countries or privacy policies were more susceptible to data breaches international organisations” (chapter 5). Article 85, which (Mello, 2018). deals with the “processing and freedom of expression and DLA Piper study reports almost 60 000 data breaches in information”, might also be decisive for enetCollect. If the Europe after the introduction of GDPR, more than one sixth rights of data subject, and the cross border data transfers in UK (DLA Piper, 2019). Most notifications were spotted are not carefully established, all the “remedies, liability and among private and public organisations from the penalties” from chapter 8 will be implemented. They can Netherlands, Germany and UK. Even though the report be gigantic, like the fine of 50 million EUR, which was doesn’t highlight the type of the organisation, it is very imposed on Google by French data protection watchdog realistic that at least 10 000 belong to educational (DLA Paper, 2019). establishments. The basic rights of data subject of the most popular LMSs, 2.2 Privacy concerns of learning environments MOOCs and crowdsourcing systems are presented in Table 1, which appear at the end of the paper. GDPR rights are Academic analytics became an inevitable and a very clustered into five sections: transparency and modalities, reliable tool for assessment and auditing of education information and access to personal data, rectification and (Campbell, DeBlois and Oblinger, 2007). It is usually erasure, right to object and automated individual decision- combined with educational data mining “providing useful making, and restrictions (EC, 2018). The compliance of the insights into student behavior online” (Baepler and educational systems with them is judged according to their Murdoch, 2010). The process of gathering, analysing, and privacy notes and terms of use. The defined criteria for each presenting student data is usually performed within are presented in the following five paragraphs. learning management systems. Student data have The compliance with the transparency and modalities legal nowadays expanded to big data (Chen, Mao and Liu, 2014; items among other, means that the existence of an Godwin-Jones, 2017). Their huge volume makes them a appointed controller; provided written or oral information fruitful arena for rich data analysis, which increases the related to data processing; provided information related to possibility of uncontrolled data mining and significantly data transfers to a third country or to an international reduces privacy (Johnson, 2014). organisation; controller’s duty to protect data processing; An additional problem is the redirection of the traditional protection of data subject from any legal effects based eLearning methods towards cloud services, where privacy solely on automated processing; and implementation of and security issues are a real challenge (Sen, 2015). suitable measures to safeguard the data subject's rights and freedom, and legitimate interests. 1 All the online resources, privacy policies and terms of use were last retrieved on 10th April 2019. EnetCollect WG3 & WG5 Meeting, 24-25 October 2018, Leiden, Netherlands 45 Information and access to personal data refer to: the Khan Academy is a global multilingual classroom for purpose of data collection; contact details of the controller; millions of users. Their privacy policy is carefully the recipients of collected data; the period of storing the prepared, and it includes special clauses for European users data; the right to access the data; the right to demand an only (khanacademy.org/about/privacy-policy). erasure of personal data; the right to restrict processing; Mechanical Turk’s privacy notice redirects towards detailed information of accessing data; and direct access to Amazon, whose privacy has not been recently updated, collected data. (mturk.com/privacy-notice), thus it is hardly compliant Rectification and erasure clauses imply that the data subject with GDPR. It might be crucial for their unethical acting has the right to: demand a rectification of inaccurate while harvesting Facebook profiles and manipulating personal data; right to erasure (‘right to be forgotten’); right people (EFF, 2018). to restriction of processing; notification that any of the Moodle is the most popular open source LMS with almost three later actions have been performed; and the right to 150 million registered users (moodle.net/stats/) who are receive the personal data. striving for the highest ethical standards. MoodleDocs The right to object and the automated individual decision- privacy rights are compatible with GDPR at all points. But, making, are comprised of the rights to object data this January, Moodle experienced an outage (Greidanos, processing at any time; and the rights to object data 2019). Unlike Edmodo, it suffered from lack of reliability. processing for direct marketing purposes. SAP SuccessFactors is a cloud provider with 120 million Restriction refers to a limited scope of obligations in users, whose cloud security and data privacy are carefully special circumstances related to the fundamental rights and designed and maintained, providing complete compliance freedoms; and safeguarding of democratic society. with privacy and security standards worldwide (www.suc Blackboard is one of the leading LMSs, and as said by cessfactors.com/content/ssf-site/en/about/privacy.html). them, #1 Global Education Software Provider. With more In parallel with the rights of data subjects, the compliance than 100 million users, Blackboard must guarantee the best with the Article 85 of all the studied platforms was also conditions, including privacy. Blackboard has a very strict examined. After a very exhaustive examinations of their and detailed privacy, which is EU-U.S. Privacy Shield corresponding policies, it was noticed that none mentions the freedom of expression and information. An exception certified. The compliance with GDPR is presented in the is Moodle, which contains a word censorship filter, 21 pages long GDPR White Paper. intended to disable the submission of “obscene or other Canvas is Instructure’s LMS with more than 18 million unwanted words in the text” within forums and wikis users (instructure.com), intended for K-12 and university (https://docs.moodle.org/36/en/Word_censorship_filter). It students. In parallel with the privacy policy, Canvas has can be misused to restrict the free expression, because the extensions for the residents of the EU and Switzerland. censor.php file can be tailored to disable some word strings. Canvas is also dedicated to adapting their own privacy Most observed educational and crowdsourcing systems policy to GDPR. They are self-certified under the EU-U.S. have shown a very high social responsibility and a serious Privacy Shield. Recently, there were complaints about data concern about privacy rights of their users. Unfortunately, treatment and third parties (privacy.commonsense.org/ the abuse of users’ confidence has occurred in both evaluation/canvas). observed crowdsourcing systems. With more than 300 million users and “world's largest collection of language-learning data”, Duolingo is the biggest educational community dedicated to language 4. EnetCollect and new EU regulations learning, which presents completely crowdsourced The major motivation of this study was to discover the language courses ai.duolingo.com/). It has the most deficiencies of the related educational platforms in order to comprehensive privacy policy, which carefully covers all avoid them carefully while creating the enetCollect’s the privacy, safety and security rights of data subject, crowd-oriented language learning system. It was concluded (duolingo.com/privacy). In spite of the declared readiness that declaratively, all of them respect the rights of data to protect users’ data, the application is criticized for “third- subject and pay attention to information security. Well party advertising or tracking services” (privacy.common established policies and terms of use converge to some sense.org/evaluation/duolingo). general rules and recommendations, which should be taken Intended for K-12, Edmodo is another example of a into consideration for the prospective platform. learning management system with detailed privacy policy It is very probable that the selection of the platform (go.edmodo.com/privacy-policy/) and terms of service. provider will be done among the two technically most These regulations are not fully compatible with GDPR, but engaged partners of the action: EURAC or ILIAS. Namely, still offer significant rights to data subjects. In May 2017, the official presentation of enetCollect is hosted by Edmodo suffered a severe data breach, which affected 77 EURAC (http://enetcollect.eurac.edu/), while the intranet million users (EHL, 2017). website is available from ILIAS (https://enetcollect.net/). EdX is an open-source platform and MOOC provider with How much are they compliant to new EU regulations? more than 130 partners and 18 million users. They claim: EURAC research has a privacy policy which has been “edX is making a good faith effort to comply, given our recently adjusted according to EU Regulation 2016/679 global reach with learners and partners.” The privacy (eurac.edu/en/aboutus/Pages/Privacy.aspx). However, it policy proves it (edx.org/edx-privacy-policy). warns the users about the use of Google Analytics, without FutureLearn is a digital educational platform “wholly an immediate possibility to “decline the use of cookies”. owned by The Open University” (future learn.com/about- Furthermore, the website “may use the third-party cookies” futurelearn). Highly experienced OU prepared a very including some social plugins. With these official concise and fully GDPR compliant privacy policy announcements, EURAC research disclaims responsibility (about.futurelearn.com/terms/privacy-policy). for any privacy violation. EnetCollect WG3 & WG5 Meeting, 24-25 October 2018, Leiden, Netherlands 46 Although ILIAS is a multi-language open-source LMS, polarized beliefs on controversial science topics. their privacy policy, or more precisely, the terms of service Proceedings of the National Academy of Sciences, are presented in German only (docu.ilias.de/ilias.php?cmd 114(36): 9587-9592. =showTermsOfService&cmdClass=ilstartupgui&cmdNod Eckersley, P. (2010). How unique is your web browser? e=k8&baseClass=ilStartUpGUI). The policy starts with the International Symposium on Privacy Enhancing intellectual property rights under GPL, carries on with the Technologies Symposium, Springer: 1-18. limitations of inappropriate content, and continues with EFF (2018). Yet another lesson from the Cambridge data protection. The compliance with GDPR is not Analytica fiasco: Remove the barriers to user privacy explicitly highlighted, but all the rights of data subject are control, https://www.eff.org/deeplinks/2018/03/why- carefully examined. The possibility of using the LMS by we-didnt-make-fix-my-facebook-privacy-settings-tool people with blindness or visual impairments, which is EHL, Edmodo Help Center (2017). Important notice about guaranteed by the Marrakesh Treaty in not enabled your Edmodo account: https://support.edmodo.com/hc (www.wipo.int/marrakesh_treaty/en/). This is the only /en-us/articles/115007376848-Important-Notice-About- system, which reveals the responsible authority for all the Your-Edmodo-Account data protection issues (http://www.ldi.nrw.de). EC, European Commission (2018). Data protection: 2018 reform of EU data protection rules, https://eur- 5. Conclusion lex.europa.eu/eli/reg/2016/679/oj EnetCollect’s crowdsourcing framework for language Flanagan, B., & Ogata, H. (2017). Integration of learning analytics research and production systems while learning can initially adopt EURAC’s prudent privacy protecting privacy. The 25th International Conference policy. Privacy notes should be accompanied with terms of use, and with a rational acceptable use policy. Furthermore, on Computers in Education, New Zealand: 333-338. Godwin-Jones, R. (2017). Scaling up and zooming in: Big Marrakesh Treaty should also be taken into consideration, data and personalization in language learning. Language to enable access to learning resources to all the learners and teachers, without any disability discrimination. The Learning & Technology, 21(1), 4-15. Greidanos, P. (2019). Moodle.org outage and data loss: corresponding regulation for US, which is not a member of https://moodle.org/news/#p1535490 the World Intellectual Property Organization is the Equality Act (equalityhumanrights.com/en/equality-act). Halder, B. (2014). Evolution of crowdsourcing: potential data protection, privacy and security concerns under the All the pointed issues are primarily recommended for new media age. Revista Democracia Digital e Governo enetCollect’s framework, but they are also applicable to all the existing or new educational platforms worldwide, Eletrônico, 1(10): 377-393. Johnson, J. A. (2014). The ethics of big data in higher including the crowd-oriented ones. education. International Review of Information Ethics, After alerting the prospective users about all these documents, a written consent about data privacy and 21(21), 3-10. Johnson, L., Becker, S. A., Cummins, M., Estrada, V., intellectual property should be obtained from all of them. Freeman, A., & Hall, C. (2016). NMC horizon report: But first, the users should be properly introduced to the documents and advised to read them carefully. To do so, 2016 higher education edition (pp. 1-50). The New Media Consortium. they should be as clear as possible, very concise and easily Joiner, M. C. (2018). To see or not to see: the constant comprehensible. To guarantee that all the sensitive student information are conflict between promoting public access to information whilst maintaining confidentiality, Student Records. safeguarded, the regulations defined should be obeyed with Jones, M. L., & Regner, L. (2016). Users or students? no exclusions. Accountability measures should be strict. Otherwise, enetCollect’s system will be one of those Privacy in university MOOCS. Science and engineering ethics, 22(5): 1473-1496. experiments, which impose “privacy concerns and the Mello, S. (2018). Data Breaches in Higher Education safety of student data as obstacles” (Johnson et al, 2016). Institutions, University of New Hampshire Poore, M. (2015). Using social media in the classroom: A 6. Bibliographical References best practice guide. Sage. Baepler, P., & Murdoch, C. J. (2010). Academic analytics Sandeen, C. (2013). Assessment's place in the new MOOC and data mining in higher education. Int. journal for the world. Research & practice in assessment, 8, 5-12. scholarship of teaching and learning: 4(2), 17. Sen, J. (2015). Security and privacy issues in cloud Bergan, S., & Deca, L. (2018). Twenty years of Bologna computing. Cloud technology: concepts, methodologies, and a decade of EHEA: what is next? European higher tools, and applications (pp. 1585-1630). IGI Global. education area: The impact of past and future policies Soltani, S., & Seno, S. A. H. (2017). A survey on digital (pp. 295-319). Springer, Cham. evidence collection and analysis. 7th International Campbell, J. P., DeBlois, P. B., & Oblinger, D. G. (2007). Conference on Computer and Knowledge Engineering Academic analytics: A new tool for a new era. (ICCKE), IEEE: 247-253. EDUCAUSE review, 42(4), 40. Widup, S. (2010). The leaking vault: Five years of data Chen, M., Mao, S., & Liu, Y. (2014). Big data: A survey. breaches. Digital Forensics Association, 1-42. Mobile networks and applications, 19(2), 171-209. Zdravkova, K. (2016). Reinforcing social media based DLA Piper (2019). DLA Piper GDPR data breach survey: learning, knowledge acquisition and learning evaluation. https://www.dlapiper.com/en/uk/insights/publications/2 Procedia - Social and Behavioral Sciences, 228: 16-23. 019/01/gdpr-data-breach-survey/ Zeide, E., & Nissenbaum, H. (2018). Learner privacy in Drummond, C., & Fischhoff, B. (2017). Individuals with MOOCs and virtual education. Theory and Research in greater science literacy and education have more Education, 16(3), 280-307. EnetCollect WG3 & WG5 Meeting, 24-25 October 2018, Leiden, Netherlands 47 Rights of data Transparency and Access to personal Rectification and Right to object Restrictions subject modalities data erasure Blackboard Complete Complete Complete Complete blackboard.com Partial compliance compliance compliance compliance compliance Canvas Complete Complete canvaslms.com Partial compliance compliance compliance Partial compliance Partial compliance Duolingo Complete Complete Complete Complete Complete duolingo.com compliance compliance compliance compliance compliance Edmodo Complete Complete edmodo.com Partial compliance compliance Not designated Not designated compliance EdX Complete Complete Complete Complete www.edx.org compliance compliance compliance compliance Not designated FutureLearn Complete Complete Complete Complete Complete futurelearn.com compliance compliance compliance compliance compliance Khan Academy Complete Complete Complete Complete Complete khanacademy.org compliance compliance compliance compliance compliance Mechanical Turk mturk.com Partial compliance Not designated Not designated Not designated Not designated Moodle: Complete Complete Complete Complete Complete Moodle.org compliance compliance compliance compliance compliance SAPSuccessFactors Complete Complete Complete Complete Complete successfactors.com compliance compliance compliance compliance compliance Table 1: Rights of data subjects in learning management systems and crowdsourcing platform EnetCollect WG3 & WG5 Meeting, 24-25 October 2018, Leiden, Netherlands 48