AI-driven big web redesign: two case studies in Italian universities Andrea Vian1,† , Daniele Pretolesi2,† , Lucia Rampino3 and Annalisa Barla4,5,∗,† 1 Dipartimento Architettura e Design, Università di Genova, Genoa, Italy 2 AIT - Austrian Institute of Technology, Vienna, Austria 3 Dipartimento di Design, Politecnico di Milano, Milan, Italy 4 Dipartimento di Informatica, Bioingegneria, Robotica e Ingegneria dei Sistemi, Università di Genova, Genoa, Italy 5 Machine Learning Genoa Center, Università di Genova, Genoa, Italy Abstract This paper explores the challenges of web redesign in Public Administration (PA), particularly within universities. Universities often struggle with fragmented online presences due to distributed editorial models and diverse communication needs across research, education, and dissemination activities. Limited resources further restrict investment in upskilling staff and adopting modern technologies. Open source solutions, though cost-effective, are often chosen without considering user experience. We present a methodology that combines user-centered design, Artificial Intelligence (AI), and ”radical collaboration” to achieve a future-proof and scalable redesign. Starting from a case study of a major web redesign project at an Italian university (2014-2020) involving hundreds of websites and over 200 content editors, the paper details the process, including a large-scale content audit using AI, single sourcing with AI-powered content transformation, and user experience (UX) testing with data visualization. This approach resulted in a unified, user-centric online presence and garnered recognition, including the ForumPA award for best innovator. The paper concludes by discussing the applicability of this methodology to other PA institutions facing similar challenges. Keywords Large-scale Web Redesign, Artificial Intelligence, Topic Modeling, Academic Communication 1. Introduction signed exclusively to respond to the bureaucratic neces- sities of those who would maintain them, disregarding This work addresses the topic of web redesign for Public users’ needs. Moreover, due to budget constraints and Administration (PA), which typically constitutes a large- resource limitations, institutions often lack the capacity scale institution often tasked with managing a multitude to invest in the upskilling and reskilling of their staff. of websites and touchpoints aimed at diverse audiences, This perpetuates a cycle where existing technological re- and governed by a distributed editorial model, frequently sources are relied upon, even if they are outdated or insuf- lacking central coordination. This scenario is particularly ficient for evolving communication needs. Consequently, pronounced within the context of universities, whose open source solutions become appealing not because they communication needs usually respond to three centrifu- are a model of knowledge dissemination and data shar- gal driving forces: educational, research, and dissemi- ing, but as they offer a seemingly cost-effective solution nation activities [1]. Devising a unified communication without the need for significant investment in training or strategy is therefore a very challenging problem that is professional development. From the viewpoint of human- compounded by the limited specialized internal resources centered design, it is relevant to acknowledge how this typically available within universities. This usually leads vicious circle can be broken and how the institution can such institutions to create a disaggregated online pres- start a maturation process toward the experience of its ence, supported by technologies with limited scalability. users [2]. In this paper, we move from our experience The result is usually a proliferation of touchpoints de- in the University of Genova (UniGe) from 2014 to 2020 where some of the authors faced the wicked problem Ital-IA 2024: 4th National Conference on Artificial Intelligence, orga- of redesigning a plethora of hundreds of websites that nized by CINI, May 29-30, 2024, Naples, Italy were left abandoned and unmanaged for years and where ∗ Corresponding author. more than 200 content editors were contributing with- † These authors contributed equally. out any coordination [3]. We illustrate our methodology Envelope-Open andrea.vian@unige.it (A. Vian); daniele.pretolesi@ait.ac.at (D. Pretolesi); lucia.rampino@polimi.it (L. Rampino); based on user-centered design, artificial intelligence (AI), annalisa.barla@unige.it (A. Barla) and radical collaboration and explain how it allowed us Orcid 0000-0003-0629-0427 (A. Vian); 0000-0001-9075-0187 to approach the problem of large-scale web redesign in (D. Pretolesi); 0000-0002-7591-9324 (L. Rampino); a principled way that is future-proof and scalable. We 0000-0002-3436-035X (A. Barla) show how this methodology allowed us to successfully © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings redesign all touchpoints and how this approach can be pletely disconnected, reachable only and not always, by applied to other existing realities. search engines. This condition is clearly observable by looking at Figure 1, where we show the structure of one of the major components of the UniGe website panorama, 2. A big web redesign for UniGe studenti.unige.it, before the redesign process. It is eas- homepage and education ily noted that the graph has a highly entropic structure, displaying subgraphs composed only of PDF files, clique- websites like components of pages all connected to one another, Established in 1481, UniGe has a long history of aca- and disconnected subgraphs which are only reachable demic excellence and innovation. The university enrolls through search engines. about 3000 faculty and staff and it offers more than 200 Content was in even worse condition: incomplete undergraduate and graduate programs across various or redundant, outdated, self-contradictory, systemati- disciplines, including arts, sciences, engineering, law, cally drafted in an involuted form and with bureaucratic economics, medicine, and more, attracting tens of thou- and sectorial language, incomprehensible to the users sands of students every year. UniGe’s major web redesign to whom they are addressed. In the context of such a project launched in 2014 in sync with and in support of transformation project, the evolutionary web redesign the university’s strategic goals at the time, summarized mode works as long as it is free to scale up. But when the by the five keywords of the governance vision: simplifi- complexity of the domain or intervention reaches signifi- cation, participation, welcoming, integration and growth. cantly increased complexity, a technological and process Simplification guided the redesign work from the very paradigm shift is required. Without this, the resources beginning. The starting point is to let the communica- and procedures in use up to that point prove wholly in- tion system deal with inherent content complexity and adequate to handle the complexity involved. The risk is present the users with selected information that is defined the indefinite absorption of all available resources, result- by their unique profile. Participation materialized in the ing in the intervention not being successful [6]. In the system of content single-sourcing, which made possible context of a complex socio-technical system, such as a the coexistence of central editing and distributed editing. large PAs, this leads to a waste of public resources and Welcoming consisted in adopting the user-centered de- the realization of inadequate outcomes. Paradoxically, it sign approach that guided the entire project: from user also fosters internal resistance to change, manifesting as research to user profiling. Integration proved to be one a rejection of a goal—user experience—that is now seen of the biggest challenges. It first originated from the as entirely alien to the culture of PAs. adoption of the principle of decoupling of the backend and frontend [4] and the principle of headless develop- 3. AI-driven big web redesign ment(Koenig, 2018). It then guided us to develop a mid- dleware layer, which allowed for the coexistence of new methodology and legacy systems and enabled data interconnection In this section we describe the methodologies employed [5]. Lastly, growth was the result of design activities to tackle the redesign issue according to the pipeline geared toward substantial change in the services offered illustrated in Figure 2. The redesign process starts from to students. The aim here was to build an experiential rep- a large-scale website whose structure is highly entropic. utation based on improving the UX rather than forcibly Borrowing methods from design and computer science, pursuing a transformational facade. we combine quantitative and qualitative approaches to The first interventions required by the new governance achieve a multidisciplinary future-proof solution capable were a feasibility study and an assessment of the effort of adapting to different users’ needs whose result is a needed to redesign the UniGe’s website in light of the newly designed website that maximizes findability and keywords of the new strategic vision. An in-depth anal- maintainability over time. ysis revealed a common situation among Italian PAs: the website structure mirrored the institution’s ambiguous internal processes. It was, in fact, a multi-site. The home- 3.1. Design and interdisciplinarity page was the cohabitation space for an inordinate num- The first condition for a successful redesign is an evo- ber of pages that make up a multitude of independently lution of the radical collaboration approach [7]: in fact, managed websites supported by different technological when dealing with the UniGe use case, the first author solutions. Overall, this panorama of websites lacked a conducted the design and prototyping phases in a multi- unified design and even more so a cohesive Information disciplinary group, involving faculty with diverse exper- Architecture (IA). Webpages were linked in a maze of tise and other stakeholders within the university. This cross-references, with entire sections forgotten and com- led to the subsequent structuring of a small, permanent Figure 1: studenti.unige.it website structure (mid-2017). The website structure displays a set of idiosyncrasies, such as clique-like subgraphs, many clusters of PDF files linked from a few node pages, and an independent connected component that is a disconnected subgraph. multidisciplinary group consisting of designers, devel- constraints, eventually involving compartments of the in- opers, process analysts, copywriters and data scientists. stitution even quite distant from those directly involved As the complexity and project burdens grew, the pro- in service delivery [8]. totyping group evolved into the permanent radical col- laboration group, eventually becoming a design group 3.2. Network Analytics and AI for as well. At the same time, the group’s traction also ex- panded: the design research driving the original initiative large-scale web redesign increasingly needed data and the ability to analyze it to The process of an effective web redesign that is more interpret the complexity of the socio-technical context profound and radical than a “Face Lift” usually starts and to guide design action. This, in turn, continued to with a thorough content audit of the entire existing set exert a profound influence on technological choices and of pages. This process is time-consuming and cannot the choice of questions to be answered in the data. The be done manually if the number of pages ranges in the second condition necessary to achieve change in com- order of tens of thousands - a typical figure for large-scale plex settings, designed and built around the ascertained websites. This was the main reason that guided us into needs of users, is to have the deep and ongoing support leveraging AI approaches to understand the structure of governance. In a PA or complex entity, services and of the existing website, and define the optimal IA by processes are interconnected and depend - to name just categorizing the existing content. a few factors - on the people sustaining them, the orga- nizational models, the technological infrastructure, the Statistical testing to assess large-scale website struc- operational practices, the available skills, the investments ture The statistical analysis of the existing website in reskill and upskill, human resource recruitment and starts by considering the website as a directed network of management policies, the regulatory environment, and documents (nodes) connected through hyperlinks (edges). the financial situation. These are all mutually condition- One example is represented in Figure 1, where the main ing factors, and as the size of the institution grows, they structure of the studenti.unige.it (mid-2017) is depicted. grow more than proportionally, until they generate a sys- By exploiting network analytics [9], we explored the dis- tem of constraints and cascading repercussions involving tributions of the in-degree and out-degree, that are the the entire institution. The human-centered design or re- number of in-coming links, and the number of out-going design of a service impacts this system of interconnected links, respectively. We argue that the scale-free prop- erty, that characterizes the World-Wide-Web as a whole [10], is no longer evident for specific sub-categories of websites. Particularly, we defined a method that could be used to better characterize topological properties de- riving from different generative principles: central or peripheral. The first category is typically characterized Figure 2: Pipeline representing the AI-driven big web redesign by a strong central control in the design and evolution methodology of the IA and content generation. Conversely, the last Figure 3: Topic modeling on studenti.unige.it website struc- Figure 4: corsi.unige.it website structure. ture (mid-2017). The analysis identified 8 main topics, with many pages linked to multiple topics. the struggle typical of traditional content management systems to handle the ever-changing regulations, infor- category is completely user-guided and its evolution is mation overload, and new digital technologies that large likely to be random [11]. This method may be used to organizations face [14]. The system highlights the impor- trace and monitor the evolution over time of the website tance of human-centered design throughout the content content also according to the editorial model in use: a management process. It is designed to be user-friendly few people that are allowed to write anywhere in the and easy to learn, even for non-technical users. This website and tightly control the structure or a multitude ensures that content creators can focus on creating high- of contributors that are allowed to write based on collab- quality content, rather than struggling with the technol- orative editing and community moderation. The result ogy. This process resulted in the publication of a new set on the UniGe website denoted a combination of central of federated websites whose structure and content were and peripheral editorial strategies that, over time, led to a the result of the entire pipeline just described. Figure 4 chaotic arrangement, confirming the need for a profound illustrates its hierarchical structure. redesign. Topic Modeling to infer an optimal Information 3.4. Assessing results through data Architecture We then proceeded by defining an AI- visualization driven approach to define the optimal IA. Specifically, To assess the overall usability of the newly designed web- we exploited topic modeling methods [12, 13] to identify site and the effectiveness of the entire procedure, we how many topics were discussed on the website and vi- resorted to a set of UX tests conducted on a sample of 60 sualize how the pages often presented multiple unrelated students divided into two groups (𝑀𝑎𝑔𝑒 = 22.7𝑦𝑟𝑠), with topics, leading to a confusing UX. The result of such an the purpose of comparing the usability of the corsi.unige.it experiment on studenti.unige.it is visualized in Figure 3, website, designed with the methods above, and of stu- where the fuschia squares represent the topics and the denti.unige.it, which historically UniGe uses to commu- blue dots represent the pages. Even at first glance, it is nicate with its students. Each user must answer three clear that many pages are linked by two or more topics, questions related to a course of study, navigating exclu- with a central cluster of pages connected to up to 5 topics. sively on one of the two sites. The questions were the following. Question A: Is there an exam to enroll in the 3.3. AI-based component management Business Administration BSc? What does it consist of? system Question B: When is the deadline for Erasmus+ applica- tion? Question C: Find where to ask for an internship in To implement a new strategy for content creation and car design as a MSc student in Design. management, we devised a system that exploits single The exercise was deemed successful if the subject cor- sourcing and AI methods to transform information into rectly answered the question at the end of the search. structured and reusable data, making it easier to create, Researchers evaluated the effectiveness of the site by maintain, and update content across multiple channels measuring the number of clicks and the time taken to and user profiles. For example, a single piece of informa- complete the exercise, with each question allotted a max- tion can be automatically tailored to the needs of different imum response time of 600 seconds. There were three audiences, such as students, faculty, and staff. The sys- potential outcomes: correct, partially correct, or incor- tem also integrates with machine learning methods for rect answers. The vertical bubble chart in Figure 5 shows image recognition and automatic translation, further re- the result of the UX test. Each circle corresponds to a ducing the burden on human editors. This overcomes corsi.unige.it studenti.unige.it of implementing the newly redesigned system in the Time (s) 150.5 253.5 university procedures. Thus, human-centered design ap- # Clicks 5 16 Success Score 1 -1 proaches and codesign and shared prototyping activities Success Rate 75% 7% have proved invaluable allies in gaining consensus, incor- porating the knowledge and skills of all stakeholders into Table 1 the design process, and disseminating the culture of UX. Median time, median number of clicks, median success score, However, splitting the design-driven digital innovation and global success rate between the two websites over the project into autonomous modules was necessary three questions. This behavior, as predicted by [15], is to be expected when working within complex organizations to step up in the UX maturity model, particularly if the starting point is user and is arranged along the vertical axis according to the bottom of the ladder, i.e. absent or limited UX. In this the number of clicks. The diameter is proportional to case, besides working on technological improvements, the time required to perform the exercise while the color the institution must focus on a profound cultural change represents its success (correct answer, partially correct, that supports UX knowledge and data awareness. incorrect). Table 1 shows how on average users performance on studenti.unige.it was significantly worse than on 4. A new challenge: PhD corsi.unige.it. In particular, in the case of corsi.unige.it, Programme at PoliMi three out of four users answer correctly with just a few clicks and in a short time, with a success of 75 percent, Our successful experience in UniGe resonated across sev- while less than one in ten users responds correctly using eral national outlets and was presented in several news studenti.unige.it. Although our experience was successful channels and conferences [14, 16] and culminated in win- in achieving the objective we set in the beginning, this ning the ForumPA award for best innovator [17]. The was not without hiccups and obstacles. In particular, a exposure we received from these initiatives allowed us to strong resistance to change was observed at the moment get in touch with other realities that were facing similar issues as the ones we faced in UniGe. In this line, we were able to begin a tight-knit collaboration with the Department of Design at Politecnico di Milano (PoliMi), where the Design PhD Programme at PoliMi was facing the problem of redesigning their website. The PhD Pro- gramme in Design at PoliMi is the largest PhD design course in Italy, with almost 90 PhD students enrolled. Their goal was to devise a system that could allow for an optimal presentation of the research activity carried on by the PhD alumni and candidates as well as a website that could describe procedures to potential new students (admission procedure, courses requirements, etc.) and stakeholders. Our involvement since the very beginning of the redesign process, allowed us to adopt the methodol- ogy defined for UniGe. Together, we identified the major pain points from their old websites, mainly consisting of a lack of automation, an inactive homepage, and a content strategy that did not leverage single-sourcing, user profiling, or structured information. The result was a set of websites that penalized the research efforts. Ap- plying our methodology allowed us to identify the root of these inefficiencies. In particular, PhD procedures lack a connection between the legacy data, processes and the website content. Also, in research dissemination pages, the existing data structure does not account for the inter- actions among researchers and therefore it is difficult to keep up-to-date and coherent. Currently, the project is Figure 5: Usability tests: corsi.unige.it vs. studenti.unige.it focusing on designing a data-driven system which may enhance the usability of the websites through refining the quality of the underlying information structure. A pri- References mary objective is to establish an architecture capable of robust automation. This architecture aims to convert nat- [1] C. Pinho, M. Franco, L. Mendes, Web portals as ural language and simple information about researchers’ tools to support information management in activity in structured data. By exploiting legacy data in a higher education institutions: A systematic seamless flux, the system reduces maintenance effort at literature review, International Journal of the bare minimum and keeps data in sync and up to date. Information Management 41 (2018) 80–92. Also, the system will behave by considering research as [2] J. Nielsen, H. Loranger, Prioritizing web usability, the result of heterogeneous networks of people and top- Pearson Education, 2006. ics. In doing so, the data produced by the research efforts [3] A. Vian, A big web redesign: data driven design become structured and easily reusable across different research through practice and implementation, touchpoints. Fuori Collana, Genoa University Press, 2020. [4] S. Grünwald, H. Bergius, Decoupling content management, in: Proc. of WWW2012, 2012. 5. Discussion and Conclusion [5] C. Bizer, T. Heath, T. Berners-Lee, Linked data: The story so far, in: Semantic services, Our experience with two large scale universities, sug- interoperability and web applications: emerging gests that Italian academia still lacks concrete strategies concepts, IGI global, 2011, pp. 205–227. and methods to tackle the ever increasing amount of [6] L. Rosenfeld, P. Morville, Information architecture information a PA faces today. Particularly in the con- for the world wide web, O’Reilly Media, Inc., 2002. text of web redesign, we defined a methodology that [7] B. Burnett, D. Evans, Designing your work life: goes beyond a shallow visual redesign in favor of a new how to thrive and change and find happiness at methodology where existing data is first used to guide work, Knopf, 2020. the definition of an optimal IA and then structured to [8] D. A. Norman, P. J. Stappers, DesignX: Complex feed the touchpoints based on single-sourcing, interop- Sociotechnical Systems, She Ji: The Journal of erability, and user profiled approach. Our methodology Design, Economics, and Innovation 1 (2015). exploits design thinking principles, User-Centred Design [9] A.-L. Barabási, Network science, Philosophical methods, agile programming, prototyping and innova- Transactions of the Royal Society A: Mathematical, tors (young fearless developers and designers) to define Physical and Engineering Sciences 371 (2013). a new way of thinking about information and how to [10] A.-L. Barabási, Scale-free networks: a decade and distribute it across complex organizations. The result is a beyond, Science 325 (2009) 412–413. system that is able to govern a multitude of different and [11] D. Garbarino, V. Tozzo, A. Vian, A. Barla, A robust interconnected websites thanks to AI, Natural Language method for statistical testing of empirical Processing and data management principles. power-law distributions, in: Proc. of WAW 2020, Academia represents a unique type of PA, tasked with Springer, 2020. the crucial responsibility of not only accumulating vast [12] J. Chang, D. Blei, Relational topic models for knowledge but also effectively sharing it with the broader document networks, in: Artificial intelligence and public. However, like many PAs, academia tends to ex- statistics, PMLR, 2009, pp. 81–88. hibit a conservative stance toward digital advancements, [13] J. Chang, D. M. Blei, Hierarchical relational models often resisting change and clinging to outdated methods. for document networks (2010). Rather than embracing true innovation, there’s a ten- [14] A. Barla, M. Cuneo, S. R. Nunzi, G. Paniati, A. Vian, dency to equate progress with simply digitizing existing AI-based component management system for processes. This approach typically prioritizes documents structured content creation, annotation, and over recognizing the central importance of data, which publication, in: Proc. of IHSI 2022., volume 22, should serve as the foundation for all operations, par- AHFE International, 2022. ticularly within sprawling institutions characterized by [15] K. Pernice, K. Moran, K. Whitenton, S. Gibbons, numerous interconnected yet autonomous departments. The 6 levels of ux maturity, 2024. URL: https: //www.nngroup.com/articles/ux-maturity-model/. Acknowledgments [16] E. Capone, Intelligenza artificiale e big data, il futuro dell’università di genova passa da internet, A. Barla is part of the RAISE Innovation Ecosystem 2020. URL: https://bit.ly/3TPJIed. funded by the Recovery and Resilience Plan (NRRP) [17] ForumPA, Andrea vian: “la trasformazione digitale (ECS00000035) dei servizi informativi”, 2023. URL: https://bit.ly/3J6eM4b.