=Paper= {{Paper |id=Vol-3762/496 |storemode=property |title=AI-driven big web redesign: two case studies in Italian universities |pdfUrl=https://ceur-ws.org/Vol-3762/496.pdf |volume=Vol-3762 |authors=Andrea Vian,Daniele Pretolesi,Lucia Rampino,Annalisa Barla |dblpUrl=https://dblp.org/rec/conf/ital-ia/VianPRB24 }} ==AI-driven big web redesign: two case studies in Italian universities== https://ceur-ws.org/Vol-3762/496.pdf
                                AI-driven big web redesign: two case studies in Italian
                                universities
                                Andrea Vian1,† , Daniele Pretolesi2,† , Lucia Rampino3 and Annalisa Barla4,5,∗,†
                                1
                                  Dipartimento Architettura e Design, Università di Genova, Genoa, Italy
                                2
                                  AIT - Austrian Institute of Technology, Vienna, Austria
                                3
                                  Dipartimento di Design, Politecnico di Milano, Milan, Italy
                                4
                                  Dipartimento di Informatica, Bioingegneria, Robotica e Ingegneria dei Sistemi, Università di Genova, Genoa, Italy
                                5
                                  Machine Learning Genoa Center, Università di Genova, Genoa, Italy


                                                  Abstract
                                                  This paper explores the challenges of web redesign in Public Administration (PA), particularly within universities. Universities
                                                  often struggle with fragmented online presences due to distributed editorial models and diverse communication needs across
                                                  research, education, and dissemination activities. Limited resources further restrict investment in upskilling staff and adopting
                                                  modern technologies. Open source solutions, though cost-effective, are often chosen without considering user experience. We
                                                  present a methodology that combines user-centered design, Artificial Intelligence (AI), and ”radical collaboration” to achieve
                                                  a future-proof and scalable redesign. Starting from a case study of a major web redesign project at an Italian university
                                                  (2014-2020) involving hundreds of websites and over 200 content editors, the paper details the process, including a large-scale
                                                  content audit using AI, single sourcing with AI-powered content transformation, and user experience (UX) testing with
                                                  data visualization. This approach resulted in a unified, user-centric online presence and garnered recognition, including the
                                                  ForumPA award for best innovator. The paper concludes by discussing the applicability of this methodology to other PA
                                                  institutions facing similar challenges.

                                                  Keywords
                                                  Large-scale Web Redesign, Artificial Intelligence, Topic Modeling, Academic Communication



                                1. Introduction                                                                                                  signed exclusively to respond to the bureaucratic neces-
                                                                                                                                                 sities of those who would maintain them, disregarding
                                This work addresses the topic of web redesign for Public users’ needs. Moreover, due to budget constraints and
                                Administration (PA), which typically constitutes a large- resource limitations, institutions often lack the capacity
                                scale institution often tasked with managing a multitude to invest in the upskilling and reskilling of their staff.
                                of websites and touchpoints aimed at diverse audiences, This perpetuates a cycle where existing technological re-
                                and governed by a distributed editorial model, frequently sources are relied upon, even if they are outdated or insuf-
                                lacking central coordination. This scenario is particularly ficient for evolving communication needs. Consequently,
                                pronounced within the context of universities, whose open source solutions become appealing not because they
                                communication needs usually respond to three centrifu- are a model of knowledge dissemination and data shar-
                                gal driving forces: educational, research, and dissemi- ing, but as they offer a seemingly cost-effective solution
                                nation activities [1]. Devising a unified communication without the need for significant investment in training or
                                strategy is therefore a very challenging problem that is professional development. From the viewpoint of human-
                                compounded by the limited specialized internal resources centered design, it is relevant to acknowledge how this
                                typically available within universities. This usually leads vicious circle can be broken and how the institution can
                                such institutions to create a disaggregated online pres- start a maturation process toward the experience of its
                                ence, supported by technologies with limited scalability. users [2]. In this paper, we move from our experience
                                The result is usually a proliferation of touchpoints de- in the University of Genova (UniGe) from 2014 to 2020
                                                                                                                                                 where some of the authors faced the wicked problem
                                Ital-IA 2024: 4th National Conference on Artificial Intelligence, orga- of redesigning a plethora of hundreds of websites that
                                nized by CINI, May 29-30, 2024, Naples, Italy                                                                    were left abandoned and unmanaged for years and where
                                ∗
                                     Corresponding author.                                                                                       more than 200 content editors were contributing with-
                                †
                                    These authors contributed equally.                                                                           out any coordination [3]. We illustrate our methodology
                                Envelope-Open andrea.vian@unige.it (A. Vian); daniele.pretolesi@ait.ac.at
                                (D. Pretolesi); lucia.rampino@polimi.it (L. Rampino);                                                            based on user-centered design, artificial intelligence (AI),
                                annalisa.barla@unige.it (A. Barla)                                                                               and radical collaboration and explain how it allowed us
                                Orcid 0000-0003-0629-0427 (A. Vian); 0000-0001-9075-0187                                                         to approach the problem of large-scale web redesign in
                                (D. Pretolesi); 0000-0002-7591-9324 (L. Rampino);                                                                a principled way that is future-proof and scalable. We
                                0000-0002-3436-035X (A. Barla)                                                                                   show how this methodology allowed us to successfully
                                                    © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License
                                            Attribution 4.0 International (CC BY 4.0).




CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
redesign all touchpoints and how this approach can be          pletely disconnected, reachable only and not always, by
applied to other existing realities.                           search engines. This condition is clearly observable by
                                                               looking at Figure 1, where we show the structure of one
                                                               of the major components of the UniGe website panorama,
2. A big web redesign for UniGe                                studenti.unige.it, before the redesign process. It is eas-
   homepage and education                                      ily noted that the graph has a highly entropic structure,
                                                               displaying subgraphs composed only of PDF files, clique-
   websites                                                    like components of pages all connected to one another,
Established in 1481, UniGe has a long history of aca-          and disconnected subgraphs which are only reachable
demic excellence and innovation. The university enrolls        through search engines.
about 3000 faculty and staff and it offers more than 200          Content was in even worse condition: incomplete
undergraduate and graduate programs across various             or redundant, outdated, self-contradictory, systemati-
disciplines, including arts, sciences, engineering, law,       cally drafted in an involuted form and with bureaucratic
economics, medicine, and more, attracting tens of thou-        and sectorial language, incomprehensible to the users
sands of students every year. UniGe’s major web redesign       to whom they are addressed. In the context of such a
project launched in 2014 in sync with and in support of        transformation project, the evolutionary web redesign
the university’s strategic goals at the time, summarized       mode works as long as it is free to scale up. But when the
by the five keywords of the governance vision: simplifi-       complexity of the domain or intervention reaches signifi-
cation, participation, welcoming, integration and growth.      cantly increased complexity, a technological and process
Simplification guided the redesign work from the very          paradigm shift is required. Without this, the resources
beginning. The starting point is to let the communica-         and procedures in use up to that point prove wholly in-
tion system deal with inherent content complexity and          adequate to handle the complexity involved. The risk is
present the users with selected information that is defined    the indefinite absorption of all available resources, result-
by their unique profile. Participation materialized in the     ing in the intervention not being successful [6]. In the
system of content single-sourcing, which made possible         context of a complex socio-technical system, such as a
the coexistence of central editing and distributed editing.    large PAs, this leads to a waste of public resources and
Welcoming consisted in adopting the user-centered de-          the realization of inadequate outcomes. Paradoxically, it
sign approach that guided the entire project: from user        also fosters internal resistance to change, manifesting as
research to user profiling. Integration proved to be one       a rejection of a goal—user experience—that is now seen
of the biggest challenges. It first originated from the        as entirely alien to the culture of PAs.
adoption of the principle of decoupling of the backend
and frontend [4] and the principle of headless develop-        3. AI-driven big web redesign
ment(Koenig, 2018). It then guided us to develop a mid-
dleware layer, which allowed for the coexistence of new           methodology
and legacy systems and enabled data interconnection
                                                               In this section we describe the methodologies employed
[5]. Lastly, growth was the result of design activities
                                                               to tackle the redesign issue according to the pipeline
geared toward substantial change in the services offered
                                                               illustrated in Figure 2. The redesign process starts from
to students. The aim here was to build an experiential rep-
                                                               a large-scale website whose structure is highly entropic.
utation based on improving the UX rather than forcibly
                                                               Borrowing methods from design and computer science,
pursuing a transformational facade.
                                                               we combine quantitative and qualitative approaches to
   The first interventions required by the new governance
                                                               achieve a multidisciplinary future-proof solution capable
were a feasibility study and an assessment of the effort
                                                               of adapting to different users’ needs whose result is a
needed to redesign the UniGe’s website in light of the
                                                               newly designed website that maximizes findability and
keywords of the new strategic vision. An in-depth anal-
                                                               maintainability over time.
ysis revealed a common situation among Italian PAs: the
website structure mirrored the institution’s ambiguous
internal processes. It was, in fact, a multi-site. The home-   3.1. Design and interdisciplinarity
page was the cohabitation space for an inordinate num-
                                                               The first condition for a successful redesign is an evo-
ber of pages that make up a multitude of independently
                                                               lution of the radical collaboration approach [7]: in fact,
managed websites supported by different technological
                                                               when dealing with the UniGe use case, the first author
solutions. Overall, this panorama of websites lacked a
                                                               conducted the design and prototyping phases in a multi-
unified design and even more so a cohesive Information
                                                               disciplinary group, involving faculty with diverse exper-
Architecture (IA). Webpages were linked in a maze of
                                                               tise and other stakeholders within the university. This
cross-references, with entire sections forgotten and com-
                                                               led to the subsequent structuring of a small, permanent
Figure 1: studenti.unige.it website structure (mid-2017). The website structure displays a set of idiosyncrasies, such as
clique-like subgraphs, many clusters of PDF files linked from a few node pages, and an independent connected component
that is a disconnected subgraph.



multidisciplinary group consisting of designers, devel-          constraints, eventually involving compartments of the in-
opers, process analysts, copywriters and data scientists.        stitution even quite distant from those directly involved
As the complexity and project burdens grew, the pro-             in service delivery [8].
totyping group evolved into the permanent radical col-
laboration group, eventually becoming a design group             3.2. Network Analytics and AI for
as well. At the same time, the group’s traction also ex-
panded: the design research driving the original initiative
                                                                      large-scale web redesign
increasingly needed data and the ability to analyze it to        The process of an effective web redesign that is more
interpret the complexity of the socio-technical context          profound and radical than a “Face Lift” usually starts
and to guide design action. This, in turn, continued to          with a thorough content audit of the entire existing set
exert a profound influence on technological choices and          of pages. This process is time-consuming and cannot
the choice of questions to be answered in the data. The          be done manually if the number of pages ranges in the
second condition necessary to achieve change in com-             order of tens of thousands - a typical figure for large-scale
plex settings, designed and built around the ascertained         websites. This was the main reason that guided us into
needs of users, is to have the deep and ongoing support          leveraging AI approaches to understand the structure
of governance. In a PA or complex entity, services and           of the existing website, and define the optimal IA by
processes are interconnected and depend - to name just           categorizing the existing content.
a few factors - on the people sustaining them, the orga-
nizational models, the technological infrastructure, the         Statistical testing to assess large-scale website struc-
operational practices, the available skills, the investments     ture The statistical analysis of the existing website
in reskill and upskill, human resource recruitment and           starts by considering the website as a directed network of
management policies, the regulatory environment, and             documents (nodes) connected through hyperlinks (edges).
the financial situation. These are all mutually condition-       One example is represented in Figure 1, where the main
ing factors, and as the size of the institution grows, they      structure of the studenti.unige.it (mid-2017) is depicted.
grow more than proportionally, until they generate a sys-        By exploiting network analytics [9], we explored the dis-
tem of constraints and cascading repercussions involving         tributions of the in-degree and out-degree, that are the
the entire institution. The human-centered design or re-         number of in-coming links, and the number of out-going
design of a service impacts this system of interconnected        links, respectively. We argue that the scale-free prop-
                                                                 erty, that characterizes the World-Wide-Web as a whole
                                                                 [10], is no longer evident for specific sub-categories of
                                                                 websites. Particularly, we defined a method that could
                                                                 be used to better characterize topological properties de-
                                                                 riving from different generative principles: central or
                                                                 peripheral. The first category is typically characterized
Figure 2: Pipeline representing the AI-driven big web redesign   by a strong central control in the design and evolution
methodology                                                      of the IA and content generation. Conversely, the last
Figure 3: Topic modeling on studenti.unige.it website struc-    Figure 4: corsi.unige.it website structure.
ture (mid-2017). The analysis identified 8 main topics, with
many pages linked to multiple topics.

                                                                the struggle typical of traditional content management
                                                                systems to handle the ever-changing regulations, infor-
category is completely user-guided and its evolution is
                                                                mation overload, and new digital technologies that large
likely to be random [11]. This method may be used to
                                                                organizations face [14]. The system highlights the impor-
trace and monitor the evolution over time of the website
                                                                tance of human-centered design throughout the content
content also according to the editorial model in use: a
                                                                management process. It is designed to be user-friendly
few people that are allowed to write anywhere in the
                                                                and easy to learn, even for non-technical users. This
website and tightly control the structure or a multitude
                                                                ensures that content creators can focus on creating high-
of contributors that are allowed to write based on collab-
                                                                quality content, rather than struggling with the technol-
orative editing and community moderation. The result
                                                                ogy. This process resulted in the publication of a new set
on the UniGe website denoted a combination of central
                                                                of federated websites whose structure and content were
and peripheral editorial strategies that, over time, led to a
                                                                the result of the entire pipeline just described. Figure 4
chaotic arrangement, confirming the need for a profound
                                                                illustrates its hierarchical structure.
redesign.

Topic Modeling to infer an optimal Information                  3.4. Assessing results through data
Architecture We then proceeded by defining an AI-                    visualization
driven approach to define the optimal IA. Specifically,
                                                                To assess the overall usability of the newly designed web-
we exploited topic modeling methods [12, 13] to identify
                                                                site and the effectiveness of the entire procedure, we
how many topics were discussed on the website and vi-
                                                                resorted to a set of UX tests conducted on a sample of 60
sualize how the pages often presented multiple unrelated
                                                                students divided into two groups (𝑀𝑎𝑔𝑒 = 22.7𝑦𝑟𝑠), with
topics, leading to a confusing UX. The result of such an
                                                                the purpose of comparing the usability of the corsi.unige.it
experiment on studenti.unige.it is visualized in Figure 3,
                                                                website, designed with the methods above, and of stu-
where the fuschia squares represent the topics and the
                                                                denti.unige.it, which historically UniGe uses to commu-
blue dots represent the pages. Even at first glance, it is
                                                                nicate with its students. Each user must answer three
clear that many pages are linked by two or more topics,
                                                                questions related to a course of study, navigating exclu-
with a central cluster of pages connected to up to 5 topics.
                                                                sively on one of the two sites. The questions were the
                                                                following. Question A: Is there an exam to enroll in the
3.3. AI-based component management                              Business Administration BSc? What does it consist of?
     system                                                     Question B: When is the deadline for Erasmus+ applica-
                                                                tion? Question C: Find where to ask for an internship in
To implement a new strategy for content creation and
                                                                car design as a MSc student in Design.
management, we devised a system that exploits single
                                                                   The exercise was deemed successful if the subject cor-
sourcing and AI methods to transform information into
                                                                rectly answered the question at the end of the search.
structured and reusable data, making it easier to create,
                                                                Researchers evaluated the effectiveness of the site by
maintain, and update content across multiple channels
                                                                measuring the number of clicks and the time taken to
and user profiles. For example, a single piece of informa-
                                                                complete the exercise, with each question allotted a max-
tion can be automatically tailored to the needs of different
                                                                imum response time of 600 seconds. There were three
audiences, such as students, faculty, and staff. The sys-
                                                                potential outcomes: correct, partially correct, or incor-
tem also integrates with machine learning methods for
                                                                rect answers. The vertical bubble chart in Figure 5 shows
image recognition and automatic translation, further re-
                                                                the result of the UX test. Each circle corresponds to a
ducing the burden on human editors. This overcomes
                     corsi.unige.it    studenti.unige.it        of implementing the newly redesigned system in the
  Time (s)               150.5              253.5
                                                                university procedures. Thus, human-centered design ap-
  # Clicks                 5                  16
  Success Score            1                  -1
                                                                proaches and codesign and shared prototyping activities
  Success Rate            75%                7%                 have proved invaluable allies in gaining consensus, incor-
                                                                porating the knowledge and skills of all stakeholders into
Table 1                                                         the design process, and disseminating the culture of UX.
Median time, median number of clicks, median success score, However, splitting the design-driven digital innovation
and global success rate between the two websites over the project into autonomous modules was necessary
three questions.
                                                                   This behavior, as predicted by [15], is to be expected
                                                                when working within complex organizations to step up in
                                                                the UX maturity model, particularly if the starting point is
user and is arranged along the vertical axis according to the bottom of the ladder, i.e. absent or limited UX. In this
the number of clicks. The diameter is proportional to case, besides working on technological improvements,
the time required to perform the exercise while the color the institution must focus on a profound cultural change
represents its success (correct answer, partially correct, that supports UX knowledge and data awareness.
incorrect).
   Table 1 shows how on average users performance
on studenti.unige.it was significantly worse than on 4. A new challenge: PhD
corsi.unige.it. In particular, in the case of corsi.unige.it,        Programme at PoliMi
three out of four users answer correctly with just a few
clicks and in a short time, with a success of 75 percent, Our successful experience in UniGe resonated across sev-
while less than one in ten users responds correctly using eral national outlets and was presented in several news
studenti.unige.it. Although our experience was successful channels and conferences [14, 16] and culminated in win-
in achieving the objective we set in the beginning, this ning the ForumPA award for best innovator [17]. The
was not without hiccups and obstacles. In particular, a exposure we received from these initiatives allowed us to
strong resistance to change was observed at the moment get in touch with other realities that were facing similar
                                                                issues as the ones we faced in UniGe. In this line, we
                                                                were able to begin a tight-knit collaboration with the
                                                                Department of Design at Politecnico di Milano (PoliMi),
                                                                where the Design PhD Programme at PoliMi was facing
                                                                the problem of redesigning their website. The PhD Pro-
                                                                gramme in Design at PoliMi is the largest PhD design
                                                                course in Italy, with almost 90 PhD students enrolled.
                                                                   Their goal was to devise a system that could allow for
                                                                an optimal presentation of the research activity carried
                                                                on by the PhD alumni and candidates as well as a website
                                                                that could describe procedures to potential new students
                                                                (admission procedure, courses requirements, etc.) and
                                                                stakeholders. Our involvement since the very beginning
                                                                of the redesign process, allowed us to adopt the methodol-
                                                                ogy defined for UniGe. Together, we identified the major
                                                                pain points from their old websites, mainly consisting
                                                                of a lack of automation, an inactive homepage, and a
                                                                content strategy that did not leverage single-sourcing,
                                                                user profiling, or structured information. The result was
                                                                a set of websites that penalized the research efforts. Ap-
                                                                plying our methodology allowed us to identify the root
                                                                of these inefficiencies. In particular, PhD procedures lack
                                                                a connection between the legacy data, processes and the
                                                                website content. Also, in research dissemination pages,
                                                                the existing data structure does not account for the inter-
                                                                actions among researchers and therefore it is difficult to
                                                                keep up-to-date and coherent. Currently, the project is
Figure 5: Usability tests: corsi.unige.it vs. studenti.unige.it
                                                                focusing on designing a data-driven system which may
                                                                enhance the usability of the websites through refining the
quality of the underlying information structure. A pri-       References
mary objective is to establish an architecture capable of
robust automation. This architecture aims to convert nat-      [1] C. Pinho, M. Franco, L. Mendes, Web portals as
ural language and simple information about researchers’            tools to support information management in
activity in structured data. By exploiting legacy data in a        higher education institutions: A systematic
seamless flux, the system reduces maintenance effort at            literature review, International Journal of
the bare minimum and keeps data in sync and up to date.            Information Management 41 (2018) 80–92.
Also, the system will behave by considering research as        [2] J. Nielsen, H. Loranger, Prioritizing web usability,
the result of heterogeneous networks of people and top-            Pearson Education, 2006.
ics. In doing so, the data produced by the research efforts    [3] A. Vian, A big web redesign: data driven design
become structured and easily reusable across different             research through practice and implementation,
touchpoints.                                                       Fuori Collana, Genoa University Press, 2020.
                                                               [4] S. Grünwald, H. Bergius, Decoupling content
                                                                   management, in: Proc. of WWW2012, 2012.
5. Discussion and Conclusion                                   [5] C. Bizer, T. Heath, T. Berners-Lee, Linked data:
                                                                   The story so far, in: Semantic services,
Our experience with two large scale universities, sug-             interoperability and web applications: emerging
gests that Italian academia still lacks concrete strategies        concepts, IGI global, 2011, pp. 205–227.
and methods to tackle the ever increasing amount of            [6] L. Rosenfeld, P. Morville, Information architecture
information a PA faces today. Particularly in the con-             for the world wide web, O’Reilly Media, Inc., 2002.
text of web redesign, we defined a methodology that            [7] B. Burnett, D. Evans, Designing your work life:
goes beyond a shallow visual redesign in favor of a new            how to thrive and change and find happiness at
methodology where existing data is first used to guide             work, Knopf, 2020.
the definition of an optimal IA and then structured to         [8] D. A. Norman, P. J. Stappers, DesignX: Complex
feed the touchpoints based on single-sourcing, interop-            Sociotechnical Systems, She Ji: The Journal of
erability, and user profiled approach. Our methodology             Design, Economics, and Innovation 1 (2015).
exploits design thinking principles, User-Centred Design       [9] A.-L. Barabási, Network science, Philosophical
methods, agile programming, prototyping and innova-                Transactions of the Royal Society A: Mathematical,
tors (young fearless developers and designers) to define           Physical and Engineering Sciences 371 (2013).
a new way of thinking about information and how to            [10] A.-L. Barabási, Scale-free networks: a decade and
distribute it across complex organizations. The result is a        beyond, Science 325 (2009) 412–413.
system that is able to govern a multitude of different and    [11] D. Garbarino, V. Tozzo, A. Vian, A. Barla, A robust
interconnected websites thanks to AI, Natural Language             method for statistical testing of empirical
Processing and data management principles.                         power-law distributions, in: Proc. of WAW 2020,
   Academia represents a unique type of PA, tasked with            Springer, 2020.
the crucial responsibility of not only accumulating vast      [12] J. Chang, D. Blei, Relational topic models for
knowledge but also effectively sharing it with the broader         document networks, in: Artificial intelligence and
public. However, like many PAs, academia tends to ex-              statistics, PMLR, 2009, pp. 81–88.
hibit a conservative stance toward digital advancements,      [13] J. Chang, D. M. Blei, Hierarchical relational models
often resisting change and clinging to outdated methods.           for document networks (2010).
Rather than embracing true innovation, there’s a ten-         [14] A. Barla, M. Cuneo, S. R. Nunzi, G. Paniati, A. Vian,
dency to equate progress with simply digitizing existing           AI-based component management system for
processes. This approach typically prioritizes documents           structured content creation, annotation, and
over recognizing the central importance of data, which             publication, in: Proc. of IHSI 2022., volume 22,
should serve as the foundation for all operations, par-            AHFE International, 2022.
ticularly within sprawling institutions characterized by      [15] K. Pernice, K. Moran, K. Whitenton, S. Gibbons,
numerous interconnected yet autonomous departments.                The 6 levels of ux maturity, 2024. URL: https:
                                                                   //www.nngroup.com/articles/ux-maturity-model/.
Acknowledgments                                               [16] E. Capone, Intelligenza artificiale e big data, il
                                                                   futuro dell’università di genova passa da internet,
A. Barla is part of the RAISE Innovation Ecosystem                 2020. URL: https://bit.ly/3TPJIed.
funded by the Recovery and Resilience Plan (NRRP)             [17] ForumPA, Andrea vian: “la trasformazione digitale
(ECS00000035)                                                      dei servizi informativi”, 2023. URL:
                                                                   https://bit.ly/3J6eM4b.