<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Analysis and Methodology of Inhibiting COVID-19 Spread on a University Campus</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Minhao Zhang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Geng Zhang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shengxi Zhang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Zian Liu</string-name>
          <email>zliu@arcadia.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vitaly Ford</string-name>
          <email>fordv@arcadia.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Victoria Turygina</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Arcadia University</institution>
          ,
          <addr-line>450 S Easton Rd, Glenside, PA 19038</addr-line>
          ,
          <country country="US">U.S.A</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Ural Federal University</institution>
          ,
          <addr-line>Prospekt Lenina, 51, Yekaterinburg, Sverdlovskaya oblast', Russia, 620075</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>Most of the states in the U.S. are slowly transitioning back to “normal”, and educational institutions must weigh in the decision of maintaining the quality of the courses while protecting the health of students in the academic years ahead. We are interested in investigating the circumstances that would help schools stay open during COVID-19, creating safe educational conditions under such a severe situation. Our goal is to move a certain number of courses online to achieve a satisfactory infection rate most eficiently. At the same time, we attempt to maximize the number of face-to-face classroom experiences as most students prefer attending courses on campus over attending them online. In our model, we introduce three parameters to evaluate the risk of every course and determine the most suitable set of courses to be converted online. The parameters include Degree Centrality, Closeness Centrality, and Betweenness Centrality. Those parameters are aggregated in a rectified value. We describe the methodology of our approach and future work, in which we will conduct simulation and sensitivity analyses.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;COVID-19</kwd>
        <kwd>university campus</kwd>
        <kwd>centrality parameter</kwd>
        <kwd>social network</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>should reopen and how to protect the safety of students, faculty, and staf. We are interested in
investigating the circumstances that would allow schools to open again in the academic years
ahead, creating safe educational conditions under such a severe situation.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background Information</title>
      <p>One of the major safety concerns is raised by the fact that students regularly encounter many of
their peers on campus throughout the day and gather together in groups outside of classrooms.
Consequently, we decided to evaluate the degree of interconnectedness/centrality of students
with respect to a potential viral outbreak. Our research goal is to develop data-driven policies
and simulate virus spread on the Arcadia University campus to provide a safer environment
for students to take face-to-face classes by converting a targeted number of the courses to an
online format.</p>
      <p>
        Given that schools would follow the guidelines and procedures defined on federal, state,
and local levels, we assume that students would follow safety practices and only some courses
need to be transferred online. However, based on the Laurence Steinberg’s article “Expecting
Students to Play It Safe if Colleges Reopen Is a Fantasy” [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], we take into consideration that the
students’ age group could be defined as “high risk-takers” and, therefore, students are prone
to allow themselves a leeway, especially in informal group settings, meaning that there is a
possibility of students not wearing masks and not following the safety protocols when they
should be doing so.
      </p>
      <p>We realize that it is not possible to completely suppress the virus if it is already on
campus, even though it would be feasible to contain it, assuming that everyone follows the safety
guidelines. However, we hope that this research can provide a data-driven approach that could
be used as a reference for school decision-makers to make teaching plans.</p>
      <p>
        By studying Cornell University’s research findings [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], we found that the Cornell University
team used transcript data describing 3 enrollment networks that connect students via classes,
creating certain conditions that simulate the virus spread. According to epidemiological
studies, they assumed that the cancellation of face-to-face courses would reduce the spread of the
virus in which case a bipartite network needed to be created. Referring to their data acquisition
and implementation methodology, we built bipartite social network graphs based on students’
schedules from Arcadia University’s fall semester data to observe the degree of correlation
among students and classes. Also, by analyzing statistical data, we simulated the spread of the
virus and removed the highly concentrated courses to observe whether moving some courses
and certain students online will significantly change the virus’s transmission results.
      </p>
      <p>In this study, we added extra parameters to simulate diferent situations that may occur on
campus. For example, we determined the virus spread with or without masks and the impact
of opening the dining hall.</p>
      <p>The core motivation for this research is to answer the following questions:
1. Would schools be able to stay open during COVID-19 by moving high sensitivity classes
and particular students online?
2. How many students would be infected compared to the model without converting classes
to an online format?</p>
      <p>
        Our simulation (presented in the follow-up article [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]) is based on the assumption that the
school has reduced the infection rate as much as possible and can frequently test students on
campus. We started our simulation based on the policies presented in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], specifically focusing
on the initial value of  0 [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
2.1. Notable characteristics of our approach
1. We have students’ actual course information meaning that our data are real rather than
hypothetical, leading to more realistic predictions during the simulation analysis.
2. The parameter  0 [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], defined as the metric of how many other people an infected
individual could infect, directly reflects the impact of our measures, such as “moving certain
courses online.” After obtaining an  0 adjusted during our simulation, we can quickly
retrieve updated results, including the number of infected people, and allow for running
experiments numerous times with diferent parameters.
3. In order to evaluate and determine a satisfactory result, we compare the infection rate of
students attending classes face-to-face with the general infection rate in the local region.
The general infection rate in Pennsylvania represented by  0 is 3 [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] based on CDC [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] at
the time of writing this paper. Such a comparison allows us to estimate the risk of getting
infected on campus versus staying home and taking classes online. If the infection rate
of students in the school is lower, we consider it as a satisfactory result.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Data Source</title>
      <p>
        To analyze the interconnectedness among students on campus, we used their course
schedule information for fall 2020. We designed a program in Python available at [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] to obtain all
the course registration information and clean the data. After cleaning the data (e.g.,
removing students who have not registered at all) from the raw data we obtained, we created our
data source based on the information of 1,816 undergraduate students including the registered
courses, time period of each course, and the classroom locations. An example of the finalized
data format is presented in figure 1.
      </p>
      <p>
        Our dataset contains 1,816 students and 689 courses. We anonymized student IDs
representing each student because we focus on reducing the spread of the virus rather than analyzing
specific personal information. As shown in figure 1, we extracted the course codes and time
periods of the students, allowing us to cluster them accordingly. Based on these data, we can
determine which courses have the highest degree of centrality to decide if they need to be
moved online. Also, since we keep the data in a   format, it is straightforward to add
other appropriate parameters during the simulation. A more detailed analysis of the collected
data is available in the follow-up paper [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. Methodology</title>
      <p>We present a mathematical modeling network for COVID-19 at Arcadia University. Besides the
data related to the students who have not registered for the fall semester and unavailable data
of graduate students, we gathered data of 1,816 undergraduate students out of about 2,500
undergraduate and graduate students to support our model. Most graduate programs at Arcadia
are online so graduate students do not afect our simulation.</p>
      <p>In comparison with the universities having a large student population, Arcadia has a
relatively small class size that is typically less than 20. This fact greatly reduces contact among
students. Also, Arcadia has enough classroom resources to avoid a possible infection caused by
classes happening one after another in the same room. Based on these conditions, our model is
designed to provide suggestions and help university leadership with the decision-making
process when they consider whether or how to bring students back for a residential fall semester.</p>
      <p>
        We are aware that an accurate prediction is not an attainable goal. So, we established a model
with  0, the most commonly used parameter in the epidemic transmission, and determined the
initial  0 for schools under suficient protection obeying the recommendation from CDC [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
Our simulation goal is to reduce  0 as much as possible by decreasing the interconnectedness
among students to achieve a reasonable situation, looking for the possibility of the majority of
students to take classes and keep from being infected.
      </p>
      <p>
        Our approach focuses on finding courses with the highest sensitivity coeficient through
the social network and shifting as few courses as possible into an online format to reduce
students’ direct contact and achieve an acceptable infection rate. We also use the method of
controlling variables to find the parameters that have the most impact on the epidemic model
and formulate relevant policies for it. Moreover, we remove large courses with students from
diferent majors to reduce the possibility of inter-departmental infection spread. It is worth
noting that our model is based on the scenario that students reduce the number of other forms
of contact as much as possible besides extracurricular activities; that is, students comply with
the CDC policy [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>
        The simulation outcome is optimistic. Even under the very pessimistic initial  0 value set to
3.2, we can still get a satisfactory  0 after moving only top-10 high sensitivity courses online,
resulting in less than 316 students being infected before the end of the semester meaning that
very few students would potentially need to quarantine. We chose to base the simulation
results on moving 0, 5, and 10 courses online because it fits in with the approach we take to
evaluate our proposed policy measures. Based on our calculations in the simulation analysis,
removing 0, 5, 10 classes from the network corresponds to the number of infected students in
the face-to-face semester, which is higher, almost similar, and lower respectively than that of a
fully online semester. In the simulation section of our follow-up article [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], this will be clearly
demonstrated.
      </p>
      <p>
        Public places like cafeterias with heavy trafic should be closed down, otherwise, the spread
of the epidemic would never be controlled. We have provided an alternative for these necessary
closures and it will be presented in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <sec id="sec-4-1">
        <title>4.1. Bipartite social network graphs of collected data</title>
        <p>We collected the course information of all students and established the network diagram
between “students and courses” and between “students and students.” In figure 2, we identified
18 components including one big component and 17 small components. We focus our attention
on the one big component.</p>
        <p>We made two diagrams with the network of “students and students” and the network of
“students and courses” respectively. First of all, in the figure of students only, there are
numerous connections among students represented by the red lines, meaning that if the school opens
without taking any efective measures, the spread of the epidemic will be significant to close it
down within a week.</p>
        <p>In the second graph of “students and courses” (figure 3, we use asterisks to represent classes,
and the more students there are in the class, the larger the symbols. Most of the courses are
interconnected via students’ connections.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Virus spread parameters and assumptions</title>
        <p>
          We used three diferent  0s to simulate the spread of the virus in optimistic, nominal, and
pessimistic scenarios. The main diference in those scenarios is the number of days the students
can potentially infect others before they are quarantined. On a relatively small campus like
Arcadia, there is a possibility to conduct regular virus testing, so that the latent or asymptomatic
patients can be quarantined as soon as possible, thus the number of days the infected students
could transmit the virus would be significantly reduced. In the paper [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], the researchers
estimated the  0 in three diferent situations as well, namely optimistic, nominal, and pessimistic,
defining the seriousness of the virus spread accordingly. However, due to the large number of
graphs per scenario, we focused our attention on the pessimistic  0 value as it describes the
worst infection spread.
        </p>
        <p>
          These data were used to simulate the spread of the virus at Cornell University [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ].
        </p>
        <p>Note: As Arcadia has fewer students (no more than 20) in one class, the above-mentioned
numbers are slightly higher than Arcadia-related numbers.</p>
        <p>
          We assume that everyone will comply with the policy and wear masks. Arcadia’s
administration has issued a document explaining the requirements for the face-to-face meetings,
including a requirement to wear a mask in public places with a note that failure to comply
will constitute a violation of the code of conduct and could result in disciplinary actions [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ].
Classrooms and other public places will be efectively and regularly cleaned. In the paper [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ],
the researchers have also simulated a similar scenario.
        </p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Goals of the proposed policies</title>
        <p>We will turn a certain number of courses into an online format to achieve a satisfactory
infection rate most eficiently. At the same time, we attempt to maximize the number of face-to-face
classroom experiences as most students prefer attending courses on campus over attending
them online.</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. Data analysis</title>
        <p>In our model, we introduce three parameters to evaluate the risk of every course and
determine the most suitable set of courses to be converted online. The parameters include Degree
Centrality, Closeness Centrality, and Betweenness Centrality.</p>
        <p>To calculate these three parameters, we need to understand what story they tell in our
specific scenario:
• Degree Centrality identifies the important nodes with many connections.
• Closeness Centrality identifies the important nodes that are close to each other.
• Betweenness Centrality identifies the important nodes that are located on the shortest
paths between other nodes on the network.</p>
        <p>In our model, we assume that the most important nodes have many connections and the
shortest path between any two nodes is short enough for a fast virus spread.</p>
        <p>
          Under the above assumptions, the evolution of the parameters mentioned above is modeled
by the following equations [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
        </p>
        <p>Degree Centrality:
Closeness Centrality:</p>
        <p>Betweenness Centrality:
where:
  ( ) =</p>
        <p>| | − 1
 
( ) =</p>
        <p>| | − 1
∑ ∈ { }  (,  )
 
= ∑
, ∈
 , ( )
 ,
(3)
(4)
(5)
 is the set of the nodes in the network.
 is the degree of node  .
 (,  ) is the distance between nodes  and  .
 is the number of the shortest paths between nodes  and  .
 ( ) is the number of shortest paths between nodes  and  that pass through node  .
Note: all the nodes belong to the set of courses.</p>
        <p>In our social network, Degree Centrality describes the number of classes connected to a
specific student as well as how many students are connected to a particular course. Closeness
Centrality explains how a course connects to other courses in the school and represents the sum
of the shortest paths (the start points are the students in the class) to all students. Betweenness
Centrality is similar to Closeness Centrality, except that it represents the importance of courses
that are located on the path for the other courses to become connected. When a course has a
high degree of Betweenness Centrality, it becomes the Suez Canal in our network: it greatly
reduces the shortest path for other courses to become connected.</p>
        <p>After obtaining the three centrality parameters of all nodes, we can get the rectified value of
each node. The node with a high rectified value usually has more connections, closer distance,
and stronger connectivity with other nodes.</p>
        <p>We define the function of rectification as the following:
  ( ) =  ̄  (( )) +  ̄  (( )) +  ̄  (( ))</p>
        <p>Through the rectification method, we remove the influence of weight on the pure numerical
values and look at the aggregate result across all centrality parameters.</p>
        <p>The outcome of arranging the courses from high to low according to their rectified values is
presented in table 1.
(6)</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>
        In this paper, we presented our initial analysis of the academic course data, demonstrating that
the rectified centrality value (based on the degree centrality, closeness centrality, and
betweenness centrality) can identify courses that would be beneficial to switch to the online modality.
We described the methodology and algorithms taking into account such information as course
location, time, and the number of enrolled students. In the paper that will follow this
article [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], we will expand on the knowledge obtained in this research and conduct simulations
with sensitivity analysis and recommendations for building a safer environment on campus.
      </p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgements</title>
      <p>We would like to thank the Computer Science and Math Department, Arcadia University, for
providing support and help during this research.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>L.</given-names>
            <surname>Steinberg</surname>
          </string-name>
          , Expecting Students to Play It Safe if Colleges Reopen Is a Fantasy,
          <year>2020</year>
          . URL: https://www.nytimes.com/
          <year>2020</year>
          /06/15/opinion/coronavirus-college-safe.html.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Cashore</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Duan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Janmohamed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Henderson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Shmoys</surname>
          </string-name>
          , P. Frazier, COVID-19
          <source>Mathematical Modeling for Cornell's Fall Semester</source>
          (
          <year>2020</year>
          ), URL https://people.orie.cornell.edu/pfrazier/COVID 19 (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , G. Zhang,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ford</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Turygina</surname>
          </string-name>
          ,
          <article-title>Application of Data-Driven Measures for Impeding COVID-19 Spread at an Academic Institution</article-title>
          , in: accepted, 26th International Conference Information Society and University Studies - IVUS,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>B.</given-names>
            <surname>Ridenhour</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Kowalik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. K.</given-names>
            <surname>Shay</surname>
          </string-name>
          , Unraveling
          <article-title>0: Considerations for public health applications</article-title>
          ,
          <source>American journal of public health 108</source>
          (
          <year>2018</year>
          )
          <fpage>S445</fpage>
          -
          <lpage>S454</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>L.</given-names>
            <surname>Matrajt</surname>
          </string-name>
          , T. Leung,
          <article-title>Evaluating the efectiveness of social distancing interventions against COVID-19</article-title>
          ,
          <issue>MedRxiv</issue>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <article-title>[6] Centers for Disease Control and Prevention, Searching CDC site</article-title>
          ,
          <year>2021</year>
          . URL: https: //search.cdc.gov/search/index.html.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7] Zhang, Minhao,
          <source>Coronavirus: A Case Study at Arcadia University</source>
          ,
          <year>2020</year>
          . URL: https:// github.com/zhangminhao00/
          <string-name>
            <surname>Coronavirus-A-Case-Study-At-</surname>
          </string-name>
          Arcadia-University.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <article-title>[8] Centers for Disease Control and Prevention, Considerations for Institutions of Higher Education</article-title>
          ,
          <year>2020</year>
          . URL: https://www.cdc.gov/coronavirus/2019-ncov/community/ colleges-universities/considerations.html.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <article-title>[9] Centers for Disease Control and Prevention</article-title>
          , How to Protect Yourself &amp; Others,
          <year>2021</year>
          . URL: https://www.cdc.gov/coronavirus/2019-ncov/prevent-getting-sick.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10] Arcadia University, Conduct and Compliance,
          <year>2020</year>
          . URL: https://www.arcadia.edu/ covid-19
          <string-name>
            <surname>-</surname>
          </string-name>
          health-and
          <article-title>-safety-plan/conduct-and-compliance.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>P.</given-names>
            <surname>Hage</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Harary</surname>
          </string-name>
          ,
          <article-title>Eccentricity and centrality in networks</article-title>
          ,
          <source>Social networks 17</source>
          (
          <year>1995</year>
          )
          <fpage>57</fpage>
          -
          <lpage>63</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>