=Paper= {{Paper |id=Vol-2884/paper_119 |storemode=property |title=COVID-19 in Spain and India: Comparing Policy Implications by Analyzing Epidemiological and Social Media Data |pdfUrl=https://ceur-ws.org/Vol-2884/paper_119.pdf |volume=Vol-2884 |authors=Parth Asawa,Manas Gaur,Kaushik Roy,Amit Sheth }} ==COVID-19 in Spain and India: Comparing Policy Implications by Analyzing Epidemiological and Social Media Data== https://ceur-ws.org/Vol-2884/paper_119.pdf
    COVID-19 in Spain and India: Comparing Policy Implications by Analyzing
                    Epidemiological and Social Media Data
                                     Parth Asawa,1 Manas Gaur,2 Kaushik Roy, 2 Amit Sheth 2
                                            1
                                              Monta Vista High School, Cupertino, CA, USA
                                                           pgasawa@gmail.com
                        2
                          Artificial Intelligence Institute, University of South Carolina, Columbia, SC, USA
                                      mgaur@email.sc.edu, kaushikr@email.sc.edu, amit@sc.edu


                            Abstract
     The COVID-19 pandemic has forced public health
     experts to develop contingent policies to stem the
     spread of infection, including measures such as par-
     tial/complete lockdowns. The effectiveness of these
     policies has varied with geography, population distri-
     bution, and effectiveness in implementation. Conse-
     quently, some nations (e.g., Taiwan, Haiti) have been
     more successful than others (e.g., United States) in
     curbing the outbreak. A data-driven investigation into
     effective public health policies of a country would al-
     low public health experts in other nations to decide fu-
     ture courses of action to control the outbreaks of disease
     and epidemics. We chose Spain and India to present our
     analysis on regions that were similar in terms of certain
     factors: (1) population density, (2) unemployment rate,
     (3) tourism, and (4) quality of living. We posit that citi-
     zen ideology obtainable from twitter conversations can          Figure 1: Top Row: April 11th, Spain 27.8% and India 4.3%,
     provide insights into conformity to policy and suitably         where x% refers to the share of COVID-19 tests that came
     reflect on future case predictions. A milestone when the        back as positive in a 7-day rolling average. Bottom Row:
     curves show the number of new cases diverging from              June 5th, Spain 0.9% and India 6.7%, where x% refers to
     each other is used to define a time period to extract           the share of COVID-19 tests that came back as positive in a
     policy-related tweets while the concepts from a causal-         7-day rolling average.
     ity network of policy-dependent sub-events are used to
     generate concept clouds. The number of new cases is             determined by how well citizens respond to those policies2 .
     predicted using sentiment scores in a regression model.         A person’s conformity to a policy may be inferred from their
     We see that the new case predictions reflects twitter sen-      ideologies mined through social media, such as Twitter (van
     timent, meaningfully tied to a trigger sub-event that en-       Holm et al. 2020). As shown in figure 1, over three months,
     ables policy-related findings for Spain and India to be         Spain recorded a decline of 97% in the number of new cases,
     effectively compared.
                                                                     whereas India has shown a 36% influx in new patients. Is
                                                                     it possible to explore policy transfer from Spain to India
                                                                     to curb the alarming COVID-19 cases? Could the number
                        Introduction                                 of infections be modeled using the Twitter concepts about
The COVID-19 pandemic has seen several countries become              causal trigger sub-events in a causality network (Helbing,
epicenters for spread. Spain was one such country; however,          Ammoser, and Kühnert 2006)? The reason we are conduct-
their policies were effective in curbing the initial outbreak of     ing this study is there is limited prior research relating policy
COVID-19 in March-May of 2020. This is arguably due to               and changes in case counts, through social media analysis,
people and governments taking precautions to limit the pop-          for COVID-19. We use Twitter as the active platform for
ulation of people susceptible to the virus — masks, social           live information on the spread of COVID-19. Government
distancing, lockdowns, business closures, etc from an early          policies, especially in developing nations, based on the epi-
stage1 . Accordingly, the effectiveness of individual coun-          demiological data, ignore the population-specific behaviors
tries’ policy responses to an epidemic or pandemic can be            of culture, ideology, and politics that hinder these policies’
                                                                     implementation. For example, a large number of people in
Copyright © 2020 for this paper by its authors. Use permitted        the US are opposed to wearing masks. To this end, we jux-
under Creative Commons License Attribution 4.0 International
(CC BY 4.0).                                                         1
                                                                         https://www.healthaffairs.org/doi/10.1377/hlthaff.2020.00818
                                                                     2
                                                                         http://bit.ly/citizenResponses
tapose Spain and India’s epidemiological data to identify a      The study was conducted on the epidemiological data of
date when the curves show the number of new cases diverg-        Hong Kong, and inferences were made using confidence in-
ing from each other, and India started showing worsening         tervals. Our research aims to investigate the applicability of
conditions.Although it could be argued that the differences      policies created by developed nations onto developing na-
we see in cases were due to travel from hotspots, it’s im-       tions. Such an exploration is not possible in Cowling et al.’s
portant to note that India closed its borders by suspending      study. Further, Cowling et al. provide statistical explana-
all international flights starting March 22nd, in addition to    tions on government policies’ potency in Hong Kong rather
taking steps to suspend inter-state travel by suspending do-     than conceptual explanations, which is required to decide the
mestic flights and domestic trains throughout the time frame     “what next.” While probing government policies’ relevance
of our analysis3 . We recognized some critical policy-related    from one nation to another, population-specific behaviors
concepts which are causally related in the COVID-19 con-         negatively affect cross-nation policy transfer. For instance,
text. For instance, “settlement areas”, “confinement to bar-     a likely source of infection in India was the Tablighi Jamaat
racks”, “mistrust of people”, “loss of government authority”     movement, a religious gathering 4 , which became a coron-
causally follow announcement of “public policy”. Hence,          avirus vector and was not taken into account in government
we used the causality network of policy-related concepts         policy or enforcement (Sivaraman et al. 2020). Likewise, the
identified by experts during severe acute respiratory syn-       return of migrant laborers to their home states in India and
drome (SARS) to perform a knowledge-guided search on             long weekend celebrations and parties in the United States
Twitter (Helbing, Ammoser, and Kühnert 2006) (see Fig-          led to an increase in COVID-19 cases. As a result, poli-
ure 2). We show Kerala and Mumbai’s policy-related con-          cies such as reopening, contact tracing, and ensuring pub-
cept clouds. Then we investigate the applicability of inter-     lic compliance, which was effective in Europe, are not di-
ventional policies in Madrid and Barcelona to Kerala and         rectly applicable to India and the United States (Hellewell et
Mumbai. Likewise, we observed a policy-level association         al. 2020). It is essential to relate patterns in epidemiological
between the Canary Islands and Andhra Pradesh as both re-        data with evolving policy-related concepts and sentiment on
gions have strong healthcare infrastructure.                     social media to better study the likelihood of policy effec-
   The main contributions of this work are thus investigating    tiveness (Kalteh and Rajabi 2020). Other regression models
Twitter conversations corresponding to explanatory causal        that predict new cases do not consider social media infor-
trigger events, to form an ideological map of the popula-        mation, which we posit is a significant predictor (Shayak,
tion that provides insights into response to government pol-     Sharma, and Gaur 2020) (Prem et al. 2020).
icy (see Methods). In turn, this is validated through the pre-
diction of new cases using the sentiment scores of the twit-                     Materials and Methods
ter conversation (see Regression Analysis and Explanatory        Materials
events). Finally, a comparison of policy and responses across
similar regions in Spain and India is discussed (see Discus-     In this research problem, we use multiple publicly available
sion and Findings).                                              datasets and government resources, specific to Spain and In-
                                                                 dia (e.g., news reports, insights on epidemiological data).
                                                                    The first country dataset is a COVID-19 dataset for Spain
                                                                 data. The dataset is available here: Link. It contains at-
                                                                 tributes including but not limited to: Total # of Cases, To-
                                                                 tal # of Hospitalizations, Total # of Patients in the ICU, To-
                                                                 tal # of Recovered Patients, and Total # of New Cases. The
                                                                 dataset was derived entirely from Spain’s Ministry of Health
                                                                 website and transformed into CSV files. All of the data is
                                                                 available by province (the equivalent to states in the United
                                                                 States). The second dataset we use is a COVID-19 dataset
                                                                 for India, available here5 . This dataset contains attributes in-
                                                                 cluding but not limited to: # of Confirmed Cases per Day, #
                                                                 of Recovered per Day, # of Deaths per Day, # of People in
                                                                 the ICU, # of People on Ventilators. The dataset was sourced
Figure 2: Causality network of sub-events during SARS
                                                                 from several sources, a list of which can be found here6 . All
Pandemic by (Helbing, Ammoser, and Kühnert 2006).
                                                                 of the data is available on a state-by-state level within India.
We utilized this graph to represent sub-events within the
                                                                    After having the two datasets for identifying divergence
COVID-19 pandemic during extraction of the word cloud
                                                                 points and initial identification of a problem, the final dataset
                                                                 we use is a dataset of Twitter-IDs, for our twitter social
                                                                 media analysis available here7 . As stated in the dataset,
                       Related Work
                                                                    4
(Cowling et al. 2020) statistically analyzed the impact of            https://www.aljazeera.com/news/2020/04/tablighi-jamaat-
policy on reducing the transmissibility rate of COVID-19.        event-india-worst-coronavirus-vector-200407052957511.html
                                                                    5
                                                                      https://api.covid19india.org/
    3                                                               6
      https://www.nytimes.com/article/coronavirus-travel-             https://telegra.ph/Covid-19-Sources-03-19
                                                                    7
restrictions.html                                                     https://github.com/echen102/COVID-19-TweetIDs
”The repository contains an ongoing collection of tweets IDs                    Exploratory Data Analysis
associated with the novel coronavirus COVID-19 (SARS-
                                                                   We begin by performing a preliminary visualization of the
CoV-2), which commenced on January 28, 2020. We used
                                                                   dataset. In Figure 4, we observe the new case counts in Ker-
Twitter’s search API to gather historical Tweets from the
                                                                   ala scaled up by a factor of 100 (for trend visibility) com-
preceding seven days, leading to the first Tweets in our
                                                                   pared to Madrid’s region. It seems that the data points re-
dataset dating back to January 21, 2020.” This dataset gives
                                                                   mained reasonably close from the period of March 15th to
us access to the Tweet ID’s pre-filtered concerning the coro-
                                                                   May 1st, after which there is a second wave of COVID-19
navirus with keywords accessible here8 . From this dataset,
                                                                   spread in Kerala. In contrast, Madrid remained relatively
we hydrated 5, 075, 830 tweets from April 15 to May 15,
                                                                   close to 0 for the rest of the period. This divergence from
of which 534 were geotagged from the state of Kerala, and
                                                                   its previous relative similarity to Madrid is a key feature
7094, the state of Mumbai.
                                                                   we intend to explore using real-time conversations on twit-
                                                                   ter. Through semantic analysis of Kerala’s tweets around the
Methods                                                            point of inflection, we recorded mentions of gatherings such
We want to analyze the differences between the spread of the       as marriages and poor capacity of the health system, which
virus in Spain and India; however, the countries are too di-       are potential causes of the rise in new cases (see Figure 6).
verse to compare in their entirety. Thus, we instead propose          Furthermore, people mentioned information on ways of
comparing the two countries on more granular scales, specif-       transmission with no known source of origin, prompting the
ically by identifying pairs of states/regions (India/Spain) that   government to reinstate lockdown procedures. Overexten-
are similar on the following grounds: (1) population density,      sion of lockdown by the government developed a panic re-
(2) unemployment rate, (3) tourism, and (4) quality of living,     action among the individuals in Kerala. The state also saw a
and examining the results. For this study, we restrict to the      lack of cooperation among authorities in affected regions,
following two pairs of states/regions: (1) Kerala and Madrid,      which contributed to a surge in cases. Rumors circulated
and (2) Maharastra (Mumbai city) and Cataluña (Barcelona          through misleading campaigns that developed uncertainty
region).                                                           and fear upsetting people’s livelihood in Kerala, making
   On the data from these states/regions, we did visualiza-        them restless in critical containment zones. From April to
tions of counts of new cases during April and May. This pe-        May, people’s responses to government policies showed ex-
riod was essential to assess the effectiveness of government       pressions of social instability, unemployment, uncontrolled
policies in controlling the COVID-19 pandemic. By creating         infection transmission, and circulation of rumors.
pairs of states/regions from India and Spain, we identified           In Figure 5, we observe the plots of daily new cases in
divergence points where India started showing worsening            Maharashtra, whose case counts were almost all from Mum-
public health. Figure 4 shows May 1st, 2020, as the diver-         bai and Cataluña (Spain, Barcelona). First, it seems that the
gence point for Kerala and Madrid. Likewise, April 22nd,           data points remained fairly close from March 15th to April
2020, is the divergence point for Mumbai and Barcelona             22nd, at which point the new cases in Cataluña remained
(Figure 5).                                                        fairly close to 0 for the rest of the period. Though the pop-
   Once the relevant timeframe is defined, we extract tweets       ulation density and social composition of Mumbai are dif-
geotagged to the local Indian regions, such as Kerala and          ferent from Kerala, we recorded the use of similar concept
Mumbai. It allows us to explore the people’s responses to-         phrases reflecting similar consequences of government poli-
wards government policies, which helps assess the rise in          cies. For instance, social instability, reaching out to catholic
COVID-19 cases. Semantically understanding people’s re-            hospitals10 (or church hospitals), seeking military aid during
actions from their twitter conversations is a challenging task     lockdown11 , mental health, panic reaction, and people seek-
for statistical natural language processing. Hence, we uti-        ing therapy. Compared to Kerala, Mumbai showed a signif-
lize a hypothesized causal graph of policy-dependent sub-          icant rise in unemployment, which is relatively similar to
events in Helbing et al., which describes a series of activ-       the trend in unemployment in Barcelona, and Madrid12 . The
ities occurring during a pandemic. Some of the concepts            situation of unemployment remained constant from April to
described by Helbing et al. are mistrust, church hospitals,        May in Kerala and Mumbai. Further, the concept of ”gen-
mask distribution, mental health. We identify a set of rel-        eral population behavior” describes the migrant population,
evant concepts that describe Kerala and Mumbai’s tweets            which constituted 93% workforce in India, contributed to
using a pre-trained multilingual ConceptNet model from a           the rise in the COVID-19 cases as people travelled back to
Sem-Eval task (Speer and Lowry-Duda 2017). We use the              their homes for security. These external factors, which aren’t
Spacy parser to generate phrase embeddings of concepts and         recorded in epidemiological data but explain epidemiology
nouns extracted from tweets9 . Next, we perform a cosine
similarity between the tweet vector and concept vector, with         10
                                                                         https://www.licas.news/2020/06/18/as-indias-healthcare-
an empirically determined threshold of 0.45. The frequency         system-struggles-with-covid-19-catholic-hospitals-join-the-
of concept phrases was recorded and presented as people’s          front-line/
                                                                      11
responses in the given region during the given time frame.               https://www.thehindu.com/news/cities/mumbai/lockdown-
                                                                   state-seeks-armys-help/article31188053.ece
  8                                                                   12
    https://github.com/echen102/COVID-19-TweetIDs/blob/                  https://www.theolivepress.es/spain-news/2020/05/12/madrid-
master/keywords.txt                                                and-barcelona-both-rank-in-the-bottom-10-of-best-cities-for-
  9
    https://spacy.io/api/dependencyparser                          jobs-following-coronavirus-crisis/
Figure 3: Workflow detailing the approach described in this study to analyze citizen response to policies and generate explain-
able inferences on the epidemiological data, in addition to predicting future changes in the spread of an epidemic.




Figure 4: Daily New Cases of COVID-19 in Kerala (scaled           Figure 5: Daily New Cases of COVID-19 in Maharash-
up by 100 for visibility) and Madrid plotted against time         tra (Mumbai City) and Cataluña (Barcelona region) plotted
from March 15th to June 1st, with an identified Divergence        against time from March 15th to June 1st, with an identified
Point of where the two curves no longer follow the same           Divergence Point of where the two curves intersected.
trend.
                                                                  in the eventual seemingly exponential growth in the spread
patterns, should be incorporated in models like SIR to better     of COVID-19. We will next validate if these thinking pat-
estimate the future patterns in the spread of disease (Sivara-    terns captured in Twitter sentiments are a good predictor of
man et al. 2020). As we can see, within both states, the top-     new cases.
ical content being discussed is relatively the same. In the
time series curve, including April, we saw that the coron-        Regression Analysis and Explanatory events
avirus cases had a steadily increasing number of new cases        We use Multivariate Linear Regression (MVR) with tweet
per day with a slight curvature. This indicates that the simi-    sentiment to predict future cases in Kerala and Mumbai’s
larity in thinking over time compounded, possibly resulting       regions from mid-April to mid-May, over a month across
Figure 6: After the first wave of COVID-19 spread in the            Figure 7: As we can see, within both states, the topical con-
month of March, the government of India instituted various          tent being discussed is relatively the same. Throughout the
policies, such as school closings, business closings, travel        frame of the time series, including April, we saw that the
bans, over-extensions, which impacted public life, especially       trend in coronavirus cases had a steadily increasing num-
for daily wage families. Hence, we see rise in the frequency        ber of new cases per day, or a positive second derivative.
of tweets concerning mental health, medical care, and un-           This indicates that the similarity in thinking over time com-
employment. As a consequence of the policies, we observe            pounded, possibly resulting in the eventual seemingly expo-
emerging events such as rumors, churches becoming hospi-            nential growth in the spread of COVID-19 that we witness.
tals due to overloaded healthcare facilities, social instability,
                                                                                         With Sentiment      Without Sentiment
and mistrust (in rectangle black box). Through citizen sens-          Time period
                                                                                                         2
ing around the point of inflection (Figure 4) , we noticed a          for Prediction   RMSE       adjR       RMSE      adjR2
constant frequency of concepts such as poor public life and           14 Days          9.54       0.84       11.73     0.76
bad condition of the state, which reflected on the imperfec-          7 Days           7.85       0.68       7.85      0.68
tion in policy implementation.                                        3 Days           6.46       0.63       6.51      0.63

different periods. To determine each tweet’s sentiment, we
use the flairNLP Python library 13 . We combine sentiments          Table 1: RMSE and adjR2 Regression Results with and
of concepts (Figure 6 and 7) identified from each tweet into        without Sentiment for the State of Kerala, model trained on
daily sentiment values – from the period of April 16th to           values from April 16th to May 14th. All the scores are sig-
May 14th/15th. We then perform MVR using the features is            nificant with one-tailed t-test at p-value 0.1
described in materials sections and another with tweet sen-
timent. The first MVR model uses the past 30 days of new
                                                                    provide explanatory sub-event triggers for those concepts.
cases and recovered cases to predict the next 30, and the
                                                                    An example is shown in Figure 2, where the causal structure
second MVR model also uses tweet sentiment to predict the
                                                                    of sub-events that guided the extraction of twitter conversa-
next 30 days. We use a cumulative function on both new
                                                                    tion is marked. The government can use this graphical ex-
cases and recovered cases to better reflect the upward trend.
                                                                    planation to shape its policy going forward.
   We find that the Regression error does indeed decrease
                                                                       Note that the dataset of Mumbai tweets was 14 times more
when using the tweet sentiments. We specifically look at
                                                                    extensive than Kerala, resulting in high RMSE. We see a
the differences in the RMSE values and the adjusted R2
                                                                    more noticeable difference in adjR2 and RMSE values for
for quantitative performance gains. Further, we use peri-
                                                                    Mumbai further in time from May 15th, than we do for Ker-
ods of 3, 7, and 14 days from May 15th for the two MVR
                                                                    ala except for the 14 days. Thus, we believe that this re-
models, as these have been shown in (Pavlicek, Rehak, and
                                                                    search can be explored further with potentially more statis-
Kral 2020) to be the periods of days with which COVID-19
                                                                    tically significant findings through access to larger datasets
deaths show regularities (see Table 1 and 2). Previous litera-
                                                                    and more extensive experimentation. However, the increase
ture suggests that the RMSE uncertainty for this number of
                                                                    of the accuracy of using sentiment does seem to happen for
data points would be approximately 12.9% (Faber 1999).
                                                                    both states further away from May 15th, i.e., the model ex-
   A model’s explainability is vital in such a high stakes ap-
                                                                    trapolates better.
plication for humans to trust and understand its predictions.
While the weights of a linear model lend themselves nicely
to interpretation, they alone do not provide any insight into                     Discussion and Findings
the type of events that may have triggered such conversation        In this paper, we presented a methodology to determine
on Twitter. For tweets with concepts of high sentiment score        crowd responses to governmental policies that can impact
weight in the model, we use the causal graph (Helbing, Am-          health and new case predictions in real-time, and evaluate
moser, and Kühnert 2006) built for the SARS epidemic to            those responses to provide direction for new public health
                                                                    policy.
  13
       https://github.com/flairNLP/flair                              In broad terms, the method presented is the first visual-
                      With Sentiment      Without Sentiment           nitely. Need to work out a way.#RahulShowsTheWay” —–
  Time period
                                                                      Spain’s Civil Guard dedicated time to compiling a report
  for Prediction     RMSE      adjR2      RMSE       adjR2
                                                                      and evaluating possible scenarios of growing social unrest
  14 Days            286.16    0.95       310.60     0.88             in conjunction with law enforcement agencies, coming up
  7 Days             235.38    0.96       245.27     0.93             with different responses to rising crime rates or civil un-
  3 Days             232.09    0.97       238.57     0.92             rest. The report specifically noted that the Spanish popu-
                                                                      lation has accepted the lockdown, “which started out as
Table 2: RMSE and adjR2 Regression Results with and                   one of the strictest in Europe” 15 .
without Sentiment for the State of Mumbai, model trained           4. Cancelled Events tweets (Mumbai): “#MAMI Mumbai
on values from April 16th to May 14th. All the scores are             Film Festival 2020 cancelled. Second major event in
significant with one-tailed t-test at p-value 0.1                     Mumbai to be cancelled this year after Lalbaugcha Raja
                                                                      Ganeshotsav. Cannot imagine the loss of revenues.” —–
ization of the data to identify the features of interest, elicit      A number of events, such as Easter Sunday, were can-
time-frames of events upon which to focus analysis, and ex-           celled in Spain16 . Further, a selective set of interntional
plain the pattern in epidemiological data with social network         events were allowed with limited capacity and stringent
sentiment analysis. For our comparison of the effectiveness           laws (e.g. Live Music) 17 .
of policies in Spain and India, we were able to identify a
                                                                      This is where real-time NLP analysis plays an instrumen-
critical time-frame across multiple state/province pairs that
                                                                   tal role. Identifying topical categories and sentiments asso-
proved to be a divergence point in the spread of the virus
                                                                   ciated with them through social network analyses like Twit-
where Spain appeared to be succeeding in containing the
                                                                   ter provides an avenue to quantitatively and qualitatively
virus. In contrast, India seemed to be experiencing exponen-
                                                                   evaluate and rank responses to different policies. For quan-
tial growth. Looking at the timelines of government lock-
                                                                   titative assessment, we considered intuitive model perfor-
downs: After the 10th case, India took action on Day 21 and
                                                                   mance metrics, such as RMSE and adjR2 . Qualitative in-
Spain on Day 16. After the 1st death, India took action on
                                                                   spection was performed by mapping the people’s response
Day 13 and Spain on Day 29. Finally, after the 100th case,
                                                                   to sub-events in SARS’s causality network. We project the
India took action on Day 13 and Spain on Day 10.
                                                                   identified causally triggered sub-events onto a concept cloud
   We see that arguably, the nations took action on a similar
                                                                   and analyze over two critical months post-initiation policies.
timescale concerning the beginning of the spread. We posit,
                                                                   Even though a linear model is already interpretable in terms
therefore, that the differences in responses to policies can be
                                                                   of weights, this type of explainability is of paramount impor-
found in crowd ideology via Twitter. Looking at a few of the
                                                                   tance to understand and trust the model predictions in such
previously identified key phrases, we can see some examples
                                                                   a high stakes application. This can give governments insight
of selected tweets that display concepts previously identified
                                                                   into whether they must make policies stricter, add more poli-
in the concept clouds, along with a timely response from au-
                                                                   cies, or enforce policies differently than they are at the mo-
thorities in Spain:
                                                                   ment. Real-time analysis of the social network and virus data
1. Tourism tweet (Kerala): “One of the largest sectors of #In-     can significantly change the course of health events and are
   dianeconomy, #Tourism, lies in tatters due to the #Coro-        a promising yet relatively unexplored tool for governments
   naPandemic and the #lockdown”—– Spain chose to han-             and policymakers to use.
   dle tourism by closing its border to outsiders, as of April,
   only allowing diplomats, traveling for emergencies, or                                 Future Work
   residents of the European Union, and assorted smaller           We have presented in this work a case study with two (State,
   states14 .                                                      Region) pairs, specifically (Mumbai, Barcelona) and (Ker-
2. Medical Care tweet (Mumbai): “When the richest coun-            ala, Madrid). We posit that this work can be extended to
   try has zero public health care in place and they need          other (State, County) pairs. Considering one pair such as
   to hire in the middle of a pandemic” —– Spain used a            Andhra Pradesh and the Canary Islands (see Figure 8) —
   royal decree to declare a 15-day national emergency back        both of which are known to have strong healthcare systems
   on March 15th (Legido-Quigley et al. 2020). It dedicated        relative to the rest of their countries — we can plot the time
   significant investments to its healthcare system, quoted “It    series visualization and analyze the divergence point.
   had allocated C2.8 billion to all regions for health services      It’s important to note that there other uncontrolled vari-
   and created a new fund with C1 billion for priority health      ables that make it hard to draw affirmative causal conclu-
   interventions.”                                                 sions, and this is an important aspect we hope to consider in
3. Social Instability tweets (Kerala and Mumbai): (a) “If you         15
                                                                         https://english.elpais.com/society/2020-05-15/spains-civil-
   get into a cyclical lockdown it will be devastating for         guard-warns-about-risk-of-social-unrest-due-to-covid-19-
   economic activity because that would destroy trust.”(b)         crisis.html
                                                                      16
   “People will lose trust if the lockdown continues indefi-             https://gulfnews.com/world/europe/easter-sunday-events-in-
                                                                   spain-cancelled-communities-make-masks-amid-virus-outbreak-
   14                                                              1.1586627331285
    https://www.euronews.com/2020/05/23/spain-will-
                                                                      17
open-borders-to-foreign-tourists-in-july-in-phasing-out-of-              https://www.nme.com/news/music/spain-to-phase-in-live-
coronavirus-restrict                                               music-events-in-may-as-part-of-lockdown-exit-plan-2656841
                                                                      [Faber 1999] Faber, N. K. M. 1999. Estimating the uncertainty in
                                                                       estimates of root mean square error of prediction: application to de-
                                                                       termining the size of an adequate test set in multivariate calibration.
                                                                       Chemometrics and Intelligent Laboratory Systems.
                                                                      [Helbing, Ammoser, and Kühnert 2006] Helbing, D.; Ammoser,
                                                                       H.; and Kühnert, C. 2006. Disasters as extreme events and the
                                                                       importance of network interactions for disaster response manage-
                                                                       ment. In Extreme events in nature and society.
                                                                      [Hellewell et al. 2020] Hellewell, J.; Abbott, S.; Gimma, A.; Bosse,
                                                                       N. I.; Jarvis, C. I.; Russell, T. W.; Munday, J. D.; Kucharski, A. J.;
                                                                       Edmunds, W. J.; Sun, F.; et al. 2020. Feasibility of controlling
                                                                       covid-19 outbreaks by isolation of cases and contacts. The Lancet
                                                                       Global Health.
                                                                      [Kalteh and Rajabi 2020] Kalteh, E. A., and Rajabi, A. 2020.
                                                                       Covid-19 and digital epidemiology. Z Gesundh Wiss.
                                                                      [Legido-Quigley et al. 2020] Legido-Quigley, H.; Mateos-Garcı́a,
                                                                       J. T.; Campos, V. R.; Gea-Sánchez, M.; Muntaner, C.; and McKee,
 Figure 8: Daily New Cases of COVID-19 in Andhra Pradesh               M. 2020. The resilience of the spanish health system against the
 (not scaled) and the Canary Islands plotted against time from         covid-19 pandemic. The lancet public health.
 March 15th to June 1st, with an identified Divergence Point          [Pavlicek, Rehak, and Kral 2020] Pavlicek, T.; Rehak, P.; and Kral,
 of where the two curves intersected.                                  P. 2020. Oscillatory dynamics in infectivity and death rates of
                                                                       covid-19. medRxiv.
 future work. The results from this preliminary work could            [Prem et al. 2020] Prem, K.; Liu, Y.; Russell, T. W.; Kucharski,
 be used to explain epidemiological models, specifically, the          A. J.; Eggo, R. M.; Davies, N.; Flasche, S.; Clifford, S.; Pearson,
 Exo-SIR (Exogenous - Susceptible, Infected, Recovered)                C. A.; Munday, J. D.; et al. 2020. The effect of control strategies
 model. Exo-SIR is built to model the disease’s spread while           to reduce social mixing on outcomes of the covid-19 epidemic in
 taking into account exogenous factors (e.g., gathering, com-          wuhan, china: a modelling study. The Lancet Public Health.
 pliance to public policy). Since our study identified concepts       [Shayak, Sharma, and Gaur 2020] Shayak, B.; Sharma, M. M.; and
 such as social instability, mistrust, and poor medicare as re-        Gaur, M. 2020. A new delay differential equation model for covid-
 sponses of the population against the instated policies, it           19.
 could be considered potential exogenous factors influenc-            [Sivaraman et al. 2020] Sivaraman, N. K.; Gaur, M.; Baijal, S.; Ru-
 ing SIR models. Our future research may entail including              pesh, C. V.; Muthiah, S. B.; and Sheth, A. 2020. Exo-sir: An epi-
 government policies themselves as the Exogenous impact                demiological model to analyze the impact of exogenous infection
 on a SIR population, and more accurately identifying and              of covid-19 in india. arXiv preprint arXiv:2008.06335.
 explaining the spread of a disease in a community by con-            [Speer and Lowry-Duda 2017] Speer, R., and Lowry-Duda, J.
 sidering citizen response to policies.                                2017. Conceptnet at semeval-2017 task 2: Extending word em-
                                                                       beddings with multilingual relational knowledge. arXiv preprint
                                                                       arXiv:1704.03560.
 All the code and datasets for this study are available for the       [van Holm et al. 2020] van Holm, E.; Monaghan, J.; Shahar, D. C.;
 reproducibility of our results here.                                  Messina, J.; and Surprenant, C. 2020. The impact of political
                                                                       ideology on concern and behavior during covid-19. Available at
                    Acknowledgements                                   SSRN 3573224.
 We would like to acknowledge Dr. Victor Vicente Palacios
 for his support in the Spain data collection and its interpreta-
 tion. Also we would like to acknowledge Mr. Nirmal Sivara-
 man and Dr. Sakthi Balan of LNMIIT-Jaipur for their brain-
 storming and input into the direction of this research. We ac-
 knowledge partial support from the National Science Foun-
 dation (NSF) award 1761880: “Spokes: MEDIUM: MID-
 WEST: Collaborative: Community-Driven Data Engineer-
 ing for Substance Abuse Prevention in the Rural Midwest”.
 Any opinions, conclusions or recommendations expressed in
 this material are those of the authors and do not necessarily
 reflect the views of the NSF.

                          References
[Cowling et al. 2020] Cowling, B. J.; Ali, S. T.; Ng, T. W.; Tsang,
 T. K.; Li, J. C.; Fong, M. W.; Liao, Q.; Kwan, M. Y.; Lee, S. L.;
 Chiu, S. S.; et al. 2020. Impact assessment of non-pharmaceutical
 interventions against coronavirus disease 2019 and influenza in
 hong kong: an observational study. The Lancet Public Health.