Cross-Country Analysis on Connection Between
                         Financial Lifestyle and Happiness
                         Anastasiia Dziuba1 , Uroš Sergaš1
                         1
                             University of Primorska, Koper, Slovenia


                                                                      Abstract
                                                                      This research aims to investigate if happiness depends on the lifestyle inhabitants can allow themselves from a
                                                                      financial point of view. Furthermore, this paper examines in which countries and regions it is possible to have
                                                                      middle-class expenses with average salary. In addition, a correlation between other features and happiness is
                                                                      observed. For specific correlations, linear models and clustering techniques are presented to identify associations.
                                                                      Obtained results and patterns behind them are further discussed which provides a clearer understanding of the
                                                                      research topic.

                                                                      Keywords
                                                                      happiness, cost of living, financial condition, monthly expenses


                         1. Introduction
                        Nowadays term “happiness” is getting more widely used. There are different assumptions regarding
                        the reason behind being happy. For instance, Brooks provides different scientific claims regarding the
                        “ingredients of happiness”. Firstly, happiness is described as subjective well-being, which is comprised
                        of genes, circumstances, and habits. Habits, according to the author, are comprised of faith, family,
                        friends, and work. Brooks believes that work in this equation means that the individual can earn success
                        and serve others. However, he claims, that: “Happiness cannot buy satisfaction.” Since people can never
                        have enough money, their wants rise with a paycheck, and the picture of a perfect life is adjusted very
                        quickly [1].
                           Kahneman argued in his study from 2010, that income influences life satisfaction, but not happiness
                        [2]. However, in 2023, Killingsworth, Kahneman, and Mellers in their article “Income and emotional
                        well-being: conflict resolved”, investigated the relationship between level of income and happiness.
                        For this purpose, a survey was conducted, in which participants were employed adults in the United
                        States, whose yearly income exceeded $10,000. Using quantile regression to analyze trends in lower
                        and higher income ranges (below and above $100,000, respectively), the authors concluded that higher
                        income positively influences respondents’ happiness scores. In addition, the authors mentioned that the
                        bottom of happiness distribution rises more rapidly than the top in that range of income. In particular
                        individuals with an annual income below $100,000 experience a 15% faster rise in happiness compared
                        to those earning more. [3]
                           Nevertheless, the role of money is considerably rising in society. We can notice that now there is a
                        trend of “buying happiness”. This century can be described as consumerism and of course, it makes an
                        imprint on people’s understanding of happiness. Social media influences it as well, we try to compare
                        our lifestyle with others, it is starting from a small age. Before people did not have that many types of
                        entertainment and could value small things. One of the reasons behind this could be that we are in
                        “endless pursuit of wanting more”[4].
                           This research aims to observe the relationship between the financial well-being of residents in
                        different countries and their perceived happiness. To achieve this objective various statistical methods
                        will be employed, including heat-map visualisations, correlation analysis, clustering, t-tests, linear and

                          HCI SI 2023: Human-Computer Interaction Slovenia 2023, January 26, 2024, Maribor, Slovenia
                          Envelope-Open 89232017@student.upr.si (A. Dziuba); uros.sergas@upr.si (U. Sergaš)
                                                                   © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
                                CEUR
                                Workshop
                                Proceedings
                                              http://ceur-ws.org
                                              ISSN 1613-0073
                                                                   CEUR Workshop Proceedings (CEUR-WS.org)

CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
logistic regressions. Nowadays we can find various indexes of evaluating the life on internet. These
happiness scores are based on different features. In this research, we will consider the index provided
by the World Happiness Report [5].


2. Methodology and gathering data
To study inhabitants’ happiness score and their financial condition, exploratory data analysis was
applied. For this purpose, the quantitative data was used. The data for this research was obtained
from several sources since a dataset containing all required features to perform the desired analysis
was not available publicly. Therefore, WHO[6], Kaggle [7], Gapminder [8] were used . Kaggle is a site
meant for machine learning practitioners and data scientists, Gapminder is a site displaying time series
of development statistics for all countries, WHO stands for World Health Organization, a specialized
agency of the United Nations responsible for international public health.
   For this research, we used the following region classification, which includes Australia and Oceania,
North America, South America, Central America and Caribbean, Europe, East and Southeast Asia,
Central Asia, South Asia, Middle East, and Africa.
   To omit bias while using the data from different resources, only the common values were used for
observation. More details regarding data cleaning will be provided further for the specific datasets.
   Every used dataset was available with a .csv extension, was published to the public domain, and is
free to use.

2.1. Cost of living dataset
The dataset was provided on Kaggle[7] by Miguel Piedade, a member and contributor of the platform.
Miguel Piedade gathered the data from the Numbeo website [9] by web scraping. Numbeo is the world’s
cost of living database, where the data is updated quarterly. The dataset published on Kaggle was based
on prices in 3rd quarter of 2022. The dataset comprised of 4874 cities from different countries, due to
data quality, this list was reduced to 760 cities. However, we used only capital cities for observation,
which ended up with 133 different countries. The data provided on Kaggle comprised 54 price metrics,
in this research we used the following metrics as average required expenses for living: internet, fitness
club (monthly fee), monthly pass (local transport), basic (electricity, heating, cooling, water, garbage),
550 min of prepaid mobile tariff local, 4 meals in an inexpensive restaurant, 8 bottles of water, 24 cups
of coffee, 2 pieces of clothes. In addition, the expenses for food were based on the minimum basket
list for consuming 2400 calories daily and using the Western diet type, which can be found here [10].
Firstly, we included “rent expenses” as well, however, it led to biased results. Since a lot of people live
not alone, it is difficult to evaluate the rent price for individuals. We should also consider that some part
of the population lives in their properties, therefore, they do not have monthly rent expenses. Overall,
we tried to follow the rule of 50/30/20 for the spending division, where 50% of salary goes for our needs,
30% for our wants and 20% supposed to be the savings [11].

2.2. Happiness score dataset
The dataset was published as an appendix to figures for World Happiness Report 2023 [5], which is
a publication of the Sustainable Development Solutions Network. The data for the report is powered
by Gallup World Poll, a global analytics and advice company [8]. The company collects data using
telephone surveys in the majority of cases, however, in some parts of the world (Latin America, Asia,
Africa, etc.) face-to-face interviews in randomly selected households. The happiness rankings are based
on the respondents’ assessment of their lives, in particular a single-item Cantril ladder life-evaluation
question (from 1 to 10, where 10 is the highest satisfaction with life). Nevertheless, the data was
published in the report for 2023, the survey was conducted in 2022.
2.3. Other datasets
To see if other features impact the life evaluation of people worldwide, we used the following datasets:
Life Expectancy, Human Development Index, Coverage Index for essential health services (UHC index),
number of suicides, which was further adjusted to suicide rate (number of suicides per capita), child
mortality rate, GDP per capita and Net migration. Most of the data was released in 2021, however, the
datasets for child mortality rate and number of suicides were last updated in 2019. We obtained the
datasets for Life expectancy, UHC index, and Child mortality rate from the World Health Organization
[6]. Net migration rate was gathered from Central Intelligence Agency [12]. GDP per capita was included
in the dataset published as an annex for World Happiness Report[5]. HDI (Human Development Index)
was published in Human Development Reports [13].
   A final set of features is provided below in Table 1.


         Table 1: Final dataset with features abbreviations and description used for analysis
        Column        Measurement        Variable    Description
        name                             type
        country       categorical        nominal     Name of the country
        region        categorical        nominal     Region to which country belongs
        savings       numerical          ratio       Average salary after deducting minimal
                                                     expenses and taxes (in US dollars)
        happiness categorical            ordinal     Metric measured in 2022 by asking par-
                                                     ticipants about their life satisfaction (the
                                                     scale from 1 to 10)
        life          numerical          ratio       Life Expectancy, estimate of average age
                                                     of participants
        hdi           categorical        ordinal     Human Development Index, summary
                                                     measure of average achievement in key
                                                     areas of human development
        hcov          categorical        ordinal     Coverage index for essential health ser-
                                                     vices
        suicide       categorical        ordinal     Suicide rate, number of suicides per
                                                     100000 people
        child         categorical        ordinal     Child mortality rate, probability that a
                                                     child will die in the period between birth
                                                     and 5 years
        gdp           numerical          ratio       GDP per capita (in US Dollars)
        mig           numerical          ratio       Net migration, difference between immi-
                                                     gration and emigration in the particular
                                                     country


3. Statistical analysis and its results
3.1. Heatmaps
Firstly, let us observe the situation regarding cost of living in the world. Using heatmaps, it is clearly
seen in which countries inhabitants are left with negative balance after average monthly expenses (from
red to light red) and where people have the opportunity to save some money (shades of blue). However,
it can be complicated to look at the specific country, therefore, plots of all world parts are provided
below. Figure 1.
                     (a) Europe                  (b) Africa                   (c) Asia


                  (d) Latin America          (e) North America              (f) Oceania


                                             (g) World outlook

Figure 1: Account balance at the end of the month, US Dollars


3.2. Regions’ outlook
Figure 2 shows us the distributional characteristics of groups of savings based on the region classification.
We can notice different patterns in the world regions based on the shapes and positions of boxplots.
Boxplot for Africa is comparatively short, which suggests that country values there, are low dispersed
and normally distributed. On average people there cannot afford all expenses. Australia and Oceania
show normal distribution and low dispersion as well, however, the average there is a surplus of 3000$.
Boxplot for East and Southeast Asia suggests a positively skewed distribution of values, therefore there
is higher dispersion in the upper quartile. Nevertheless, on average people in this region are left with
almost nothing at the end of the month (median ≈ 0). Let us look at the European region, where the
values are also highly dispersed and right skewed distribution can be observed. We can suggest that
the reason for this lies in different parts of the European region, since in the countries classification we
used for research there was no division between countries. There is less dispersion in the 1st and 2nd
quartiles of the boxplot, it is interesting that the maximum values go beyond 4000$ (the 4th quartile
shows high dispersion of values), however, the median is around 750$. Middle Eastern countries share a
similar distribution pattern as European ones, although, the account balance varies there from negative
to 1500$. We can notice high dispersion in the 3rd quartile and a significantly low one in the 2nd
quartile. Despite a considerably high maximum, on average people in Middle Eastern countries do not
have money for savings at the end of the month. Distribution in the Northern American region tends
to be normal, therefore, people there, are left with 2000$ in their bank account. South Asian and South
American countries share similar patterns; however, we can notice a negative outlier in South Asia.
People in both regions do not have enough money to afford the average level of life. Unfortunately, the
outlook of the Central Asian region is quite biased, only 2 countries are present in this group due to the
low data quality.
   Overall, we can observe that the financial situation is quite different in world regions, which demon-
strates inequality between countries with developing and developed economies based on the United
Nations M49 standard [14]. Most of the developed economies are represented by the countries in
Europe, North America and Oceania.


Figure 2: Money left at the end of the month by region, US Dollars


3.3. Correlation
The correlation plot allows us to quantify the direction and strength of the relationship between the
included in our research variables. Figure 3 gives us the outlook of correlation in our dataset. We
can notice that there is a high positive correlation between happiness ranking and health coverage,
human development index, life expectancy, gdp, and financial condition of people. In addition, there
high negative correlation between happiness score and child mortality. However, from the plot, we
assume that there is a low negative correlation between the suicide rate and other presented variables.
   On the other hand, relationships between financial condition and other variables show similar picture
as happiness score. The difference is noticeable between life expectancy and financial condition, where
the correlation is moderately positive. Finally, we observe a low negative correlation between financial
condition and child mortality. It is important to mention that the correlation between inhabitants’ bank
accounts and happiness scores is 0.8 in European countries, which is higher than in the world in general.
We can suggest that even though in Europe most of the countries are left with a positive balance at the
end of the month, people feel happier with higher income. However, the correlation between GDP and
happiness rating stays at the same level as in other parts of the world.
  The independent t-test was performed to compare the adjusted means of people’s savings in countries
with developing and developed economies [14]. There was a significant difference between means of
two variables groups (t(54)=5.076, p <0.001). The 95% confidence interval (CI) of the mean difference
was from 490$ (for the lower bound value) to 1130$. In addition, we observed a significant difference
between happiness scores in the examined groups (t(94)=8.25, p <0.001).
  Overall, we assume that inhabitants in happier countries tend to live longer, are provided with better
health services, and can allow more themselves in financial terms.


Figure 3: Correlation of used features


3.4. Clustering
This subsection aims to find similar patterns in the relationship between happiness scores and the
financial condition of people in different world regions. For this purpose, a k-means clustering algorithm
was used, which is based on computing the distance between points in the dataset and a centroid and
assigning it to the cluster. Besides world region clustering (Figure 4a ), European countries clustering is
as well presented below (Figure 4b).
   For region clustering the optimal number of clusters according to analysis of all indices except GAP,
Gamma, Gplus, and Tau, was 3. Therefore, we can observe that two regions, Australia with Oceania
and North America have the highest values of happiness score and potential savings of people. In the
middle group, we can find countries from Europe, East and Southeast Asia, and Middle East regions. It
is worth mentioning that the leaders of the happiest countries list are European countries, however,
considering also financial condition of people living there, we can notice that North America and Pacific
countries are leading.
   The last group is represented by South and Central Asia, Africa, South America, Central America, and
the Caribbean. The last three regions mentioned are almost on the same level with East and Southeast
Asia by happiness ranking, however, they are showing low results in financial condition.
  Let us also look at the clustering plot of European countries 4b. Here we can observe 5 clusters,
however, one of them is represented only by Switzerland. Finland, which is considered the happiest
country in the world [5], can be found in the third cluster, where countries’ inhabitants have lower
savings than countries in the first cluster (Luxembourg, United Kingdom, Netherlands, Denmark,
Norway, and Iceland).


                    (a) World Regions                              (b) European countries

Figure 4: K-Means Clustering based on happiness rate and savings


3.5. Linear regression model
The results of the following linear regression model aim to predict happiness scores based on the
amount of money people are left at the end of the month (Figure 5). Looking at the summary closer,
we can notice that our model is predicting evenly at both the high and low ends of our dataset since the
median value is 0,2. Coefficients allow us to construct the equation for our model. Y = 0.0006680*x +
5.365. As a baseline, if the money at the end of the month equals 0, the happiness score would be 5.365.
Then for each 1000$ of saved money, the happiness score of the country would increase by 0.67 points.
In our case, the standard error value (0.00007128) is larger than the coefficient, which shows that the
coefficient will most likely not be 0. Therefore, the savings coefficient is 9.37 standard errors away
from zero, which is quite far to say that the coefficient will not be 0. The P-value is much smaller than
0.05 (<2e-16 and 3.34e-15 in our case), which means that our coefficient is significant for our model.
According to residual standard error, our actual happiness scores are 0.83 points away from predicted
ones on average, which is quite high taking into consideration our actual highest scores (7.8). Savings
at the end of the month explain 47.77% of the variation with happiness score. Looking at the F-Statistic
and p-value we can conclude that the null hypothesis is rejected and there is a relationship between
happiness score and the amount of money people are left at the end of the month with average expenses.
   Now we can look at the provided graphs(Figure 5. Looking at the Normal Q-Q plot we can suggest that
the distribution is negatively skewed, which means that our model possibly overestimates happiness
score for countries with smaller income and underestimates for countries with higher one. Scale-
Location plot shows us that homoscedasticity is likely to be satisfied for our model. To be sure that
homoscedasticity is met, we can perform the Breusch-Pagan test, in our case p-value is 0,175 which
is higher than 0,05. Therefore, it indicates that we do not have sufficient evidence to claim that
heteroscedasticity is present. Based on the Residuals vs Leverage plot, we notice that there are no
influential points in our model (all the points stick very closely to Cook’s distance). However, we can
remove those points since the model fits the data.
   To conclude, taking into consideration all results provided by the linear regression model, there is a
relationship between the happiness score of a specific country and the amount of money people are left
at the end of the month.


Figure 5: Linear regression model


3.6. Logistic regression model
To better understand the relationship between happiness rate and citizens’ financial condition around
the world the binary logistic regression was applied. Here we aim to predict if people can cover minimal
expenses during the month based on the happiness rate. For this purpose, the threshold of 0 was used
for “Savings” The dataset was randomly divided into trainset and test set in proportion 0.7 and 0.3
respectively).
   Based on the obtained results we claim that one unit change in happiness rate increases the log odds
of people living in country X to have positive balance accounts after necessary expenses by 1,76. The
P-value is greater than the significance level, which indicates that the happiness rate is statistically
significant.
   In addition, binomial logistic regression was used for observing the relationship between happiness
rate and citizens’ financial condition around the world. Therefore, we used the happiness score mean as
a threshold, which in our case was 5.76. In our original dataset, 43% of sampled countries belong to the
group with a happiness rate lower than the mean. The modeling dataset was further randomly divided
into a training set, and one used for testing the produced model. Different proportions of splitting data
were considered to achieve accurate results. Below the results of one split are described, namely 70% for
the training set and 30% for testing of the original dataset. According to the obtained p-value, we claim
that the savings variable is statistically significant. In our case for every one dollar increase in savings,
the log odds of higher happiness rate increases by 0.004. Considering the intercept of the provided
model, we assume that the probability of the country where people are left with no money at the end of
the month to achieve a happiness rate higher than 5.76 is approximately 49%.
   The presented classification model was used to make the predictions on our test data and further
evaluated with a confusion matrix. According to statistics of the confusion matrix, there is 89.66%
accuracy of the model predictions. The Kappa metric suggests that our model fit is better than random
chance (0.79). 0.517 true positive cases were correctly detected by our model compared to 0.55 true
positive cases in our dataset. The ROC curve is provided in Figure 6, the curve is close enough to the
upper left corner of the graph which indicates the accuracy of our test. The area under ROC curve
(AUC) is 0.95.
   Based on the described results, we assume that the model fits our data. The financial condition of
people impacts the happiness level of the country.
   Other models were considered as well, namely the relationship between life expectancy and financial
condition and, the prediction of a balance account at the end of the month based on the happiness
rate. However, the mentioned models did not show accurate results and, therefore, are not presented in
detail.


Figure 6: ROC curve for logistic regression model


4. Discussion and Limitations
The applied statistical methods on the used data provide us with a brief overview of the financial
condition impact on happiness levels around the world. With other features, such as average life
expectancy and human development index we were able to observe how different development levels
are interrelated with happiness score and cost of living.
   One of the biggest limitations of research is that the used data provides a picture only of capital cities,
therefore, we cannot make any conclusions regarding the countries overall. On the other hand, it can
be considered more as research on capital cities worldwide. We can suggest that in small cities and
villages, the situation with expenses and income is different from capital. Since most of companies vary
the prices for products in different countries to ensure a circular flow of economy, but prices inside the
country are stable in different cities.
   Another limitation would be data quality for the African region, since some data provided for the
cost of living is absent, as a result, we cannot make fair conclusions regarding this region. In addition,
the population could be divided into groups based on age and marital status, since the expenses in
families are built differently from people who live alone. In some families there is only one person who
is employed, so in such cases, income is divided between several people.
   Since we did not consider the rent expenses in our study, it is difficult to evaluate if the final results are
biased. Average property ownership varies from country to country, in some places, it is more affordable
to rent the property for living than to own one. We should also consider that not everyone lives in
an equal-sized house and there are considerable differences between countries. Property taxation and
conditions for purchasing also lie in governmental processes.
   We can suggest that people with higher income buy more, therefore, the demand is higher in such
countries. For instance, in some developing countries the individuals’ picture of prosperity would be
totally different from the ones in developed countries. Several decades ago, people did not have such a
variety of products and services available, consequently, their demand was lower. Therefore, further
research can be done in observing the level of consumerism and happiness scores worldwide.


5. Conclusion
From the presented analysis we can see that there is a relationship between the financial condition of
inhabitants and happiness rating in different countries of the world. However, we also observed the
limitations that were faced during the research.
   Despite the obtained results, it is quite difficult to say that happiness depends on the level of income.
Happiness is something abstract, nobody can describe it. There is a picture of happiness in society,
where an individual is free, has a lot of money, and can do whatever s/he wants. In real life, our picture
of happiness is based on what we consume in terms of information. Information leads our wants, needs,
and therefore our expectations of life. Nevertheless, information is free to use in various parts of the
world, news feed stays different, and the picture of prosperity, values, satisfaction, happiness, and
misery cannot be compared from country to country. Nevertheless, life satisfaction is firstly impacted
by the family environment and then by society.


References
 [1] A. C. Brooks, 2020, The 3 equations for a happy life, even during a pandemic, URL: https://www.
     theatlantic.com/family/archive/2020/04/how-increase-happiness-according-research/609619/.
 [2] A. D. Daniel Kahneman, High income improves evaluation of life but not emotional well-being.,
     Proceedings of the National Academy of Sciences of the United States of America (2010). doi:10.
     1073/pnas.1011492107 .
 [3] B. M. Matthew A. Killingsworth, Daniel Kahneman, Income and emotional well-being: A conflict
     resolved., Proceedings of the National Academy of Sciences of the United States of America (2023).
     doi:10.1073/pnas.2208661120 .
 [4] C. Campbell, The Romantic Ethic and the Spirit of Modern Consumerism: New Extended Edition,
     Springer Verlag, 2018.
 [5] J. Helliwell, R. Layard, J. D. Sachs, J.-E. D. Neve, S. W. Lara B. Aknin, World Happiness Report
     2023 (2023). URL: https://worldhappiness.report/ed/2023/#appendices-and-data.
 [6] WHO, 2023, World Health Organization, URL: https://www.who.int.
 [7] Kaggle: Your Machine Learning and Data Science Community, 2022, URL: https://www.kaggle.
     com/.
 [8] Gapminder, 2023, URL: https://www.gapminder.org.
 [9] Numbeo, 2023, Cost of living, URL: https://www.numbeo.com/cost-of-living/.
[10] Numbeo, 2023, Food Prices, URL: https://www.numbeo.com/food-prices/.
[11] The 50/30/20 rule: how to budget your money more efficiently, 2023. URL: https://n26.com/en-eu/
     blog/50-30-20-rule.
[12] Central Intelligence Agency, 2023, URL: https://www.cia.gov.
[13] UNDP, 2023, Human Development Reports,                     URL: https://hdr.undp.org/data-center/
     human-development-index#/indicies/HDI.
[14] United Nations, 2022, Methodology: Standard country or area codes for statistical use (m49), URL:
     https://unstats.un.org/unsd/methodology/m49/.