<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Fourth Italian Workshop on Artificial Intelligence for an Ageing Society, November</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Artificial Intelligence approach to predict mutidimensional poverty of older people from unlabelled data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Lorenzo Olearo</string-name>
          <email>lorenzo.olearo@.unimib.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fabio D'Adda</string-name>
          <email>fabio.dadda@unimib.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vincenzina Messina</string-name>
          <email>enza.messina@unimib.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marco Cremaschi</string-name>
          <email>marco.cremaschi@unimib.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stefania Bandini</string-name>
          <email>stefania.bandini@unimib.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francesca Gasparini</string-name>
          <email>francesca.gasparini@unimib.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science</institution>
          ,
          <addr-line>Systems and Communications</addr-line>
          ,
          <institution>University of Milano - Bicocca</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>0</volume>
      <fpage>6</fpage>
      <lpage>09</lpage>
      <abstract>
        <p>Despite the rapid development in very recent years of Artificial Intelligence models to predict poverty, this problem still remains an unsolved open issue especially in a multidimensional perspective. In this work we present our proposal to face multidimensional poverty in case of a fragile population, the older adults, starting from an unlabelled dataset, collected administering a proper questionnaire to about 500 individuals. Firstly a model that allows to label the collected data into three classes of poverty is proposed. Then, XGBoost and Naive Bayes classifiers are considered to solve the classification problem. Finally, after having determined the relative importance of each feature, a novel Naive Bayes model is proposed that relies on new aggregated features that represent five poverty dimensions. These aggregated features are obtained by properly combining the variables collected through the questionnaire with cut-ofs defined by a domain expert.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        Poverty is one of the most significant social problems according to the Organization for Economic
Cooperation and Development (OECD). Poverty does not only afect developing countries but
can also refer specifically to certain categories of fragile people. There is a strong relationship
between poverty and human rights, as also emphasised by the Council of Europe [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. An extreme
imbalance in people’s wealth leads to extreme inequality in the enjoyment of human rights.
Therefore, overcoming poverty also implies reducing this inequality. Being able to predict
poverty therefore becomes crucial as one of the first steps in defeating it. The application
of artificial intelligence to the problem of poverty prediction is very recent, with significant
early work published in 2016 [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], and a rapid increase in interest in 2021 [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. This increase
in applications of artificial intelligence to tackle poverty and, in general, to contribute to the
nEvelop-O
∗Corresponding author.
CEUR
Workshop
Proceedings
achievement of sustainable development goals, especially in the last two years, is also linked to
the cumulative efect of the COVID-19 pandemic, the Russia-Ukraine war and climate change [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ],
[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. The first approaches considered poverty and the corresponding indicators mainly referring
to monetary aspects [
        <xref ref-type="bibr" rid="ref6 ref7 ref8">6, 7, 8</xref>
        ]. The recent trend, on the other hand, considers poverty as the
consequence of several concomitant efects and thus requires the definition of multidimensional
indicators [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ],[10]. Several definitions and measurements can be defined for multidimensional
poverty, related to diferent ways in collecting and analyzing data. Moreover, considering the
various fragile people requires identifying adequate measures, to grasp the peculiarities of their
conditions.
      </p>
      <p>
        One of the major problem in predicting poverty applying AI is the lack of high quality labelled
data. Diferent proxies can be assumed as ground truth for Machine Learning ( ML) training
such as the Proxy Means Test (PMT) labels. However these proxies are not easily verifiable
[11]. Among diferent ML techniques for poverty classification, decision tree [ 12], random
forest [11], and ensemble approaches [13] are the most used. A pioneering work [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] proposed
to analyze high-resolution satellite images to predicting poverty considering a CNN model
pre-trained on ImageNet. In poverty prediction the role of the diferent poverty dimensions is
crucial. AI models have been applied also for feature selection [14] and feature ranking [15] to
create models that rely only on the most important variables [12].
      </p>
      <p>The AMPEL Project (Artificial intelligence facing Multidimensional Poverty in ELderly) faces
the problem of predicting poverty of a particular fragile population: the older adults. Especially
in this case, income-based indicators are poor proxies of material conditions [16, 17] whereas
non-monetary ones improve our understanding of who is poor. The final aim of the project
is to define a three level poverty risk classifier useful to identify subjects that need a prompt
support, especially in case of emergencies. As a secondary outcome of no less importance, the
project aims at identifying which indicators in the case of the elderly population may be the
most meaningful to describe multidimensional poverty. To this end, a proper questionnaire
that correlates several aspects regarding economic status, needs, health conditions, loneliness,
social interactions, among others, have been designed. Thanks to the help of Auser volunteers,
the questionnaires have been delivered to around 500 elderly people in Lombardy, collecting a
valuable amount of data.</p>
      <p>In this paper, we report our classification approach to identify a three-level risk of poverty,
considering the following main issues:
1. Pre-process the collected data in order to obtain a clean dataset and select robust features
to feed ML algorithms;
2. Apply an automatic procedure to label the collected data, taking into account the
multifacet aspects of poverty, that initially relies on the intervention of domain experts;
3. Applying a proper model to understand how diferent features can contribute to
identifying poverty;
4. Defining multiple classifiers based on new features vectors that combine the original
features considering their relevance.</p>
      <p>The paper is organised as follows. In Section 2 the data and its sources are presented. In
Section 3, the proposed framework of analysis is described. In Section 6 a brief description of
the technologies used for the implementation is provided, emphasizing the data visualization
dashboard that has been implemented. Eventually, we conclude this paper and discuss the
future direction in Section 7.</p>
    </sec>
    <sec id="sec-3">
      <title>2. The AMPEL Dataset</title>
      <p>To collect the dataset, a questionnaire properly defined by domain experts was administered to
older people by trained interviewers to investigate various aspects of their living conditions.
These questions cover topics such as monetary aspects, health status, environment, social
networks, and quality of life, resulting in a total of 125 variables (features) for each individual.
The questionnaires were administered to 496 people, (336 females) with an average age of 76.4
(standard deviation = 8.9).</p>
      <p>The collected variables can be grouped into two categories: i) Categorical variables: variables
which assume a fixed set of values ( e.g., private health insurance coverage: yes or no); ii)
Numeric variables: variables which assume numeric values, both discrete and continue (e.g.,
the number of individual that live in a house). It should be noted, however, that in the collected
dataset numerical features are considerably less than categorical ones, a topic that will later be
addressed in Section 3.1.</p>
      <p>With the supervision of a domain expert, each of these features has been assigned to one of
the five poverty dimensions here listed:
• Maintenance Capacity: the financial situation of an individual;
• Consumption Deprivation: organises some information related to the afordance
capacity of the individual;
• Health Status: collects health status features related to the individual;
• Housing Facilities: contains all the information that describes the conditions of the
dwelling and the neighbourhood in which the individual lives;
• Social and Context Deprivation: reports information about social relations.</p>
    </sec>
    <sec id="sec-4">
      <title>3. The proposed framework of analysis</title>
      <p>The prediction of multidimensional poverty using innovative AI-based solutions requires fixing
methodological issues in order to face the complexity of the task, especially in this case in which
the dataset which is composed by a limited sample size (496 individuals) and has missing values
due to the lack of information from participants. In this section, a methodology addressing
this critical data gap, and the distribution of poverty among individuals within the dataset are
illustrated through a pipeline of four distinct phases, as illustrated in Figure 1:
1. Data Cleaning: This step aims to identify and correct errors or inconsistencies in the
dataset to improve its quality, accuracy, and reliability. It is a crucial step for the next
phases because the quality of the data directly impacts the validity and efectiveness
of the employed machine learning model. Moreover, it requires feature engineering to
correct for type heterogeneity within the dataset.
2. Data Labelling: The dataset does not explicitly tell the distribution of poverty across
the population, so this fundamental step implements an approach that allows to classify
individuals in diferent poverty levels.
3. Feature Relevance: The algorithm relies on multidimensional poverty in order to
consider consider the diferent level of deprivation of an individual. The purpose of this
phase is to find which features have the greatest impact on definition of poverty.
4. Poverty Prediction: The last step aims to build a machine learning model by using
information such as, labelled data and feature relevance, to classify new, previously
unseen data instances. The goal is to define a three level poverty predictor.</p>
      <sec id="sec-4-1">
        <title>3.1. Data Cleaning</title>
        <p>This dataset requires essential preliminary steps of data pre-processing to clean and normalise
the data entries. Firstly, generalities are removed, along with other irrelevant features and
others with too many missing or noisy values. When possible, however, missing values are filled
with relevant ones. For example, if an individual does not have a certain condition, they did not
answer the relevant question in the form, all such entries are thus filled with a categorical value
representing the absence of that specific condition.</p>
        <p>The dataset is composed of data collected from a survey. Some of its features are answers
to open-ended questions and as such, they present considerable name heterogeneity: answers
that are semantically equivalent but syntactically diferent. To address this heterogeneity, when
possible, the various semantically diferent values are mapped into a common one, otherwise,
thanks to the reduced dimensions of the dataset, these values are manually corrected one at the
time.</p>
        <p>Another considerable issue of this dataset is represented by features that are related and
dependent, features such as the fact an individual is retired from work can be clearly related to
features describing the job which an individual is engaged or features describing if an individual
is unemployed or not. To address these redundant information, the dataset is simplified by
reducing its dimensionality, these groups of redundant columns are then collapsed into a single
meaningful one.</p>
        <p>The dataset is composed for the majority of categorical features but some numerical ones are
present, such the age of a given person or the number of cigarettes an individual smokes over
the course of a week.</p>
        <p>Considering the small number of observations in this dataset, namely 496, the numerical
columns are simplified reducing their information by binarizing their values. For example,
instead of representing the number of people that live with an individual, a possibly challenging
numerical value to handle in an otherwise categorical dataset, this information is transformed
representing whether this individual lives alone or not. This type of feature engeneering
is applied with the same logic over all the numerical features remaining in the dataset thus
transforming it into a fully categorical dataset.</p>
        <p>The steps illustrated above are repeated on the whole dataset and allow reducing the number
of features from 125 to 103.</p>
      </sec>
      <sec id="sec-4-2">
        <title>3.2. Data Labelling</title>
        <p>Since the dataset does not contain information about the distribution of the poverty, it is
necessary to define a process that is able to introduce this information, providing labelled data.
This issue has been solved applying the framework described in a previous work [18] and
ifrstly proposed by Liberati et al. [ 19]. This approach seeks to assess the likelihood of poverty
for each person by employing a set of vector weights to understand the various aspects of
multidimensional poverty. Instead of assigning a single weight to each variable, this technique
allows for the generation of a representation of all potential weight combinations. These weights
enable the classification of individuals in the dataset into three poverty levels: high (identified
with red color), medium (yellow color), or low (green color). The implementation is the same as
reported in [18], just properly adjusting the initial deprivation cut-ofs, according to the new
dataset. In order to understand how the vector space has been built, it’s necessary to summarize
the steps of the labelling process, which are also illustrated in Figure 2:
1. Load Dataset: This is a preliminary step which is necessary to load the dataset and the
relative cut-ofs. The purpose is also to prepare the data for the next steps of the pipeline.
2. Deprivation Matrix: The first phase relies on mapping the initial dataset on a deprivation
matrix by applying the binary cut-ofs properly defined by a domain expert for this data.
The resulting matrix is a binary matrix that indicate, if an individual is deprived or not,
with respect to each feature.
3. Deprivation Score Matrix: The deprivation matrix enables to calculate a poverty score
for each individual by weighting each feature with a predefined score. Instead of defining
a single vector of weights, Liberati et al. [19] proposes to randomly generate a set of 
weight vectors from a uniform distribution, allowing us to explore better the feasible
weight space. In this work we use  = 10000 . The final matrix contains  deprivation
scores, for each individual, based on changes in weight vectors. For this reason, each
vector can be considered as an embedding representation of the poverty level of
each individual. Eventually, the number of random vectors is an hyperparameter to be
discovered by stressing the model.
4. Poverty Indicator Matrix: Starting from the deprivation score matrix, the poverty
indicator matrix is defined. This matrix represents the poverty ranks of each individual
based on their deprivation scores within the considered population. For each vector of
weights, the rank of each individual is computed by adding the number of individuals
whose deprivation score is higher than the deprivation score of the considered individual,
plus one. In simpler terms, the score is one plus the count of individuals who are
multidimensionally poorer than the current individual. Therefore, the higher score the
lower the poverty level.
5. Probability Matrix: The last phase builds a probability matrix, which represents the
probabilities of individuals being in poverty. Each entry in the matrix indicates the
likelihood of each individual of occupying a rank from 1 to N in the final poverty matrix.
⎢⎡22 22 22 68⎥⎤ ⎢⎡00 00 00 10⎤⎥ ⎡⎢00..5130
⎢1 2 3 5⎥⎥ ⟶ ⎢⎢1 0 1 1⎥ × ⎢⎢0.71
⎢⎢2 1 2 5⎥ ⎢0 1 0 1⎥⎥
⎣1 2 1 1⎦ ⎣1 0 0 1⎦ ⎣0.27</p>
        <p>From the Poverty Indicator Matrix the individuals can be classified into three levels of poverty,
high (visually coded with red color), medium (visually coded with yellow color) or low (visually
coded with green color).</p>
        <p>Once individuals are labelled, also the associated vectors in the Deprivation Score Matrix can
be referred to the three poverty levels.</p>
        <p>A dimensionality reduction technique such as T-distributed Stochastic Neighbor Embedding
TSNE [20], which is commonly used in machine learning and data analysis for visualizing
high-dimensional data in a lower-dimensional space, can be used. This representation allows us
to visualize how individuals are distributed in the representation space varying the number of
random weight vectors used, and to assess whether it is possible to visually distinguish poverty
groups. This analysis allows to understand how the number of random vectors changes the
representation space of individuals, as reported in Figure 3, considering 5, 10, 50, 100, 1000 and
10000 weight vectors respectively.</p>
        <p>When the number of vectors increases, the representation space changes, and the poverty
groups become more separated. This behavior better explains the benefits ofered considering a
set of randomized vectors:
(d) 100 Vectors
(e) 1000 Vectors
(f) 10000 Vectors
• Exploration of Solution Space: Randomized vectors allow to explore a wider range of
solutions within the feasible weight space. So, having a unique vector of weights, the
solution would be locked into a single one. In contrast, randomized vectors provide a
more comprehensive view of the solution space, which can be beneficial in cases where
the optimal solution is not known in advance.
• Robustness: Randomized vectors can make the optimization of the algorithm more
robust to variations in the data. Randomisation can help mitigate this problem by making
the solution less sensitive to specific data characteristics.
• Eficient Exploration : Randomization allows for eficient exploration of the solution
space, because it’s possible to explore a wide range of weight vectors quickly, which is
particularly useful when searching for good solutions in large and complex optimization
problems.</p>
        <p>Increasing the number of weight vectors considered, the convergence of the model increseas,
as reported also in Table 2.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Feature Relevance</title>
      <p>One of the key aspects of this project lies in the understanding of which and how much features
from diferent domains contribute to an individual’s poverty, however, with such a complex
categorical dataset this task becomes challenging.</p>
      <p>The strength of the correlation between the features is firstly analyzed thought the Cramer’s
V [21], a measure of association between two nominal variables based on Pearson’s  2 test.
From the resulting correlation matrix, it seems that there is a significant correlation only among
features related to the health status, while no other significant correlation among other variables
has been found.</p>
      <p>To try to achieve considerable results, a diametrically opposite approach was attempted using
machine learning models. In particular the information gain of each feature in a tree based
model, in this case XGBoost [22], is investigated.</p>
      <p>XGBoost is used to fit boosted trees for the 3-classes classification task. It allows for the
exploitation of the tree-based structure of the classifier to better understand the features in our
dataset, furthermore, it allows us to operate with categorical data.</p>
      <p>Once fitted, an XGBoost model returns the importance of each feature in the dataset the
model was trained with. This importance can be defined following diferent metrics, in this
case particular interest lies in the gain score of each feature which is defined as the average
gain across all splits the feature is used in. The gain of a certain feature represents its relative
contribution in the classification for each of the trees in the model, it depicts the amount of
information gained about the target class after adding the relative split in the model.</p>
      <p>The information gain of a tree-based model is a powerful metric but it can be misleading,
especially in our case in which the dataset has a large number of features (103) while the
observations are unfortunately low (496), the model is at high risk of overfitting as verified in
our experiments, which are available at the following link: AMPEL Code.</p>
      <p>To deal with this issue a 5-Fold cross validation process is implied. The entire dataset is
initially divided into two subsets: 80% of it will be used for training while the remaining 20%
is retained for testing. In the training subset of the dataset, a 5-Fold cross validation of the
XGBoost model is carried on, all of the 5 models fitted in the process are saved, along with their
gain scores. To avoid overfitting, these scores are then averaged.</p>
      <p>Moreover, from the gain scores of the features within each group, it is also possible to compute
the relevance of each of the five dimensions, as reported in Figure 4, that shows a fairly balanced
distribution of importance.</p>
      <p>Computing the relevance of the features and their groups is a fundamental task, it allows
for a subset of questions to be identified and focused upon in case a new questionnaire is to be
submitted to another sample of individuals.
5. Towards the prediction of multidimensional poverty
Having properly pre-processed the dataset and assigned the poverty labels, the problem of
multidimensional poverty prediction can be faced. Three distinct classification models are here
investigated: XGBoost, Categorical Naive Bayes and a novel hybrid approach that combines
XGBoost feature gain scores with a Naive Bayes classifier, considering the five poverty dimensions
defined in Section 2.</p>
      <p>In all of the following analysis, the same 80% of the dataset previously used for the feature
relevance is used to train the presented models while the remaining 20% is retained for the
testing, all of the models are thus trained on the same data and evaluated on the same test set.</p>
      <p>In this section the proposed models are evaluated considering three diferent metrics: accuracy,
recall and F1-score.</p>
      <sec id="sec-5-1">
        <title>5.1. The XGBoost classification model</title>
        <p>Following our previous work [18] and having used it to understand the impact of each feature,
the first model analyzed is XGBoost in its categorical classification form, this model provides
the baseline for subsequent predictive models. This model is chosen as it ofers significant
interpretability through the visualization of the aforementioned feature importance, moreover,
both training and inference on the model are remarkably fast. With these considerations, once
iftted, our XGBoost model achieves the 79% of overall accuracy across the test set.</p>
        <sec id="sec-5-1-1">
          <title>Class</title>
          <p>Green
Yellow
Red</p>
        </sec>
        <sec id="sec-5-1-2">
          <title>Accuracy</title>
          <p>0.73
0.72
0.89</p>
        </sec>
        <sec id="sec-5-1-3">
          <title>Recall</title>
          <p>0.90
0.70
0.81</p>
        </sec>
        <sec id="sec-5-1-4">
          <title>F1-Score</title>
          <p>0.81
0.71
0.85</p>
        </sec>
        <sec id="sec-5-1-5">
          <title>Support</title>
          <p>21
37
42</p>
        </sec>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. The Categorical Naive Bayes classification model</title>
        <p>With the aim of having an explainable model, a Bayesian approach is analyzed, more precisely a
Naive Bayesian Classifier. No conditional dependence is assumed between the various features
of the dataset. This is a reasonable assumption as the questions asked to the individuals cover
diferent subjects one from the other, albeit of the same topic in some cases. Furthermore,
ifnding any conditional dependence between such a large number of features is non-trivial and
requires rich domain knowledge.</p>
        <p>Once again, the model is trained on the entire train set, this time achieving slightly better
results than the XGBoost model with a global accuracy of 84% as shown in Table 4.</p>
        <sec id="sec-5-2-1">
          <title>Class</title>
          <p>Green
Yellow</p>
          <p>Red</p>
          <p>Increasing the number of features tends to add more information that the model can learn.
Ideally this can lead to better discrimination between classes. However, this can also have its
downsides: as the number of features grows so does the risk of overfitting the train data: the
model might learn apparently unrelated noise while loosing the ability to later generalize on
the test set.</p>
          <p>Naive Bayes classifier are known for their simplicity and efectiveness in many classification
tasks [23], however, they tend to be less informative as the number of diferent features increases.
In order to create a model as explainable as possible, it can be interesting to study how the
performance of the model changes while increasing the number of features it is trained with.</p>
          <p>In Figure 5, we present the performance evaluation of the model employing a 5-fold strategy,
showcasing the model performance as we increment the number of features. These features are
arranged based on their relevance in descending order, determined by computing the average
feature gain scores across the 5 cross-validated XGBoost models discussed in Section 4.</p>
          <p>The trend of the accuracy seems irregular varying the number of features considered, probably
due to the small number of observations in the dataset. However, it is still possible to appreciate
the incremental tendency, that seems to stabilizes around the top 60 ones.
5.3. The Naive Bayes multidimensional poverty classifier
This model is introduced with the aim of having a predictive model as explainable as possible,
capable of operating with a reduced number of features but in a multidimensional
environment. These dimensions correspond to the five groups defined in Section 2: Social Deprivation,
Maintenance Capacity, Consumption Deprivation, Housing Facilities and Health Status.</p>
          <p>For each dimension, a new feature is computed by linearly combining the XGBoost
information gain of each feature with its relative cut of defined by the domain expert. In order to
improve the generalization power of the model, the gain used for this linear combination is the
one averaged from the 5 extracted cross-validated models in Section 4.</p>
          <p>The procedure to compute the five new features is reported below. Starting from the binary
deprivation matrix defined in Section 3.2, each row describes the features where an individual
is considered deprivated. These rows are here referred as deprivation vectors. Each column,
instead, represents the distribution of individuals that exceeds the cut-ofs defined by the domain
expert.</p>
          <p>Considering now the gain scores obtained for each feature vector, these are grouped by
respect of their previously defined poverty dimension and linearly combined with their relative
deprivation vector as shown in Equation 1, where  is the importance vector for the group of
features of the considered poverty dimension, each    ℎ  ∈ {1, ⋯ , } is the gain score of the
 -th feature, while  instead is the deprivation vector where each    ℎ  ∈ {1, ⋯ , } represents
the feature  -th of that group.</p>
          <p>⎡ 1⎤
 ⊤ = ⎢⎢⎢ ⋮2⎥⎥⎥ [ 1  2
⎣  ⎦
⋯   ] =     
(1)</p>
          <p>These computations result in five numerical values referred as deprivation scores representing
the “magnitude” of the deprivation an individual is subjected for each of the five dimensions.</p>
          <p>Each individual at this point is encoded into 5 values that will now be used to determine their
poverty class. To this end, a Naive Bayes classifier is thus applied having a parent node for each
of the 5 classes of features as shown in Figure 6.</p>
          <p>The classifier built on top of this pipeline shows good performance as reported in Table 5
reaching an overall accuracy of 89% on the test set which it is important to mention has never
been used in the entire pipeline except to assess this metric. This result shows how well the
model is able to generalize by using the average of the gain extracted from the cross-validated
XGBoost models introduced in Section 4.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. The AMPEL Project Dashboard</title>
      <p>The implementation of the project involves multiple independent software modules that
cooperate by exchanging data resulting from the analyses described above. The figure 7 shows
the diferent pieces of software implemented and deployed by using Docker containers. A brief
description of each backend component is reported below:
• MongoDB: MongoDB has been used to save AMPEL data and it is an open-source NoSQL
database management system. It falls under the category of document-oriented databases
and is designed to handle large volumes of unstructured or semi-structured data.
• API: The deployable code runs in the AMPEL API container because they enable the
integration of diverse applications and technologies.
• Jupyter Notebook: Most of the analyses have been implemented in a Jupyter Notebook
because it allows to create and share documents containing live code, equations and
visualizations.</p>
      <p>The last service reported in figure 7 is the Dashboard (reported on figure 8). It plays a crucial
role because it allows the visual analysis of the results. carried out in the previous sections.
The Dashboard, is a visual representation of key information, data, metrics and performance
indicators organized and displayed in a centralized and easily accessible manner. In this case, the
dashboard has been designed to provide a quick and concise overview of important information,
allowing experts to monitor, analyze, and make informed decisions based on real-time and
historical data. Among the charts available in the dashboard it’s possible to find:
• Poverty on degree: Poverty is analyzed by categorizing individuals according to their
educational attainment.
• Poverty on Gender: Poverty is analyzed by categorizing individuals according to their
gender.
• Map Poverty: Poverty is analyzed by highlighting on a geographical map the areas with
the highest concentrations of poverty.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion and future works</title>
      <p>The proposed AI approaches to predict multidimensional levels of poverty have shown good
performance. The feature relevance analysis allows to define poverty scores, related only to
ifve poverty dimensions, starting from the original features properly linearly combined. This
approach allows to train a model on feature vectors of a lower dimension, that preserve the
valuable information of the original set of features. However it is worth noting that the labelling
procedure could have introduced a bias in the final results. As well as the definition of the five
dimensions could be revised and extended. To test our proposal and make it more generalizable,
we plan to adopt other labelled datasets available in the literature, to test both the labelling
procedure and the proposed Naive Bayes multidimensional classifier.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgments</title>
      <p>This research is supported by the Fondazione CARIPLO “AMPEL: Artificial intelligence facing
Multidimensional Poverty in ELderly” (CUP H45F20000840007, Ref. 2020-0232). This publication
was also produced with the co-funding of European Union – Next Generation EU, in the context
of the National Recovery and Resilience Plan Investment Partenariato Esteso PE8 ”Conseguenze
e sfide dell’invecchiamento”, Project Age-It (Ageing Well in an Ageing Society) PE00000015 −
CUP: H43C22000840006.</p>
      <p>(a) Dashboard Panel
(b) Dashboard Charts
standards of living, Univ of California Press, 1979.
[10] M. Terraneo, A longitudinal study of deprivation in european countries, International</p>
      <p>Journal of Sociology and Social Policy (2016).
[11] J. H. Mohamud, O. N. Gerek, Poverty level characterization via feature selection and
machine learning, in: 2019 27th Signal Processing and Communications Applications
Conference (SIU), IEEE, 2019, pp. 1–4.
[12] N. S. Sani, M. A. Rahman, A. A. Bakar, S. Sahran, H. M. Sarim, Machine learning approach
for bottom 40 percent households (b40) poverty classification, Int. J. Adv. Sci. Eng. Inf.</p>
      <p>Technol 8 (2018) 1698.
[13] A. Abu, R. Hamdan, N. Sani, Ensemble learning for multidimensional poverty classification,</p>
      <p>Sains Malaysiana 49 (2020) 447–459.
[14] T. P. Sohnesen, N. Stender, Is random forest a superior methodology for predicting poverty?
an empirical assessment, Poverty &amp; Public Policy 9 (2017) 118–133.
[15] D. Arribas-Bel, J. E. Patino, J. C. Duque, Remote sensing-based measurement of living
environment deprivation: Improving classical approaches with machine learning, PLoS
one 12 (2017) e0176684.
[16] M. Adena, M. Myck, Poverty and transitions in health in later life, Social science &amp;
medicine 116 (2014) 202–210.
[17] B. Nolan, C. T. Whelan, Measuring poverty using income and deprivation indicators:
alternative approaches, Journal of European Social Policy 6 (1996) 225–240.
[18] F. D’Adda, M. Cremaschi, E. Messina, M. Terraneo, S. Bandini, F. Gasparini, A three
level prediction of multidimensional poverty in elderly, in: Proceedings of the Italia
Intelligenza Artificiale - Thematic Workshops co-located with the 3rd CINI National Lab
AIIS Conference on Artificial Intelligence (Ital IA 2023), volume 3486, CEUR, 2022, pp.
544–549.
[19] P. Liberati, G. Resce, F. Tosi, The probability of multidimensional poverty: A new approach
and an empirical application to eu-silc data, Review of Income and Wealth (2022).
[20] L. Van der Maaten, G. Hinton, Visualizing data using t-sne., Journal of machine learning
research 9 (2008).
[21] H. Cramer, Mathematical methods of statistics, princeton univ, Press, Princeton, NJ 27
(1946).
[22] T. Chen, C. Guestrin, Xgboost: A scalable tree boosting system, in: Proceedings of the
22nd acm sigkdd international conference on knowledge discovery and data mining, 2016,
pp. 785–794.
[23] I. Rish, et al., An empirical study of the naive bayes classifier, in: IJCAI 2001 workshop on
empirical methods in artificial intelligence, volume 3, 2001, pp. 41–46.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1] Council of europe, https://www.coe.int/en/web/portal/home, ???? Accessed:
          <fpage>2023</fpage>
          -09-16.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>N.</given-names>
            <surname>Jean</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Burke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Xie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. M.</given-names>
            <surname>Davis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. B.</given-names>
            <surname>Lobell</surname>
          </string-name>
          , S. Ermon,
          <article-title>Combining satellite imagery and machine learning to predict poverty</article-title>
          ,
          <source>Science</source>
          <volume>353</volume>
          (
          <year>2016</year>
          )
          <fpage>790</fpage>
          -
          <lpage>794</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Usmanova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Aziz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Rakhmonov</surname>
          </string-name>
          , W. Osamy,
          <article-title>Utilities of artificial intelligence in poverty prediction: a review</article-title>
          ,
          <source>Sustainability</source>
          <volume>14</volume>
          (
          <year>2022</year>
          )
          <fpage>14238</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S. K.</given-names>
            <surname>Satapathy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Saravanan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mishra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. N.</given-names>
            <surname>Mohanty</surname>
          </string-name>
          ,
          <article-title>A comparative analysis of multidimensional covid-19 poverty determinants: An observational machine learning approach</article-title>
          ,
          <source>New Generation Computing</source>
          <volume>41</volume>
          (
          <year>2023</year>
          )
          <fpage>155</fpage>
          -
          <lpage>184</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>W.</given-names>
            <surname>Bank</surname>
          </string-name>
          ,
          <article-title>Poverty and shared prosperity 2020: Reversals of fortune</article-title>
          , The World Bank,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A. B.</given-names>
            <surname>Atkinson</surname>
          </string-name>
          ,
          <article-title>Income distribution in oecd countries, Evidence from Luxemburg income study (</article-title>
          <year>1995</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Biewen</surname>
          </string-name>
          ,
          <article-title>Income inequality in germany during the 1980s and 1990s</article-title>
          ,
          <source>Review of Income and Wealth</source>
          <volume>46</volume>
          (
          <year>2000</year>
          )
          <fpage>1</fpage>
          -
          <lpage>19</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M. F.</given-names>
            <surname>Förster</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. Mira D'Ercole</surname>
          </string-name>
          ,
          <article-title>Income distribution and poverty in oecd countries in the second half of the 1990s, Income Distribution and Poverty in OECD Countries in the Second Half of the 1990s (February 18,</article-title>
          <year>2005</year>
          ) (
          <year>2005</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>P.</given-names>
            <surname>Townsend</surname>
          </string-name>
          ,
          <article-title>Poverty in the United Kingdom: a survey of household resources and</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>