=Paper=
{{Paper
|id=Vol-3074/paper07
|storemode=property
|title=Spatial weighted robust clustering of multivariate time series based on quantile dependence with an application to mobility during COVID-19 pandemic
|pdfUrl=https://ceur-ws.org/Vol-3074/paper07.pdf
|volume=Vol-3074
|authors=Àngel López-Oriona, Pierpaolo D'Urso, Jose Vilar,Borja Lafuente-Rego
|dblpUrl=https://dblp.org/rec/conf/wilf/Lopez-OrionaDVL21
}}
==Spatial weighted robust clustering of multivariate time series based on quantile dependence with an application to mobility during COVID-19 pandemic==
Spatial weighted robust clustering of multivariate
time series based on quantile dependence with an
application to mobility during COVID-19 pandemic
Ángel López-Oriona1 , Pierpaolo D’Urso2 , José A. Vilar1 and Borja Lafuente-Rego1
1
Department of Mathematics, Research Group MODES, Research Center for Information and Communication
Technologies (CITIC), University of A Coruña, Spain
2
Department of Social Sciences and Economics, Sapienza University of Rome, Italy
Abstract
In this paper, a fuzzy clustering model for multivariate time series based on the quantile cross-spectral density and
principal component analysis is improved. The extension consists of (i) a weighting system which assigns a weight
to each principal component in accordance with its importance concerning the underlying clustering structure and
(ii) a penalization term allowing to take into account the spatial information. The iterative solutions of the new
model, which employs the exponential distance in order to gain robustness against outlying series, are derived. A
simulation study shows that the introduction of the weighting system substantially enhances the effectiveness of the
former approach. The behaviour of the extended model in terms of the spatial penalization term is also analysed. An
application involving multivariate time series of mobility indicators concerning COVID-19 pandemic highlights the
usefulness of the proposed technique.
Keywords
clustering, multivariate time series, principal component analysis, quantile cross-spectral density, weighting system,
spatial statistics, COVID-19
1. Introduction and related work
Clustering of time series is a central problem in data mining with applications in many fields [1, 2]. The objective
is to split a large set of unlabelled time series into homogeneous groups so that similar series are placed together
in the same group and dissimilar series are located in different groups. The majority of the proposed approaches
concern univariate time series (UTS) [3, 4, 5], while clustering of multivariate time series (MTS) has received much
less attention [6].
This paper improves a fuzzy clustering model for MTS proposed in our previous work [7]. This model is based
on the so-called Quantile Cross-spectral Density (QCD) and the classical Principal Component Analysis (PCA).
Although highly successful, the model in [7] suffers from two major drawbacks. First, each principal component
contributes equally to the objective function of the clustering algorithm. This overlooks the fact that some principal
components could have more importance than others concerning the underlying clustering structure. Second,
no spatial information is considered in the clustering model, so it can not handle time series datasets containing
geographical information in a proper way. This paper proposes a new fuzzy clustering model aimed at overcoming
both previously mentioned issues.
The 13th International Workshop on Fuzzy Logic and Applications
" oriona38@hotmail.com (.́ López-Oriona); pierpaolo.durso@uniroma1.it (P. D’Urso); jose.vilarf@udc.es
(J. A. Vilar); borja.lafuente@udc.es (B. Lafuente-Rego)
0000-0003-1456-7342 (.́ López-Oriona); 0000-0002-7406-6411 (P. D’Urso); 0000-0001-5494-171X (J. A. Vilar);
0000-0002-0877-7063 (B. Lafuente-Rego)
© 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
CEUR
Workshop
Proceedings
http://ceur-ws.org
ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org)
In Section 2, the procedure presented in [7] is concisely revised and its weighted version is introduced. A brief
numerical experiment is carried out in order to show why introducing a weighting system in the principal compo-
nents space is advantageous. Section 3 describes the new proposed model, in which spatial information is taken
into account. A simple simulation study is performed to assess the behaviour of the procedure. In Section 4, the
technique is applied to a dataset of mobility concerning the COVID-19 pandemic. Section 5 concludes. Finally, the
Appendix contains the derivation of the iterative solutions of the proposed algorithm.
2. Fuzzy clustering models based on the quantile cross-spectral
density
In this section, we first introduce a fuzzy clustering model based on QCD and PCA. Then, we present a modification
of this model in which a different weight is given to each principal component. A numerical experiment shows the
usefulness of the latter approach.
2.1. A fuzzy clustering model based on QCD and PCA
Let {𝑋 𝑡 , 𝑡 ∈ Z} = {(𝑋𝑡,1 , . . . , 𝑋𝑡,𝑑 ), 𝑡 ∈ Z} be a 𝑑-variate real-valued strictly stationary stochastic process.
Denote by 𝐹𝑗 the marginal distribution function of 𝑋𝑡,𝑗 , 𝑗 = 1, . . . , 𝑑, and by 𝑞𝑗 (𝜏 ) = 𝐹𝑗−1 (𝜏 ), 𝜏 ∈ [0, 1], the
corresponding quantile function. Fixed 𝑙 ∈ Z and an arbitrary couple of quantile levels (𝜏, 𝜏 ′ ) ∈ [0, 1]2 , consider
the cross-covariance of the indicator functions 𝐼 {𝑋𝑡,𝑗1 ≤ 𝑞𝑗1 (𝜏 )} and 𝐼 {𝑋𝑡+𝑙,𝑗2 ≤ 𝑞𝑗2 (𝜏 ′ )} given by
𝛾𝑗1 ,𝑗2 (𝑙, 𝜏, 𝜏 ′ ) = Cov 𝐼 {𝑋𝑡,𝑗1 ≤ 𝑞𝑗1 (𝜏 )} , 𝐼 𝑋𝑡+𝑙,𝑗2 ≤ 𝑞𝑗2 (𝜏 ′ ) , (1)
(︀ {︀ }︀)︀
for 1 ≤ 𝑗1 , 𝑗2 ≤ 𝑑. Taking 𝑗1 = 𝑗2 = 𝑗, the function 𝛾𝑗,𝑗 (𝑙, 𝜏, 𝜏 ′ ), with (𝜏, 𝜏 ′ ) ∈ [0, 1]2 , so-called quantile
autocovariance function (QAF) of lag 𝑙, generalizes the traditional autocovariance function.
In the case of the multivariate process {𝑋 𝑡 , 𝑡 ∈ Z}, we can consider the 𝑑 × 𝑑 matrix
Γ(𝑙, 𝜏, 𝜏 ′ ) = 𝛾𝑗1 ,𝑗2 (𝑙, 𝜏, 𝜏 ′ ) 1≤𝑗 ,𝑗 ≤𝑑 , (2)
(︀ )︀
1 2
which jointly provides information about both the cross-dependence (when 𝑗1 ̸= 𝑗2 ) and the serial dependence
(because the lag 𝑙 is considered).
In the same way as the spectral density is the representation in the frequency domain of the autocovariance func-
tion, the spectral counterpart for the cross-covariances 𝛾𝑗1 ,𝑗2 (𝑙, 𝜏, 𝜏 ′ ) can be introduced. Under suitable summabil-
ity conditions (mixing conditions), the Fourier transform of the cross-covariances is well-defined and the quantile
cross-spectral density is given by
∞
∑︁
f𝑗1 ,𝑗2 (𝜔, 𝜏, 𝜏 ′ ) = (1/2𝜋) 𝛾𝑗1 ,𝑗2 (𝑙, 𝜏, 𝜏 ′ )𝑒−𝑖𝑙𝜔 , (3)
𝑙=−∞
for 1 ≤ 𝑗1 , 𝑗2 ≤ 𝑑, 𝜔 ∈ R and 𝜏, 𝜏 ′ ∈ [0, 1]. Note that f𝑗1 ,𝑗2 (𝜔, 𝜏, 𝜏 ′ ) is complex-valued.
The quantile cross-spectral density contains information about the general dependence structure of a given
stochastic process. For a specific realization of the process, this quantity can be consistently estimated by means of
𝑗1 ,𝑗2
the so-called smoothed CCR-periodogram, 𝐺 ^ 𝑇,𝑅 (𝜔, 𝜏, 𝜏 ′ ), proposed by [8].
Based on previous remarks, a simple dissimilarity measure between two realizations of the 𝑑-variate process
(𝑖) 𝑗1 ,𝑗2
(MTS) can be defined as follows. Given the 𝑖-th MTS, 𝑋 𝑡 , consider the set 𝐺(𝑖) = {𝐺 ^ 𝑇,𝑅 (𝜔, 𝜏, 𝜏 ′ ), 𝑗1 , 𝑗2 =
1, . . . , 𝑑, 𝜔 ∈ Ω, 𝜏, 𝜏 ∈ 𝒯 }, where Ω is the set of Fourier frequencies and 𝒯 = {0.1, 0.5, 0.9}. Let Ψ(𝑖) be
′
the vector formed by concatenating separately the real and imaginary parts of the elements of the set 𝐺(𝑖) . A
(1) (2)
dissimilarity measure between the series 𝑋 𝑡 and 𝑋 𝑡 is defined as the squared Euclidean distance between the
vectors Ψ and Ψ . We call this dissimilarity 𝑑𝑄𝐶𝐷 .
(1) (2)
The distance measure 𝑑𝑄𝐶𝐷 can be used as input to the traditional fuzzy 𝐶-medoids algorithm to develop a
clustering procedure for MTS. However, the numerical simulations carried out in [7] have shown that performing
PCA as a preprocessing step frequently improves the effectiveness of the metric 𝑑𝑄𝐶𝐷 in a clustering context. More
(1) (𝑛)
precisely, given a set of 𝑛 MTS, {𝑋 𝑡 , . . . , 𝑋 𝑡 }, the previously mentioned QCD-based features are extracted
from each series, thus providing the set Ψ = {Ψ(1) , . . . , Ψ(𝑛) }. These vectors are transformed by means of
(1) (𝑛)
PCA, obtaining the set of score vectors Ψ𝑃 𝐶𝐴 = {Ψ𝑃 𝐶𝐴 , . . . , Ψ𝑃 𝐶𝐴 }. The corresponding distance used in
⃦ ⃦2
(1) (2) ⃦ (1) (2) ⃦
the clustering algorithm is 𝑑𝑄𝐶𝐷𝑃 𝐶𝐴 (𝑋 𝑡 , 𝑋 𝑡 ) = ⃦Ψ𝑃 𝐶𝐴 − Ψ𝑃 𝐶𝐴 ⃦ . From now on, we assume that this
is the distance being considered, although the subscript 𝑃 𝐶𝐴 is removed for the sake of simplicity. It is worth
remarking that the use of PCA has been shown to improve the accuracy of clustering algorithms in several contexts
[9, 10, 11, 12].
The metric 𝑑𝑄𝐶𝐷 is used to develop the so-called QCD-based Fuzzy 𝐶-Medoids clustering model (QCD-FCMd),
whose goal is to find the subset of Ψ of size 𝐶, Ψ
̃︀ = {Ψ̃︀ (1) , . . . , Ψ
̃︀ (𝐶) }, and the 𝑛 × 𝐶 matrix of fuzzy coefficients,
𝑈 , solving the minimization problem
𝑛 ∑︁
𝐶 𝐶
⃦ (𝑖) ̃︀ (𝑐) ⃦2
∑︁ ⃦ ⃦ ∑︁
min 𝑢𝑚
𝑖𝑐 ⃦Ψ − Ψ ⃦ with respect to 𝑢𝑖𝑐 = 1 ∀𝑖 and 𝑢𝑖𝑐 ≥ 0, (4)
Ψ,𝑈
̃︀
𝑖=1 𝑐=1 𝑐=1
where 𝑢𝑖𝑐 ∈ [0, 1] represents the membership degree of the 𝑖-th series in the 𝑐-th cluster, Ψ̃︀ (𝑐) is the vector of
QCD-based features with regards to the medoid series for the cluster 𝑐, and 𝑚 > 1 is a parameter controlling the
fuzziness of the partition, usually referred to as fuzziness parameter.
Several hyperparameters have to be selected in the model QCD-FCMd. In particular, the number of clusters,
𝐶, the fuziness parameter, 𝑚 and the number of retained principal components, 𝑝, need to be set in advance. We
propose to perform hyperparameter selection through a grid search by considering the four internal clustering
quality indexes given in [13]. It is worth mentioning that the model QCD-FCMd was compared in [7] with several
alternative procedures, clearly outperforming all of them.
Despite its good behaviour, The QCD-FCMd model suffers from a major limitation: all the selected principal
components receive the same importance concerning the distances in (4), thus ignoring that some principal compo-
nents often contain more information than others about the underlying clustering structure. Solving the limitations
of the QCD-FCMd model is one of the main motivations of the present work.
2.2. A weighted fuzzy clustering model based on QCD and PCA
(1) (𝑛)
Consider the set of MTS {𝑋 𝑡 , . . . , 𝑋 𝑡 } and denote by Ψ = {Ψ(1) , . . . , Ψ(𝑛) } the set of QCD-based features.
Assume that the subset of 𝑝 first principal componens was selected. In this way, each Ψ(𝑖) is a 𝑝-dimensional vector,
(𝑖) (𝑖)
Ψ(𝑖) = (Ψ1 , . . . , Ψ𝑝 ), 𝑖 = 1, . . . , 𝑛. We introduce now the Weighted QCD-based Fuzzy 𝐶-Medoids clustering
model (W-QCD-FCMd), which intends to get around the limitations of the QCD-FCMd model by considering the
minimization problem
𝑛 ∑︁
𝐶 𝑝 (︂ (︁ )︁)︂2
∑︁ ∑︁ (𝑖)
min 𝑢𝑚
𝑖𝑐 𝑤𝑘 Ψ𝑘 − Ψ̃︀ (𝑐)
𝑘
Ψ,𝑤,𝑈
̃︀
𝑖=1 𝑐=1 𝑘=1
𝑝
(5)
𝐶
∑︁ ∑︁
subject to 𝑢𝑖𝑐 = 1 and 𝑢𝑖𝑐 ≥ 0, 𝑤𝑘 = 1, 𝑤𝑘 ≥ 0, for 𝑘 = 1, . . . , 𝑝,
𝑐=1 𝑘=1
where 𝑤 = {𝑤1 , . . . , 𝑤𝑝 } is a set of 𝑝 weights and the remaining terms are the same as in (4). The W-QCD-FCMd
model associates the weight 𝑤𝑘 to the 𝑘-th principal component when computing the distance in (5). Since the
weights are variables in the objective function, the set 𝑤 gets properly estimated. Basically, W-QCD-FCMd allows
for an appropriate tuning of the influence of the various principal components when computing the dissimilarity
between MTS. Hence, the algorithm can give more importance to the principal components containing the most
valuable information about the structure of the dataset. The iterative solutions of (5) can be easily obtained through
the Lagrange multipliers method and the procedure is quite similar to the one shown in the Appendix for computing
the solutions of the model presented in Section 3.
In order to show the advantages of the W-QCD-FCMd model over QCD-FCMd model, we constructed a simple
simulation study as follows. We considered a scenario involving four different generating processes. 10 series
were simulated from each process and the corresponding set of 40 MTS was subject to clustering by means of both
models QCD-FCMd and W-QCD-FCMd. The generating processes were bivariate vector moving average processes
with vectorized matrices of coefficients (−0.3, 0, 0, 0), (0.5, 0, 0, 0), (0, 0, 0.3, 0) and (0, 0, 0, 0.7), and Gaussian
innovations. The number of clusters was set to 𝐶 = 4. The series length was set to 𝑇 = 300. Several values for the
fuzziness parameter, 𝑚, and the retained number of principal components, 𝑝, were taken into account. Specifically,
we considered 𝑚 = 1.4, 1.6, 1.8, 2 and 𝑝 = 1, 2, 3, 4, 5.
According to our clustering purpose, the partition defined by the generating processes was assumed to be the true
partition (the ground truth) and the assessment of both approaches was carried out with the fuzzy extension of the
Adjusted Rand Index [14], denoted by FARI. The simulation procedure was repeated 200 trials and average values of
FARI were calculated for each combination of 𝑚 and 𝑝. Table 1 contains the results. It is clear that, overall, the model
W-QCD-FCMd substantially outperformed QCD-FCMd in most of the settings. The only exception was when 𝑝 = 1,
when both models are the same (only one weight is present in W-QCD-FCMd for the first principal component).
We have performed additional numerical experiments, achieving similar results. Therefore, we conclude that the
W-QCD-FCMd model substantially improves the QCD-FCMd model.
𝑚 = 1.4 𝑚 = 1.6 𝑚 = 1.8 𝑚=2
𝑝=1 QCD-FCMd 0.58 0.53 0.50 0.40
W-QCD-FCMd 0.55 0.55 0.49 0.45
𝑝=2 QCD-FCMd 0.70 0.53 0.36 0.26
W-QCD-FCMd 0.86 0.82 0.72 0.55
𝑝=3 QCD-FCMd 0.78 0.53 0.38 0.31
W-QCD-FCMd 0.91 0.84 0.65 0.52
𝑝=4 QCD-FCMd 0.45 0.33 0.23 0.20
W-QCD-FCMd 0.60 0.48 0.38 0.30
𝑝=5 QCD-FCMd 0.32 0.23 0.17 0.13
W-QCD-FCMd 0.28 0.32 0.25 0.18
Table 1
Average FARI obtained by QCD-FCMd and W-QCD-FCMd in the simulation scenario. For each pair (𝑝, 𝑚), the best result is
shown in bold.
3. The QCD-based exponential weighted fuzzy C-medoids
clustering model with spatial component
(1) (𝑛)
In the context of previous sections, we propose to perform partitional fuzzy clustering on {𝑋 𝑡 , . . . , 𝑋 𝑡 } by
using the Weighted QCD-based Exponential Fuzzy C-Medoids clustering model with Spatial component (EW-QCD-
FCMd-S), whose aim is to find the subset of Ψ of size 𝐶, Ψ ̃︀ = {Ψ ̃︀ (1) , . . . , Ψ
̃︀ (𝐶) }, the set of 𝑝 weights 𝑤 =
{𝑤1 , . . . , 𝑤𝑝 }, and the 𝑛 × 𝐶 matrix of fuzzy coefficients, 𝑈 = (𝑢𝑖𝑐 ), solving the minimization problem:
𝐶
𝑛 ∑︁ 𝑝 𝑛 𝐶 𝑛
∑︁ [︁ ∑︁ (𝑖) (𝑐) 2 )︀
]︁ 𝛾 ∑︁ ∑︁ 𝑚 ∑︁ ∑︁
𝑢𝑚 2
𝑝𝑖𝑖′ 𝑢𝑚
(︀
min 𝑖𝑐 1 − exp − 𝛽 𝑤 𝑘 (Ψ𝑘 − Ψ
̃︀
𝑘 ) + 𝑢 𝑖𝑐 𝑖′ 𝑐′
Ψ,𝑤,𝑈
̃︀
𝑖=1 𝑐=1
2 𝑖=1 𝑐=1 ′ ′
𝑘=1 𝑖 =1 𝑐 ∈𝐶𝑐
subject to: (6)
𝐶 𝑝
∑︁ ∑︁
𝑢𝑖𝑐 = 1, 𝑢𝑖𝑐 ≥ 0, for 𝑖 = 1, . . . , 𝑛; 𝑐 = 1, . . . , 𝐶, 𝑤𝑘 = 1, 𝑤𝑘 ≥ 0, for 𝑘 = 1, . . . , 𝑝,
𝑐=1 𝑘=1
where 𝛽 > 0 is a constant tuning the membership degrees to gain robustness to outlying series; 𝑃 = (𝑝𝑖𝑖′ ) is
a symmetric 𝑛 × 𝑛 matrix introducing information on spatial proximity of the statistical units, with 𝑝𝑖𝑖′ ≥ 0 for
𝑖 ̸= 𝑖′ and 𝑝𝑖𝑖 = 0, 𝑖 = 1, . . . , 𝑛; 𝛾 ≥ 0 is a coefficient regulating the effect of the spatial proximity (such as
described below); 𝐶𝑐 = {1, . . . , 𝑐 − 1, 𝑐 + 1, . . . 𝐶} denotes the set of all the clusters except cluster 𝑐; and the
remaining elements are the same as in (5).
The objective function in (6) is formed by two specific terms, which are described in detail below.
𝑛 ∑︁𝐶 [︁ 𝑝
∑︁ ∑︁ (𝑖) )︀]︁
[1] The term 𝑢𝑚 𝑤𝑘2 (Ψ𝑘 − Ψ ̃︀ (𝑐) )2 corresponds, in essence, to the standard term in
(︀
𝑖𝑐 1 − exp − 𝛽 𝑘
𝑖=1 𝑐=1 𝑘=1
the classical fuzzy 𝐶-medoids clustering algorithm but including two main modifications. The Euclidean distance is
replaced by an exponential-type distance to endow the clustering procedure with higher robustness against outliers
[15]. The exponential distance assigns small weights to anomalous points and larger weights to elements from well-
defined and compact clusters (for more details, see [15, 16]). Moreover, this term also considers different weights for
each principal component in order to improve the accuracy of the standard non-weighted version such as argued
in Section 2.
𝑛 𝐶 𝑛
𝛾 ∑︁ ∑︁ 𝑚 ∑︁ ∑︁
[2] The second term, 𝑢𝑖𝑐 𝑖′ 𝑐′ , is a spatial penalty term, which can be seen as a regulariza-
𝑝𝑖𝑖′ 𝑢𝑚
2 𝑖=1 𝑐=1 ′ ′
𝑖 =1 𝑐 ∈𝐶𝑐
tion element. The matrix 𝑃 incorporates the spatial information of the 𝑛 units where the MTS were recorded and
it is often restricted to indicate if two different units are or not contiguous, i.e., 𝑃 is usually constructed as follows:
𝑝𝑖𝑖′ = 1 if MTS 𝑖 is contiguous to MTS 𝑖′ , 0 otherwise. (7)
The goal of the spatial regularization term is penalizing the fact that two contiguous units are located in different
clusters with high membership degrees, since it is expected that neighbouring sites tend to have similar features.
The constant 𝛾 regulates the trade-off between internal cohesion based on the set Ψ and the fact that the clusters
are constituted by adjacent MTS. In this way, large values of 𝛾 force neighbouring spatial units to belong to the
same cluster, whereas when 𝛾 = 0, the spatial information is not taken into account.
The optimal iterative solutions of the minimization problem (6) are given in the Appendix.
Remark 1 (Selection of 𝛽). An appropriate choice of the hyperparameter 𝛽 is totally crucial for a good performance
of the EW-QCD-FCMd-S procedure. The value of 𝛽 is determined in this work as the inverse of the variability in
the data, which is a common choice in the literature. For more details, see [16].
Remark 2 (Selection of 𝛾). The selection of the optimal value for 𝛾 is a complex problem and frequently relies on
heuristic procedures. To that aim, we adapt a procedure given in [17], which is based on maximizing the so-called
within cluster spatial autocorrelation.
A brief simulation study was carried out in order to assess the behaviour of the EW-QCD-FCMd-S model. The
simulation mechanism involved two vector autoregressive processes or order 1, denoted by VAR1 and VAR2 , with
vectorized matrices of coefficients (−0.2, 0.2, 0.2, 0.2) and (0.3, −0.2, 0, 0), respectively, and Gaussian innova-
tions. Ten different series were generated from each process. The first 8 MTS and MTS 19 and 20 pertained to
VAR1 . On the other hand, MTS from 9 to 18 were associated with VAR2 . We encapsulated the spatial information
by means of a matrix 𝑃 indicating that the first 10 MTS belonged to the same area, the same holding for the last
10 MTS. The matrix 𝑃 was constructed as
𝑝𝑖𝑗 = 1, 𝑖, 𝑗 = 1, . . . , 10, 𝑖 ̸= 𝑗, 𝑝𝑘𝑙 = 1, 𝑘, 𝑙 = 11, . . . , 20, 𝑘 ̸= 𝑙. (8)
The remaining entries of the matrix 𝑃 were set to zero. The number of clusters was set to 𝐶 = 2. The series
length was set to 𝑇 = 500. The hyperparameter 𝛽 was chosen to be 𝛽 = 1 for illustrative purposes. Concerning
the number of principal components, 𝑝, we selected the optimal value of 𝑝 ∈ {2, . . . , 6} according to the FARI
attained when 𝛾 = 0 and the true partition is given by the true generating processes, which resulted 𝑝 = 2.
The simulation mechanism was performed for all pairs (𝑚, 𝛾), with 𝑚 = 1.4, 1.6, 1.8, 2 and 𝛾 = 0, 0.01, 0.05,
0.1, 0.2, 0.5. Note that, when 𝛾 = 0, there is no spatial penalization term, whereas when increasing the values of 𝛾,
more importance is given to this term, which is, the clusters are more required to be formed by series in the same
area to a higher extent.
To gain insights into the behaviour of EW-QCD-FCMd-S from a spatial point of view, we recorded, for each pair
(𝑚, 𝛾), the sum of the membership degrees of the MTS 9, 10, 19 and 20 with respect to the cluster in which they
must be located according to the matrix 𝑃 . In other words, we took the membership degrees of MTS 9 and 10 in
the cluster where MTS 1-8 showed a higher membership degree. Analogously, we took the membership degrees of
MTS 19 and 20 in the cluster where MTS 11-18 showed a higher membership degree. The average of the 4 previous
quantities was taken into account. Note that this measure indicates to what extent the spatial requirements defined
by 𝑃 are fulfilled. Results are given in Table 2 concerning 200 simulation trials. As expected, when 𝛾 = 0, the
corresponding quantities are close to 0 as the grouping is made uniquely based on underlying generating processes.
When the value of 𝛾 increases, the MTS 9, 10, 19 and 20 are more affected by the penalization term, constituting
clusters along with MTS in the same areas, and the corresponding measures also increase. Indeed, for 𝛾 = 0.5, the
𝑚 = 1.4 𝑚 = 1.6 𝑚 = 1.8 𝑚=2
𝛾=0 0.05 0.07 0.08 0.08
𝛾 = 0.01 0.06 0.08 0.11 0.14
𝛾 = 0.05 0.17 0.23 0.27 0.29
𝛾 = 0.1 0.44 0.45 0.44 0.43
𝛾 = 0.2 0.78 0.69 0.64 0.58
𝛾 = 0.5 0.90 0.86 0.81 0.76
Table 2
Average membership degree of MTS 9, 10, 19 and 20 with respect to the cluster they must belong according to the matrix 𝑃 .
spatial requirement gets so stringent that the series are located in their spatial clusters with very high membership
degrees.
4. Application. Spatial clustering of Spanish regions based on
mobility time series during COVID-19 pandemic
In this section we develop a study case related to the non-supervised classification of geographical zones in terms
of their temporal records of some mobility indicators concerning the COVID-19 pandemic.
In particular, we considered mobility time series observed during COVID-19 outbreak in the 17 autonomous
communities of Spain. Specifically, we recorded daily data about mobility in (i) retail and recreation places, (ii)
groceries and pharmacies and (ii) transit stations. The three variables are measured as a percentage of change with
respect to a baseline level associated with the pre-pandemic stage. The sample period spans from 15th February 2020
to 22nd May 2021, thus resulting serial realizations of length 𝑇 = 1096. Each community is described by means of
a trivariate series indicating the daily percent of change in mobility with respect to the mentioned locations. Note
that, from an epidemological point of view, it is reasonable to think that the joint behaviour of these three quantities
is different depending on the geographical location of each region.
The EW-QCD-FCMd-S model was applied to the set of 17 series. Note that 5 hyperparameters had to be chosen,
namely 𝑝, 𝐶, 𝑚, 𝛽 and 𝛾. To choose 𝑝, 𝐶 and 𝑚, we applied a grid search based on the four internal clustering
quality indexes presented in [13]. To this aim, we considered the W-QCD-FCMd clustering model, which does not
require the selection of 𝛽 and 𝛾. The optimal combination was (𝑝, 𝑚, 𝐶) = (2, 1.7, 4). Given the selected value
𝑝 = 2, the hyperparameter 𝛽 was chosen according to the procedure indicated in [18]. The corresponding value
was 𝛽 = 0.432. The matrix 𝑃 was constructed so that 𝑝𝑖𝑗 = 1 if communities 𝑖 and 𝑗 are adjacent and 𝑝𝑖𝑗 =0
otherwise. In this way, the only communities lacking neighbours are the Canary Islands and the Balearic Islands.
The optimal value of the tuning parameter 𝛾 was chosen according to the procedure described in [17]. The optimal
value of 𝛾 was 𝛾 = 1.76. The optimal weights were 0.82 and 0.18 for the first and second principal component,
respectively.
Table 3 shows the membership degrees concerning the 4 clustering solution given by EW-QCD-FCMd-S. For
each community, the highest value for the membership degree was highlighted in bold. According to Table 3, there
exist a small cluster, 𝐶4 , including the autonomous communities of Asturias (AS) and Cantabria (CA). None of the
remaining communities show a high membership degree in this cluster. There are three more clusters, 𝐶1 , 𝐶2
and 𝐶3 , each one containing 5, 4, and 6 communities, respectively. Interestingly, the community of Castile and
León (CL) presents membership degrees which are spread out between the 4 clusters. Hence, this region could be
considered an outlier.
In order to clarify the results in Table 3, we have depicted in Figure 1 a map of Spain where the communities
were coloured according to the underlying crisp clustering partition (blue, red, yellow and green colours for Clus-
ters 𝐶1 , 𝐶2 , 𝐶3 and 𝐶4 , respectively). The abbreviation of each region given in Table 3 was incorporated. It is
clear from Figure 1 that the resulting partition is highly interpretable from a spatial point of view, suggesting that
the climate and other geographical conditions could have influenced how people moved during the COVID-19 pan-
demic. Clusters 𝐶3 and 𝐶4 are formed mainly by communities located in the northern part of the country, with
prototypes Galicia (GA) and AS, respectively. On the other hand, Cluster 𝐶1 is constituted by communities close
to the Mediterranean Sea. The medoid of Cluster 𝐶1 is the community of Castile-La Mancha (CM). It is worth
Community 𝐶1 𝐶2 𝐶3 𝐶4
Andalusia (AN) 0.71 0.19 0.08 0.03
Aragon (AR) 0.25 0.54 0.21 0.01
Asturias (AS) 0.00 0.00 0.00 1.00
Balearic Islands (BI) 0.77 0.13 0.04 0.05
Basque Country (BC) 0.21 0.30 0.44 0.06
Canary Islands (CI) 0.00 1.00 0.00 0.00
Cantabria (CB) 0.12 0.14 0.17 0.58
Castile and León (CL) 0.26 0.29 0.35 0.10
Castile-La Mancha (CM) 1.00 0.00 0.00 0.00
Catalonia (CT) 0.62 0.18 0.10 0.10
Community of Madrid (MD) 0.09 0.85 0.04 0.02
Extremadura (EX) 0.04 0.17 0.74 0.05
Galicia (GA) 0.00 0.00 1.00 0.00
La Rioja (RI) 0.03 0.07 0.89 0.01
Navarre (NC) 0.06 0.12 0.81 0.01
Region of Murcia (MC) 0.23 0.70 0.03 0.04
Valencian Community (VC) 0.76 0.16 0.04 0.04
Table 3
Membership degrees for the 17 Spanish communities by considering the EW-QCD-FCMd-S model and a 4 cluster partition. For
each community, the highest value for the membership degree was highlighted in bold.
Figure 1: Map of Spain highlighting the crisp clustering partition given by the model EW-QCD-FCMd-S. Each community
is represented in a different colour according to its cluster (blue, red, yellow and green colour for cluster 𝐶1 , 𝐶2 , 𝐶3 and 𝐶4 ,
respectively). A darker colour was used for the medoid communities.
remarking that the only Mediterranean area not included in Cluster 𝐶1 from a crisp point of view, the region of
Murcia (MC), shows a relatively high membership in this group, 0.23. Finally, Cluster 𝐶2 is the more heterogeneous
cluster from a geographical perspective. This cluster is composed by a northern community, Aragon (AR), a central
community, Madrid (MD), a southern zone, Murcia (MC), and the insular area of the Canary Islands (CI), which
corresponds to the medoid. These islands are located far away from the Iberian Peninsula although they are shown
near Andalusia (AN) in Figure 1 for the sake of simplicity. It is important to emphasize that, although the elements
in Cluster 𝐶2 are not spatially connected, the strong similarity shown by the corresponding MTS offsets the spatial
penalization term and provokes the formation of this group. Further investigations based on epidemiological data
will be carried out so as to explain the formation of this group.
5. Concluding remarks
In this work, we have extended a fuzzy clustering model for multivariate time series based on QCD and PCA.
The former model employs a set of features obtained through the smoothed CCR-periodogram and performs the
traditional fuzzy 𝐶-medoids clustering algorithm by considering the squared Euclidean distance in the principal
components space.
The proposed extension is based on two tools. On the one hand, a system of weights is introduced in the objective
function so that more importance is given to the principal components having more discriminative ability in terms
of the clustering structure. On the other hand, a penalization term which permits to take into account spatial
information is considered. The resulting model uses the exponential distance, thus attaining robustness against
outlying series.
We have showed that the consideration of the weighting system is totally advantageous in terms of clustering
effectiveness. The behaviour of the model according to the spatial penalization term was also analysed by means of
a brief example. Finally, the proposed technique was applied to cluster a dataset containing COVID-19 data where
the MTS come from different geographic locations, leading to illuminating conclusions.
Appendix
Here we derive the iterative solutions of the constrained minimization problem (6) via the Lagrangian multiplier method.
First, consider the corresponding Lagrangian function taking the form:
𝐿(𝑈 , 𝑤, 𝜆, 𝜌) =
𝑛 ∑︁
𝐶 𝑝 𝑛 ∑︁𝐶 𝑛
̃︀ (𝑐) )2 + 𝛾
[︁ )︀]︁
(𝑖)
∑︁ ∑︁ ∑︁ ∑︁ ∑︁
𝑢𝑚 𝑤𝑘2 (Ψ𝑘 − Ψ 𝑢𝑚 𝑝𝑖𝑖′ 𝑢𝑚
(︀
𝑖𝑐 1 − exp − 𝛽 𝑘 𝑖𝑐 𝑖′ 𝑐′′
𝑖=1 𝑐=1 𝑘=1
2 𝑖=1 𝑐=1 ′ ′′ 𝑖 =1 𝑐 ∈𝐶𝑐
𝑛
∑︁ 𝐶 𝑝
(︀ ∑︁ (︀ ∑︁
(9)
)︀ )︀
− 𝜆𝑖 𝑢𝑖𝑐 − 1 − 𝜌 𝑤𝑘 − 1 ,
𝑖=1 𝑐=1 𝑘=1
where 𝜆 = {𝜆1 , . . . , 𝜆𝑛 } and 𝜌 stand for the Lagrange multipliers concerning the constraints of the membership degrees and
the weights, respectively.
By fixing 𝑤 and setting equal to zero the partial derivatives of 𝐿 with respect to 𝑢𝑖𝑐 and 𝜆𝑖 , for arbitrary 𝑖 ∈ {1, . . . , 𝑛} and
𝑐 ∈ {1, . . . , 𝐶}, we obtain that
𝜕𝐿(𝑈 , 𝑤, 𝜆, 𝜌) 𝜕𝐿(𝑈 , 𝑤, 𝜆, 𝜌)
= 0, = 0,
𝜕𝑢𝑖𝑐 𝜕𝜆𝑖
is equivalent to
[︁ 𝑝 𝑛 ]︁ 𝐶
(𝑖) ̃︀ (𝑐) )2 + 𝛾
∑︁ ∑︁ ∑︁ ∑︁
𝑚𝑢𝑚−1 2 𝑚
(10)
(︀ )︀
𝑖𝑐 1 − exp − 𝛽 𝑤𝑘 (Ψ𝑘 − Ψ 𝑘 𝑝𝑖𝑖 ′ 𝑢 ′
𝑖 𝑐′′ − 𝜆 𝑖 = 0, 𝑢𝑖𝑐′ − 1 = 0.
𝑘=1 𝑖′ =1 𝑐′′ ∈𝐶𝑐 𝑐′ =1
From the first equation in (10) follows
1 𝑝 𝑛
(︂
𝜆𝑖
)︂
𝑚−1 [︁ ]︁ −1
(𝑖) ̃︀ (𝑐) )2 + 𝛾
∑︁ ∑︁ ∑︁ 𝑚−1
𝑤𝑘2 (Ψ𝑘 − Ψ 𝑝𝑖𝑖′ 𝑢𝑚 (11)
(︀ )︀
𝑢𝑖𝑐 = 1 − exp − 𝛽 𝑘 𝑖′ 𝑐′′ .
𝑚 𝑘=1 𝑖′ =1 𝑐′′ ∈𝐶𝑐
Replacing (11) in the second equation of (10), we obtain
1 𝐶 [︁ 𝑝 𝑛
(︂
𝜆𝑖
)︂
𝑚−1 ′ ]︁ −1
(𝑖) ̃︀ (𝑐 ) )2 + 𝛾
∑︁ ∑︁ ∑︁ ∑︁ 𝑚−1
𝑤𝑘2 (Ψ𝑘 − Ψ 𝑝𝑖𝑖′ 𝑢𝑚
(︀ )︀
1 − exp − 𝛽 𝑘 𝑖′ 𝑐′′ = 1, (12)
𝑚
𝑐′ =1 𝑘=1 𝑖′ =1 𝑐′′ ∈𝐶𝑐′
which yields
(︂ )︂ 1 [︃ 𝐶 [︂ 𝑝 𝑛 ]︂ −1 ]︃−1
𝜆𝑖 𝑚−1 ∑︁ ∑︁ (𝑖) ′
̃︀ (𝑐 ) )2 + 𝛾
∑︁ ∑︁ 𝑚−1
𝑤𝑘2 (Ψ𝑘 − Ψ 𝑝𝑖𝑖′ 𝑢𝑚 (13)
(︀ )︀
= 1 − exp − 𝛽 𝑘 𝑖′ 𝑐′′ .
𝑚
𝑐′ =1 𝑘=1 𝑖′ =1 𝑐′′ ∈𝐶𝑐
Finally, by replacing (13) in (11), we conclude that
[︃ 𝐶
(︃ (︀ ∑︀𝑝 2 (𝑖) ̃︀ (𝑐) 2
)︀ )︃ 1 ]︃−1
∑︁ 1 − 𝑒 −𝛽 𝑘=1 𝑤𝑘 (Ψ𝑘 −Ψ𝑘 ) + 𝑆1 𝑚−1
𝑢𝑖𝑐 = (︀ ∑︀𝑝 (𝑖) ̃︀ (𝑐′ ) 2
)︀ , (14)
2
𝑐′ =1 1 − 𝑒 −𝛽 𝑘=1 𝑤𝑘 (Ψ𝑘 −Ψ𝑘 ) + 𝑆2
𝑛
∑︁ ∑︁ 𝑛
∑︁ ∑︁
with 𝑆1 = 𝛾 𝑖′ 𝑐′′ , 𝑆2 = 𝛾
𝑝𝑖𝑖′ 𝑢𝑚 𝑖′ 𝑐′′ . Expression (14) gives the iterative solution for the member-
𝑝𝑖𝑖′ 𝑢𝑚
𝑖′ =1 𝑐′′ ∈𝐶𝑐 𝑖′ =1 𝑐′′ ∈𝐶𝑐′
ship degrees.
On the other hand, the optimal weights 𝑤𝑘 can be determined in a similar way. Now, we fix 𝑢𝑖𝑐 and set equal to zero the
partial derivatives of 𝐿 with respect to 𝑤𝑘 and the Lagrange multiplier 𝜌 so that an equivalence occurs between
𝜕𝐿(𝑈 , 𝑤, 𝜆, 𝜌) 𝜕𝐿(𝑈 , 𝑤, 𝜆, 𝜌)
= 0, = 0,
𝜕𝑤𝑘 𝜕𝜌
and
𝑛 ∑︁
𝐶 𝑝 𝑝
(𝑖) ̃︀ (𝑐)
)︀ (𝑖)
̃︀ (𝑐) )2 − 𝜌 = 0,
∑︁ ∑︁ ∑︁
𝑢𝑚 𝑤𝑘2′ (Ψ𝑘′ − Ψ )2 × − 2𝑤𝑘 𝛽 (Ψ𝑘 − Ψ (15)
(︀ )︀ (︀
𝑖𝑐 exp − 𝛽 𝑘′ 𝑘 𝑤𝑘′ − 1 = 0.
𝑖=1 𝑐=1 𝑘′ =1 𝑘′ =1
From the first equation in (15), we can write 𝑤𝑘 as follows:
[︂ 𝑛 𝐶 𝑝
)︀ −1
]︂
𝜌 ∑︁ ∑︁ 𝑚 (𝑖) ̃︀ (𝑐) 2 ∑︁ (𝑖) ̃︀ (𝑐)
𝑤𝑘2′ (Ψ𝑘′ − Ψ 2
(16)
(︀
𝑤𝑘 = − 𝑢𝑖𝑐 (Ψ𝑘 − Ψ𝑘 ) × exp − 𝛽 𝑘 ′ ) .
2𝛽 𝑖=1 𝑐=1 ′ 𝑘 =1
By replacing (16) in the second equation of (15), we have
𝑝 𝑛 ∑︁
𝐶 𝑝
)︀ −1
[︂ ∑︁ ]︂
𝜌 ∑︁ (𝑖) ̃︀ (𝑐) 2
∑︁ (𝑖) ̃︀ (𝑐)
𝑢𝑚 𝑤𝑘2′′ (Ψ𝑘′′ − Ψ )2 (17)
(︀
− 𝑖𝑐 (Ψ𝑘′ − Ψ𝑘′ ) × exp − 𝛽 𝑘′′
= 1,
2𝛽 ′
𝑘 =1 𝑖=1 𝑐=1 𝑘′′ =1
which yields
[︃ 𝑝 [︂ ∑︁
𝑛 ∑︁
𝐶 𝑝 ]︂−1 ]︃−1
𝜌 ∑︁ (𝑖) (𝑐) 2
∑︁ (𝑖) (𝑐) 2 )︀
𝑢𝑚 2
(18)
(︀
− = 𝑖𝑐 (Ψ 𝑘′
− Ψ
̃︀
𝑘′
) × exp − 𝛽 𝑤𝑘′′ (Ψ 𝑘′′
− Ψ
̃︀
𝑘′′
) .
2𝛽
𝑘′ =1 𝑖=1 𝑐=1 𝑘′′ =1
By plugging in (16) the expression for −𝜌/2𝛽 in (18), we obtain the iterative solution for the weights, which takes the form:
𝑝 (𝑖)
]︃−1
̃︀ (𝑐) 2
[︃ ∑︀𝑛 ∑︀𝐶 𝑚
𝑐=1 𝑢𝑖𝑐 (Ψ𝑘 − Ψ𝑘 ) 𝑑𝑒𝑥𝑝 (𝑖, 𝑐)
∑︁ 𝑖=1
𝑤𝑘 = (𝑖)
, (19)
∑︀𝑛 ∑︀𝐶 𝑚 ̃︀ (𝑐) 2
𝑘′ =1 𝑖=1 𝑐=1 𝑢𝑖𝑐 (Ψ𝑘′ − Ψ𝑘′ ) 𝑑𝑒𝑥𝑝 (𝑖, 𝑐)
(𝑖 ) ̃︀ (𝑖2 ) )2 .
where for the sake of simplicity we have used the notation 𝑑𝑒𝑥𝑝 (𝑖1 , 𝑖2 ) = exp − 𝛽 𝑝𝑘=1 𝑤𝑘2 (Ψ𝑘 1 − Ψ
(︀ ∑︀ )︀
𝑘
References
[1] T. W. Liao, Clustering of time series data—a survey, Pattern recognition 38 (2005) 1857–1874.
[2] S. Aghabozorgi, A. S. Shirkhorshidi, T. Y. Wah, Time-series clustering–a decade review, Information Systems 53 (2015) 16–38.
[3] B. Lafuente-Rego, J. A. Vilar, Clustering of time series using quantile autocovariances, Advances in Data Analysis and classification 10 (2016) 391–415.
[4] J. A. Vilar, B. Lafuente-Rego, P. D’Urso, Quantile autocovariances: a powerful tool for hard and soft partitional clustering of time series, Fuzzy Sets and Systems 340
(2018) 38–72.
[5] P. D’Urso, E. A. Maharaj, Autocorrelation-based fuzzy clustering of time series, Fuzzy Sets and Systems 160 (2009) 3565–3589.
[6] P. D’Urso, E. A. Maharaj, Wavelets-based clustering of multivariate time series, Fuzzy Sets and Systems 193 (2012) 33–61.
[7] Á. López-Oriona, J. A. Vilar, Quantile cross-spectral density: A novel and effective tool for clustering multivariate time series, Expert Systems with Applications 185
(2021) 115677.
[8] J. Baruník, T. Kley, Quantile coherency: A general measure for dependence between cyclical economic variables, The Econometrics Journal 22 (2019) 131–152.
[9] J. Xue, C. Lee, S. G. Wakeham, R. A. Armstrong, Using principal components analysis (pca) with cluster analysis to study the organic geochemistry of sinking particles
in the ocean, Organic Geochemistry 42 (2011) 356–367.
[10] N. Gaitani, C. Lehmann, M. Santamouris, G. Mihalakakou, P. Patargias, Using principal component and cluster analysis in the heating evaluation of the school building
sector, Applied Energy 87 (2010) 2079–2086.
[11] C. Ding, X. He, K-means clustering via principal component analysis, in: Proceedings of the twenty-first international conference on Machine learning, 2004, p. 29.
[12] S. Raychaudhuri, J. M. Stuart, R. B. Altman, Principal components analysis to summarize microarray experiments: application to sporulation time series, in: Biocom-
puting 2000, World Scientific, 1999, pp. 455–466.
[13] K. Zhou, C. Fu, S. Yang, Fuzziness parameter selection in fuzzy c-means: the perspective of cluster validation, Science China Information Sciences 57 (2014) 1–8.
[14] R. J. Campello, A fuzzy extension of the rand index and other related indexes for clustering and classification assessment, Pattern Recognition Letters 28 (2007)
833–841.
[15] K.-L. Wu, M.-S. Yang, Alternative c-means clustering algorithms, Pattern recognition 35 (2002) 2267–2278.
[16] P. D’Urso, L. De Giovanni, R. Massari, Time series clustering by a robust autoregressive metric with application to air pollution, Chemometrics and Intelligent
Laboratory Systems 141 (2015) 107–124.
[17] R. Coppi, P. D’Urso, P. Giordani, A fuzzy clustering model for multivariate spatial time series, Journal of Classification 27 (2010) 54–88.
[18] P. D’Urso, L. De Giovanni, E. A. Maharaj, R. Massari, Wavelet-based self-organizing maps for classifying multivariate time series, Journal of Chemometrics 28 (2014)
28–51.