<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Automatic Time-Series Clustering via Network Inference</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Kohei Obata</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yasuko Matsubara</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Koki Kawabata</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yasushi Sakurai</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>SANKEN, Osaka University</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Given a collection of multidimensional time-series that contains an unknown type and number of network structures between variables, how eficiently can we find typical patterns and their points of variation? How can we interpret important relationships with obtained patterns? In this paper, we propose a new method of model-based clustering, which we call network clustering via graphical lasso (NGL). Our method has the following properties: (a) Interpretable: it provides interpretable network structures and cluster assignments for the data; (b) Automatic: it determines the optimal cut points and the number of clusters automatically; (c) Accurate: it provides reliable clustering performance thanks to the automated algorithm. We evaluate our NGL algorithm on both real and synthetic datasets, obtaining interpretable network structure results and outperforming state-of-the-art baselines in terms of accuracy.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;time-series</kwd>
        <kwd>network structure</kwd>
        <kwd>graphical lasso</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>in most cases, we do not know the optimal number of
clusters in advance.</p>
      <p>
        Many applications generate time-series data including In this paper, we propose an automatic algorithm,
those used in automobiles [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], biology, social networks called network clustering via graphical lasso (NGL),
and in relation to financial data. In most cases, these which enables us to summarize multidimensional
timedata are multidimensional, and it is important to find series into meaningful patterns eficiently based on the
typical patterns, which have a specific network structure. graphical lasso problem. Intuitively, the problem we wish
In practice, real-life data have multiple distinct patterns, to solve is as follows.
which diferentiate their network structures. For
examInformalProblem 1. Given a large collection of
mulple, automobile sensor data from a driving session can be
composed of some basic actions and some abrupt actions tidimensional time-series data with underlying network
structures , Find a compact description of , which
(i.e., going straight, turning right, turning left, slowing
down, sudden braking, sudden turning). The network consists of:
structure is equal to the graph structure. In this case, sen- 1. a set of segments and their cut points
sors can be represented as nodes, and sensor interactions 2. a set of segment groups (i.e., clusters) of similar
can be represented as edges. For a turning action, lateral network structures
acceleration and steering angle may have an edge and
for a braking action, brake pedal stroke and longitudinal 3. the optimal number of clusters
acceleration may have an edge.
      </p>
      <p>
        In this paper, we focus on finding a network structure Contrast with Competitors. We will compare NGL
automatically from multidimensional time-series data. with existing methods from the viewpoint of network
Understanding the structure of these networks is useful inference. Network estimation with time-series
informabecause it allows us to devise models of sensor interac- tion has been studied as a method for analyzing
ecotion, which can be used to analyze such behaviours as nomic data and biological signal data because of the
fossil-eficient driving. However, there are many network high interpretability of its graphical model [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Graphical
structures in the data, which change over time, and it is lasso [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ] is a network estimation method that provides
dificult to find a meaningful segmentation point since no an interpretable sparse inverse covariance matrix due to
one knows how data change. Moreover, the number of the ℓ1-norm. Time varying graphical lasso (TVGL) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]
clusters should be selected automatically to find abrupt is a network estimation method that takes time
informachanges or for an extension to online learning, because tion into account. Although this method can find change
points by comparing the network structure before and
V$LDobBa’2ta28,8S@eps0a5n–k9e,n2.0o2s2k,aS-yud.ance.jyp, A(Ku.stOrablaiata); after a change, it can’t find clusters. Toeplitz inverse
yasuko@sanken.osaka-u.ac.jp (Y. Matsubara); covariance-based clustering (TICC) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and time adaptive
koki@sanken.osaka-u.ac.jp (K. Kawabata); Gaussian model (TAGM) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] are clustering methods based
yasushi@©sa20n22kPeronce.eodsinagskoaft-hueV.aLDcB.j2p022(YPh.DSWaokrkushropa,iS)eptember 5, 2022. Sydney, on network structure. TICC uses Markov random fields
CPWrEooUrckReshdoinpgs IhStpN:/c1e6u1r3-w-0s.o7r3g ACCureEsatrtUaivlieRa.CCoWompmyoroirngkshtLs(iCche)no2s0pe2A2Ptftorrribotuhctisioepnea4pd.e0riInbnytgeirtsnsaa(tuiCothnoEarlsU(.CUCRseB-YWpe4r.m0S)i..ttoedrgun)der
(tMionRsFh)iapnsdamTooenpglivtzarmiaabtlreics.esTAtoGcMapitsuarefuthsieoinnhoefraenhitdrdeelanMarkov model (HMM) and a Gaussian mixture model for  in the -th cluster, the full parameter set that we
(GMM). These methods find clusters depending on the want to estimate consists of  = {1, 2, ...,  }
network structure of each subsequence. This provides and the number of clusters .
clusters with interpretability and allows us to discover
patterns that other traditional clustering methods are 2.2. Graphical lasso problem
unable to find. Both incorporate a graphical lasso to
capture the interaction between variables but require the We first consider the static inference that estimates  .
number of clusters to be specified as prior information. The optimization problem is written as follows:
Consequently, only our approach satisfies the need for
interpretability and find the optimal number of clusters minimize ∈+ + − (,  ) +  || ||,1,
automatically.
(,  ) = ||(log det   −  ( )),
Contributions. The main contribution of this work is
the concept and design of NGL, which has the following
desirable properties:
1. Interpretable: NGL provides the underlying
graphical structures and cluster assignments in
data, which help us to interpret important
relationships between variables.
2. Automatic: We formulate data encoding
schemes to reveal distinct patterns/clusters, each
of which captures the network structure. The
proposed method requires no parameter tuning
to specify the number of clusters.
3. Accurate: NGL is a simple yet powerful
algorithm for time-series segmentation using a
graphical lasso, which outperforms state-of-the-art
competitors in terms of accuracy.
      </p>
      <p>where  is the empirical covariance
(1/||) ∑︀||   ,  are the diferent samples,</p>
      <p>=1
+ + is the space of the positive definite matrices,
 determines the sparsity level of the network, and
‖ · ‖ ,1 is the of-diagonal ℓ1-norm. This is a convex
optimization problem, which imposes the ℓ1-norm
restriction.</p>
      <sec id="sec-1-1">
        <title>2.3. TVGL problem</title>
        <p>
          To infer a time-varying sequence of networks, TVGL [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]
extends the above approach, which is designed to infer a
set of inverse covariance matrices Θ . TVGL solves the
problem below:
minimize ∈+ +
 
∑︁ − (,  ) +  || ||,1 +  ∑︁  (  −  − 1),
=1 =2
where  determines how strongly correlated neighboring
2. Preliminary covariance estimations should be. The penalty function
 encourages similarity between   and  − 1. Diferent
In this paper we investigate an automatic network struc- types of  allow us to enforce diferent restrictions in the
ture clustering for large multidimensional time-series time-varying similarity. This problem is solved by
emdata. We will describe a few key concepts and back- ploying the alternating direction method of multipliers
ground materials in this section. (ADMM) [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], which is a well-established method for
solving the convex optimization problem. For more details,
2.1. Problem definition please see, e.g., [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. Although TVGL can find a changing
point by comparing   and  − 1, it cannot find a cluster
Consider a set of -dimensional time-series data con- simultaneously. Throughout this paper our method uses
sisting of  sequential bundle observations,  = TVGL to optimize the graphical lasso problem.
{1, 2, ...,  } and there are || ≥ 1 diferent
observations at each time .  ∈ R is the -th multidimensional
bundle observation, and each bundle observation vector 3. Algorithms
is sampled from any  multivariate normal distributions
vgarar∼ipahblset(rru0e,cptΣruerse)e.,niTntshweahnniecothdweao,grakinvdsetnrtuhcet uc∼orevaisr(i0eaqn,Σuceal)mt,oaetatrhcihxe (Tianh)veherposrweevctiooovudasersisacenrcitcbieoenmthdaeetrsmiccroeidbs eeΘld,.h(bNo)wohwotowtheetsotqimfinudeastoetipoatnismsetaalroef
Σ  forms an edge. Our goal is to find the cluster assign- cut points, and (c) how to assign segments to optimal
ments of these  bundle observations into  clusters clusters. There are three main ideas behind our model:
ℱ = {ℱ1, ℱ2, ..., ℱ }, where ℱ is a cluster assign- 1. Model description cost: We use the minimum
ment set of  ⊂  (..  = 1, 2, ..., ) represented description length (MDL) principle as a model
by a set of  matrices, i.e., Θ = { 1,  2, ...,   }. There- selection criterion for choosing between
alternafore, letting  = { , ℱ} be a compact description tive segmentation and cluster descriptions. We
propose a novel cost function to estimate the de- 3.2.1. MergeSegment (inner loop)
scription cost of the graphical lasso model.
        </p>
        <p>
          Assuming that neighboring segments tend to belong to
2. CutPointSearch: We modify the generic bottom- the same cluster, we update cut points through
Mergeup algorithm [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] to enhance its ability to han- Segment. We consider given cut points  = {0, 1,
dle time-series data. Initially we adopt short seg- ..., }, and a set of inverse covariance matrices at each
ments and iteratively merge with an adjacent pair segment Θ  = { ,0 ,  0,1 , ...,  ,}, where the
numthat satisfies the cost restriction. ber of segments is  + 1. And the set of inverse
covariance matrices at each segment, which consists of only
even/odd-numbered cut points Θ  = { ,0 ,  0,2 , ...}
3. NGL: We use the EM algorithm to cluster the
segments obtained by CutPointSearch while
determining the optimal number of clusters
automatically.
        </p>
        <p>and Θ  = { ,1 ,  1,3 , ...}. , , , , and  ,
are the data, model, and inverse covariance matrix from
cut points  to  . Our goal is to determine if a
segment should be merged with its neighboring segment. As
3.1. Model description cost shown in Figure 1, we have three candidates as updated
cut points: (a) Solo has three segments all separated, (b)
The MDL explains the model in a parsimonious way that Left and (c) Right have two segments in which one side
calculates the required number of bits. Thus, it follows is merged. We compare the MDL costs using Equation
the assumption that the more we can compress the data, (1), in these three cases, (a) vs. (b) vs. (c), and select the
the more we can generalize its underlying structures. best cut points so that they minimize the local MDL cost.
In a nutshell, we want to find the minimum number of For example, if (b) has the lowest cost, +2 is added to
graphical lasso models needed to express the data. The the updated cut points. If (a) has the lowest cost, there is
goodness of the model  can be described as follows: no change from the previous cut points. We iterate this
process throughout the whole sequence.
&lt; ;  &gt; = &lt;  &gt; + &lt; | &gt;,</p>
        <p>(1)
where &lt;  &gt; shows the cost of describing the model
 , and &lt; | &gt; represents the cost of describing the
data  given the model  .</p>
        <p>Model Coding Cost. The description complexity of
model  is the sum of the following elements: The
number of clusters  requires log* (). 1 The total number of
observations of each cluster requires ∑︀</p>
        <p>=1 log* (|ℱ|).</p>
        <p>The mean value of each cluster which has a size  × 1,
requires ∑︀=1( ×  ). The inverse covariance
matrix of each cluster which has a size  × , requires
∑︀</p>
        <p>
          =1 | |̸=0(2 log()+ )+log* (| |̸=0), where |·| ̸=0
describes the number of non-zero elements in a matrix
and  is the floating point cost. 2
Data Coding Cost. Given a model  , encoding cost of
the data  using Hufman coding [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] is computed by:
&lt; | &gt;= ∑︀=1 (,  ). Our next goal is to find
the best model  that minimizes Equation (1).
        </p>
      </sec>
      <sec id="sec-1-2">
        <title>3.2. Automatic cut point detection</title>
        <p>3.2.2. CutPointSearch (outer loop)
This algorithm finds the optimal cut points. We are now
given bundle  and initial cut points . The user decides
the interval of initial cut points. Since TVGL forces a
time-varying similarity with neighboring network, we
calculate Θ  , Θ , and Θ  using the TVGL graphical
lasso optimization method. After obtaining each Θ , we
run the MergeSegment algorithm to update the cut points.
We iterate this process until the cut points are stable.</p>
      </sec>
      <sec id="sec-1-3">
        <title>3.3. Automatic clustering: NGL</title>
        <p>Now we have optimal cut points, which means that there
are a limited number of segments that have enough
samples with which to estimate the network structure. Next,
we assign segments to a cluster and find the optimal
number of clusters. As Algorithm 1 shows, we use the EM
algorithm to classify each segment. For each iteration we
vary  = 1, 2, 3, ..., and minimize the function below:</p>
        <sec id="sec-1-3-1">
          <title>1Here, log* is the universal code length for integers</title>
          <p>2We used 4 × 8 bits in our setting
So far, we have described how we calculate the MDL 
cost for our model. The next question is how to find arg min ∑︁ − (,  ) +  || ||,1, (2)
optimal cut points that minimize MDL cost eficiently; =1
we still have numerous candidates with which to merge
to summarize similar subsequences into a compact model, In the E-step, we assign each segment to the optimal
and thus we modify the bottom-up algorithm to prevent a cluster, so that the log likelihood is maximized. In the
pattern explosion. We answer this question in two steps, M-step, we calculate the   value of each cluster using a
MergeSegment and CutPointSearch. normal graphical lasso optimization algorithm. Until the
cost function increases, we vary  so as to minimize the
cost function.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>4. Experiments</title>
      <sec id="sec-2-1">
        <title>We evaluate our method on both synthetic and real datasets.</title>
        <sec id="sec-2-1-1">
          <title>4.1. Accuracy on synthetic data</title>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>In this section, we demonstrate the accuracy of NGL on</title>
        <p>
          synthetic data. We do so because there are clear ground
truth networks with which to test the accuracy.
Experimental Setup. We randomly generate synthetic
multidimensional data in R5, which follows a
multivariate normal distribution  ∼  (0,  − 1). Each of the
 clusters has a mean of ⃗0, so that the clustering
result is based entirely on the structure of the data. For
each cluster, we generate a random ground truth inverse
covariance as follows [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]:
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>Baseline Methods. We compare our method to</title>
        <p>
          three state-of-the-art methods and one ablation method.
TICC [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] and TAGM [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] take network structure into
account. Since both methods need to specify the number
of clusters, we gave the true number of clusters only to
these methods. AutoPlait [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] is multi-level HMM based
automatic method for time-series clustering. NGL no-cps
is our NGL method without CutPointSearch. Our method,
including NGL no-cps, requires us to specify initial cut
points. We set its interval at every 5 points throughout
the synthetic experiments.
        </p>
        <p>Clustering Accuracy. We set each segment in each of
the examples to have 100 observations in R5 (for
example, "1,2,1" has a total of 300 observations). Table 1 shows
the clustering accuracy for the macro-F1 scores for each
dataset. As shown, NGL significantly outperforms the
baselines. Our method consistently achieves the highest
accuracy and lowest standard deviation. AutoPlait does
not consider network structure, so it does not find any
clusters. Although we gave the true number of clusters 
to TICC and TAGM, the average accuracy of our method
is more than 10% higher. NGL no-cps shows that
finding a large segment by CutPointSearch has meaning of
grouping adjacent observations into the same cluster.
Efect of Total Number of Samples. We next focus on
the number of samples required for each method to
accurately find clusters. We take the "1,2,3,4,1,2,3,4" example
and vary the number of samples. As shown in Figure 2,
our method outperforms the baselines for almost all
segment lengths. Our method has a constantly high average,
even for relatively small segment lengths. This is because
our CutPointSearch algorithm correctly find cut points
even if the sample size is small.</p>
        <sec id="sec-2-3-1">
          <title>4.2. Case study</title>
          <p>Here, we show that our NGL provides an interpretable
result with real-world financial data. In general, stocks,
bonds, and currency prices are correlated. By examining
historical financial data, we can infer a financial network
1. Set  ∈ R5× 5 equal to the adjacency matrix of structure to reveal the relationships between them. We
an Erdős-Rényi directed random graph, where use hourly currency exchange rate data 3 of AUD/USD,
every edge has a 20% chance of being selected. EUR/USD, GBP/USD, and USD/CAD from 2005 to 2018.
2. For every selected edge in  set  ∼ Assuming that the underlying network structure is
conUniform([− 0.6, − 0.3] ∪ [0.3, 0.6]). We enforce a sistent for a week, we normalized the data for each week.
symmetry constraint whereby every  = . We also set the initial cut points at a week to capture the
weekly correlation trend. The top of Figure 3 shows the
3. Let  =  min() be the smallest eigenvalue of clustering result obtained with NGL. During the global
, and set  =  + (0.1 + ||), where  is an financial crisis (from mid- 2007 to early 2009), we found
identity matrix. that the network structure changed. There are abrupt
changes on 2016/5/16 ∼ 2016/6/5, the bottom of
FigWe run our experiments on four diferent temporal se- ure 3 shows how the correlation changed during this
quences: "1,2,1","1,2,3,2,1","1,2,3,4,1,2,3,4","1,2,2,1,3,3,3,1". period. As we can see, a correlation related to the United
We generate each dataset 10 times and report the mean
and standard deviation of the macro-F1 score.</p>
          <p>Model
1,2,1
1,2,3,2,1
1,2,3,4,1,2,3,4
1,2,2,1,3,3,3,1
Macro-F1 score of clustering accuracy for four diferent temporal sequences, comparing</p>
          <p>NGL with state-of-the-art methods</p>
          <p>TAGM (KDD’21)</p>
          <p>TICC (KDD’18)</p>
          <p>AutoPlait (SIGMOD’14)</p>
          <p>NGL no-cps
ber of samples for NGL and two other state-of-the-art
methods.</p>
        </sec>
      </sec>
      <sec id="sec-2-4">
        <title>Kingdom changed significantly. This was in response to the United Kingdom European Union membership referendum on 2016/6/23, which may have caused public concern.</title>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>5. Conclusion and Future work</title>
      <sec id="sec-3-1">
        <title>In this paper, we presented NGL, which is an interpretable</title>
        <p>clustering algorithm. We focused on the problem of the
interpretable clustering of multidimensional time-series
data with underlying network structures. Our proposed</p>
      </sec>
      <sec id="sec-3-2">
        <title>NGL indeed exhibits all the desirable properties; it is</title>
        <p>Interpretable and Automatic and Accurate.</p>
        <p>In future work, we will focus on the following
direction: Online learning. In several situations, network
inference needs to operate in an online fashion. And to
the best of our knowledge, no study has dealt with online
clustering based on network structure. In this context, we
will develop an extension of our methods by utilizing the
novel sliding window and bottom-up (SWAB) algorithm</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgments</title>
      <sec id="sec-4-1">
        <title>The authors would like to thank the anonymous referees for their valuable comments and helpful suggestions.</title>
      </sec>
      <sec id="sec-4-2">
        <title>This work was supported by JSPS KAK</title>
      </sec>
      <sec id="sec-4-3">
        <title>ENHI Grant-in-Aid for Scientific Research Number</title>
      </sec>
      <sec id="sec-4-4">
        <title>MIC/SCOPE 192107004, JST-AIP JPMJCR21U4, ERCA</title>
      </sec>
      <sec id="sec-4-5">
        <title>Environment Research and Technology Development</title>
      </sec>
      <sec id="sec-4-6">
        <title>Fund JPMEERF20201R02,</title>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>C.</given-names>
            <surname>Miyajima</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Nishiwaki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Ozawa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Wakita</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Itou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Takeda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Itakura</surname>
          </string-name>
          ,
          <article-title>Driver modeling based on driving behavior and its evaluation in driver identiifcation</article-title>
          ,
          <source>IEEE</source>
          <volume>95</volume>
          (
          <year>2007</year>
          )
          <fpage>427</fpage>
          -
          <lpage>437</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>F.</given-names>
            <surname>Tomasi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Tozzo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Barla</surname>
          </string-name>
          ,
          <article-title>Temporal pattern detection in time-varying graphical models</article-title>
          ,
          <source>in: ICPR</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>4481</fpage>
          -
          <lpage>4488</lpage>
          . doi:
          <volume>10</volume>
          .1109/ICPR48806.
          <year>2021</year>
          .
          <volume>9413203</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Friedman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Hastie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Tibshirani</surname>
          </string-name>
          ,
          <article-title>Sparse inverse covariance estimation with the graphical lasso</article-title>
          ,
          <source>Biostatistics</source>
          <volume>9</volume>
          (
          <year>2008</year>
          )
          <fpage>432</fpage>
          -
          <lpage>441</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>F.</given-names>
            <surname>Tomasi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Tozzo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Salzo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Verri</surname>
          </string-name>
          ,
          <article-title>Latent variable time-varying network inference</article-title>
          ,
          <source>in: KDD</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>2338</fpage>
          -
          <lpage>2346</lpage>
          . URL: https://doi.org/10.1145/3219819. 3220121. doi:
          <volume>10</volume>
          .1145/3219819.3220121.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Hallac</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Park</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. P.</given-names>
            <surname>Boyd</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Leskovec</surname>
          </string-name>
          ,
          <article-title>Network inference via the timevarying graphical lasso</article-title>
          ,
          <source>in: KDD</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>205</fpage>
          -
          <lpage>213</lpage>
          . URL: https://doi.org/10. 1145/3097983.3098037. doi:
          <volume>10</volume>
          .1145/3097983.3098037.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D.</given-names>
            <surname>Hallac</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Vare</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. P.</given-names>
            <surname>Boyd</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Leskovec</surname>
          </string-name>
          ,
          <article-title>Toeplitz inverse covariance-based clustering of multivariate time series data</article-title>
          ,
          <source>in: KDD</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>215</fpage>
          -
          <lpage>223</lpage>
          . URL: https://doi.org/10.1145/3097983.3098060. doi:
          <volume>10</volume>
          .1145/3097983.3098060.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>V.</given-names>
            <surname>Tozzo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ciech</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Garbarino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Verri</surname>
          </string-name>
          ,
          <article-title>Statistical models coupling allows for complex local multivariate time series analysis</article-title>
          ,
          <source>in: KDD</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>1593</fpage>
          -
          <lpage>1603</lpage>
          . URL: https://doi.org/10.1145/3447548.3467362. doi:
          <volume>10</volume>
          .1145/3447548. 3467362.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S. P.</given-names>
            <surname>Boyd</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Parikh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Chu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Peleato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Eckstein</surname>
          </string-name>
          ,
          <article-title>Distributed optimization and statistical learning via the alternating direction method of multipliers, Found</article-title>
          .
          <source>Trends Mach. Learn</source>
          .
          <volume>3</volume>
          (
          <issue>2011</issue>
          )
          <fpage>1</fpage>
          -
          <lpage>122</lpage>
          . URL: https://doi.org/10.1561/ 2200000016. doi:
          <volume>10</volume>
          .1561/2200000016.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>E. J.</given-names>
            <surname>Keogh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. M.</given-names>
            <surname>Hart</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Pazzani</surname>
          </string-name>
          ,
          <article-title>An online algorithm for segmenting time series</article-title>
          ,
          <source>in: Proceedings of the 2001 IEEE International Conference on Data Mining</source>
          ,
          <volume>29</volume>
          <fpage>November</fpage>
          - 2
          <source>December</source>
          <year>2001</year>
          , San Jose, California, USA, IEEE Computer Society,
          <year>2001</year>
          , pp.
          <fpage>289</fpage>
          -
          <lpage>296</lpage>
          . URL: https://doi.org/10.1109/ICDM.
          <year>2001</year>
          .
          <volume>989531</volume>
          . doi:
          <volume>10</volume>
          .1109/ICDM.
          <year>2001</year>
          .
          <volume>989531</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>C.</given-names>
            <surname>Böhm</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Faloutsos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-Y.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Plant</surname>
          </string-name>
          , Ric:
          <article-title>Parameter-free noise-robust clustering</article-title>
          ,
          <source>TKDD</source>
          <volume>1</volume>
          (
          <year>2007</year>
          )
          <fpage>10</fpage>
          -
          <lpage>es</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>K.</given-names>
            <surname>Mohan</surname>
          </string-name>
          , P. London,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fazel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Witten</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.-I. Lee</surname>
          </string-name>
          ,
          <article-title>Node-based learning of multiple gaussian graphical models</article-title>
          ,
          <source>J. Mach. Learn. Res</source>
          .
          <volume>15</volume>
          (
          <year>2014</year>
          )
          <fpage>445</fpage>
          -
          <lpage>488</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Matsubara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sakurai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Faloutsos</surname>
          </string-name>
          , Autoplait:
          <article-title>Automatic mining of co-evolving time sequences</article-title>
          ,
          <source>in: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, SIGMOD '14</source>
          ,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2014</year>
          , p.
          <fpage>193</fpage>
          -
          <lpage>204</lpage>
          . URL: https://doi.org/10.1145/2588555.2588556. doi:
          <volume>10</volume>
          .1145/2588555.2588556.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>