A novel spatio-temporal clustering technique to study the bike
                   sharing system in Lyon

                              Marta Galvani∗                                                        Agostino Torti†
                           University of Pavia                                                    Politecnico di Milano
                                Pavia, Italy                                                           Milan, Italy
                          marta.galvani@unipv.it                                                 agostino.torti@polimi.it

                      Alessandra Menafoglio‡                                                        Simone Vantini§
                         Politecnico di Milano                                                    Politecnico di Milano
                              Milan, Italy                                                             Milan, Italy
                   alessandra.menafoglio@polimi.it                                              simone.vantini@polimi.it

ABSTRACT                                                                         the bike sharing program has been widely adopted by major
In the last decades cities have been changing at an incredible                   regions like Europe, North America and Asia-Pacific.
rate growing the needs of efficient urban transportation to avoid                    Bike sharing systems (BSSs) have become a common feature
traffic jams and high environmental pollution. In this context,                  of all metropolitan areas and according to a 2019 Global Market
bike sharing systems (BSSs) have been widely adopted by major                    Insights, Inc. report, it has been predicted that the fleet size of
regions like Europe, North America and Asia-Pacific becoming a                   bike sharing market will gain over 8% from 2019 to 2025, leading
common feature of all metropolitan areas. Its fast growing has                   the worldwide industry revenue to surpass a valuation of USD
increased the need of new monitoring and forecasting tools to                    10 billion.
take fast decisions and provide an efficient mobility management.                This fast growth has urged scientists in developing suitable moni-
   In this context we focus on the BSS of Lyon in France, called                 toring and forecasting tools to handle with mobility management
Vélo’v. In particular we analyse a dataset containing the loading                and make feasible and efficient future plans [5]. Many studies
profiles of 345 bike stations over one week, treating the data as                have demonstrated that an efficient analysis of the data collected
continuous functional observations over a period of one day. The                 by BSSs can provide good insights for the service design, i.e. for
aim of this work is to identify spatio-temporal patterns on the                  the reallocation strategies optimization, to underly the causes
usage of bike sharing stations, identifying groups of stations and               of network imbalance and for the adjustment of pricing policies
days with similar behaviour, with the purpose of providing useful                ([10], [14], [1]).
information to the fleet managers. To this scope, we develop a                   Most BSSs provide open access to their data regarding the real-
novel bi-clustering algorithm able to deal with functional data,                 time status on their bike stations. In this context we focus on
extending a nonparametric algorithm developed for multivariate                   the BSS of Lyon, called Vélo’v. Launched in 2005, Vélo’v is the
data. This new algorithm is able to find simultaneously subsets                  first bicycle-sharing system in France, with a network of more
of rows and columns with similar behaviour when the elements                     than 3000 bikes spread over 345 stations in Lyon and neighboring
of the dataset are functional objects.                                           Villeurbanne. The service has been developed by street furniture
   Obtained results show that through this analysis it is possible               company JCDecaux for Lyon Metropole and it counts now more
to identify different usage patterns according to the area of the                than 68.500 subscribers.
city and the day of the week.                                                        In this work we analyse a dataset containing the loading pro-
                                                                                 files of 345 bike stations over one week during the period from
                                                                                 Monday 10th March until Sunday 16th March in 2014. The real
1    INTRODUCTION                                                                time data are available at https://developer.jcdecaux.com/ trough
Due to urbanization and globalization in the last decades, cities                an api key and they were first used in [3]. Specifically, for each
have been changing at an incredible rate. In particular a growing                station the number of available bikes divided by the total number
need of urban transportation has increased the number of vehicle                 of bike docks at each hour is recorded.
usage on roads and ultimately led to heavy traffic jams and high                 The aim of our work is to understand the spatio-temporal pat-
environmental pollution. To alleviate the above growing issues,                  terns of the bike stations usage, providing useful information
                                                                                 for the correct management of the service. We are interested in
∗ Department of Mathematics                                                      understanding how bike sharing stations are used according to
† MOX Laboratory for Modeling and Scientific Computing - Department of Mathe-    their spatial position looking at the variability within and be-
matics.                                                                          tween days.
Center for Analysis Decisions and Society, Human Technopole.
‡ MOX Laboratory for Modeling and Scientific Computing - Department of           Due to the continuous dependence on time of our data, we decide
Mathematics.                                                                     to model them making use of tools from Functional Data Analysis
§ MOX Laboratory for Modeling and Scientific Computing - Department of
                                                                                 (FDA), the branch of statistics dealing with curves, surfaces or
Mathematics.                                                                     anything else varying over a continuum (e.g., [13]). In this way
                                                                                 it is possible to consider the within-day variability.
© 2020 Copyright held by its author(s). Published in the Workshop Proceedings    From a statistical point of view, we are facing with a problem of
of the EDBT/ICDT 2020 Joint Conference, (March 30-April 2, 2020, Copenhagen,
Denmark) on CEUR-WS.org:                                                         clustering both the bike stations and the days of the week, which
Use permitted under Creative Commons License Attribution 4.0 International (CC   is know in the literature as a bi-clustering problem. The main aim
BY 4.0)
                                                                                                      Í |I | Í | J |
of bi-clustering (or co-clustering) algorithms is to simultaneously       where f I J (t) = |I |1| J | i=1 j=1 fi j (t). For easiness of interpre-
group individuals and features into homogeneous sets, see [11]            tation we define the bi-cluster template observing only the aver-
for a complete review of bi-clustering methodologies.                     age function in the bi-cluster.
As for each station and for each day we define the bike station           Consequently, the extended H -score of a bi-cluster A′ (I, J ) is:
loading profile as a continuous functional datum, we have found                                                  ∫
ourselves with a problem of bi-clustering functional data. Differ-                                     HI J =        H I J (t)
                                                                                                                      T
ent methodologies, which extend some well known algorithm
                                                                                                     Í |I | Í | J |              2
                                                                          with H I J (t) = |I |1| J | i=1 j=1 fi j (t) − [f I J (t)] .
                                                                                                                      
for clustering multivariate data, have been proposed in the liter-
ature with the aim of clustering functional data [9]. In the same         The pseudo-code of the Algorithm to find a bi-cluster works as
way, bi-clustering methods can be generalized to functional data          expressed in algorithm 1. The algorithm starts considering the
by defining new algorithms able to detect functional bi-clusters.         whole dataset and try to find the biggest bi-cluster with a H -score
Although the concept of bi-clustering was first introduced by             value lower then a given threshold δ . The rows/columns addition
[7] in the 1970s, [4] are recognized as the first ones to propose         and deletion procedures are a natural extension of the ones in-
a bi-clustering algorithm. Subsequently different model based             troduced in [4]. The procedure follows the main structure of the
approaches have been proposed, among them [6] relies on the               original Cheng and Church algorithm, except for the masking
latent block model. Starting from it, [2] introduce a parametric          procedure. Indeed, instead of this step, after finding a new bi-
model able to deal with functional data. Although, as based on            cluster, a set of all the biggest submatrices containing elements
Latent block model, this approach is able to detect exhaustive            not already assigned is found through the Bimax algorithm ([12]).
bi-clustering maintaining a checkerboard structure that does not          Then, in the next iteration, the new bi-cluster is searched inside
always fit with the real situations.                                      the biggest submatrix found. Each time a new bi-cluster is found
In this work we introduce a novel methodology based on the                the set of the submatrices of not assigned elements is updated;
extension of the Cheng and Church algorithm with the aim of               otherwise a new bi-cluster is searched in the following biggest
detecting functional non exclusive bi-clusters. We propose an it-         submatrix in the set.
erative procedure based on a non parametric approach obtaining
a deterministic strategy that does not have to rely on strong mod-         Algorithm 1: Functional Cheng and Church algorithm
elling assumptions of the data, which are generally not consistent
in the FDA framework, and allows for flexible non exclusive bi-              Input: (n, m) matrix A whose elements are functions
clusters.                                                                           fi j (t)
The rest of this paper is organized as follows: in Section 2 we de-                 δ >0 the maximum acceptable H -score
scribe the functional Cheng and Church bi-clustering, discussing                    maxIter the maximum number of allowed
how the extension of the algorithm is implemented. In Section 3                     iterations
the introduced methodology is applied on the Vélo’v BSS and the              Result: A set of Bi-clusters with H -score< δ
main results are reported. In Section 4 conclusions are presented            Set M=A
and discussed, underlying the spatio-temporal patterns found in              while iter < maxIter and submatrices to search in for
the data employing the novel algorithm proposed in this work.                 bi-clusters are present do
                                                                                Given a submatrix M:
2    METHODOLOGY: THE FUNCTIONAL                                                while H -score> δ and deletion/addition is still possible
                                                                                 do
     CHENG AND CHURCH ALGORITHM
                                                                                       (1) Multiple Node Deletion:
Given a dataset arranged in a matrix A composed by n rows                                  remove groups of rows/cols
and m columns, the aim of a bi-clustering technique is to find                         (2) Single Node Deletion:
a submatrix A′ ∈ A, corresponding to a subset of rows I and a                              remove the row/col that reduce H -score
subset of columns J , with a high similarity score. In particular, in                      the most
the Cheng and Church algorithm ([4]), an ideal bi-cluster is a set                     (3) Node Addition:
of rows I and a set of columns J such that each element ai j in the                        add rows/cols that do not make H -score
bi-cluster can be expressed as ai j = a I J + α i + β j ∀i ∈ I and ∀j ∈                    greater than δ
J , where a I J is the average value in the bi-cluster and α i and
                                                                                end
β j are respectively the residue of rows and columns average
                                                                                if A new bi-cluster is found then
value and the total average value a I J . A particular measure of
                                                                                    Apply Bimax to search for all the biggest
goodness is evaluated for a bi-cluster A′ (I, J ) considering a score
                                                                                     submatrices of not assigned elements and select
H which is the Mean Squared Residue between all the real values
                                                                                     the biggest one as M
ai j ∈ A′ (I, J ) and their representative values in the bi-cluster
                                                                                else
a I J + αi + β j .                                                                  Select as M the following biggest submatrix
Extending these concepts to FDA, each element of the dataset                    end
matrix A is a function fi j (t) defined on a continues domain T . In         end
such framework we define an ideal bi-cluster A′ as a subset of
rows I and columns J such that each function belonging to the
bi-cluster A′ (I, J ) can be defined as follows:
                                                                             As in the classical Cheng and Church, the results are sensitive
                                                                          to the choice of the input parameter δ . Indeed, a too high value
                                                                          of δ would imply a unique big bi-cluster, while a too low value
                  fi j (t) = f I J (t) ∀i ∈ I and ∀j ∈ J
                                                                          would imply a really large number of bi-clusters or even the
                              Figure 1: Functions belonging on each bi-cluster with their templates


impossibility to find a bi-cluster. To tune the parameter δ , we        of the weekends (e.g. bi-clusters 4, 5, 6), while some other con-
perform a sensitivity analysis on the number of obtained clus-          sidered stations that have the same pattern during working and
ters, the number of not assigned observations and the sum of            weekend days (e.g. bi-clusters 1, 15, 18).
the H scores over all the found bi-clusters. Then, following the         Turning our attention on the found bi-clusters, Figure 1, it is
same approach used for many other clustering techniques as the          possible to interpret results as a way of segment the city in differ-
classical k-means, we choose the value of δ where an elbow or a         ent activity areas according to the day of the week and the hour
peak are found.                                                         of the day. Observing the usage profiles of the bi-clusters, three
                                                                        main groups can be identified: the constant profile, the residential
                                                                        profile and the working profile.
3   DATA ANALYSIS AND RESULTS                                           The constant profile bi-clusters show flat functions of usage dur-
The first step of our analyses is to treat the available raw data as    ing the whole day implying a no usage or a continuous replace-
continuous functions. Specifically, for each station and for each       ment of bikes in these stations. Among these, the always Full
day we define, through a kernel density estimation smoothing            (e.g. bi-clusters 3, 6 and 9) and Empty stations (e.g. bi-clusters 13,
method [8], the bike station loading profile as a continuous func-      16 and 20), which necessarily imply user dissatisfaction as they
tional datum representing the number of available bikes divided         respectively cannot drop-off or pick-up a bike, are of particular
by the total number of bike docks at each timestamp. In this way        interest.
we obtain 2415 curves, i.e. 345 stations per 7 days, representing       The working profile (e.g. bi-clusters 4, 10 and 14) and the residen-
all the elements fi j (t) (with t ∈ [0, 24]) of a dataset matrix A      tial profile (e.g. bi-clusters 2 and 37) instead, are characterised by
with the same dimensions (345x7). Functional Cheng and Church           a huge activity during rush hours in the morning, around 8a.m.,
algorithm, presented in the previous section, can be applied on         and in the evening, around 7p.m.. However, the two groups show
this dataset.                                                           an opposite behaviour since while the first one fills up in the
To find the set of the best bi-clusters, a threshold δ must be pro-     morning and empties out in the evening, instead, the second one
vided to the algorithm. After performing a sensitivity analysis to      empties out in the morning and fills up in the evening. More-
choose the threshold parameter δ we fix δ equal to 0.025.               over, looking at the days inside the working profile and residential
Results are shown in Figure 1. There are in total 94 bi-clusters        profile groups, it appears that these bi-clusters are composed by
while the not assigned observations are artificially assigned to        working days. The peculiarity of these two groups reveal a clear
bi-cluster 0. For each bi-cluster all the functions contained in that   commuting behaviour of the bike sharing users which move dur-
bi-cluster are shown together in colors; the template function, de-     ing working days in the morning and evening rush hours.
fined as the average curve of the bi-cluster, is displayed in black.    To better explore these two behaviours, we focus, as explanatory
Looking at the bi-cluster dimensions (i.e. the number of curves in      example, on bi-clusters 2 and 4. In Figure 2 and 3 results on these
the bi-cluster), the obtained results are able to explain the 75% of    two bi-clusters are reported; in particular all the functions be-
the data, while the 25% of the functions are not assigned to any        longing to the bi-cluster with the bi-cluster template (in black),
bi-clusters. Note that the found bi-clusters have been ordered          the corresponding days and bike stations location are shown.
from the biggest one to the smallest one, considering the num-          Observing Figure 2 it is possible to say that Bi-cluster 2 is a block
ber of included rows and columns. Evaluating the percentage of          composed from 34 stations and 5 days (from Monday to Friday),
working and weekend days for each bi-cluster, we notice that            covering almost the 7% of all the data. It is characterised from full
some bi-clusters cover specific patterns of the working days or         stations before 8a.m. and after 8p.m. and empty stations during
Figure 2: Functions in bi-cluster 2 with calendar and bike            Figure 3: Functions in bi-cluster 4 with calendar and bike
stations position on the map of Lyon                                  stations position on the map of Lyon


                                                                      4   CONCLUSION
                                                                      The aim of our work is to study the spatio-temporal patterns
the rest of the day. This peculiar behaviour is justified by the      of the Vélo’v BSS usage profile during a one week period in
fact that these stations are mostly located in residential areas      Lyon providing useful information to the fleet managers. To this
in the East of the city. An opposite behaviour is instead present     end, we model the usage profiles of the different bike stations
in all the stations belonging to bi-cluster 4 (Figure 3) which are    around the city day by day as continuous functions with the
full between 8a.m.-8p.m. and empty in the rest of the day. This       aim of discovering subgroups of stations and days with similar
behaviour is easily explainable by the fact that these stations are   behaviour, which is know in the literature as a bi-clustering
located in parts of the city with many companies where people         problem.
are used to commute during the day. This bi-cluster is composed       To build our analyses, we introduce a novel non parametric bi-
by 17 stations and again the 5 working days from Monday to            clustering algorithm extending the Cheng and Church algorithm
Friday, covering the 3.5% of the total observations in the data.      in the FDA framework. From a methodological point of view
Another small group of bi-clusters, almost covering weekend           the concept of ideal bi-cluster is extended for functional data
days, can be described as weekend profile (e.g. bi-clusters 6, 7      and a new score evaluation for found bi-clusters is introduced.
and 73). For instance, bi-cluster 73 (Figure 4) contains the daily    In addition the new introduced algorithm overcomes the main
usage profiles of 3 stations for the entire weekend. The peculiar-    weaknesses of the original Cheng and Church, avoiding the usage
ity of this bi-cluster is that the concerned bike stations are in     of the masking procedure and introducing a greedy search in the
the city center, very closed to River Sâone banks, where there       not already assigned elements. The developed algorithm allows
are many shops and bars especially active during the weekends.        to find non exclusive bi-clusters with a H -score lower than a
It is possible to see, observing Figure 4, that these stations are    given value δ , through a non parametric procedure. This has the
filled up during evening until they become almost totally full        advantage of avoiding to rely on strong modelling. A sensitivity
before midnight and then they slowly empty out during the night.      analysis for the δ parameter tuning is also presented.
This behaviour can be explained considering that people go out           From a practical point of view, the developed approach is
clubbing during evening and then they return back home late in        applied to study the daily usage profiles of all the 345 stations of
the night.                                                            the Vélo’v BSS in Lyon for one week in March 2014. Results show
                                                                      REFERENCES
                                                                       [1] Pierre Borgnat, Celine Robardet, Jean-Baptiste Rouquier, Patrice Abry, Patrick
                                                                           Flandrin, and Eric Fleury. 2011. Shared Bicycles in a City: A Signal Processing
                                                                           and Data Analysis Perspective. Advances in Complex Systems 14 (06 2011).
                                                                           https://doi.org/10.1142/S0219525911002950
                                                                       [2] Charles Bouveyron, Laurent Bozzi, Julien Jacques, and François-Xavier Jollois.
                                                                           2018. The functional latent block model for the co-clustering of electricity
                                                                           consumption curves. Journal of the Royal Statistical Society: Series C (Applied
                                                                           Statistics) 67, 4 (2018), 897–915.
                                                                       [3] Charles Bouveyron, Etienne Côme, Julien Jacques, et al. 2015. The discrimi-
                                                                           native functional mixture model for a comparative analysis of bike sharing
                                                                           systems. The Annals of Applied Statistics 9, 4 (2015), 1726–1760.
                                                                       [4] Yizong Cheng and George M Church. 2000. Biclustering of expression data..
                                                                           In Ismb, Vol. 8. 93–103.
                                                                       [5] Elliot Fishman. 2016. Bikeshare: A Review of Recent Literature. Transport
                                                                           Reviews 36, 1 (2016), 92–113. https://doi.org/10.1080/01441647.2015.1033036
                                                                       [6] Gérard Govaert and Mohamed Nadif. 2013. Co-clustering: models, algorithms
                                                                           and applications. John Wiley & Sons.
                                                                       [7] John A Hartigan. 1972. Direct clustering of a data matrix. Journal of the
                                                                           american statistical association 67, 337 (1972), 123–129.
                                                                       [8] Trevor Hastie, Robert Tibshirani, and Jerome Friedman. 2001. The Elements of
                                                                           Statistical Learning. Springer New York Inc., New York, NY, USA.
                                                                       [9] Julien Jacques and Cristian Preda. 2014. Functional data clustering: a survey.
                                                                           Advances in Data Analysis and Classification 8, 3 (2014), 231–255.
                                                                      [10] Neal Lathia, Saniul Ahmed, and Licia Capra. 2012. Measuring the impact of
                                                                           opening the London shared bicycle scheme to casual users. Transportation
                                                                           Research Part C: Emerging Technologies 22 (2012), 88 – 102. https://doi.org/10.
                                                                           1016/j.trc.2011.12.004
                                                                      [11] Beatriz Pontes, Raúl Giráldez, and Jesús S Aguilar-Ruiz. 2015. Biclustering
                                                                           on expression data: A review. Journal of biomedical informatics 57 (2015),
                                                                           163–180.
                                                                      [12] Amela Prelić, Stefan Bleuler, Philip Zimmermann, Anja Wille, Peter Bühlmann,
                                                                           Wilhelm Gruissem, Lars Hennig, Lothar Thiele, and Eckart Zitzler. 2006. A
                                                                           systematic comparison and evaluation of biclustering methods for gene ex-
                                                                           pression data. Bioinformatics 22, 9 (2006), 1122–1129.
                                                                      [13] J. O. Ramsay and B. W. Silverman. 2005. Functional data analysis. Springer,
                                                                           New York.
                                                                      [14] Patrick Vogel, Torsten Greiser, and Dirk Christian Mattfeld. 2011. Understand-
                                                                           ing Bike-Sharing Systems using Data Mining: Exploring Activity Patterns.
                                                                           Procedia - Social and Behavioral Sciences 20 (2011), 514 – 523.


Figure 4: Functions in bi-cluster 73 with calendar and bike
stations position on the map of Lyon


clear patterns of usage allowing to segment the city in different
activity areas according to the day of the week and the hour of the
day. For instance, a commuting behaviour is observed revealing
that stations next to residential areas and working areas have an
opposite behaviour during working days. It is interesting to notice
that despite no apriori information about the spatial distribution
of the stations are taken into account by the model, it appears
that stations belonging to the same bi-cluster are actually located
in neighborhoods with the same socio-economic characteristics.
Moreover, groups of stations always full or always empty are
highlighted, revealing some criticalities of the service.
In conclusion, our work contributed to implement the study of a
bike sharing system in two ways: from a methodological point of
view, we defined a novel non parametric bi-clustering technique
for functional data; from an applied point of view, we analysed
the bike sharing system in the city of Lyon providing useful
information for the correct management of the service.