=Paper= {{Paper |id=Vol-1743/ks2 |storemode=property |title=Spatio-Temporal Data Mining: From Big Data to Patterns |pdfUrl=https://ceur-ws.org/Vol-1743/ks2.pdf |volume=Vol-1743 |authors=Maguelonne Teisseire |dblpUrl=https://dblp.org/rec/conf/simbig/Teisseire16 }} ==Spatio-Temporal Data Mining: From Big Data to Patterns== https://ceur-ws.org/Vol-1743/ks2.pdf
               Spatio-Temporal Data Mining: From Big Data to Patterns

                                      Maguelonne Teisseire
                       UMR TETIS (Cirad, Irstea, AgroParisTech, CNRS) – France
                            maguelonne.teisseire@irstea.fr
                                 Web: www.textmining.biz



                        Abstract                                       With the dramatic growth of spatial informa-
                                                                    tion and Geographic Information Systems (GIS),
        Technological advances in terms of data                     many studies have been carried out in the context
        acquisition enable to better monitor dy-                    of spatiotemporal patterns mining. Early work in
        namic phenomena in various domains (ar-                     this area has dealt with spatial and temporal di-
        eas, fields) including environment. The                     mensions separately. Extraction of temporal se-
        collected data is more and more complex                     quences aims at identifying features frequent over
        - spatial, temporal, heterogeneous and                      time without taking into account spatial relation-
        multi-scale. Exploiting this data requires                  ships. Colocation mining methods extract set of
        new data analysis and knowledge discov-                     features which frequently appear in close objects
        ery methods. In that context, approaches                    without taking into account the temporal aspect.
        aimed at discovering spatio-temporal pat-                   More recently, these works have been extended
        terns are particularly relevant. This paper1                to simultaneously integrate spatial and temporal
        focuses on spatio-temporal data and asso-                   dimensions. Examples include the detection of
        ciated data mining methods.                                 sequences of located events and trajectory min-
                                                                    ing. A review has been published by the consor-
1       Spatio-temporal Data                                        tium GeoPKDD (Giannotti and Pedreschi, 2008).
In recent years, technological advances in data ac-                 However, in those approaches, the mined patterns
quisition (satellite images, sensors, etc.) have en-                do not match the spatial complexity encountered
abled numerous applications in surveillance and                     when dealing with sattelite images. Similarly, the
environmental monitoring: detection of abrupt                       primitive constraints usually used (typically mini-
changes (natural disasters, etc.), evolution track-                 mum frequency) are not sufficient to express crite-
ing of natural phenomena (coastal erosion, deser-                   ria of interest for experts, such as geologists.
tification, wildfires, etc.) or development of mod-                    A spatiotemporal database contains information
els (hydrology, agriculture, etc.). The collected                   characterized by a spatial and a temporal dimen-
data is usually heterogeneous, multiscale, spa-                     sions. Two types of spatiotemporal databases are
tial and temporal (time series of satellite images,                 mainly considered: databases containing trajecto-
aerial or terrestrial photos, digital terrain models,               ries of moving objects located in both space and
physical ground measurements, qualitative obser-                    time (e.g. bird or aircraft trajectories); databases
vations, etc. ). This data is used to understand and                storing spatial and temporal dynamics of events
predict phenomena generated by processes that are                   (e.g. erosion evolution in a region or epidemic
complex and of multidisciplinary origin (climatic,                  spread in a city).
geological, etc.). Exploitation by experts of those
                                                                    2   Mining moving object trajectories
huge volume of complex data (big data) requires
not only to structure it to the best but also and                   The emergence of new mobile technologies has
mainly to design data analysis and knowledge dis-                   facilitated the collection of large amounts of spa-
covery methods. In that context, approaches in-                     tiotemporal data, dedicated to the localization of
volving pattern mining are particularly relevant.                   mobile objects in space and time (Perera et al.,
    1
                                                                    2015). These new databases provide opportunities
    The content of the paper was prepared in collaboration
with H. Alatrista Salas, S. Bringay, F. Flouvat, and N. Sel-        for new applications. The project GeoPKDD (Gi-
maoui.                                                              annotti and Pedreschi, 2008), for example, studied



                                                               17
the development of traffic planning in large cities           for those patterns would further facilitate their in-
according to vehicle- flows. Other application do-            terpretation. Many application areas remain to
mains include socio-economic geography, sports                be explored as for example image-mining where
(e.g. football players), fishing control and weather          large amounts of data are available but few effec-
forecast- (e.g. hurricanes). In most of these appli-          tive and scalable methods have been developed so
cations, the number of paths is high. One of the              far. Finally, there is a real need for collaboration
objectives of trajectory analysis is to find the most         between domain experts and data mining experts.
relevant paths according to the targeted problem              Collaboration is the key to success for the knowl-
(e.g. the most frequent, the most unexpected, peri-           edge extraction process.
odic, etc.). Several approaches have been recently
proposed in the literature, for instance (Orakzai et
al., 2015).                                                   References
                                                              Hugo Alatrista-Salas, Sandra Bringay, Frédéric Flou-
3   Spatial patterns and spatiotemporal                         vat, Nazha Selmaoui-Folcher, and Maguelonne Teis-
    patterns for located event- mining                          seire. 2016. Spatio-sequential patterns mining: Be-
                                                                yond the boundaries. Intell. Data Anal., 20(2):293–
The extraction of spatial and spatiotemporal pat-               316.
terns has been studied extensively in recent years            Mete Celik, Shashi Shekhar, James P. Rogers, and
in geographic data and GIS. There are two families             James A. Shine. 2008. Mixed-drove spatiotem-
of approaches: colocations (Shekhar and Huang,                 poral co-occurrence pattern mining. IEEE TKDE,
                                                               20(10):1322–1335.
2001) that identify events that are frequently close;
and spatiotemporal patterns that identify the evo-            Fosca Giannotti and Dino Pedreschi, editors. 2008.
lution of events in both space and time (Alatrista-             Mobility, Data Mining and Privacy - Geographic
                                                                Knowledge Discovery. Springer.
Salas et al., 2016). Sequences and more gener-
ally graphs have often been used and extended to              Christian S. Jensen, Markus Schneider, Bernhard
                                                                Seeger, and Vassilis J. Tsotras, editors. 2001. Ad-
the spatiotemporal context in order to represent the
                                                                vances in Spatial and Temporal Databases, 7th
propagation of phenomena in space and time. Col-                International Symposium, SSTD 2001, Redondo
locations focus on objects and their spatial rela-              Beach, CA, USA, July 12-15, 2001, Proceedings,
tionships, for instance (Shekhar and Huang, 2001;               volume 2121 of Lecture Notes in Computer Science.
Celik et al., 2008).                                            Springer.
                                                              Faisal Orakzai, Thomas Devogele, and Toon Calders.
4   Conclusion                                                  2015. Towards distributed convoy pattern mining.
                                                                In Proceedings of the 23rd SIGSPATIAL Interna-
The challenges associated with spatial and spatio-              tional Conference on Advances in Geographic Infor-
temporal databases are numerous. Firstly, the se-               mation Systems, Bellevue, WA, USA, November 3-6,
mantics of extracted patterns must be considered                2015, pages 50:1–50:4.
to present experts with patterns which actually               Nicolas Pasquier, Yves Bastide, Rafik Taouil, and Lotfi
meet their application needs. Patterns with more                Lakhal. 1998. Pruning closed itemset lattices for
                                                                associations rules. In BDA’98.
complex structures, such as attributed graphs, can
be really effective in spatial databases as shown by          Kushani Perera, Tanusri Bhattacharya, Lars Kulik, and
Pasquier’s promising work (Pasquier et al., 1998)               James Bailey. 2015. Trajectory inference for mobile
                                                                devices using connected cell towers. In Proceedings
and (Sanhes et al., 2013). In addition, methods
                                                                of the 23rd SIGSPATIAL International Conference
of spatio-temporal data mining often generate a                 on Advances in Geographic Information Systems,
lot of patterns, sometimes more than the size of                Bellevue, WA, USA, November 3-6, 2015, pages
original data. It is therefore important to define              23:1–23:10.
measures of interest that enable experts to select            Jérémy Sanhes, Frédéric Flouvat, Claude Pasquier,
the most relevant patterns. As highlighted in the                 Nazha Selmaoui-Folcher, and Jean-François Bouli-
method based on colocations, it is also necessary                 caut. 2013. Weighted path as a condensed pattern
                                                                  in a single attributed DAG. In IJCAI 2013, Beijing,
to include - the domain knowledge (e.g. metadata,
                                                                  China, August 3-9, 2013, pages 1642–1648.
semantic descriptions, ontologies, etc.) in the ex-
traction process to improve the scalability as well           Shashi Shekhar and Yan Huang. 2001. Discovering
                                                                spatial co-location patterns: A summary of results.
as the quality of the extracted patterns and their in-
                                                                In Jensen et al. (Jensen et al., 2001), pages 236–256.
terpretation. A definition of relevant visualizations



                                                         18