<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Discover spatio-temporal cluster from trajectory data enhance by heterogeneous contextual knowledge using FCA and the NextPriorityConcept Algorithm</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jérémy Richard</string-name>
          <email>jeremy.richard2@univ-lr.f</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Guillaume Savarit</string-name>
          <email>guillaume.savarit1@univ-lr.f</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Salah Eddine Boukheta</string-name>
          <email>salah.boukhetta@univ-lr.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Cyril Faucher</string-name>
          <email>cyril.faucher@univ-lr.f</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Karell Berte</string-name>
          <email>karell.bertet@univ-lr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>tand ChristopheDemko</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Laboratory L3i, La Rochelle University</institution>
          ,
          <addr-line>La Rochelle</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Published in Pablo Cordero</institution>
          ,
          <addr-line>Ondrej Kridlo (Eds.): The</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>hTe rising number of diferent kinds of data that can be used to describe a human trajectory (Such as GPS Coordinate, GSM, RFID, RSSI…) put in the spotlight the semantically rich trajectory. A semantic trajectory annotates semantic knowledge directly into raw data based on features of the studied area such as point of interest woreather conditions. One of the challenges of mobility studies nowadays is to find the right data model to shape all those data coming from diferent source into a framework lfexible enough to multiply the contextual data that can be used; where contextual data are knowledge coming from external data source (public city dataset, web pages, natwioenatahler services etc….). Such data models are the key component of mobility studies, but oftentimes lose the computational aspect of trajectories. In this paper, we will use the semantically rich trajectory as a way to analyse behavioral data enriched by contextual knowledge as this issue has rarely been addressed in the state of the art. We will study the use of formal concept analysis and patern mining as a way to compute complex sequential paterns in a dataset of semantic trajectories by usingNtehxetPriorityConcept algorithm. This kind of formal concept analysis allows an interactive analysis between individuals path and contextual data resulting in a hierarchy of spatio-temporal clusters where each cluster contains a specific patern depicting the trajectories within.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        A trajectory represents the path of a moving object in space. Most of raw trajectories are
formalised as a sequence of time related coordina(t, e)s at a tim e, noted&lt; (  ,   ),   &gt;. The
most common type of data used to reconstruct such paths is GPS coordinates. Yet, with the
exponential growth of the internet of Things, we collect more and more data from various ways
(such as GSM, radio frequency etc….) that are related to a person’s path (or animals, vehicles
etc ..). Therefore, making mobility studies more and more complexes taking into account a wide
range of data that can be used to reconstitute the movement of an object in space. This question
the way we can use and model this information. To address this problem, researchers consider
the use of semantic trajectori1e]s[[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. A semantic trajectory adds contextual knowledge to
a path of a mobile object; where contextual knowledge can be defined as every information
bringing a beter understanding of an event, person or object. This information brings to
trajectories another point of view avoiding to rely only on movemenWtdeatthae.r condition
or events such as protests or all kinds of “background” information can be used to discover
trajectory-shaping decisions. This data needs to be added onto those trajectories and are often
time shaped as an episode. An episode is a time interv(,a l , ) where ,  are timestamps and
A a semantic annotation. Using a succession of episodes such as the sequencedisotfri“ct”
crossed by a trajectory can be interpreted as a “discretization” of this one. Nevertheless, not
only the “context-aware” trajectory has been hard to analyze, but also the temporal aspects of a
semantic trajectory are largely ignored and only a few studies address this problem directly. In
this paper, we propose a way to analyse such enhanced trajectories by using the similarities
between semantic trajectories and temporal sequences. Recent advances in the field of patern
mining and Formal Concept Analysis (FCA) show promising ways to analyze heterogeneous
data coming from multiple datasets simultaneously; method that we are going to apply to
such semantic trajectories. Processing complex semantic trajectories usGinAgLtAhCe TIC
framework and theNextPriorityConcept[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] algorithm shows that it is possible to extract
such mobility paterns. We also can add any type of contextual data during the process enabling
an interactive analysis where we can follow the mobility of an object enriched with all kinds of
contextual data evolving on the edges of its trajectory. Also, adding contextual knowledge is a
mean to avoid the deluge of patern by focusing on specific behaviors. We propose a method for
movement data analysis focusing on the semantic aspect of trajectories alongside interactive
analysis with contextual data. As of today, the number of methods for analysing the relationship
between a moving object and background information is relatively low.
      </p>
      <p>hTe paper is organized as follows; in section 2 we will present related work both from the
ifeld of trajectory modelling and formal concept analysis. Section 3 will explain the library
GALACTIC. In section 4 we will explore the computational properties of such a tool with
experimentations on heterogeneous data. We are using real life semantic trajectories captured
by our team and composed by touristic path in the city of La Rochelle (France) and enriched
with contextual knowledge.</p>
    </sec>
    <sec id="sec-2">
      <title>2. From trajectories to temporal sequences: Related work</title>
      <sec id="sec-2-1">
        <title>2.1. Trajectory processing :</title>
        <p>In this section, we will present an overview of existing state-of-the-art models for semantic
trajectories. First, we will detail models and solutions to add more contextual knowledge into
trajectories. Next, we will study methods and analysis techniques for movement data. Based on
a dataset of semantic trajectories, we will describe the existing solutions used to retrieve similar
sub-trajectory common behaviors or movement paterns.</p>
        <sec id="sec-2-1-1">
          <title>2.1.1. The semantic aspect and the knowledge management.</title>
          <p>
            Semantic trajectory is defined bPyarent et al. 2013 [
            <xref ref-type="bibr" rid="ref4">4</xref>
            ] as : “ … a [raw] trajectory that has been
enhanced with annotations and/or one or several complementary segmentation.”. The addition
of semantic knowledge directly into a raw trajectory enables researchers to beter interpret
such data. A simple GPS coordinate can become “street A”, and a series of radio frequency from
multiple sensors in a system of indoor localization can become “room A”. The definition and
the formalisation of the semantic trajectory will difer from a data model to another but the
main idea is to provide a framework generic enough to be used in many contexts. To do so,
such models are usually structured as a sequence of time related episodes where an episode is a
group of positions from a raw trajectory over a time period according to a preset defined by the
data-scientist5][. This preset can be a POI (Place Of Interest), a behavior or any other atribute
that the researcher chooses to focuses on in his model which will be shaped as an alphabet.
hTe first model for semantic trajectories is the “stop and move” mode6l][. By semantically
annotating when an individual stops and moves again, S. Spaccapietra et al. (along L. O Alvares
also based on stop and move7[]) built their semantic trajectories as a successiostnopoafnd move.
A stop is a non-empty time interv[a,l] where the travelling object does not move anmdoave is
a non-empty time interv[a,l ] where the travelling object is in motion. In this mostdoepl,and
move are episodes. This work made the first step in the field of semantic trajectory inspiring
several works such as 8[] where a framework for semantic annotation and enhancement is
proposed. Shortly after, semantic trajectories began to use ontology-based models suc9h] as [
still based on the firststop and move model. Ontology formalism is then used in those models as
a mean to convert the raw trajectory into a high-level semantic representation. The “stop” and
“move” model can be seen as a discretization of the speed whilediast“rict” is a discretization of
space. Those ontology-based data models for semantic trajectories are only “queryable” and
cannot be used in patern mining processes. As a mater of fact, the analysis of these data
models is a major issue in the field of semantic trajectories.
          </p>
        </sec>
        <sec id="sec-2-1-2">
          <title>2.1.2. Trajectory analysis.</title>
          <p>Mobility paterns tend to vectorize raw GPS trajectories such as the trajectory clustering
paterns where the goal is to determine common trajectories in order to group similar ones.
Using a similarity measure (1[0] where authors proposed a small overview), several studies
proposed clustering algorithms for trajectories suc1h1a]sw[here authors used those metrics
to extract common sub-trajectories. Nevertheless, these studies used raw GPS trajectories
and similarity measures can be dificult to adapt to other types of trajectories such as indoor
semantic trajectories. In a notable uncommon trajectory clustering study very close to our
work [12], J. Nyoman et al. experiment the use of sequential patern mining for analysing
visitor path in a museum. By shaping trajectories as sequences, the article combines sequence
mining with algorithms such as MFCS ( forM“iningFrequentContiguousSubsequences” )
and MRGS ( for “MiningRareGeneralSubsequences” ) with movement data analysis; authors
were able to identify four visiting behaviors. The use of patern mining techniques to process
trajectories shows promising results for analysing and clustering movement data with semantic
trajectories13[]. These works are closely related to what has been done in sequential patern
analysis. This is an emerging issue in the field of movement data analysis, but still we can
cite [14] where authors proposed Spliter, a framework capable of eficiently mines sequential
patern in semantic trajectories with as a classification approach. I1n5][, C. Huiping et al.
proposed to use anApriori-like algorithm to solve this problem in a dataset of trajectories.
Several studies have addressed this problem in the larger field of sequence mining and they will
be discussed next section.</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Introduction to FCA, patern mining and sequence mining:</title>
        <sec id="sec-2-2-1">
          <title>2.2.1. The basis of patern mining.</title>
          <p>A sequence is defined as a succession of elements from an alphabeΣt, often in the form of
 =&lt;   &gt;≤ , where  ∈ Σ. Each element of the dictionary can be text, action, item, protein and
so on. The ordering of elements in a sequence is very important and is a key element in this data
modelling. This form of representation gained popularity in the 90s, in particular with the work
of Agrawal Srikant with the Mining sequential pater1n6[]. Given a dataset of customers, they
build sequences as a succession of items purchased by an individual and managed to get several
lists of items that are frequently bought together through a patern mining algorithm and more
specifically the Apriori algorithm. By using patern mining techniques, we allow a semantic
kind of representation to be computed and the goal is then to find common paterns in our
dataset. Sequence mining is a subfield of data mining which aims at finding paterns in a dataset
of sequences that appear more frequently. Paterns can be subsequences, prefixes, sufixes,
subsequences according to a sliding windo w17[]. The first generation of algorithms emerged
in the 90s with the article of Agrawal and Srika1n6t] w[hich extends the well-knowAnpriori
algorithm to sequence mining. All these algorithms take as input a dataset of sequences and a
minimum support threshold, and generate the set of frequent subsequences. For big datasets
and a short minimum support, these algorithms take a huge computing time and generate a
too large number of paterns that are dificult to interpret and contain redundancy. A second
generation of algorithms focuses on maximal paterns also know as closed paterns because
they verify a well-known property of closure in order to limit the number of paterns extracted
- called the deluge of patern. Many algorithms directly address this problem (CloSpa1n8][,
ClaSP [19]).</p>
        </sec>
        <sec id="sec-2-2-2">
          <title>2.2.2. The basis of formal concept analysis.</title>
          <p>Some algorithms also appear within Formal Concept Analysis (FCA) framewor20k]s a[nd their
extensions to patern structure2s1[], where the latice property of closed paterns is promoted.
We can mention an article for mining medical care trajectories using patern struc2t2u].res [
Patern concepts are built as maximal sets of individuals with their maximal common
subsequences, the whole set of concepts is equipped with a specialization/generalization relation
from a partially ordered set with the latice property. This latice represents the initial data
where concepts are clusters of “similar” sequences, and the concept latice is a hierarchy of
clusters (regrouping objects with their associated common paterns). The use of the Formal
Concept Analysis while working with semantic sequences is a promising way to find common
paterns of behavior [ 23].</p>
        </sec>
        <sec id="sec-2-2-3">
          <title>2.2.3. Temporal sequence mining.</title>
          <p>
            To the best of our knowledge, Yoshida et al.24][ where the firsts introduce the notion of
temporal patern mining, called “Delta paterns” where a patern (, [
            <xref ref-type="bibr" rid="ref3">0, 3</xref>
            ], ) is a sequential
patern (, ) that frequently occurs in the dataset and has a transition tim etforo mof [
            <xref ref-type="bibr" rid="ref3">0, 3</xref>
            ],
a time interval. This transition time between two elements has known a notable extension in
work such as [25]. In [
            <xref ref-type="bibr" rid="ref8">26</xref>
            ], researchers focus on mining chronicles, where a chronicle is a couple
(ℰ ,  ) withℰ is a set of temporal events a nda set of temporal constraints on theℰs.eAt
temporal constraint no t1e[d −,  +] 2 represent the time gap between two ev en1tasnd  2.
          </p>
          <p>
            hTus, a parallel can be made between semantic trajectories and temporal sequences. By
extracting the semantic annotations and the time information of all episodes of the semantic
trajectories, we end up with a sequence of time intervals denot e=d&lt;by  ,   ,   &gt;. As defined
in “Extracting temporal paterns from Interval-Based Sequence2s7”],[T. Guyet and R. Quiniou
defined a way of representing temporal sequences “A temporal sequence S is an ordered set
of events, where an even t= (, [, ]) is composed of a symbol A and a nonempty interval
[, ] , where l and u are dates.” Based on this definition, semantic annotations of episodes
can then be considered as temporal events, making the whole semantic trajectory a temporal
sequence. The goal of temporal patern mining is then to find sequential paterns from a dataset
of temporal sequences, such a(s{, }, [
            <xref ref-type="bibr" rid="ref1 ref5">1, 5</xref>
            ]) , where the eventa and b are sharing a common
time interval. Several temporal patern mining algorithms showed good results in extracting
sequential paterns, such as NeGPSan [
            <xref ref-type="bibr" rid="ref10">28</xref>
            ] to extract negative sequential paterns. Nevertheless,
these approaches cause a deluge of paterns and remain dedicated to a sequence dataset on a
same alphabet.
3. Description of theNextPriorityConcept algorithm
hTe NextPriorityConcept algorithm 3[] computes concepts from complex and heterogeneous
data for a set of objects. We first introduce the notiodnescription  , which is an application to
provide the smallest set of predicates describing a set of obj⊆ec ts , based on their
characteristics. A characteristic describes our objects, such as numerical, discrete, semantic, temporal etc
… A concept(, ()) is composed of a subset of A and a set of predicat(e)s describing the
objects it contains. An example of predicate canlebses “than c” or “match subsequence s” where
c is the max of the numerical values ansdis a maximal common subsequence of the sequence .
Depending on the data type, generation algorithms will difer. Each predicate is specific to one
type of characteristics. The final description  is the union of all those predicates.
() = ⋃   () (1)
          </p>
          <p>∈1…</p>
          <p>Where(  ()) ∈1… is a family of descriptions. This is the exact principle that allows us to
perform heterogeneous analyses. The generation of a latice withNextPriorityConcept is
inspired by Bordat’s algorithm29[]. It recursively computes the immediate successors of a
and strategies.
generated.
concept, starting at the botom concept.TheNextPriorityConcept focuses on objects G and
starts the computation at the top con(c, e(p)t),</p>
          <p>containing the whole set of objectasnd
their common description by predicat()es until no more concepts can be generated. These
concepts are computed on demand and do not need any kind of preprocessing.</p>
          <p>In order to generate the immediate predecessors of a con(, c(e)p)t
, the algorithm
introduces strategies  that select predecessors of such a concept based on each characteristic. The
strategy refines the descriptio(n)</p>
          <p>to a reduced set ′ ⊂  . We call cardinality the number
of individuals withi n ′.  is an application such tha t∶ 2 
→ 2 where is the set of all
possible predicates. Several strategies are available to generate predecessors of a concept such
as the naive strategy (considering all possible predecessors of a concept), or strategies reducing
the number of predecessors.</p>
          <p>hTe use of predicates regardless of the data allows a generic implementation of algorithms
which mixed with a system of plugins, enables for an easy integration of new data types
(description and strategies), hence, taking into account a wide range of dataGtAypLeAsC.TIC1
(GAlois LAtices, ConceptTheory,Implicational systems andClosures) is a development
platform for our algorithm allowing easy integration of new plugins for characteristics, descriptions</p>
          <p>NextPriorityConcept maintains the latice structure using queue for a generation level
by level, and a mechanism of propagation of constraints to ensure the meet and join will be</p>
          <p>
            WithGALACTIC and theNextPriorityConcept algorithm, we contribute to the field of
temporal sequences by creating two descript ioannsd a set of strategieswhere we consider
temporal sequences 3[0] [
            <xref ref-type="bibr" rid="ref13">31</xref>
            ] as:
          </p>
          <p>between two items of S.
• A temporal sequence such th at= (⟨, ⟩</p>
          <p>)∈1… with ∈ Σ is an item from a dictionarΣy
and t is a timestamp. Then, we compute the distance (either transition time or duration)
• A temporal sequence such tha t= (⟨, [</p>
          <p>,   ]⟩ )∈1… where  ∈ Σ is an item from a
dictionaryΣ and [ ,   ] the interval of time during the element a occurs. We will then

compute common sub-sequences of items during a maximal similar interval of time.
4. Experimentations
4.1. The Geoluciole dataset
hTe Geoluciole dataset is composed of GPS trajectories in
the city of La Rochelle France. This dataset was collected
during the summer of 2019, from tourists visiting the city
during their vacation. It contains 192 trajectories of
different individuals. Their positions were obtained by an
application we developed, giving their position every 30
seconds during their stay. Before activating their applFicigau-re 1: Segmentation of the city of La Rochelle
tion, we also had them fill out a survey, giving contextual
intodistrict
knowledge, such as thestaying status or thearrival means.</p>
          <p>By matching the GPS coordinatesdtiostricts of the city such as shown in figure1, we can then
transform these raw data into semantic trajectories. The segmentation of the trajectory into a
sequence of districts has been done by using a GIS (Geographic Information Systems). We can
also add all kinds of knowledge we have webscrapped, such as thbeeach they were and parks
which is spatial information anwdeather and tide information which is more of a temporal
information (figure2). This data representation is very similar to the one we can find in32[] a
“Multi-Level and Multiple Aspect Semantic Trajectory Model”.</p>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>4.2. The semantic trajectory model</title>
        <p>from webscrapping.</p>
        <p>In Table 2, each “aspect” (or side) of those semantic trajectories keeps the specificity of the
data within. The district, weather, tide, green space (parks) and beach are sequences of temporal
intervals. In each “temporal” aspect of a sequence, the temporal information represents the
hour of the day. While thdeistrict, green space and beach depends on the localisation of the user
between  and   , the weather and tide depends on the temporal information and are issued
set of predicate defined by :</p>
        <p>We formalise the semantic trajectory model for a trajectaosrfoyllows :  ∶ ⟨ 1,  2 …   ⟩
where  is described by aspects and each aspects  , with ≤ 
can either be a temporal
sequence  = (⟨  ,   ,   ⟩)≤ , on an alphabetΣ and   ∈ Σ , a semantic value  =  on an
alphabetΣ and  ∈ Σ</p>
        <p>, or a numeric valu e  =  . Our dataset of trajector iesis given by
⟩. The description space used with GALACTIC for any trajecto r y ∈  is a
(  ) ∶ ⟨ 1( 1 ),  2( 2 ) …   (   )⟩
(2)</p>
      </sec>
      <sec id="sec-2-4">
        <title>4.3. Experiment description</title>
        <p>In this section, we usGeALACTIC to extract concepts from the Geoluciole dataset. First, we will
study the impact of a heterogeneous analysis on the number of concepts (ie. clusters) generated.
hTen, we will focus on specific parts of the dataset with selected semantic information in order
to beter define touristic behaviors. Table1 shows the descriptions and strategies used during
this experimentation. Tab2leshow a description of each dataset.</p>
        <p>Chain</p>
        <p>Numeric
descriptio n :</p>
        <p>NextPriorityConcept uses descriptions to define a set of predicates describing the
attributes. As predicates, the descriptions can then be seen as binary table. We also makes use of
strategies to choose next predecessors at each iteration.</p>
        <p>In this paper, we will use the following descriptions and strategies :
• Numeric : We describe a se t= { 1, ..  } of numerical values by the “simple numerical”
(3)
(4)
(5)
(6)
(7)
(8)
  () = { is greather than min(A), is lesser than max}(A)
hTe strategy is the quantile description (, )</p>
        <p>where k is the number of quantiles:
 () = {
is greater than , is lesser than 
:   is a k-quantile}
subsequence relation:
• Chain : For a set = {&lt;   &gt;} of sequences, we use the “chain matching” description
 
(, )
that compute the maximal number of subsequences of size k, w⊆ithis the
  (, ) = { ∈ Σ
∗ ∶ ∀ ∈ ,  ⊑</p>
        <p>and || = }
  () = { ∶  ∈ 
 ( ′),  ′ = \{},
for all ∈ }</p>
        <p>We use the “complete match” strat e g y that computes all the possibles subse ts′ ⊂  :
• Temporal sequence : We consider a set of temporal sequenc e=s {&lt; 
 ,   &gt;} where
  =   ,   is an interval and  an itemset. For a sequen ce∈  , the projection () selects
all the itemsets o f included in the interval. We use the “maximal common interval”
description that computes the set of maximal common sub-intervals:
hTe “minimal cardinality” strategy  ()
subsequences of the description:</p>
        <p>adds element of minimal cardinality to the
  
() = {⟨( ,  )⟩ ∶ ∀ ∈ ,  ⊆ Φ</p>
        <p>()}
 
() = {⟨( ,  )⟩ ∶ ∀
(,  , ) = ||
 ∈ ,</p>
        <p>() ⊆  and ∀ ∈ 
and (,  ,  ) = 
 (,  )}</p>
      </sec>
      <sec id="sec-2-5">
        <title>4.4. Impact of heterogeneous data on the paterns</title>
        <p>Running a heterogeneous analysis with multiple description and strategy types can reduce the
number of concepts in the final latice. While a “naive” strategy selects all predicates within a
descriptio n, such as   , other strategies such as will select specific predicates and skip
other to refine the analysis.</p>
        <p>Example: Table 3 shows a datase t of five individuals, characterized by a numeric value
and a sequence of intervals. In this example, we will be using the simple numerical description
and quantile strategy for the numerical aspects and the maximal common interval description
alongside the minimal cardinality strategy for the sequence of intervals.</p>
        <p>Individual
a
b
c
d
e</p>
        <p>Numeric ( )
1
2
3
4
5</p>
        <p>Interval
(8.3, 11): “P”, (13, 15): “M”, “H”
(10, 12): “P”, ”H”, (14, 16): “M”
(8.3, 12): “P”, (14, 16): “M”, “C”
(7, 9): “P”, “H”, (12, 13): “M”
(10, 11): “P”, (12, 12): “M”, “C”</p>
        <p>Let be the latice shown in Figure3 as the result of a numeric description and quantile
strategy, taking into account only the numeric characteristics. By adding an interval description
and strategy, it results in the latice′ with a fewer number of concepts. Figu4reshows that
concept $1  (≤ 4 ) and $2 ( ≥ 2) from  do not appear in the latice ′ as they have not
been selected by the minimal cardinality strategy. The concepts contained′ianre the most
representative of the two description spaces.
4.4.1. Naive analysis using time interval.
ifrst, we will only considerdistricts since they are a geographical segmentation of trajectories
intodistricts of the city. By doing so, we obtain 416 concepts with a low cardinality value. The
huge number of concepts generated involving only one or two trajectories is an indicator that
there are no significant places where multiple individuals were at a same time of the day. To
sharpen our research, we then choosed to add other aspects of semantic trajectories in order to
discover new significant behaviors depending on external factors. The next section will present
$0:#5
$1:#4 $2:#4
x&lt;=4 x&gt;=2
$3:#3 $4:#3 x&gt;=3</p>
        <p>$5:#3
x&lt;=3
experimentations by taking into account multiple aspects of semantic trajectories, such as the
weather or thestaying status. With only theweather and the same approach, we obtain 379
concepts.</p>
        <sec id="sec-2-5-1">
          <title>4.4.2. Analysis with heterogeneous data</title>
          <p>400
350
T300
P
E
CN250
O
C
FO200
S
R
E150
B
M
U
N100
50</p>
          <p>In this experiment, we exploit the abilityGoAfLACTIC to deal with heterogeneous data
by adding successively the other atributes to thwee“ather” atributes (the ones that generate
the largest number of concepts). Hence it is possible to mix several sequences (such as shown
in Equation1 and 2) together with other data in a heterogeneous analysis. With nearly all
atributes added successively, we diminish the number of concepts to only 36. Fig5urgeives
the number of concepts obtained for each added atributes that decreases from 379 (only the
weather) to 36 (with all our atributes).</p>
          <p>hTe table 6 in appendix gives the descriptions by common predicates of three concepts
(concept number 0 is the top concept for all atributes). We can observe that the addition of
more knowledge gives a more precise description of concepts, and a reduction of the number of
paterns.</p>
        </sec>
      </sec>
      <sec id="sec-2-6">
        <title>4.5. Focus on some particular behaviours when reducing patern</title>
        <p>One way for the data analyst to analyse the dataset is to focus on particular behaviors, where a
behavior can be a subset of objects (people going to tbehaech) or a part of the description (an
action during the night).</p>
        <sec id="sec-2-6-1">
          <title>4.5.1. The detection ofbeach trip related to thweeather.</title>
          <p>In order to focus on visitors going tobtehaceh, we choose to analysebeach with theweather
that is semantically meaningful for this experiment. The dataset is composed of 108 trajectories,
and generates 93 concepts as a result of the formal concept analysis. The Tab7le(found in
Appendix) shows some concepts containing valuable information, such as a strong correlation
between “raining” and “no beach”, meaning that people won’t go tobetahche if it’s raining .</p>
          <p>In the dataset, 31 trajectories contain rain.</p>
          <p>On that part of the trajectory we can see that 8 of them can be described with a predicate
which matchs “raining” between 11 and 12 am and “no beach”. Other predicates are also
generated which also describe that bwadeather has an impact on abeach trip and this predicate
seems to focus on badweather in the morning.</p>
        </sec>
        <sec id="sec-2-6-2">
          <title>4.5.2. The detection of sleeping habits during touristic stay.</title>
          <p>With the time information, it is also possible to analyse any part of the day. One possible
user driven analysis could be to see if sleeping habits are dependent onsttahyeing status of
individuals. To do so, we will make an analysis with two atributes, thsteaying status, which is
a chain and locations which are temporal. With 138 trajectories, we have in total 340 concepts,
and 108 for the sleeping period of time (between 22 and 10).</p>
          <p>Some of the results can be seen in Tab8le(see Appendix). As we can see, predicates are
generated and are supported by nearly 10% of “family” trajectories and more than 10% of “with
friends” trajectories to locations subsequences, which by comparing it with other predicates of
supersequence seem to represent sleeping places (no movement during a long period of time
between 0 and 10 am).</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>5. Conclusion and future works</title>
      <p>In this paper, we propose a way to process heterogeneous data usingNtehxetPriorityConcept
algorithm alongside thGe ALACTIC platform. In the experiments, we illustrate the advantage
of mixing heterogeneous data while keeping their semantic and readability with two examples.
First, by reducing the number of paterns by adding contextual knowledge onto trajectories
and afterwards by focusing on specific aspects of the data to detect and identify distinctive
behaviors. The specificity of this approach is to keep the raw data within and avoid any kind
vectorization technique. Also, it allow us to enrich the dataset with semantic information
from the space and/or temporal knowledge directly from the trajectory (sduicsthriactss in this
paper), surveys filled out by individuals (staying status, arrival means) or even web scrapping
(weather, tide). The plugins system of GALACTIC allows to easily integrate descriptions and
strategies for heterogeneous data and ofering a wide range of data types to work with. The
NextPriorityConcept and its ability to run interactive, heterogeneous analysis is a big step
in data science and ofers new ways to deal with complex data structures such as semantic
trajectories.</p>
      <p>For future works, we would like to measure the quality of generated concepts to beter
represent and sort the relevance of their predicates (such satsatbihliety measure). These results
should become available very soon. Also, we are currently working on an incremental tool in
order to reinforce the interactivity with a user driven tool where the data-analyst can select
data to analyse it in an interactive way. By doing so, we allow the data scientist to select which
side of the data he wants to explore, because only he knows best the semantic of the data he
manipulates. In particular, it gives the possibility to change the strategy or description during
every step of the process (where new concepts are generated), in an intuitive and simple way.
For exampleweather and tide information are not interesting during the night. We wish that by
giving the opportunity to the data-scientist to interactively explore the data in a user-driven
approach, we will be able to optimize data mining processes.
Annual ACM International Symposium on Advances in Geographic Information Systems.
ser. GIS ’07. ACM, 2007 (2007) 22:1–22:8.
[8] V. Bogorny, C. Renso, A. R. de Aquino, F. de Lucca Siqueira, L. O. Alvares, Constant – a
conceptual data model for semantic trajectories of moving objects, Transactions in GIS 18
(2014) 66–88. doi:10.1111/tgis.12011.
[9] Z. Yan, J. Macedo, C. Parent, S. Spaccapietra, Trajectory ontologies and queries,
Transactions in GIS 12 (2008) 75–91. do1i:0.1111/j.1467-9671.2008.01137.x.
[10] K. Toohey, M. Duckham, Trajectory similarity measures, Sigspatial Special 7 (2015) 43–50.
[11] S. J. Camargo, A. W. Robertson, S. J. Gafney, P. Smyth, M. Ghil, Cluster analysis of typhoon
tracks. part i: General properties, Journal of Climate 20 (2007) 3635–3653.
[12] N. Juniarta, M. Couceiro, A. Napoli, C. Raïssi, Sequential patern mining using fca and
patern structures for analyzing visitor trajectories in a museum, in: CLA 2018-The 14th
International Conference on Concept Latices and Their Applications, 2018.
[13] Y. Chen, P. Yuan, M. Qiu, D. Pi, An indoor trajectory frequent patern mining algorithm
based on vague grid sequence, Expert Systems with Applications 118 (2019) 614–624.
[14] C. Zhang, J. Han, L. Shou, J. Lu, T. La Porta, Spliter: Mining fine-grained sequential
paterns in semantic trajectories, Proceedings of the VLDB Endowment 7 (2014) 769–780.
[15] H. Cao, N. Mamoulis, D. W. Cheung, Mining frequent spatio-temporal sequential paterns,
in: Fifth IEEE International Conference on Data Mining (ICDM’05), IEEE, 2005, pp. 8–pp.
[16] R. Agrawal, R. Srikant, Mining sequential paterns, in: Proceedings of the Eleventh</p>
      <p>International Conference on Data Engineering, 1995, pp. 3–14.
[17] H. Mannila, H. Toivonen, A. Inkeri V., Discovery of frequent episodes in event sequences,</p>
      <p>Data mining and knowledge discovery 1 (1997) 259–289.
[18] X. Yan, J. Han, R. Afshar, Clospan: Mining: Closed sequential paterns in large datasets,
in: Proceedings of the 2003 SIAM international conference on data mining, SIAM, 2003,
pp. 166–177.
[19] A. Gomariz, M. Campos, R. Marin, B. Goethals, Clasp: An eficient algorithm for mining
frequent closed sequences, in: Pacific-Asia Conference on Knowledge Discovery and Data
Mining, Springer, 2013, pp. 50–61.
[20] B. Ganter, Atribute exploration with background knowledge, Theoretical Computer</p>
      <p>Science 217 (1999) 215–234.
[21] B. Ganter, S. O. Kuznetsov, Patern structures and their projections, in: International
conference on conceptual structures, Springer, 2001, pp. 129–142.
[22] A. Buzmakov, E. Egho, N. Jay, S. O. Kuznetsov, A. Napoli, C. Raïssi, On projections of
sequential patern structures (with an application on care trajectories), in: The Tenth
International Conference on Concept Latices and Their Applications-CLA’13, 2013.
[23] N. Juniarta, M. Couceiro, A. Napoli, C. Raïssi, Sequential Patern Mining using FCA and
Patern Structures for Analyzing Visitor Trajectories in a Museum, in: CLA 2018 - The 14th
International Conference on Concept Latices and Their Applications, Olomouc, Czech
Republic, 2018. URL: https://hal.inria.fr/hal-0188791. 4
[24] M. Yoshida, T. Iizuka, H. Shiohara, M. Ishiguro, Mining sequential paterns including time
intervals, in: Data Mining and Knowledge Discovery: Theory, Tools, and Technology II,
volume 4057, SPIE, 2000, pp. 213–220.
[25] S. Yen, Y. Lee, Mining non-redundant time-gap sequential paterns, Applied Intelligence</p>
    </sec>
    <sec id="sec-4">
      <title>Appendix</title>
      <p>Concepts
0
1
2
district
weather</p>
      <p>tide
green space
beach
district
weather</p>
      <p>tide
green space
beach</p>
      <p>Paterns (common predicates)</p>
      <p>∅
temperature = 20.75
stay status: ’family’
arrival mean: ’personnal car’</p>
      <p>Dor</p>
      <p>cloudy</p>
      <p>Square Valin low tide
Parc Charruyer</p>
      <p>No beach
12 13 14 15 16 17 18 19 20 t
25.0417 ≥ temperature ≥ 20.75
stay status: ’family’</p>
      <p>Dor
cloudy</p>
      <p>low tide
Parc Charruyer</p>
      <p>No beach
Individuals</p>
      <p>All
2
2</p>
      <p>Support “raining”
Individuals
weather match [“[11;12):[’raining’]”]
beach match [“[0;24):[’No beach’]”]
weather match [“[1;2):[’raining’]”]
beach match [“[0;24):[’No beach’]”]
weather match [“[8;9):[’raining’]”]
beach match [“[0;24):[’No beach’]”]
weather match [“[1;4):[’raining’]”]
beach match [“[0;24):[’No beach’]”]
2
8
6
5
5
4
4</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Kontarinis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Zeitouni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Marinica</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Vodislav</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kotzinos</surname>
          </string-name>
          ,
          <article-title>Towards a Semantic Indoor Trajectory Model</article-title>
          , in: 2nd International Workshop on ”
          <article-title>Big Mobility Data Analytics” (BMDA) with EDBT 2019</article-title>
          , Lisbon, Portugal,
          <year>2019</year>
          . URLh:ttps://hal.archives-ouvertes.fr/ hal-02314572.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ilarri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Stojanovic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Ray</surname>
          </string-name>
          ,
          <article-title>Semantic management of moving objects: A vision towards smart mobility</article-title>
          ,
          <source>Expert Systems with Applications</source>
          <volume>42</volume>
          (
          <year>2015</year>
          )
          <fpage>1418</fpage>
          -
          <lpage>1435</lpage>
          .
          <year>1d0o</year>
          .i:1016/ j.eswa.
          <year>2014</year>
          .
          <volume>08</volume>
          .057.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>C.</given-names>
            <surname>Demko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Bertet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Faucher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Viaud</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. O.</given-names>
            <surname>Kuznetsov</surname>
          </string-name>
          ,
          <article-title>Nextpriorityconcept: A new and generic algorithm computing concepts from complex and heterogeneous data</article-title>
          ,
          <source>Theoretical Computer Science</source>
          <volume>845</volume>
          (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>20</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>C.</given-names>
            <surname>Parent</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Spaccapietra</surname>
          </string-name>
          ,
          <article-title>Semantic trajectories modeling and analysis</article-title>
          ,
          <source>ACM Comput. Surv</source>
          .
          <volume>45</volume>
          (
          <year>2013</year>
          )
          <fpage>1</fpage>
          -
          <lpage>32</lpage>
          .
          <year>doi1</year>
          :
          <fpage>0</fpage>
          .1145/2501654.2501656.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Mountain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Raper</surname>
          </string-name>
          ,
          <article-title>Modelling human spatio-temporal behaviour: A challenge for location-based services</article-title>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Spaccapietra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Parent</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. L.</given-names>
            <surname>Damiani</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. A. de Macedo</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Porto</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Vangenot</surname>
          </string-name>
          ,
          <article-title>A conceptual view on trajectories</article-title>
          ,
          <source>Data Knowledge Engineering</source>
          <volume>65</volume>
          (
          <year>2008</year>
          )
          <fpage>126</fpage>
          -
          <lpage>146</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>L. O.</given-names>
            <surname>Alvares</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Bogorny</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Kuijpers</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. A. F. de Macedo</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Moelans</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Vaisman</surname>
          </string-name>
          ,
          <article-title>A model for enriching trajectories with semantic geographical information</article-title>
          ,
          <source>Proc. of the 15th 39</source>
          (
          <year>2013</year>
          )
          <fpage>727</fpage>
          -
          <lpage>738</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>D.</given-names>
            <surname>Cram</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mathern</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mille</surname>
          </string-name>
          ,
          <article-title>A complete chronicle discovery approach: application to activity analysis</article-title>
          ,
          <source>Expert Systems</source>
          <volume>29</volume>
          (
          <year>2012</year>
          )
          <fpage>321</fpage>
          -
          <lpage>346</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>T.</given-names>
            <surname>Guyet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Quiniou</surname>
          </string-name>
          ,
          <article-title>Extracting temporal paterns from interval-based sequences</article-title>
          ,
          <source>in: International Join Conference on Artificial Intelligence (IJCAI)</source>
          , Barcelone, Spain,
          <year>2011</year>
          . URL: https://hal.inria.fr/inria-006184.
          <fpage>44</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>T.</given-names>
            <surname>Guyet</surname>
          </string-name>
          , R. Quiniou,
          <article-title>NeGPSpan: eficient extraction of negative sequential paterns with embedding constraints</article-title>
          ,
          <source>Data Mining and Knowledge Discovery</source>
          <volume>34</volume>
          (
          <year>2020</year>
          )
          <fpage>563</fpage>
          -
          <lpage>609</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bordat</surname>
          </string-name>
          ,
          <article-title>Calcul pratique du treillis de galois d'une correspondance</article-title>
          ,
          <source>Mathématiques et Sciences humaines 96</source>
          (
          <year>1986</year>
          )
          <fpage>31</fpage>
          -
          <lpage>47</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>S. E.</given-names>
            <surname>Boukheta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Demko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Bertet</surname>
          </string-name>
          , J. Richard, C. Cayèré,
          <article-title>Temporal sequence mining using fca and galactic</article-title>
          , in: International Conference on Conceptual Structures, Springer,
          <year>2021</year>
          , pp.
          <fpage>185</fpage>
          -
          <lpage>199</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>S. E.</given-names>
            <surname>Boukheta</surname>
          </string-name>
          , J. Richard,
          <string-name>
            <given-names>C.</given-names>
            <surname>Demko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Bertet</surname>
          </string-name>
          ,
          <article-title>Interval-based sequence mining using fca and the nextpriorityconcept algorithm</article-title>
          .,
          <source>in: FCA4AI@ ECAI</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>91</fpage>
          -
          <lpage>102</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>C.</given-names>
            <surname>Cayèré</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Sallaberry</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Faucher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bessagnet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Roose</surname>
          </string-name>
          , M. Masson, J. Richard,
          <article-title>Multilevel and multiple aspect semantic trajectory model: Application to the tourism domain</article-title>
          ,
          <source>ISPRS International Journal of Geo-Information</source>
          <volume>10</volume>
          (
          <year>2021</year>
          )
          <fpage>592</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>