=Paper=
{{Paper
|id=Vol-2322/BMDA_2
|storemode=property
|title=Towards A Semantic Indoor Trajectory Model
|pdfUrl=https://ceur-ws.org/Vol-2322/BMDA_2.pdf
|volume=Vol-2322
|authors=Alexandros Kontarinis,Karine Zeitouni,Claudia Marinica,Dan Vodislav,Dimitris Kotzinos
|dblpUrl=https://dblp.org/rec/conf/edbt/KontarinisZMVK19
}}
==Towards A Semantic Indoor Trajectory Model==
Towards a Semantic Indoor Trajectory Model Alexandros Kontarinis Karine Zeitouni Claudia Marinica ETIS UMR 8051, University of DAVID Lab, University of Versailles ETIS UMR 8051, University of Paris-Seine, University of Saint-Quentin, University of Paris-Seine, University of Cergy-Pontoise, ENSEA, CNRS / Paris-Saclay Cergy-Pontoise, ENSEA, CNRS DAVID Lab, University of Versailles Versailles, France Cergy-Pontoise, France Saint-Quentin, University of karine.zeitouni@uvsq.fr claudia.marinica@u-cergy.fr Paris-Saclay alexandros.kontarinis@ensea.fr Dan Vodislav Dimitris Kotzinos ETIS UMR 8051, University of ETIS UMR 8051, University of Paris-Seine, University of Paris-Seine, University of Cergy-Pontoise, ENSEA, CNRS Cergy-Pontoise, ENSEA, CNRS Cergy-Pontoise, France Cergy-Pontoise, France dan.vodislav@u-cergy.fr Dimitrios.Kotzinos@u-cergy.fr ABSTRACT data of varying spatial granularity, and other peculiarities. In ad- In this paper we present a Semantic Indoor Trajectory Model dition, indoor trajectory analytics may gain from avoiding cum- aimed at supporting the design and implementation of context- bersome calculations over geometric representations of space aware mobility data mining and statistical analytics methods. and objects within it, that are typical of outdoor environments. Motivated by a compelling museum case study, and by what we Instead, operations such as intersection, containment, and prox- perceive as a lack in indoor trajectory research, we are interested imity can be simplified in order to prioritize the non-geometric in combining aspects of state-of-the art semantic outdoor trajec- aspects of movement [15], instead of metric aspects often focused tory models, with a semantically-enabled hierarchical symbolic on Euclidean distances from potential targets. In fact, reasoning representation of the indoor space, which abides by OGC’s In- about space without precise quantitative information has been doorGML standard. We drive the discussion on those modeling at the core of Qualitative Spatial Relations research [9]. issues and details that have been overlooked so far or where Moreover, in order to reason about movement in information- our approach deviates from typical practices. We illustrate the ally rich domains, a trajectory model must also account for mul- modeling part with instantiations from the Louvre Museum in tiple types of contextual and semantic information. As identified an effort to provide a pragmatic view of what a Semantic Indoor by [22] and further explored in [4, 5], there are three fundamen- Trajectory Model ought to represent and ideally also how. tal sets pertinent to movement, representing the “where” (set of locations), “when” (set of instants or intervals), and “what” (set of objects) of spatiotemporal data. This is true across applications. Distinguishing between semantics of time, semantics of places, 1 INTRODUCTION and semantics of moving objects, in addition to the semantics of It has long been of paramount importance for museums to “know” movement itself could empower a synergistic interplay between their visitors, meaning to study and understand their motivations, different types of semantics. Such semantic information can be expectations, engagement, and satisfaction. In this regard, multi- derived either from the moving object’s environment or from media guides offering Location-Based Services (e.g. way-finding, external data sources. It can then be used to add a meaningful di- contextualized content delivery) are becoming an invaluable tool mension to “raw” trajectories. Unfortunately, semantic trajectory for museums, since they provide them with access to an unprece- models have - to a large extent - targeted outdoor settings. dented wealth of visitor movement data. Similar opportunities This has resulted in an emphasis on the enrichment of GPS have appeared in other domains of indoor human mobility such data, the identification of stops and moves, the identification of as retail stores, arenas, hospitals, airports, universities [16]. transportation means, and other conceptual modeling issues that So far, trajectory-based human mobility data analytics research are either not interesting or not transferable in indoor settings. has solely focused on outdoor trajectories, driven by the fact that On the other hand, the adoption of some modeling approaches, Geographic Information Science (GIS) has traditionally only sup- such as the segmentation of trajectories into episodes and the ported outdoor spatial information. This type of research differs use of semantic annotations, seems to be promising. considerably in indoor environments, mainly due to interior ar- In this paper, we present a new model for spatiotemporal chitectural components constraining (or otherwise affecting) the indoor trajectories enriched with semantic annotations. The pro- way people can move. For example, an indoor trajectory model posed model makes use of an indoor space modeling framework, has to consider multiple ways of entering a room, floor changes, instead of assuming 2D coordinate data as is the norm. To this specific locations of entrance/exit to/from the building, sensor end, on the one hand, we identify certain limitations of state- coverage gaps and/or sensor detection area overlaps, movement of-the-art conceptual semantic (outdoor) trajectory models and propose ways to overcome them, and on the other hand we dis- © 2019 Copyright held by the author(s). Published in the Workshop Proceedings cuss different indoor space modeling approaches and the choices of the EDBT/ICDT 2019 Joint Conference (March 26, 2019, Lisbon, Portugal) on that we made. Equally important, the new model is developed in CEUR-WS.org. order to support mining and analysis tasks. The rest of this paper is divided as follows: Section 2 presents an overview of the related work and its limitations with regards to indoors. Section 3 introduces our trajectory model. Section 4 introduces the Louvre case study and the corresponding model instantiation. Finally, Section 5 concludes with the key issues addressed in our model and a brief description of the types of analytical tasks that it supports. 2 RELATED WORK AND BACKGROUND In this section, we describe the state-of-the-art in modeling in- door spaces and (outdoor) semantic trajectories. 2.1 Indoor Space Models In order to represent movement phenomena in terms of trajec- tories, first a formal spatial model is needed to provide an ab- straction of their physical environment. Every trajectory model (TM) proposed in the literature, either explicitly or more usually implicitly, uses a certain model of location and therefore space. In this regard, a fundamental distinction exists between quantitative and qualitative spatial representation approaches. The former are preferable when precise spatial information is important, while the latter when it is unnecessary or unavailable [9]. A qualitative spatial representation formalism, coupled with Figure 1: A 2-level hierarchical graph representing the cen- qualitative relations between spatial objects and qualitative rea- tral part of the 1st floor of the Louvre’s Denon Wing. soning about spatial knowledge, constitutes what is known as Qualitative Spatial Reasoning (QSR) [23]. Two of the most wide- spread qualitative spatial calculi are RCC (Region Connection Moreover, IndoorGML’s Multi-Layered Space Model (MLSM) Calculus) [10] and n-intersection [13]. In specific, RCC-8 and is the description of multiple interpretations of the same physical 4-intersection (as well as other variants) result in the definition indoor space, through the instantiation of multiple cell decom- of eight binary topological relations: “disjoint”, “touch” (“meet”), positions and corresponding NRGs. Each NRG is treated as a “overlap”, “contains” “insideOf”, “covers”, “coveredBy”, “equal” separate graph layer. Nodes belonging to different layers are [14]. From a more applied perspective, most indoor spatial data connected via inter-layer “joint” edges. While intra-layer edges models can be classified into geometric ones and symbolic ones represent either adjacency, connectivity, or accessibility relations [1]. The former focus on representing the geometry of indoor between non-overlapping cells, joint edges represent potential lo- features using primitives such as points, lines, areas, and volumes. cations where a physical object might actually reside. Therefore, The latter focus on representing the ontological aspects of spatial given that a physical object may be in only one cell of each layer units and the topological relationships between them, maintain- at any given point in time (called the “active” state), joint edges ing a more abstract view of indoor space [2]. Hybrid models express all the valid active state combinations (called “overall” represent both symbolic concepts and geometric properties. states) and are derived by pairwise cell intersection. Equivalently, Furthermore, a line of research works on indoor space model- a joint edge represents any of the eight binary topological rela- ing (e.g. [6]) has culminated into the development of IndoorGML tionships derived by the n-intersection model [13], except for [19], an OGC standard aimed at representing and allowing the “disjoint” and “meet”. In Figure 1 for example, if a visitor is inside exchange of geoinformation for indoor navigational systems. the hall represented as node 5 in layer i + 1, then the joint edges IndoorGML’s core module considers an indoor space as a set suggest that he can only be in either 5a, 5b, or 5c in layer i. of non-overlapping cells that represent its smallest organiza- The MLSM can be used to represent spatial hierarchies but it tional/structural units: S = {c 1 , c 2 , ..., c n }, c i ∩ c j = ∅. Techni- is unclear how its flexible cell subdivision mechanism is ought to cally, IndoorGML describes a hybrid indoor space model and not be used: each node may be split independently of the rest which a TM, but it can be used in support of one. More specifically, the favors ad-hoc hierarchical modeling approaches. For instance, cell space and the topological relationships between its objects in the Louvre example of Figure 1, we may want to split nodes are represented by one or more Node-Relation Graphs (NRGs). In 4 and 5 into smaller cells to take advantage of more precise particular, the Poincaré duality provides the means of mapping localization data available there. It is however unclear, whether the physical indoor space (embedded in a 2D/3D Euclidean primal or not we should also split 1,2,3 correspondingly, or whether space) into an adjacency NRG (in the corresponding dual space). or not we should split 4 and 5 in the same layer (as depicted). Therefore, a cell (e.g. room) becomes a node and a cell bound- These indoor space modeling issues have been identified in [17] ary (e.g. a thin wall) becomes an edge. The respective formal and [11], but the former only provides some general partitioning terminology is summarized in Table 1. If cell boundary semantics criteria (e.g. splitting cells that have multiple properties or that are also taken into account (e.g. doors vs. walls, ramps) then a are too big), while the latter categorizes such criteria (geometry- connectivity and/or an accessibility NRG may be derived as well. driven, topology-driven, semantics-driven, navigation-driven) Connectivity suggests that there exists an opening in the com- but is more interested in furnished 3D indoor spaces, rather than mon boundary of two cells. Accessibility additionally suggests 2D multi-floor spaces. However, such space modeling issues will that the opening is traversable by the moving object (MO). eventually affect the spatial granularity of the symbolic TM. N-intersection Primal Space (2D) Dual Space (NRG) Dual Space (Navigation) (spatial) region1 cell/“cellspace” node state (region) boundary (cell/“cellspace”) boundary (intra-layer) edge transition “overlap” / “coveredBy” / “inside” binary topological relationship (inter-layer) joint edge valid active state combination / / “covers” / “contains” / “equal” (between cells/“cellspaces”) valid overall state Table 1: Closely related terms, often used interchangeably under the context of indoor space modeling and IndoorGML. 2.2 Semantic Trajectory Models characteristics of the positions”. Lastly, temporal gaps in the In the last decade, accounting for the semantics of movement movement track greater than the sampling rate of raw data, are has received a lot of attention in the trajectory data modeling said to be either accidental (“holes”) or intentional (“semantic and analytics literature. Pivotal to this has been the proposal gaps”), in which case their list makes part of the main TM. to view a trajectory as “the user-defined record of spatiotem- Finally, CONSTAnT [8] is a conceptual semantic TM that re- poral evolvement of the position of a MO, during a given time sembles the TM of [21], but supports more strictly defined types interval of its lifespan, and in order to achieve a certain goal”: of trajectory semantics. A trajectory T is defined as an ordered [tbeдin , tend ] → space [24]. In the same work, a purposefully list of timestamped (x, y) coordinate points. Enriched with con- generic way of semantically segmenting a trajectory into stops textual information, a semantic trajectory is defined as the tuple and moves was also established, leaving its implementation to (tid, oid, S, д, d), where oid is the MO identifier, S is a list of se- be specified at the application level. For example, [3] adopted mantic subtrajectories, д is the general goal of the trajectory (i.e. the conceptual TM of [24] and defined stops based on temporal the reason/objective of the movement), and d is the device that stay value thresholds. Similarly, [7] adopted the conceptual TM generated the trajectory. д is required and S must contain at least of [24] and associated stops with important visited places, before one semantic subtrajectory, which means that a semantic trajec- extending it with fundamental data mining concepts in order to tory must have exactly one goal and at least one meaningful part. support frequent/sequential patterns and association rules. Moreover, a semantic subtrajectory s ⊂ T is defined as a list of More recently, in [4, 5], the authors propose a general con- consecutive semantic points, that corresponds to at least a goal, ceptual modeling framework aimed at connecting the analysis or a means of transportation, or a behavior, if not to multiple of movement data with its spatiotemporal context, which is de- ones. Lastly, a semantic point p ⊂ s is defined as a coordinate fined as the physical space and time where movement takes place point, annotated with a set of environments related to where it together with the objects and events that co-exist in it. Their was collected and/or with a set of places where it is located. framework exhaustively categorizes the types of information More generally, in the earlier semantic TM literature, seman- that can be represented by movement data. First, it breaks move- tics were largely exhausted in the names and types of the ge- ment down to its most essential elements: the set of locations S ographic places of interest related to the MO’s physical stops. (space), the set of time units T (time instants or intervals), and the Efforts have since been undertaken to integrate movement on- set of objects O (physical and abstract entities). Their elements tologies, linked open data, information extracted from social net- may have properties represented as spatial, temporal, or thematic work platforms, or complementary case-specific datasets, with attribute values, which in turn may involve other elements of spatiotemporal trajectory data. But they have largely concerned S, T , O. The framework does not address semantic modeling, outdoor contexts, as made evident by the terminology (e.g. “trav- apart from proposing dynamic thematic attributes, said to repre- eling objects” [24]) and definitions introduced. On the contrary, sent any attribute available in the movement data or “any other a Semantic Indoor Trajectory Model (SITM) needs to at least con- existing or conceivable thing”, which can be thought of as the sider the building’s topology and space semantics. The interior equivalent of semantic annotations used in other semantic TMs. of buildings is typically divided into clearly delimited spatial en- SeMiTri [25] is an application-independent framework for tities such as rooms, halls, floors. Thus, its physical segmentation the semantic enrichment of raw GPS trajectories in the form already holds a considerable amount of semantic information. of annotations based on spatial and temporal properties of raw data streams. The enrichment happens, either at a low level via 3 SEMANTIC INDOOR TRAJECTORY the notion of a “semantic place” spi ∈ P = Pr eдion ∪ Pline ∪ MODEL Ppoint , which represents a meaningful geographic object (with a In this section, we define a semantic indoor trajectory model Region Of Interest (ROI), a Line Of Interest (LOI), or a Point Of (SITM) aimed at supporting: Interest (POI) as its extent), or at a high level via the notion of an “episode”, the abstraction of a subsequence of the spatiotemporal • all types of indoor settings; trajectory’s points that are highly correlated with respect to some • both human and inanimate moving objects (from hereon identifiable spatiotemporal feature (e.g. velocity, time interval). both referred to as MOs); The conceptual semantic TM proposed by [21] is similarly • mining and analysis applications using both statistical and structured as a sequence of potentially annotated timestamped reasoning approaches in order to provide insight both at coordinate positions or episodes. An annotation is defined as any the individual and collective level. additional data (captured or inferred) that enrich the knowledge about a trajectory or any part thereof. It can be an attribute value, 3.1 Model Components a link to an object, or a complex value composed of both. Also, an The proposed SITM mainly consists of a semantically enriched “episode” is defined verbatim from [25] as “a maximal subsequence sequence of an individual MO’s spatiotemporal presence, but also of a semantic trajectory, such that all its spatiotemporal positions makes use of a semantically enriched representation of indoor comply with a given predicate, bearing on the spatiotemporal space. The semantically enriched representation of indoor space that it considers undirected edges in all of its examples. As far as we propose is represented as a layered multigraph. Its nodes intra-layer edges go, we can think of “adjacency” and “connec- symbolically represent indoor spatial regions, and its edges repre- tivity” as being symmetric relations. However, “accessibility” is sent topological relationship information between those regions. not symmetric since often indoor movement is only unidirec- Static semantic information about the regions is represented tionally possible due to technical, safety or other limitations. In through node classes and attributes as well as node-edge group- Figure 1 for example, room 4 (“Salle des Etats”) houses the “Mona ing into layers. The proposed representation is compatible with Lisa” and accommodates a vast number of visitors on a daily OGC’s IndoorGML standard and can be viewed as an extension basis. To facilitate their flow, entering it from room 2 is often of it. It is described in Subsection 3.2. prohibited by the museum personnel while exiting it that way is The semantically enriched representation of an individual allowed. Therefore, we assume directed accessibility NRGs. As MO’s trajectory that we propose is a couple consisting of a trace far as joint edges go, while “overlap” and “equal” can be thought of consecutive presence intervals inside the indoor regions rep- of as symmetric binary relations, “contains” and “covers” can not. resented by the graph’s nodes, and a set of semantic annotations Therefore, we also assume directed joint edges (Figure 2). If we describing the trajectory in its entirety. It is semantically enriched wanted to simply model intersection non-emptiness, instead of and uses the above indoor space representation. It is described the specific nature of the relation, then undirected joint edges in Subsection 3.3. would suffice. In our model, we define a layer hierarchy as k ≥ 2 ordered 3.2 Indoor Space Modeling layers G i (0 ≤ i ≤ k) of G that are only consecutively connected Based on the modeling framework provided by the IndoorGML by joint edges. Similar to [17], we exclude “overlap” relations standard and in particular its Multi-Layered Space Model (MLSM), from layer hierarchies, but contrary to it, we also exclude “equal” we represent a 2D multiple floor (i.e 2.5D) indoor space as a relations to prohibit node repetition and instead favor a proper layered multigraph G = (V , E) where hierarchy. Instead of [17]’s “inside” and “coveredBy”, we assume m “contains”, “covers”, and a corresponding top to bottom joint edge direction. Furthermore, we account for the fact that virtually Ø V = Vi i=0 any indoor environment is characterized by a basic three-layer hierarchy consisting of: a “Building” layer, a “Floor” layer, and a and m “Room” layer. The latter is loosely named as it may actually con- Eiacc ∪ E t op Ø E= tain any type of room-level navigable spatial cell, such as rooms, i=0 chambers, halls, lobbies, cellars, terraces, corridors, hallways, big G comprises m + 1 different layers of nodes and edges, each staircases. Therefore, G includes k layers representing static hier- constituting an accessibility Node-Relation Graph (NRG): archical levels of spatiosemantic granularity (3 ≤ k ≤ m). Other layers are optional and may also integrate with this core layer G i = (Vi , Eiacc )(0 ≤ i ≤ m) hierarchy. that corresponds to a different decomposition of the indoor space. It is thus evident that there can be layer hierarchies that com- On the one hand, node v ∈ Vi represents a cell belonging to the i- prise either topographic layers, or semantic layers, or both. Our th layer and an edge e ∈ Eiacc ⊆ Vi ×Vi represents the accessibility core hierarchy is indeed a mixed one. The “Building” and “Floor” between two cells of the i-th layer. On the other hand, a joint edge layers are spatially defined, since the architectural structure alone e ′ ∈ E t op ⊆ Vi × Vj represents a binary topological relationship is mostly enough to determine which space constitutes a building between two cells of different layers (i , j). Figure 2 illustrates an and which space constitutes a floor. The “Room” layer is predi- example of such an indoor space graph representation consisting cated both spatially and semantically: it should not contain cells of five hierarchical layers (detailed in Section 4), but in general of vastly different sizes, but it may contain cells whose bound- G need not be strictly hierarchical. aries are not necessarily physical (e.g. functionally independent In the proposed indoor space model, we adopt IndoorGML’s subspaces of a big hall or of a great room). implicit assumption that each node belongs to only one layer: Additionally, two optional layers are proposed for typical cases: m a “Building Complex” root layer and a “Region of Interest (RoI)” Vi = ∅. If a node is relevant to multiple layers then it is es- Ñ i=0 leaf layer (Figure 2). We define the “Building Complex” layer sentially replicated in each one and all the copies are connected to represent the indoor space of a site comprised of multiple to each other via “equal” joint edges. Moreover, assuming that buildings, such as a hospital spanning multiple attached wings cells represent the physical reality of planar space (instead of a or a university campus spanning multiple independent edifices. conceptual space) and that same-layer cells do not overlap at all, We define the “RoI” layer to represent navigable sub-room level an intra-layer edge e ∈ Eiacc actually presupposes the “meet” rela- spatial cells of application-specific interest, such as “you are here” tion between its two cells, because they need to share a common map installations in a shopping mall or individual exhibit displays surface for the MO to be able to physically transition between in a museum (Figure 4). The “Building Complex” and “RoI” layers them. At the same time, as explained in Section 2, a joint edge are only relevant per case. When present, they can be properly e ′ ∈ E t op signifies that either one of the “overlap”, “contains”, integrated into the aforedescribed core layer hierarchy: “Building “insideOf”, “covers”, “coveredBy”, or “equal” topological relations Complex” → “Building” → “Floor” → “Room” → “RoI”. Then, a holds between the two cells that it connects. Thus, intra-layer “Floor” object describes a single building’s floor level (e.g. FloorA1 edges and inter-layer edges are always of a different type, and , FloorB1 in Figure 2). Ad-hoc refinements of the hierarchy are therefore G can be considered as an edge-coloured multigraph still possible in extremely particular cases (e.g. architectures with which can be mapped to a multilayer network [18]. indistinguishable floor levels) as long as joint edges represent An important modeling decision is whether G is directed or “contain” or “cover” relations and do not skip layers. not. Although IndoorGML does not explicitly assume either case, Figure 2: The required core layer hierarchy extended with a multi-building root layer and an intra-room region layer. A static predefined layer hierarchy (e.g. Figure 2), as opposed of the trajectory defined as a sequence of timestamped semanti- to local ad-hoc node subdivisioning (e.g. Figure 1), allows a struc- cally annotated presence periods/intervals at states of the indoor tured reasoning about the trajectories at multiple levels of gran- space graph G. ularity. By only allowing “proper part” types of relationships, The second element of the couple in Def. 3.1 is a non-empty we allow inference of a MO’s location at all levels of granularity set of semantic annotations characterizing the trajectory in its above the detection data level. This in turn allows developing rea- entirety. A trajectory semantic annotation at r aj ∈ At r aj is not soning mechanisms to cope with missing or uncertain location confined within specific types of information, but would typi- information. It also enables the identification of certain types of cally be chosen to represent an activity, a behavior, or a goal movement patterns at the “room” level for instance, and at the showcased by the complete trajectory. These terms are often same time of other types of patterns at the “floor” level, from ambiguously used in trajectory literature. Here, we consider an the same trajectory dataset. Finally, hierarchies simplify the con- “activity” to concern more targeted/conscious actions than a “be- ceptual indoor space data model thanks to the transitivity of havior”, which concerns less intentional actions or reactions. parthood (isomorphic to set inclusion) in classical mereology: a Both describe the actuality of movement. A “goal” might instead layer hierarchy only needs to connect to other layers or layer hier- concern the potentiality of movement (e.g. a disrupted activity). archies at the lowest possible level, since a relation (e.g. “overlap”) between two nodes will also hold between their predecessors. Definition 3.2 (semantic trajectory trace). A semantic trajectory trace is defined as following: 3.3 Semantic Indoor Trajectory Modeling trace I Dmo ,ts t ar t ,te nd = (ei , vi , tist ar t , tiend , Ai )i ∈[1,n] Automatically collected raw movement data typically consist m where ei = (vi−1 , vi ) ∈ Eiacc is the transition, i.e. boundary Ð of spatiotemporal records, out of which individual trajectories can be extracted. Depending on the application and on the type i=0 crossed, that led the MO from state vi−1 to state vi at time tist ar t , of MO, only the evolution of its representative location may be relevant (e.g. museum visit analysis) or perhaps also its shape where it stayed until time tiend . Moreover, Ai is a potentially and parts’ movements (e.g. sports performance analysis). In the empty set of semantic annotations describing that specific stay. former case, a trajectory is typically represented as a sequence Given that each layer’s NRG is a multigraph, it is generally useful of timestamped spatial points, as explained in Subsection 3.2. to know the specific transition ei (e.g. which door, staircase, or Due to a building’s clearly separated spaces, we consider regions elevator was used), albeit optional2 . (instead of points) as our primary primitive spatial entities, in the For example, the spatiotemporal trace of a museum visitor’s spirit of Qualitative Spatial Representation [10] and IndoorGML’s 3-hour visit (on a given day) might resemble the following: cellular space [19], both described in Section 2. trace I Dv is ,11:30:00,14:28:00 = { (_,room001,11:30:00,11:32:35,∅), Definition 3.1 (semantic trajectory). A semantic trajectory is (door 012,hall003,11:32:31,11:40:00,∅), ... defined as the couple of its spatiotemporal trace and the set At r aj (door 005,room006,14:12:00,14:28:00,∅) } of semantic annotations describing it in its entirety, as given by We define a semantic subtrajectory as being for all practical the following equation: purposes a semantic trajectory (similar to how a mathematical TI Dmo ,ts t ar t ,tend = (trace I Dmo ,ts t ar t ,tend , At r aj ) subsequence is itself a sequence) but necessarily referable to some other main semantic trajectory: where I Dmo is the identifier of the MO concerned, tst ar t and 2 For applications where individual transitions bear a dynamic semantic load (e.g. tend are the trajectory’s starting and ending timestamps. More- setting off an alarm with some probability), we can extend the TM with semantic over, trace I Dmo ,ts t ar t ,tend represents the spatiotemporal aspect transition annotations, effectively substituting e i with e is em = (e i , Ati r ans ). Definition 3.3 (semantic subtrajectory). Given a semantic tra- version of the museum map for navigation purposes. The Louvre jectory has already been the object of visitor mobility research in the past TI Dmo ,ts t ar t ,tend = (trace I Dmo ,ts t ar t ,tend , At r aj ) leading to interesting conclusions [27], but the current beacon infrastructure offers improved tracking coverage and continuity. a semantic subtrajectory of it is defined as: In the obtained dataset, raw geometric positions have already TI′Dmo ,t ′ = (trace ′ , A ′ t r aj ) been spatially aggregated into 52 non-overlapping zones. Each s t ar t ,t end I Dmo ,t s t ar t ,t end ′ ′ ′ zone corresponds to a large polygonal area of the museum (Fig- iff trace ′ is a proper subsequence of trace: ures 3 and 5) specified by the museum administration in such a tst ar t ≤ tst ′ ar t < t end < t end or tst ar t < tst ar t < t end ≤ t end . ′ ′ ′ way so as to reflect a single exhibition theme (e.g. Italian paint- A subtrajectory’s set of semantic annotations At′ r aj may or ings) but also only extend within a single floor. In more detail, may not be the same as that of its main trajectory At r aj , contrary our dataset consists of 4,945 visits (continuously collected from to [8], where a subtrajectory is enriched with different types of 19-01-2017 to 29-05-2017, where each visit consists of a sequence semantic information than its main trajectory. of timestamped “zone detections”, i.e. detections of the visitor’s Moreover, in the following, we define an episode of a semantic smartphone inside a certain zone. The duration of a visit ranges trajectory as any particularly meaningful part of it. from 0 sec (potential error) to 7 hours, 41 min and 37 sec, whereas the duration of a zone detection ranges from 0 sec (potential er- Definition 3.4 (episode). Given a semantic trajectory ror) to 5 hours, 39 min and 20 sec. The visits were performed by TI Dmo ,ts t ar t ,tend = (trace I Dmo ,ts t ar t ,tend , At r aj ) 3228 different visitors using both the iPhone and Android app versions. Out of them, 1227 were “returning” visitors who made an episode of it is defined as 1717 second/third visits, although not necessarily on different TI′Dmo ,t ′ = (trace ′ , A ′ t r aj ) days. The dataset includes 20,245 zone detections and 15,300 s t ar t ,t end I Dmo ,t s t ar t ,t end ′ ′ ′ ′ (intra-visit) zone transitions in total. iff (1)TI D ,t ′ is a semantic subtrajectory ofTI Dmo ,ts t ar t ,te nd , mo s t ar t ,t e nd ′ Unfortunately, the trajectories obtained from the dataset are (2) At r aj , At r aj , and (3) it satisfies a given spatiotemporal ′ sparse, since a visitor may delay launching the app or stop using and/or semantic predicate: it early in the visit for a variety of reasons, ranging from battery ′ depletion to lack of engagement or sporadic navigation-only us- Pep : TI Dmo ,t ′ → {true, f alse} s t ar t ,t end ′ age. Moreover, around 10% of the zone detections have a duration where Pep is domain-dependent and user-defined. of zero value, forcing us to filter them out as detection errors. Moreover, an episodic segmentation of a semantic trajectory is simply any subset of its episodes that covers it time-wise. Con- 4.2 Model Instantiation trary to typical literature practice, we allow an episodic segmen- In order to instantiate the STIM presented in Section 3 for the tation to contain episodes that overlap in time, since the exact Louvre case, we need first to represent Louvre’s indoor spaces same movement part may have multiple meanings depending according to the proposed graph-based model. This is done in on the broader context. An example illustrative of the museum Figure 2. Although the Louvre’s multi-layered graph is prohib- domain is given in the next Section. itively large to be included in this paper, we cite hereafter its Finally, the SITM is event-based in the sense that, only a correspondences with respect to Figures 3 and 5: Layer 4 is instan- change of the spatial cell that the MO is located in, or a change tiated as the whole “Louvre Museum”, Layer 3 as its three wings of the semantic information regarding the MO’s presence in that (“Richelieu”, “Denon”, and “Sully”) as well as the “Napoleon” area cell, needs to be accompanied by a new tuple and a corresponding (under the Pyramide), Layer 2 as a wing’s five different floors (-2, timestamp. Hence, in the previous museum visit example the last -1, 0, +1, +2), Layer 1 as a floor’s rooms and halls (hundreds in presence interval could be split if the visitor changes his goal total), and Layer 0 as a room’s exhibits (several hundreds of the while in room006 (which hosts both exhibits and the gift shop): most important ones). In addition, we add a semantic layer that (door 005,room006,14:12:00,14:21:45,{goals:[“visit”]}) and happens to fall right between Layer 2 and Layer 1, representing (_,room006,14:21:46,14:28:00,{goals:[“visit”,“buy”]}). This model- the thematic zones of our dataset as described in Subsection 4.1 ing approach allows us to integrate different data sources in order (Figure 3). Layer 4 actually represents a level above any specific to semantically enrich the trajectory. building, denoting whether a visitor is at the Louvre in general. 4 THE LOUVRE CASE STUDY Layer 3 treats each wing of the museum as a separate building because its spaces and usage are practically equivalent to that In this section, we present a compelling trajectory dataset from of a typical building. In Layer 0, we opted to define a RoI as the the world’s most frequented museum, the Louvre Museum. predefined spatial area of engagement with the corresponding exhibit, outside of which a visitor is certainly not paying atten- 4.1 Visitor Movement Dataset tion to it. For simplicity, a RoI includes the area physically taken In July 2016, the Louvre launched its official “My Visit to the up by the exhibit itself and its display installation (i.e. no holes). Louvre” smartphone application, which takes advantage of a Finally, an interesting space modeling decision concerns whether large Bluetooth Low Energy (BLE) beacon infrastructure3 and or not to assume that the spatial region represented by a node the smartphone’s accelerometer and compass, in order to estimate in layer i + 1 is fully covered by the union of the spatial regions the visitor’s (lat,long) coordinate position within the museum. represented by its child nodes in layer i. For example, is a floor This is accomplished via BLE Received Signal Strength Indicator fully covered by the rooms it contains (Figure 2)? Although not (RSSI)-based trilateration, extended Kalman and particle filtering explicitly stated, the IndoorGML standard and related works (e.g. techniques. The app visualizes the position over a locally stored [17]) seem to adhere to a full-coverage hypothesis. This has the 3 Around 1800 beacons installed across all five floors. advantage that accessibility relations need only be captured at Figure 3: Choropleth map of visitor detections in the Louvre’s 11 ground floor polygonal zones. the lowest possible level of the hierarchy, from where they can be inferred for the higher levels. However, it is often an unre- alistic assumption [20]. In Figure 4 for instance, the RoIs of the displayed exhibits do not completely cover their room’s surface. Figure 5: A Louvre visit trajectory may contain two over- lapping “exit museum” and “buy souvenir” episodes. hosts the temporary exhibition of the Louvre which requires a separate ticket to enter. Thus, we would expect that δt 1 ≫ δt 2 . There are many such interesting examples, where cell semantics could help, not only explain the results of, but potentially even redesign, existing sequential pattern mining methods. It is now more apparent why our SITM allows for overlap- ping episodes instead of requiring mutually exclusive episode Figure 4: Indicative representation of the RoIs contained predicates (as in [26] for example). For instance, if a given visitor within zones 60854 and 60853 of the Louvre. (Figure 5) has visited the temporary exhibition (hosted in E) and wishes to leave the museum, he may take the path E→P→S→C Having instantiated the Louvre’s indoor space representation, before his trace disappears, as he is leaving the museum through the SITM is used to extract (from the zone detection data) the the Carousel exit (C). However, he may also want to first buy Louvre visit trajectories as sequences of presence intervals in something from the souvenir shops (hosted in S). Hence, when the museum’s thematic zones. Figure 6 depicts the accessibil- considering a goal-related episodic segmentation of his trajectory, ity topology of the 30 zones present in the dataset, which was we may tag the whole E→P→S→C part with the “exit museum” extracted by hand on site and can therefore also assist in filter- goal and its E→P→S subsequence with the “buy souvenir” tag. ing out data errors. The figure’s lower part corresponds to the More generally, any part of the MO’s trajectory may correspond −2 floor of the museum, and a short sub-visit of a random visi- to multiple episodes (goal-related or otherwise). tor in February 2017 is drawn over it: at time t 1 the visitor was detected in Zone60887 (i.e. E in Figure 5) for a duration of δt 1 , 5 CONCLUSIONS AND FUTURE WORK and at time t 2 he was detected in Zone60890 (i.e. S in Figure 5) In this work, we presented an indoor space representation based for a duration of δt 2 . From the zone layer NRG (Figure 6) we on the IndoorGML standard [19] and using a hierarchical graph can infer that although never detected there, the visitor must structure similar to [17]. The main difference is that we require have passed from Zone60888 (i.e. P in Figure 5). In our SITM, a static hierarchy of three basic layers (building, floor, room) this would be captured with the addition of an extra tuple in and propose two more typical layers (building complex, intra- the sequence, e.g.: (checkpoint002, zone60888, 17:30:21, 17:31:42, room region of interest), thus avoiding ad-hoc subdivisions of {goals:[“cloakroomPickup”,“souvenirBuy”,“museumExit”]}) space. Motivated by our case study involving a museum visitor The semantics of places also offer us valuable insight about mobility dataset, containing spatially aggregated timestamped the visitor’s trajectory. For instance, we know that the visitor detections, we instantiated the space representation, also adding disappearing after Zone60890 is normal because it is one of the a case-specific semantic layer of “thematic zones” that matches Louvre’s exit zones (through the Carrousel Hall). Also, Zone60887 the granularity of our data. We also explained how a sequence the 15th Annual ACM International Symposium on Advances in Geographic Information Systems (GIS ’07). ACM, New York, NY, USA, 22:1–22:8. [4] Gennady Andrienko, Natalia Andrienko, Peter Bak, Daniel Keim, Slava Kisile- vich, and Stefan Wrobel. 2011. A conceptual framework and taxonomy of techniques for analyzing movement. Journal of Visual Languages & Computing 22, 3 (2011), 213 – 232. [5] Gennady Andrienko, Natalia Andrienko, and Marco Heurich. 2011. An Event- based Conceptual Model for Context-aware Movement Analysis. International Journal of Geographical Information Science 25 (2011), 1347–1370. [6] Thomas Becker, Claus Nagel, and Thomas H. Kolbe. 2009. Supporting Con- texts for Indoor Navigation Using a Multilayered Space Model. In 2009 Tenth International Conference on Mobile Data Management: Systems, Services and Middleware. 680–685. [7] Vania Bogorny, Carlos Alberto Heuser, and Luis Otavio Alvares. 2010. A Conceptual Data Model for Trajectory Data Mining. In Proceedings of the 6th International Conference on Geographic Information Science (GIScience’10) (Lecture Notes in Computer Science). Springer-Verlag, Berlin, Heidelberg, 1–15. [8] Vania Bogorny, Chiara Renso, Artur Ribeiro de Aquino, Fernando de Lucca Siqueira, and Luis Otavio Alvares. 2014. CONSTAnT – A Concep- tual Data Model for Semantic Trajectories of Moving Objects. Transactions in GIS 18, 1 (2014), 66–88. [9] Juan Chen, Anthony G. Cohn, Dayou Liu, Shengsheng Wang, Jihong Ouyang, and Qiangyuan Yu. 2015. A survey of qualitative spatial representations. The Knowledge Engineering Review 30, 1 (2015), 106–136. [10] Anthony Cohn, Brandon Bennett, John Gooday, and Mark Gotts. 1997. Qual- itative Spatial Representation and Reasoning with the Region Connection Calculus. GeoInformatica 1 (1997), 275–316. [11] Abdoulaye Diakité, Sisi Zlatanova, and Ki Joune Li. 2017. About the subdivision Figure 6: Based on the chain topology of zones, a visitor’s of indoor spaces in IndoorGML. In 12th 3D Geoinfo Conference. 41–48. presence in Zone 60888 (blue zone) can be inferred. [12] Martin Doerr, Christian-Emil Ore, and Stephen Stead. 2007. The CIDOC Conceptual Reference Model: A New Standard for Knowledge Sharing. In Tutorials, Posters, Panels and Industrial Contributions at the 26th International Conference on Conceptual Modeling (ER ’07). Australian Computer Society, Inc., of presence intervals in symbolic indoor areas, coupled with se- 51–56. mantic annotations, and flexible concept definitions, can produce [13] Max Egenhofer and John Herring. 1992. Categorizing binary topological relations between regions, lines and points in geographic databases. The a Semantic Indoor Trajectory Model (SITM) that adopts good 9-Intersection, Formalism and Its Use For Natural-Language Spatial Predicates practices from state-of-the-art semantic outdoor TMs. 94 (1992), 1–28. We will next focus on developing new data mining methods [14] Max J. Egenhofer and Franzosa Robert D. 1991. Point-set topological spatial relations. International Journal of Geographical Information Systems 5, 2 (1991), that exploit the expressiveness of the SITM, and on proposing 161–174. semantic similarity metrics for trajectories (e.g. for visitor profil- [15] Stephen C. Hirtle and John Jonides. 1985. Evidence of hierarchies in cognitive ing). In the future, it would be interesting to integrate the indoor maps. Memory & Cognition 13, 3 (1985), 208–217. [16] Christian Søndergaard Jensen, Hua Lu, and Bin Yang. 2010. Indoor - A New space representation with formal ontologies of cultural heritage Data Management Frontier. IEEE Computer Society Data Engineering Bulletin information (e.g. CIDOC Conceptual Reference Model [12]). Also, 33, 2 (2010), 12–17. [17] Hae-Kyong Kang and Ki-Joune Li. 2017. A Standard Indoor Spatial Data Model modeling conceptual instead of physical trajectories could be - OGC IndoorGML and Implementation Approaches. ISPRS International compelling in the museum domain, where an interpretation of Journal of Geo-Information 6, 4 (2017), 1–25. visitor movement based on “focus of attention” is sometimes [18] Mikko Kivelä, Alex Arenas, Marc Barthelemy, James P. Gleeson, Yamir Moreno, and Mason A. Porter. 2014. Multilayer networks. Journal of Complex Networks even more important than one based on physical presence. With 2, 3 (2014), 203–271. regards to the Louvre case, it would be of interest to account [19] Jiyeong Lee, Ki-Joune Li, Sisi Zlatanova, Thomas H. Kolbe, Claus Nagel, and for the problem of data sparsity by restructuring longer indica- Thomas Becker. 2018. OGC IndoorGML. Technical Report. Open Geospatial Consortium. tive visits from the actual fragmented zone sequences. However, [20] Hua Lu, Chenjuan Guo, Bin Yang, and Christian S. Jensen. 2016. Finding the data can already provide some interesting insight albeit at a Frequently Visited Indoor POIs Using Symbolic Indoor Tracking Data. In Pro- ceedings of the 19th International Conference on Extending Database Technology coarse level of granularity (e.g. floor-switching patterns). (EDBT2016). 449–460. [21] Christine Parent, Stefano Spaccapietra, Chiara Renso, Gennady Andrienko, ACKNOWLEDGMENTS Natalia Andrienko, Vania Bogorny, Maria Luisa Damiani, Aris Gkoulalas- Divanis, Jose Macedo, Nikos Pelekis, Yannis Theodoridis, and Zhixian Yan. The authors would like to thank Anne Krebs for her cooperativity 2013. Semantic Trajectories Modeling and Analysis. ACM Comput. Surv. 45, 4 and her multifaceted help and Artus Gosselin for his contribution (2013), 42:1–42:32. [22] Donna J. Peuquet. 2002. Representations of Space and Time. Guilford Press. in developing the code for Figure 6. [23] Jochen Renz. 2002. Qualitative Spatial Reasoning with Topological Information. This work is supported by the TRAJECTOIRES project funded Springer-Verlag, Berlin, Heidelberg. by the French Heritage Science Foundation (EUR-17-EURE-0021). [24] Stefano Spaccapietra, Christine Parent, Maria Luisa Damiani, Jose Antonio de Macedo, Fabio Porto, and Christelle Vangenot. 2008. A Conceptual View on Karine Zeitouni’s work in this paper has been supported by the Trajectories. Data and Knowledge Engineering 65, 1 (2008), 126–146. MASTER project that has received funding from the European [25] Zhixian Yan, Dipanjan Chakraborty, Christine Parent, Stefano Spaccapietra, and Karl Aberer. 2011. SeMiTri: A Framework for Semantic Annotation of Union’s Horizon 2020 research and innovation programme under Heterogeneous Trajectories. In Proceedings of the 14th International Conference the Marie-Slodowska Curie grant agreement N. 777695. on Extending Database Technology (EDBT/ICDT ’11). ACM, New York, USA, 259–270. [26] Zhixian Yan, Christine Parent, Stefano Spaccapietra, and Dipanjan REFERENCES Chakraborty. 2010. A Hybrid Model and Computing Platform for Spatio- [1] Imad Afyouni, Cyril Ray, and Claramunt Christophe. 2012. Spatial models semantic Trajectories. In The Semantic Web: Research and Applications (Lecture for context-aware indoor navigation systems: A survey. Journal of Spatial Notes in Computer Science). Springer, Berlin, Heidelberg, 60–75. Information Science 1, 4 (May 2012), 85–123. [27] Yuji Yoshimura, Anne Krebs, and Carlo Ratti. 2017. Noninvasive Bluetooth [2] Imad Afyouni, Cyril Ray, and Christophe Claramunt. 2017. Representation: Monitoring of Visitors’ Length of Stay at the Louvre. IEEE Pervasive Computing Indoor Spaces. American Cancer Society, 1–12. 16, 2 (2017), 26–34. [3] Luis Otavio Alvares, Vania Bogorny, Bart Kuijpers, Jose Antonio Fernandes de Macedo, Bart Moelans, and Alejandro Vaisman. 2007. A Model for Enrich- ing Trajectories with Semantic Geographical Information. In Proceedings of