<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">A Shape-Based Map Matching Approach for Geographic Transferability of Discriminative Subtrajectories</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Cristiano</forename><surname>Landi</surname></persName>
							<email>cristiano.landi@phd.unipi.it</email>
							<affiliation key="aff0">
								<orgName type="institution">University of Pisa</orgName>
								<address>
									<settlement>Pisa</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="institution">ISTI-CNR</orgName>
								<address>
									<settlement>Pisa</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Riccardo</forename><surname>Guidotti</surname></persName>
							<email>riccardo.guidotti@unipi.it</email>
							<affiliation key="aff0">
								<orgName type="institution">University of Pisa</orgName>
								<address>
									<settlement>Pisa</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="institution">ISTI-CNR</orgName>
								<address>
									<settlement>Pisa</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">A Shape-Based Map Matching Approach for Geographic Transferability of Discriminative Subtrajectories</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">56594342527A915FA26D93ED08140E81</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T17:17+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Map Matching</term>
					<term>Geographic Transferability</term>
					<term>Machine Learning</term>
					<term>Discriminative Subtrajectories</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This paper addresses the challenge of map matching and geographic transferability in trajectory analysis. Existing methods often face limitations tied to specific coordinates or road networks. In response, we propose GASM, a shape-based map matching method that relies solely on trajectory shapes, irrespective of geographic origin. GASM introduces a symbolic road network representation, facilitating efficient searches based solely on trajectory shapes. Our experimentation, spanning over 5,000 km of roads, demonstrates GASM's ability to accurately position trajectories with an impressive accuracy exceeding 90%. Notably, GASM stands as the first in the literature to perform shape-based symbolic map matching without prior knowledge of the geographic region.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>In recent years, the widespread adoption of cutting-edge technologies equipped with Global Positioning System (GPS) devices has enabled the recording of positions for various moving objects, ranging from cars and transportation vehicles to phones and wearables. Unfortunately, the coordinates captured by these sensors often fail to accurately reflect real positions due to physical constraints and/or legal regulations. Nevertheless, in various applications, it is imperative to accurately align GPS trajectories with a road network. For instance, in navigation services, map matching empowers drivers to monitor their exact locations and receive optimal routes to specified destinations. Conversely, in machine learning tasks, map matching enhances users' mobility information by incorporating knowledge related to the territory, such as Points Of Interest (POI), feature engineering <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2,</ref><ref type="bibr" target="#b2">3,</ref><ref type="bibr" target="#b3">4]</ref>, or the identification of discriminatory subsequences, such as mobility shapelets <ref type="bibr" target="#b4">[5,</ref><ref type="bibr" target="#b5">6,</ref><ref type="bibr" target="#b6">7]</ref>. Without an appropriate map-matching procedure, reliance on an expert becomes necessary to determine which features can be extracted from trajectories concerning the territory. However, the reliance on ad-hoc features restricts the applicability of machine learning methods and amplifies sensitivity to input changes <ref type="bibr" target="#b0">[1]</ref>, rendering it unsuitable for geographic transferability. This implies the challenge of extracting mobility patterns from one geographical region and effectively applying them in another region <ref type="bibr" target="#b7">[8,</ref><ref type="bibr" target="#b8">9]</ref>.</p><p>Particularly noteworthy are recent advancements in machine learning leveraging shapelet-based subtrajectories <ref type="bibr" target="#b4">[5,</ref><ref type="bibr" target="#b5">6,</ref><ref type="bibr" target="#b6">7]</ref>. Originating from the domain of time series analytics, shapelets represent discriminatory subsequences that encapsulate a collection of distinctive shapes, crucial for discerning specific classes <ref type="bibr" target="#b9">[10]</ref>. Various approaches exist for defining discriminative subtrajectories. In <ref type="bibr" target="#b5">[6]</ref>, the Movelet method is introduced-an approach for extracting discriminative subtrajectories selected through a rigorous statistical test. During the discovery phase, Movelet generates candidate subtrajectories by extracting all possible subsequences with more than two contiguous observations, utilizing a sliding window. Building upon the foundation laid by Movelet, Geolet is introduced in <ref type="bibr" target="#b6">[7]</ref>. This extension incorporates a normalization step after the discovery phase. The normalization step is designed to ease the comparison of discriminative subsequences with trajectories recorded in diverse geographical regions. The underlying rationale is that a subtrajectory pinpointing a sudden break in a road segment in one city should exhibit similarities to a subtrajectory associated with the same event in another city. This normalization enhances the method's adaptability across various geographic contexts.</p><p>While Geolet successfully addresses the limitation of Movelet by providing normalized subtrajectories, thereby enhancing geographic transferability independent of specific GPS coordinates, it introduces a potential vulnerability tied to the road network.</p><p>Our underlying hypothesis is that the less frequently a trajectory occurs, the greater the likelihood that shapebased methods utilizing it as a discriminative subse-quence may not capture the intrinsic features of the trajectory but rather only its geographic position. In essence, if a discriminative subtrajectory is intrinsically linked to a particular road network due to its distinctive shape, it becomes unsuitable for geographic transferability. This limitation arises from the fact that the discriminatory aspect is not rooted in the movements themselves but rather in the structural characteristics of the road network. Consequently, evaluating the geographic transferability of discriminative subtrajectories necessitates a shape-based map-matching approach that exclusively relies on shapes without prior knowledge of the position. Regrettably, to the best of our knowledge, such an approach is currently unavailable. This underscores the need for innovative solutions in the realm of shape-based map matching to comprehensively assess the adaptability of discriminative subtrajectories across diverse geographical contexts.</p><p>To overcome this limitation, our paper introduces GASM, an Geographic Automaton Shape-based map Matching approach. GASM relies solely on the shape of a trajectory to accurately determine its position within the road network. Specifically, GASM employs a symbolic representation to transform the road network, constructing a spatial index independent of coordinates. This unique approach allows for efficient trajectory searches based solely on their shapes. To the best of our knowledge, GASM is the first proposal in the literature that exclusively utilizes a discretized representation of a trajectory's shape, devoid of any knowledge of the geographic region, for shape-based map matching. Our experimentation with GASM on a novel comprehensive geographical dataset spanning over 5,000 km of roads in Tuscany, central Italy, demonstrates its capability to identify correct alignments with an impressive accuracy exceeding 90%. Furthermore, GASM exhibits efficiency, as it can construct the necessary representation for the entire dataset in less than 1.5 hours, maintaining a linear complexity at query time.</p><p>The paper is organized as follows. Section 2 summarises the related works concerning map-matching methods and the challenges posed by geographic transferability. In Section 3, we encapsulate the technical concepts essential for comprehending the algorithm delineated in Section 4. The outcomes of experiments conducted with GASM are detailed in Section 5. Finally, Section 6 encapsulates our findings and delves into potential avenues for future developments.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Works</head><p>In the following, we provide a concise overview of the literature concerning map matching methods and elucidate the geographic transferability problem, introducing key strategies employed to tackle this challenge.</p><p>In the literature of trajectory analysis, a multitude of strategies exists for mapping trajectories onto a road network. For high-frequency sampled trajectories, the simplest approach involves associating each spatio-temporal point with the nearest street segment <ref type="bibr" target="#b10">[11,</ref><ref type="bibr" target="#b11">12]</ref>. However, these techniques, while fast and straightforward, have exhibited inaccuracies, particularly at intersections and parallel roads. To address these limitations, enhancements have been introduced, incorporating heading direction or employing a Kalman filter to eliminate outlier points in trajectories <ref type="bibr" target="#b12">[13]</ref>. Alternatively, some approaches leverage probabilistic-based map-matching algorithms, integrating hidden Markov models to identify the most likely sequence of road segments aligning with the trajectory <ref type="bibr" target="#b13">[14,</ref><ref type="bibr" target="#b14">15]</ref>. On the other hand, in the context of low-sampled trajectories, much of the existing literature presupposes that the most probable route connecting two successive points is also the shortest or fastest <ref type="bibr" target="#b15">[16]</ref>. However, in <ref type="bibr" target="#b16">[17]</ref>, is introduced a map-matching algorithm that exploits temporal intervals between GPS points. This method identifies the optimal match between two GPS points by selecting the route with the most similar travel time. Also, in <ref type="bibr" target="#b17">[18]</ref> is proposed a method for map matching low-sampled trajectories based on supplementary information such as speed and moving direction, typically collected alongside spatial locations.</p><p>Geographic transferability encapsulates the challenge of extending knowledge gleaned from one geographic region to another. This entails constructing machine learning models capable of adeptly executing tasks in regions distinct from their original training grounds. The core of this challenge emerges from pronounced disparities in data distributions, patterns, and characteristics across diverse geographical locations. As articulated in <ref type="bibr" target="#b7">[8,</ref><ref type="bibr" target="#b8">9]</ref>, models trained on data from one region may encounter difficulties in generalization when confronted with the unique variations inherent in another region. In pursuit of a global model, a fundamental strategy is employed as demonstrated in <ref type="bibr" target="#b7">[8]</ref>, where diverse data sources are aggregated on a global scale to build a transferable model. Conversely, in <ref type="bibr" target="#b8">[9]</ref>, city indicators are identified pertaining to road networks, traffic flows, and individual mobility, to facilitate the assessment of similarities between geographical regions. Subsequently, an ensemble classifier is devised, computing the output as a weighted average of outputs generated by individual local classifiers. Notably, the importance of local models is determined by their higher similarity to the target regions compared to others in the ensemble with respect to the city indicators. Alternate methodologies involve adapting a model initially trained in a data-rich region and transplanting it to a target region characterized by limited data availability. This adaptation process entails integrating additional data to compute sub-region similarities, subsequently enabling the remapping of the model <ref type="bibr" target="#b18">[19,</ref><ref type="bibr" target="#b19">20]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Problem Setting</head><p>In this section, we articulate the fundamental concepts essential for comprehending our proposal. Initially, we establish a shared language and framework by introducing notation that serves as a basis for discussing key elements. Subsequently, we delve into the transformation facilitated by Geolet, a catalyst for the motivation behind this work. Ultimately, we present a formal introduction to the problem at hand.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Definition 1 (Trajectory</head><formula xml:id="formula_0">). A trajectory 𝑋 is a sequence of spatio-temporal points 𝑋 = {⃗ x 𝑡 0 , … , ⃗ x 𝑡 𝑚 } ∈ ℝ 𝑚×3</formula><p>where the spatial vectors ⃗</p><p>x 𝑡 𝑗 = (lat 𝑡 𝑗 , long 𝑡 𝑗 ) are sorted by increasing time 𝑡 𝑗 , i.e., ∀1 ≤ 𝑗 &lt; 𝑚 we have 𝑡 𝑗 &lt; 𝑡 𝑗+1 .</p><p>In a sense, trajectories can be viewed as multivariate time series containing two signals, i.e., the latitude and longitude, recorded at non-constant sampling rates <ref type="bibr" target="#b4">[5,</ref><ref type="bibr" target="#b20">21,</ref><ref type="bibr" target="#b5">6]</ref>. In order to simplify notation, we will use 𝑗 instead of 𝑡 𝑗 every time. A trajectory classification dataset is a set of trajectories with a vector of labels attached. Formally:</p><formula xml:id="formula_1">Definition 2 (Trajectory Dataset). A trajectory dataset 𝒳 ∈ ℝ 𝑛×𝑚×3 is a set of 𝑛 trajectories, 𝒳 = {𝑋 0 … , 𝑋 𝑛 }.</formula><p>For simplicity, we use a single symbol 𝑚 to denote the lengths of the trajectories, even if a dataset can contain trajectories with a different number of observations. Similarly, we emphasize that there is no constraint on the sampling rate, i.e., we can have a non-constant sample in the same trajectory. Furthermore, we define a subtrajectory as: Definition 3 (Subtrajectory). Given a trajectory 𝑋 of length 𝑚, a subtrajectory 𝑆 = {⃗ 𝑠 𝑗 , … , ⃗ 𝑠 𝑗+𝑙 } ⊂ 𝑋, of length 𝑙 ≤ 𝑚, is an ordered sequence of consecutive values such that 0 ≤ 𝑗 ≤ 𝑚 − 𝑙.</p><p>As previously mentioned, Movelet <ref type="bibr" target="#b5">[6]</ref> and Geolet <ref type="bibr" target="#b6">[7]</ref> are shapelet-inspired <ref type="bibr" target="#b9">[10]</ref> trajectory approaches that identify discriminative subtrajectories for classification purposes. They both select the most discriminative subtrajectories w.r.t. the target label using the mutual information <ref type="bibr" target="#b21">[22]</ref>. Like shapelet-based approaches, Movelet and Geolet extract discriminative subtrajectories that can be used to train any machine learning model <ref type="bibr" target="#b22">[23]</ref>. Indeed, once the most discriminative subtrajectories are identified, a trajectory dataset can be transformed into a tabular representation capturing the distance between trajectories and discriminative subtrajectories through the subtrajectory transform function: </p><p>On the other hand, Geolet computes the best fitting in the same way, but geographically shifting 𝑆 to overlap each subsequence of 𝑋 of length 𝑙. In particular, Geolet extends the best fitting function of Movelet by adding a pre-processing function, shift, that subtracts the value of the first vector of the subsequence from all the others, bestfit Geolet (𝑋 , 𝑆) =</p><formula xml:id="formula_3">𝑚−𝑙 min 𝑗=0 {𝐸𝐷(shift(𝑋 𝑗∶𝑗+𝑙 ), shift(𝑆))} (2)</formula><p>where shift(𝑋</p><formula xml:id="formula_4">) = { ⃗ 𝑥 0 − ⃗ 𝑥 0 , … , ⃗ 𝑥 𝑚 − ⃗ 𝑥 0 }.</formula><p>The shift function makes Geolet suitable for geographic transferability not being tide to the territory.</p><p>Finally, in order to present map matching we define a road network as follows:</p><p>Definition 5 (Road Network). A road network 𝐺 = ⟨𝑉 , 𝐸⟩ is a directed graph where 𝑉 = {𝑣 1 , … , 𝑣 𝑝 } is the set of 𝑝 road junction (or nodes), and 𝐸 = {𝑒 1 , … , 𝑒 𝑞 } is the set of 𝑞 road segments (or edges), where</p><formula xml:id="formula_5">𝑒 𝑖 = (𝑣 𝑖 1 , 𝑣 𝑖 2 ).</formula><p>We underline that we rely on an enhanced road network representation where, for each edge, we also have access to the road segment geometries expressed as a sequence of 𝑘 latitude and longitude, formally shape(𝐸) = { ⃗ 𝑥 0 , … , ⃗ 𝑥 𝑘 } for some 𝐸 ∈ ℰ. As for trajectories, for simplicity of notation, we use a single symbol 𝑘 to denote the lengths of the points describing the geometry of the road segment, even if, in real-case scenarios, the shape can be described using an arbitrary number of points.</p><p>We are now able to formalize the shape-based map matching problem as follows:</p><p>Definition 6 (Shape-based Map Matching). Given a road network 𝒢 = ⟨𝑉 , 𝐸⟩ and a trajectory 𝑋, the shapebased map matching problem consists in finding the best sequence of edges 𝑌 = {𝑒 1 , … , 𝑒 𝑧 } ⊆ 𝐸 such that does not exist another sequence of edges 𝑌 ′ ⊂ 𝐸 different from 𝑌, i.e., 𝑌 ′ ≠ 𝑌, where bestfit Geolet (𝑋 , shape(𝑌 ′ )) &lt; bestfit Geolet (𝑋 , shape(𝑌 )).</p><p>In other words, the shape-based map matching problem involves determining the optimal alignment for a (sub)trajectory 𝑋 within a designated road network 𝐺, relying on the configuration of the edges comprising the road segments. It is essential to emphasize that this map matching endeavor necessitates resolution without any reliance on GPS coordinates as the usage of the shift operator normalizes the trajectory 𝑋 rendering state-of-the-art map matching methods unsuitable for this particular task. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Shape-based Map Matching</head><p>To tackle the shape-based map-matching problem, our aim is to design a map-matching method with the capability to accurately deduce the original GPS coordinates of a trajectory within a designated road network. Crucially, this precision is sought exclusively through an examination of the trajectory's shape and the configurations of the edges within the road network, entirely independent of any reliance on GPS coordinates.</p><p>A brute-force approach to address the problem involves map matching all conceivable alignments of 𝑋 within every segment 𝐸 of the road network 𝐺, employing the bestfit Geolet function. However, this naive strategy is only viable for small road networks due to the algorithmic complexity being 𝑂(|𝐸|(𝑚 − 𝑘)𝑘), where 𝑚 and 𝑘 represent the number of points characterizing the trajectory 𝑋 and the number of points describing each road segments in 𝐸, respectively <ref type="foot" target="#foot_0">1</ref> . This limitation also extends to other map-matching algorithms that rely on latitude and longitude coordinates to confine the matching scope to the nearest roads.</p><p>We overcome this limitation by proposing GASM a Geographic Automaton Shape-based map Matching approach that is able to significantly reduce the number of road segment alignments to test with the brute force method. In essence, GASM comprises two key steps. Initially, leveraging the Aho-Corasick algorithm <ref type="bibr" target="#b23">[24]</ref>, GASM constructs a shape-based index for all road segments in 𝐸, portraying it as a geographic finite state automaton. Subsequently, GASM facilitates querying the automaton to pinpoint a set of candidate partial matches between 𝑋 and {shape(𝑌 )| ∀ 𝑌 ⊆ 𝐸}. Further elucidation of these two steps is provided in the subsequent sections.</p><p>Geographic Automaton Construction. GASM leverages the Aho-Corasick algorithm to construct a geographic automaton, serving as a spatial index for expedited query processing <ref type="bibr" target="#b23">[24]</ref>. The Aho-Corasick algorithm, renowned for string searching, takes a set 𝒲 = {𝑊 1 , … , 𝑊 𝑛 } as input, where each 𝑊 𝑖 represents a finite sequence of symbols over an alphabet Σ. Subsequently, it builds a finite-state automaton based on the sequences in 𝒲 within a given finite symbol sequence defined over the alphabet Σ. Consequently, the automaton, constructed using the dictionary 𝒲, identifies a subset 𝒲 ′ ⊂ 𝒲 wherein each sequence in 𝒲 ′ is contained in 𝑄. Figure <ref type="figure" target="#fig_1">1</ref> provides an illustration of the automaton created by the Aho-Corasick algorithm, using the sequences 𝒲 = {𝐴𝐶, 𝐵, 𝐵𝐶𝐴, 𝐶} over the alphabet Σ = {𝐴, 𝐵, 𝐶, 𝐷}. The Aho-Corasick algorithm initiates by constructing a suffix trie <ref type="bibr" target="#b24">[25]</ref>, depicted in black in Figure <ref type="figure" target="#fig_1">1</ref>. Subsequently, it designates all leaves of the trie as final states of the automaton and introduces edges to complete the automaton. Two types of edges are incorporated, connecting their respective suffixes: suffix edges, depicted in blue, are utilized in the case of a mismatch, without guaranteeing that the suffix is also a sequence in the dictionary. In contrast, dictionary-suffix edges, portrayed in green, guarantee that the suffix is a sequence present in the dictionary. These operations unfold linearly concerning the total number of symbols in the input dictionary 𝒲. The automaton enables the search for all sequences contained in a query by traversing the automaton, achievable in linear time relative to the query's length.</p><p>Algorithm 1 delineates the procedural steps requisite of GASM for constructing the Aho-Corasick automaton. The algorithm accepts, as input, the road segments 𝐸 of the road network 𝐺, the resampling distance 𝑑, the maximum allowed number of symbols 𝛼, and the number of hops ℎ, producing a geographical automaton 𝐴 as output. The GASM-build algorithm begins by aggregating the road network, concatenating ℎ times a road segment to linked road segments in 𝐸 ℎ to extend the length of existing segments and enhance their representativeness (line 1). Subsequently, it initializes an empty dictionary 𝒲 (line 2). The following steps are applied for each road segment in 𝐸 ℎ (denoted as 𝑒). Given that the shape of a road segment 𝑒 may be described by varying numbers of points based on its length and sinuosity, GASM initially resamples the geometries into a series of evenly spaced points 𝑋. This ensures that the symbolic representation's length, crucial for Aho-Corasick automaton construction, is proportional solely to the road length. To fulfill the prerequisite of representing each road segment 𝑒 in a discretized space, a sequence of symbols is generated (line 5). Subsequently, GASM determines the heading direction ⃗ 𝑋 between consecutive points along the resampled road segment, transforming the shape of each road sequence 𝑒 into a univariate time series of directions ⃗ 𝑋 with a consistent length-based sampling rate 𝑑 (line 6). This facilitates the utilization of Symbolic Aggregate approXimation (SAX) <ref type="bibr" target="#b25">[26]</ref> to obtain a symbolic representation of each road segment over an alphabet Σ (line 7). These representations are added to the dictionary 𝒲. Finally, the dictionary of discretized representations of the road segments is employed to construct the Aho-Corasick automaton, which is then returned as the output (line 8).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Shape-based Matching.</head><p>Once the construction of the geographic automaton is complete, GASM can execute shape-based map matching over the automaton following the steps outlined in Algorithm 2. GASM-search takes as input the query trajectory 𝑋, the geographical automaton 𝐴, the road segments 𝐸, the resampling distance 𝑑, and the maximum allowed number of symbols 𝛼. It yields the sequence of edges 𝑌 ⊆ 𝐸 that minimizes the bestfit Geolet function, as per the ensuing procedure. The initial three steps of Algorithm 2, aligning with Algorithm 1, involve resampling the query trajectory 𝑋, extracting its direction, and transforming it into a symbolic representation 𝑄. Indeed, the same preprocessing applied to the road segments 𝐸 is applied to the query trajectories. Subsequently, the geographic automaton 𝐴 is utilized to perform a linear search for the best matches 𝒴 among all possibilities offered by 𝐸 (line 4). This implementation enables GASM to identify an "initial best match", presenting a set of best match candidates 𝒴 = {𝑌 1 , … , 𝑌 𝑛 }. From this set, the final selection of the optimal alignment 𝑌 * is determined through a naive approach (line 5). </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 1</head><p>Tested hyperparameters with their values.</p><p>Figure <ref type="figure" target="#fig_2">2</ref> visually summarise GASM. On the left side, the geographic automaton construction phase is depicted, wherein each road within an arbitrary large road network is indexed according to its heading direction. On the right side, the shape-based matching phase is illustrated. Here, given a trajectory with known shape but unknown origin point, GASM computes the set of potential partial map matches. Subsequently, it selects the match that minimizes the bestfit Geolet function.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Experiments</head><p>In this section, we evaluate the effectiveness of GASM <ref type="foot" target="#foot_3">2</ref> . First, we present the experimental setting, then we report and discuss the best performance achieved. Finally, we illustrate details of the hyperparameter tuning and the result of a sensitivity analysis w.r.t. some data properties.</p><p>Experimental Setting. Regrettably, only a handful of mobility datasets, such as GeoLife and Porto Taxi<ref type="foot" target="#foot_4">3</ref> , are available as open access <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b6">7]</ref>. However, these datasets possess limited geographic coverage, rendering them unsuitable for our study. Thus, we introduce a novel high-sample rate dataset derived from the publicly acces-  sible 2013 GPS traces on OpenStreetMap 4 . Although the initial OpenStreetMap dataset encompasses GPS trajectories spanning the entire globe, our analysis concentrates on the ten provinces in Tuscany, a region encompassing 22, 985𝑘𝑚 2 in central Italy. The Mappymatch python package 5 was employed to map-match each trajectory, retaining only those trajectories with an average error of less than 10𝑚. The final dataset encompasses 358 distinct trajectories, covering a total travel distance of 5, 024𝑘𝑚 and described by 300, 049 GPS points. Additional information on the types of roads traversed in each province in Tuscany, as per the OpenStreetMap taxonomy 6 , is presented in Table <ref type="table">6</ref>.</p><formula xml:id="formula_6">Length</formula><p>Within the framework of our shape-based mapmatching formulation, we aim to address the following questions. First, to what extent can GASM infer the original GPS coordinates without utilizing them for map matching? Second, how effectively can GASM reduce the number of potential alignments compared to the entire road network? The first question is evaluated through 4 OpenStreetMap 2013 public GPS traces: https://t.ly/q7u2N 5 Mappymatch: https://t.ly/RHafS 6 Highway taxonomy: https://t.ly/NpxZv the metric of accuracy.On the other hand, the evaluation of the second question relies on the metric of selectivity, commonly employed in database literature <ref type="bibr" target="#b26">[27]</ref>. Selectivity measures the reduction of potential alignments between a query result and the entire dataset. In our context, selectivity is defined as the ratio of matched road segments (|𝒴 |) to the total number of road segments (|𝐸|). For accuracy, higher values indicate better results, while for selectivity, lower values indicate better outcomes. GASM Performance. Table <ref type="table">5</ref> presents the performance metrics of GASM across individual provinces. To determine the optimal hyperparameters, a grid search was conducted over the values outlined in Table <ref type="table">6</ref>, specifically for the province of Grosseto. This process yielded the following hyperparameters: a resampling distance of 𝑑 = 10 meters, an alphabet size of 𝛼 = 8 symbols, and a street aggregation of ℎ = 2 hops. GASM showcases an impressive ability to deduce the original GPS coordinates, achieving an average light accuracy of 90.1%. Furthermore, it significantly narrows down the potential alignments, as indicated by the selectivity factor, reducing it to just 10.8% of the original road network. These commendable results are attained while maintaining a  Hyperparameters Tuning. In this section, we present the results of experiments conducted on the province of Grosseto while varying the hyperparameters detailed in Table <ref type="table">6</ref>. Initially, we compute the Pearson correlation between the method's hyperparameters and two key performance metrics: the selectivity factor and accuracy. Figure <ref type="figure" target="#fig_3">3</ref> visually depicts the changes in performance metrics, emphasizing variations in the top three most influential hyperparameters-those exhibiting the highest absolute values of Pearson correlation. Notably, the most influential hyperparameter is the number of road segment aggregations (ℎ), demonstrating a correlation of −0.52 with the selectivity. Thus, increasing ℎ proves beneficial for GASM as it helps select fewer candidate road segments without significantly impacting accuracy. The alphabet size (𝛼) displays correlations of −0.44 and 0.40 with respect to the selectivity and accuracy, respectively. This hyperparameter introduces a trade-off, as increasing the number of symbols reduces selectivity but may lead to a slight decrease in accuracy. Finally, the resampling distance (𝑑) exhibits a correlation of 0.22 with accuracy. Interestingly, decreasing 𝑑 slightly enhances accuracy according to our observations. Sensitivity Analysis. We delve here into the variations in performance with respect to the length of the query trajectory 𝑋. Additionally, we explore the dis-criminative nature of trajectories based on the type of road. Our hypothesis posits that straight streets, such as motorways, exhibit lower discriminative characteristics. Consequently, trajectories observed on such roads are more likely to avoid re-identification, suggesting enhanced geographic transferability. To investigate this, we identify the type of road traveled within each segment. In cases where multiple types of roads are encountered, we perform a majority vote weighted by road length. Additionally, to examine the influence of changes in the input data, we create random subtrajectories of varying lengths, including 100m, 150m, 500m, 1km, 1.5km, 5km, and 10km, derived from our OpenStreetMap dataset. In order to assess our hypothesis, we evaluate the straightness <ref type="bibr" target="#b3">[4]</ref> of each subtrajectory by calculating the ratio between the shortest path from the origin to the destination and the actual trajectory.</p><p>Figure <ref type="figure" target="#fig_0">4</ref> encapsulates these results. The initial plot on the left highlights a notable trend: an increase in subtrajectory length correlates with a rapid elevation in both accuracy and selectivity. In simpler terms, as the subtrajectory length extends, the model's precision improves. The central plot reveals that trajectory straightness has a negligible impact on the number of candidate matches. However, as trajectories become more linear, the accuracy experiences a decline. Finally, the rightmost plot illustrates the method's performance across various road types. This plot validates the findings of the straightness plot: roads with greater straightness, like motorways, pose the greatest challenge for re-identification. Con-versely, more sinuous roads present a slightly higher difficulty in re-identification, reflected in a higher selectivity but with a concomitant boost in accuracy.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Conclusion</head><p>In this paper we have introduced GASM, a map matching method capable of determining a trajectory's position solely based on its shape. Our experiments showcase that GASM significantly reduces the number of potential alignments and deduces the original GPS coordinates with remarkable accuracy. Further analysis reveals that longer and less linear trajectories are more straightforward to map match. However, this observation raises concerns about the potential for shape-based methods to inadvertently learn geographic positions instead of focusing on other intrinsic features. As a part of future work, as outlined at the beginning, we aim to assess the geographic transferability of shape-based methods, such as Geolet, by incorporating GASM. Specifically, we propose giving more weight to the selection of discriminative subsequences with higher selectivity rather than basing the decision solely on a statistical test.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Definition 4 (</head><label>4</label><figDesc>Subtrajectory Transform). Given a dataset 𝒳 and a set 𝒮 containing ℎ subtrajectories, the subtrajectory transform converts 𝒳 ∈ ℝ 𝑛×𝑚×3 into a real-valued matrix 𝑇 ∈ ℝ 𝑛×ℎ , obtained by taking the best fitting of each trajectory 𝑋 ∈ 𝒳, and each subtrajectory 𝑆 ∈ 𝒮. Movelet computes the best fitting as the minimal Euclidean Distance (ED) between 𝑆 in each subsequence of length 𝑙 = |𝑆| of 𝑋, formally, bestfit Movelets (𝑋 , 𝑆) = 𝑚−𝑙 min 𝑗=0 {𝐸𝐷(𝑋 𝑗∶𝑗+𝑙 , 𝑆)}</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Aho-Corasick automaton using symbolic sequences {𝐴𝐶, 𝐵, 𝐵𝐶𝐴, 𝐶}. Blue dashed arches are suffix arches, while green dotted arches are the dictionary suffix arches.</figDesc><graphic coords="4,145.14,84.19,91.67,71.08" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Summary of GASM, depicting geographic automaton construction (left) and shape-based matching (right).</figDesc><graphic coords="5,302.62,84.19,208.34,166.17" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Hyperparameters influence: ℎ-hop aggregation (left), 𝛼 alphabet size (center), and 𝑑 resampling distance (right).</figDesc><graphic coords="7,89.33,84.78,137.51,96.13" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Influence of trajectory length (left), trajectory straightness (center), and kind of road (right) on performance metrics.</figDesc><graphic coords="7,368.44,216.92,137.51,95.13" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>(km) Length (#points) Kind of Road (%) Province #Trj Totoal Average (𝜎) Average (𝜎) Motorway Trunk Primary Secondary Minor</head><label></label><figDesc></figDesc><table><row><cell>Arezzo</cell><cell>17</cell><cell>341</cell><cell>17.2 (18.38)</cell><cell>853 (851)</cell><cell>0.019</cell><cell>0.227</cell><cell>0.205</cell><cell>0.344</cell><cell>0.196</cell></row><row><cell>Firenze</cell><cell>58</cell><cell>1041</cell><cell>30.4 (37.31)</cell><cell>1526 (1427)</cell><cell>0.122</cell><cell>0.032</cell><cell>0.392</cell><cell>0.205</cell><cell>0.249</cell></row><row><cell>Grosseto</cell><cell>35</cell><cell>215</cell><cell>6.7 (9.35)</cell><cell>578 (545)</cell><cell>0.020</cell><cell>0.133</cell><cell>0.054</cell><cell>0.327</cell><cell>0.466</cell></row><row><cell>Livorno</cell><cell>53</cell><cell>410</cell><cell>18.5 (25.42)</cell><cell>916 (645)</cell><cell>0.410</cell><cell>0.043</cell><cell>0.089</cell><cell>0.144</cell><cell>0.313</cell></row><row><cell>Lucca</cell><cell>39</cell><cell>804</cell><cell>17.5 (15.49)</cell><cell>1128 (1133)</cell><cell>0.225</cell><cell>0.000</cell><cell>0.056</cell><cell>0.442</cell><cell>0.278</cell></row><row><cell>M. Carrara</cell><cell>46</cell><cell>267</cell><cell>5.1 (3.54)</cell><cell>625 (430)</cell><cell>0.187</cell><cell>0.173</cell><cell>0.160</cell><cell>0.267</cell><cell>0.212</cell></row><row><cell>Pisa</cell><cell>35</cell><cell>831</cell><cell>26.0 (23.04)</cell><cell>1347 (1178)</cell><cell>0.000</cell><cell>0.002</cell><cell>0.219</cell><cell>0.200</cell><cell>0.578</cell></row><row><cell>Pistoia</cell><cell>20</cell><cell>468</cell><cell>53.1 (31.07)</cell><cell>929 (777)</cell><cell>0.000</cell><cell>0.186</cell><cell>0.251</cell><cell>0.163</cell><cell>0.399</cell></row><row><cell>Prato</cell><cell>31</cell><cell>146</cell><cell>4.5 (2.37)</cell><cell>660 (672)</cell><cell>0.141</cell><cell>0.050</cell><cell>0.395</cell><cell>0.146</cell><cell>0.269</cell></row><row><cell>Siena</cell><cell>24</cell><cell>497</cell><cell>22.4 (16.54)</cell><cell>983 (660)</cell><cell>0.557</cell><cell>0.032</cell><cell>0.340</cell><cell>0.026</cell><cell>0.046</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc>Dataset description. Besides average values are reported standard deviations (𝜎).</figDesc><table><row><cell></cell><cell cols="2">Method Performances</cell><cell></cell><cell></cell><cell cols="2">Road Network Characteristics</cell><cell></cell></row><row><cell></cell><cell>Selectivity</cell><cell cols="2">Building</cell><cell>#Road</cell><cell></cell><cell>Avg node</cell><cell></cell></row><row><cell>Province</cell><cell cols="5">Factor ↓ Accuracy ↑ Time (s) ↓ Segments #Intersections</cell><cell cols="2">Degree Length (km)</cell></row><row><cell>Arezzo</cell><cell>0.096</cell><cell>0.875</cell><cell>708.32</cell><cell>133028</cell><cell>54485</cell><cell>4.883</cell><cell>26852</cell></row><row><cell>Siena</cell><cell>0.068</cell><cell>1.000</cell><cell>516.37</cell><cell>175088</cell><cell>72079</cell><cell>4.858</cell><cell>28949</cell></row><row><cell>Pistoia</cell><cell>0.080</cell><cell>0.950</cell><cell>433.24</cell><cell>81870</cell><cell>34492</cell><cell>4.747</cell><cell>12363</cell></row><row><cell>Lucca</cell><cell>0.070</cell><cell>0.846</cell><cell>658.03</cell><cell>141149</cell><cell>59274</cell><cell>4.763</cell><cell>20328</cell></row><row><cell>Firenze</cell><cell>0.399</cell><cell>0.263</cell><cell>157.17</cell><cell>299312</cell><cell>119068</cell><cell>5.028</cell><cell>39025</cell></row><row><cell>Grosseto</cell><cell>0.060</cell><cell>0.823</cell><cell>421.66</cell><cell>121045</cell><cell>50014</cell><cell>4.841</cell><cell>26818</cell></row><row><cell>Livorno</cell><cell>0.069</cell><cell>1.000</cell><cell>277.38</cell><cell>96631</cell><cell>39613</cell><cell>4.879</cell><cell>11269</cell></row><row><cell>M. Carrara</cell><cell>0.056</cell><cell>0.978</cell><cell>294.50</cell><cell>71917</cell><cell>300071</cell><cell>4.783</cell><cell>10809</cell></row><row><cell>Pisa</cell><cell>0,081</cell><cell>0.857</cell><cell>1007.06</cell><cell>150954</cell><cell>62580</cell><cell>4.824</cell><cell>22396</cell></row><row><cell>Prato</cell><cell>0.105</cell><cell>0.936</cell><cell>190.77</cell><cell>45060</cell><cell>18794</cell><cell>4.795</cell><cell>5224</cell></row><row><cell cols="3">Macro Avg (𝜎) 0.108 (0.103) 0.853 (0.217)</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3</head><label>3</label><figDesc>Selectivity factor, accuracy, automaton construction runtime, and other road network informations.</figDesc><table /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">For each 𝑌 ⊆ 𝐸, we compute the Euclidean Distance (linear complexity), for all the possible 𝑚 − 𝑘 alignments of shape(𝑌 ) in 𝑋.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_1">𝑌 * ← arg min 𝑌 ∈𝒴 bestfit Geolet (𝑌 , 𝑋 ′ );</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_2">return 𝑌;// return the best match</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_3">Python code: https://t.ly/wVlXS. We ran our experiments on a 2xIntel Xeon Gold 6342 24-core CPU, limiting each test to use at most 12 cores.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_4">GeoLife: https://t.ly/6VJ-E. Porto: https://t.ly/0GMR9.</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>This work is partially supported by the EU NextGener-ationEU programme under the funding schemes PNRR-"SoBigData.it -Strengthening the Italian RI for Social Mining and Big Data Analytics" -Prot. IR0000013, H2020-INFRAIA-2019-1: Res. Infr. G.A. 871042 SoBigData++, and GreenDatAI G.A. 101070416.</p></div>
			</div>

			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0" />			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">A survey and comparison of trajectory classification methods</title>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">L</forename><surname>Da Silva</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">BRACIS, IEEE</title>
				<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="788" to="793" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Inferring hybrid transportation modes from sparse GPS data using a moving window SVM classification</title>
		<author>
			<persName><forename type="first">A</forename><surname>Bolbol</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">CEUS</title>
		<imprint>
			<biblScope unit="volume">36</biblScope>
			<biblScope unit="page" from="526" to="537" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Revealing the physics of movement: Comparing the similarity of movement characteristics</title>
		<author>
			<persName><forename type="first">S</forename><surname>Dodge</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">CEUS</title>
		<imprint>
			<biblScope unit="volume">33</biblScope>
			<biblScope unit="page" from="419" to="434" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Indices of movement behaviour: conceptual background, effects of scale and location errors</title>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">J</forename><surname>Almeida</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Zoologia</title>
		<imprint>
			<biblScope unit="volume">27</biblScope>
			<biblScope unit="page" from="674" to="680" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<author>
			<persName><forename type="first">I</forename><surname>Kontopoulos</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2205.13880</idno>
		<title level="m">Traclets: Harnessing the power of computer vision for trajectory classification</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">MOVELETS: exploring relevant subtrajectories for robust trajectory classification</title>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">A</forename><surname>Ferrero</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">SAC, ACM</title>
				<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="849" to="856" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<title level="m" type="main">Geolet: An interpretable model for trajectory classification</title>
		<author>
			<persName><forename type="first">C</forename><surname>Landi</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2023">2023</date>
			<publisher>Springer</publisher>
			<biblScope unit="volume">13876</biblScope>
			<biblScope unit="page" from="236" to="248" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Assessing accuracy and geographical transferability of machine learning algorithms for wind speed modelling</title>
		<author>
			<persName><forename type="first">F</forename><surname>Veronesi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">AGILE, LNGC</title>
				<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">City indicators for geographical transfer learning: an application to crash prediction</title>
		<author>
			<persName><forename type="first">M</forename><surname>Nanni</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">GeoInformatica</title>
		<imprint>
			<biblScope unit="volume">26</biblScope>
			<biblScope unit="page" from="581" to="612" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">A shapelet transform for time series classification</title>
		<author>
			<persName><forename type="first">J</forename><surname>Lines</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ACM SIGKDD</title>
				<imprint>
			<date type="published" when="2012">2012</date>
			<biblScope unit="page" from="289" to="297" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Some map matching algorithms for personal navigation assistants</title>
		<author>
			<persName><forename type="first">C</forename><surname>White</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">TRC</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<date type="published" when="2000">2000</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<title level="m" type="main">An introduction to map matching for personal navigation assistants</title>
		<author>
			<persName><forename type="first">D</forename><surname>Bernstein</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1996">1996</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">A general map matching algorithm for transport telematics applications</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Quddus</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">GPS solutions</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="page" from="157" to="167" />
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">A novel algorithm of low sampling rate GPS trajectories on map-matching</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Li</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">EURASIP</title>
		<imprint>
			<biblScope unit="volume">2017</biblScope>
			<biblScope unit="page">30</biblScope>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">A st-crf map-matching method for low-frequency floating car data</title>
		<author>
			<persName><forename type="first">X</forename><surname>Liu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">TITS</title>
		<imprint>
			<biblScope unit="volume">18</biblScope>
			<biblScope unit="page" from="1241" to="1254" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">From driving trajectories to driving paths: a survey on map-matching algorithms</title>
		<author>
			<persName><forename type="first">L</forename><surname>Jiang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">CCF TPCI</title>
		<imprint>
			<biblScope unit="volume">4</biblScope>
			<biblScope unit="page" from="252" to="267" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<author>
			<persName><forename type="first">P</forename><surname>Cintia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Nanni</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1603.07376</idno>
		<title level="m">An effective time-aware map matching process for low sampling gps data</title>
				<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">If-matching: Towards accurate mapmatching with information fusion</title>
		<author>
			<persName><forename type="first">G</forename><surname>Hu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">TKDE</title>
		<imprint>
			<biblScope unit="volume">29</biblScope>
			<biblScope unit="page" from="114" to="127" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Smart city development with urban transfer learning</title>
		<author>
			<persName><forename type="first">L</forename><surname>Wang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computer</title>
		<imprint>
			<biblScope unit="volume">51</biblScope>
			<biblScope unit="page" from="32" to="41" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Cross-city transfer learning for deep spatio-temporal prediction</title>
		<author>
			<persName><forename type="first">L</forename><surname>Wang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IJCAI, ijcai.org</title>
				<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="1893" to="1899" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Myway: Location prediction via mobility profiling</title>
		<author>
			<persName><forename type="first">R</forename><surname>Trasarti</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Inf. Syst</title>
		<imprint>
			<biblScope unit="volume">64</biblScope>
			<biblScope unit="page" from="350" to="367" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
		<title level="m" type="main">Introduction to data mining</title>
		<author>
			<persName><forename type="first">P.-N</forename><surname>Tan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Steinbach</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Kumar</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2016">2016</date>
			<publisher>Pearson Education India</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<monogr>
		<title level="m" type="main">The great time series classification bake off</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">J</forename><surname>Bagnall</surname></persName>
		</author>
		<idno>CoRR abs/1602.01711</idno>
		<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Efficient string matching: An aid to bibliographic search</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">V</forename><surname>Aho</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM</title>
		<imprint>
			<biblScope unit="volume">18</biblScope>
			<biblScope unit="page" from="333" to="340" />
			<date type="published" when="1975">1975</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">A space-economical suffix tree construction algorithm</title>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">M</forename><surname>Mccreight</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">JACM</title>
		<imprint>
			<biblScope unit="volume">23</biblScope>
			<biblScope unit="page" from="262" to="272" />
			<date type="published" when="1976">1976</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">A symbolic representation of time series, with implications for streaming algorithms</title>
		<author>
			<persName><forename type="first">J</forename><surname>Lin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">DMKD, ACM</title>
				<imprint>
			<date type="published" when="2003">2003</date>
			<biblScope unit="page" from="2" to="11" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">Selectivity estimation in spatial databases</title>
		<author>
			<persName><forename type="first">S</forename><surname>Acharya</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">SIGMOD, ACM</title>
				<imprint>
			<date type="published" when="1999">1999</date>
			<biblScope unit="page" from="13" to="24" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
