Short Paper: Citizen Sensing within a Real-Time
        Passenger Information System

           David Corsar, Peter Edwards, Chris Baillie, Milan Markovic,
                   Konstantinos Papangelis, and John Nelson

                           dot.rural Digital Economy Hub,
                        University of Aberdeen, Aberdeen, UK
                    {dcorsar,p.edwards,c.baillie,m.markovic,
                       k.papangelis,j.d.nelson}@abdn.ac.uk
                            http://www.dotrural.ac.uk


        Abstract. GetThere is a real-time passenger information system (RTPI)
        for rural areas that uses a citizen sensing approach to acquire information
        from public transport users. This paper describes the use of ontologies in
        GetThere to represent and integrate citizen sensors with data required
        to provide RTPI (e.g. timetable and route descriptions). The service ar-
        chitecture used to manage semantic sensor data is also described.

        Keywords: Citizen sensing, semantic sensors, ontology, quality, prove-
        nance, transport


1     Introduction
We are developing an information ecosystem within the Informed Rural Pas-
senger project1 to support GetThere, a real-time passenger information (RTPI)
system for rural areas. This ecosystem is based on an ontological framework
that describes the datasets necessary to support the provision of RTPI, such
as estimated vehicle arrival times or notification of delays. A key aspect is the
integration of sensor data (e.g. vehicle locations from GPS) with other data (e.g.
timetable and route descriptions). However, it is not uncommon in rural areas
to experience situations in which appropriate sensors are unavailable (e.g. rural
buses are not typically equipped with GPS devices). To address this, we have
adopted a citizen sensing approach [7], i.e. enabling humans (in our case, public
transport users) to act as data sensors using the GetThere smartphone app2 .
However, this introduces new issues associated with the quality of such observa-
tions due to malicious users, inaccurate devices, and erroneous observations.
    This paper discusses the citizen sensing aspects of GetThere. Section 2 de-
scribes the ontologies used to represent and integrate citizen sensor data, and
the service architecture built to manage semantic sensor data; section 3 discusses
its deployment within the GetThere system; section 4 discusses the performance
of the architecture; and section 5 outlines conclusions and future work.
1
    http://www.dotrural.ac.uk/irp
2
    http://www.gettherebus.com
2       Corsar et al

2     Supporting RTPI via Citizen Sensing
During the initial design of GetThere and from experience of trialling the sys-
tem, a number of requirements were identified to enable citizen sensing to be
used to support RTPI provision using the ecosystem. These include: semantic
integration of sensor descriptions and observations with other data within the
ecosystem; following best practise and reusing existing (sensor) ontologies to
describe (citizen) sensor data; recording the provenance of observations; man-
agement of observations and sensor descriptions via a RESTful API (create,
retrieve, update, and delete); and assessing the quality of user observations.

2.1   Ontologies
Fig. 1 outlines the ontological framework used to integrate citizen sensors within
the ecosystem. This framework is designed to support a range of transport appli-
cations in different geographic areas. The W3C Semantic Sensor Network (SSN)
Incubator Group ontology3 forms the basis of the framework. This ontology
provides a generic model for describing Sensor s, the Sensing methods they im-
plement, Observations (value for a property of a phenomenon), Sensor Outputs
generated by sensors, Observation Values, and Feature Of Interests (real world
phenomena being observed) [5].
    The Travel Sensors ontology4 extends SSN to represent the sensor concepts
present in GetThere. This includes defining users (FOAF5 Agents) as sensors and
several types of observation that they can provide (e.g. vehicle occupancy level,
vehicle temperature). Mobile devices running the GetThere app are represented
as platforms, with several attached sensing devices, each with their own sensing
and observation classes allowing the representation of observations produced by
the GetThere app (e.g. location, ambient noise level, presence of Wi-Fi) on behalf
of the user. Along with classes (such as those shown in Fig. 1), the ontology also
defines cardinality constraints on the SSN properties to, for example, ensure
OccupancyLevelObservations are only observedBy a FOAF Agent.
    Observations from users are integrated within the ecosystem via the Journey
class6 , which represents a trip taken by a user on public transport and is used
as an observation’s feature of interest. The Journey class references the public
transport Route 7 being travelled, from which further information such as the
location of roads the vehicle should travel along and details of stop points (e.g.
their location8 ) can be determined. This provides contextual information about
the observation that is required to determine RTPI (e.g. finding the locations of
vehicles on a specific route) and to support quality assessment (e.g. calculating
how far a reported location is from the expected route of travel).
3
  http://www.w3.org/2005/Incubator/ssn/ssnx/ssn
4
  http://www.dotrural.ac.uk/irp/uploads/ontologies/sensors.owl
5
  http//xmlns.com/foaf/spec/
6
  Defined by the User ontology - http://www.dotrural.ac.uk/irp/uploads/ontologies/user.owl
7
  Defined by the Transit ontology which describes public transport timetables.
8
  Defined by the NaPTAN dataset - http://data.gov.uk/dataset/naptan
                       Citizen Sensing within a Real-Time Passenger Information System                                                                                               3

 PROV-O                                                                                     prov:wasAssociatedWith
                prov:Activity           prov:wasGeneratedBy            prov:Entity             prov:wasAttributedTo                                                        prov:Agent


 Sensor rdfs:subClassOf                  rdfs:subPropertyOf         rdfs:subClassOf          rdfs:subPropertyOf                                                           rdfs:subClassOf
 Provenance

 SSN                                                                                      ssn:observedBy                         ssn:onPlatform                              ssn:Sensor
                ssn:Sensing             ssn:sensingMethodUsed         ssn:Observation         ssn:FeatureOf                                  ssn:Sensing
                                                                                                 Interest              ssn:Platform
                                                                                                                                               Device         rdfs:subClassOf


 Quality                              gtq:Accuracy        Travel                              rdfs:subClassOf                rdfs:subClassOf            rdfs:subClassOf
                 gtq:measured                                     rdfs:subClassOf                                                                                         rdfs:subClassOf
 gtq:Accuracy
                                       Dimension          Sensors    irps:Android                     irps:Occupancy          irps:AndroidMobileDevice
    Metric                            gtq:Accuracy                    LocationSensing                LevelObservation
                   gtq:guidedBy                                                                                                                               irps:Android
                                      Assessment                                             rdfs:subClassOf                    ssn:observedBy only
                                                            ssn:sensingMethodUsed only                                                                       LocationSensor
 gtq:Accuracy       gtq:resultOf       gtq:targeted                                     irps:AndroidLocationObservation               rdfs:subClassOf     prov:actedOnBehalfOf
     Score

 NaPTAN             naptan:StopPoint
                                                Transit                              trn:stopPoint                trn:Route                irp:onLine                        foaf:Agent
                                                                trn:StopTime
                                                 trn:stop
                                                                                               trn:Trip    trn:route                               irpu:Journey
 Infrastructure            irpi:Map                              irpi:tripMap                                                                                         irpu:user   Users


Fig. 1. Ontologies integrating sensor data with other data to support RTPI provision.


    Further contextual information is provided by recording and inferring the
provenance of observations. Provenance, a record of the agents and activities
involved in producing, influencing, or delivering a piece of data can be used to
form assessments about its quality, reliability or trustworthiness [6]. To capture
observation provenance, we have defined a Sensor Provenance ontology9 , which
aligns the SSN ontology with the W3C PROV-O ontology10 through subclass and
subproperty axioms. PROV-O is based around the concepts of Entity, a thing
that wasGeneratedBy some Activity (something that occurs and acts upon or
with entities), which in turn wasAssociatedWith an Agent (something that has
some responsibility for an activity or entity). PROV-O is also used to capture
the relationship between the sensors on a mobile device and the device’s user
through the actedOnBehalfOf property. This enables, for example, a record to
be kept of any processing performed on observations within the ecosystem, and
retrieval of the user associated with observations produced by a sensing device.

2.2        Services
The ecosystem also features a general-purpose web service architecture11 suit-
able for applications that require management of sensor data expressed using the
SSN ontology. Five services each provide RESTful APIs for creating, getting, up-
dating, and deleting RDF descriptions of sensors, sensing methods, observations,
sensor outputs, and observation values. Upon receiving a request, each service
generates a SPARQL v1.112 update or query based on the parameter values in-
cluded with the request13 . This query is executed on a SPARQL endpoint; the
service processes the endpoint’s response and then sends a response to the client.
9
   http://www.dotrural.ac.uk/irp/uploads/ontologies/sensorprov.owl
10
   http://www.w3.org/ns/prov-o
11
   Available from https://github.com/dcorsar/sensor-service
12
   http://www.w3.org/TR/sparql11-query/
13
   Parameter values are type checked before use and an error thrown if checking fails.
4      Corsar et al

This delegates storage to the technology backing the endpoint, allowing the use
of, for example, a database and R2RML [3] if only storage and publication is
required, or a triplestore and ontology reasoner if materialisation is required.
This also allows the data to be published as Linked Data by, for example, using
Pubby14 .
    These services can also be extended for different use cases; for example, we
have extended them to create a Location Observation service which manages
real-time vehicle locations obtained from the GetThere app. This includes addi-
tional methods for creating and storing observations (and the associated sensor
output and observation value) for a given geolocation and user’s journey, and
for retrieving the latest real-time locations for a particular bus route.
    As discussed earlier, employing citizen sensing introduces the potential for
low quality observations; therefore we have developed a sensor data quality ser-
vice, underpinned by our data quality ontology15 [1]. The service employs a SPIN
reasoner guided by a number of SPARQL rules [4] to examine the metadata as-
sociated with the location observations provided by users. These rules describe
a number of quality metrics that define how data should be evaluated against a
number of quality dimensions [2]. Location observations are currently evaluated
against four quality dimensions based on our experience of testing and deploy-
ing the system16 . These are accuracy (accurate location observations have an
associated error less than 25 metres), availability (considers any delay between
the observation being observed by the device and received by the ecosystem due
to mobile network lag); timeliness (timely observations were produced by the
sensor less than one minute ago); and relevance (relevant observations are no
farther than 500 metres from the expected route of travel, to support detecting
potentially malicious/erroneous locations).


3    Deployment

The ecosystem and GetThere app17 are currently deployed for nine bus routes in
the Scottish Borders, UK (the First Group routes 62/62A, 72, 73, 95/95A/X95,
396, and 397). Fig. 2 (a), outlines the sensing components of the ecosystem for
this deployment. This includes datasets for observations, user profiles, timetable,
road infrastructure, and bus stop details (NaPTAN). Services reasoning with
these datasets (accessed via SPARQL endpoints) provide RTPI functionalities
for the GetThere app. These include the aforementioned Sensor, Location Ob-
servation, and Quality services, along with the User service which handles reg-
istering new users and managing user profiles, and the Timetable service which
provides route and timetable information. Deployment in other areas simply
requires creating the timetable and infrastructure datasets for that area.
14
   http://wifo5-03.informatik.uni-mannheim.de/pubby/
15
   http://sensornet.abdn.ac.uk/onts/Qual-O.ttl
16
   The      ontology      containing    these      rules   is     available    at
   http://sensornet.abdn.ac.uk/onts/GetThereQ.ttl
17
   For a video showing the GetThere app see http://www.gettherebus.com/ssn2013
                 Citizen Sensing within a Real-Time Passenger Information System         5

Clients                            GetThere Android App

 Web                                Location
 Services Quality    Sensor
                                   Observation
                                                     User      Timetable
          Service    Services                       Service     Service
                                    Service

 SPARQL Observation          User         Timetable     transport.data.gov
 Endpoints Endpoint         Endpoint      Endpoint           Endpoint

 Datasets
            Observations   Users      Timetable   Infrastructure   NaPTAN

                                (a)                                          (b)   (c)


Fig. 2. The ecosytem’s sensing components (a), and screenshots of the GetThere smart-
phone app showing vehicle locations (b), and results of quality assessment (c).


    The app allows users to register, view the available bus routes, view both
the timetabled and real-time bus locations (from other users) on a particular
route (Fig. 2 (b)), and upload their location automatically, every minute during
journeys. Users can tap the icon representing a real-time location to invoke the
quality assessment service for that observation, the results of which are visualised
using a colour-coded bar representing the quality score for each dimension. For
example, in Fig. 2 (c), the green filled bar under “Availability” indicates a high
quality score, while the nearly empty red bar under “Relevance” indicates a low
score for that quality dimension. As of June 2013, there were 47 registered users,
of which 17 have contributed 1008 location observations during 167 bus journeys.


4      Performance Evaluation
We have conducted a simulation in order to gain an insight into the performance
of the sensor architecture when providing real-time vehicle locations with mul-
tiple users uploading locations. The simulation was based on buses travelling in
two directions on six routes, with five passengers per bus (which we believe is
a realistic maximum figure), each providing a location every minute. New buses
were introduced every 30 minutes (reflecting the frequency of buses in the de-
ployment area). A further 12 users were introduced to query for real-time vehicle
locations every minute (one user per route per direction). The response times
for all requests were recorded. The simulation ran for a period of 18 hours, re-
flecting a full day of bus operations in the Scottish Borders; it was executed on
a single machine, with a 3.2 GHz Intel Core i3 processor with 4GB 1333MHz
DDR3 memory with a Sesame MySQL triplestore used for data storage.
    Lack of space prevents a detailed presentation and discussion of the results
in terms of the response times for real-time locations18 . However, in summary,
for the 12960 requests made for real-time vehicle locations, 95% of the responses
were generated in under 4.1 seconds; 99% of responses were generated in under
16.2 seconds. Given that the app uploads locations every minute, we believe
these are acceptable response times for providing real-time vehicle locations.
18
     Full results are available at http://homepages.abdn.ac.uk/dcorsar/pages/ssn2013
6       Corsar et al

5    Conclusions & Future Work
We have presented the sensor architecture developed to support citizen sens-
ing within GetThere. This includes describing extensions of the SSN ontology to
model citizen sensors. We found extending the SSN ontology simplified the design
process for our sensor model, as SSN defines the concepts (sensors, observations,
etc.) we required and outlines how to extend them for our own model, The SSN
ontology also provides a method for integrating sensor data with other datasets
in the ecosystem, and could be aligned with PROV-O to facilitate recording
provenance of observations Two design issues faced were: how to handle sensor
capabilities, for example, should all location sensors on Android smartphones
use the same or different individuals to represent their location sensing capabil-
ity?; and ensuring observations created by a sensor on a smartphone link to the
phone’s user, which we opted to capture in the provenance record. Aligning SSN
with PROV-O enables provenance to be automatically inferred for each obser-
vation, providing data that can be useful for services such as quality assessment.
Using ontologies is also beneficial when defining quality metrics, as they allow
separate metrics assessing the same dimension for different types of observation.
For example, they allow separate metrics for assessing timeliness of location and
occupancy level observations, which will require different criteria.
    As part of future work, we plan to increase the types of observations that the
GetThere app acquires from users to capture other aspects of their journeys (e.g.
presence of Wi-Fi, vehicle temperature), and extend our sensor architecture to
accommodate such observations. We are also developing a model to determine
the trustworthiness and reputation of citizen sensors, which incorporates aspects
of the ecosystem, such as the quality evaluation of their previous contributions.
Acknowledgements The research described here is supported by the award
made by the RCUK Digital Economy programme to the dot.rural Digital Econ-
omy Hub; award reference: EP/G066051/1

References
1. Baillie, C., Edwards, P. Pignotti, E., Corsar, D.: Short paper: Assessing the quality
   of semantic sensor data. In: Proc. of The 6th International Workshop on Semantic
   Sensor Networks (October 2013), to appear
2. Bizer, C., Cygniak, R.: Quality-driven information filtering using the wiqa policy
   framework. Journal of Web Semantics 7, 1–10 (2009)
3. Das, S., Sundara, S., Cyganiak, R.: R2rml: Rdb to rdf mapping language. W3C
   Recommendation (September 2012)
4. Furber, C., Hepp, M.: Swiqa - a semantic web information quality assessment frame-
   work. In: 19th European Conference on Information Systems. pp. 922–933 (2011)
5. Lefort, L., Henson, C., Taylor, K.: Semantic sensor network xg final report. W3C
   Incubator Group Report (June 2011)
6. Moreau, L., Missier, P.: Prov-dm: The prov data model. W3C Recommendation
   (April 2012), http://www.w3.org/TR/prov-dm/
7. Sheth, A.: Citizen sensing, social signals, and enriching human experience. Internet
   Computing, IEEE 13(4), 87–92 (2009)