=Paper= {{Paper |id=None |storemode=property |title=D2RCrime: A Tool for Helping to Publish Crime Reports on the Web from Relational Data |pdfUrl=https://ceur-ws.org/Vol-966/STIDS2012_T09_TavaresEtAl_D2RCrime.pdf |volume=Vol-966 |dblpUrl=https://dblp.org/rec/conf/stids/TavaresFSF12 }} ==D2RCrime: A Tool for Helping to Publish Crime Reports on the Web from Relational Data == https://ceur-ws.org/Vol-966/STIDS2012_T09_TavaresEtAl_D2RCrime.pdf
          D2RCrime: A Tool for Helping to Publish
       Crime Reports on the Web from Relational Data
                       Júlio Tavares                                                              Vasco Furtado
             University of Fortaleza - UNIFOR                                          University of Fortaleza - UNIFOR
                   Fortaleza/CE, Brazil                                                      Fortaleza/CE, Brazil
                   julio.at@gmail.com                                                     furtado.vasco@gmail.com

                     Henrique Santos                                                          Eurico Vasconcelos
             University of Fortaleza - UNIFOR                                          University of Fortaleza - UNIFOR
                   Fortaleza/CE, Brazil                                                      Fortaleza/CE, Brazil
                  hensantos@gmail.com                                                   euricovasconcelos@gmail.com



   Abstract—In the Law Enforcement context, more and more            is desirable that the different sources of information follow a
data about crime occurrences are becoming available to the           pattern, which allows, for instance, making reliable
general public. For an effective use of open data, it is desirable   comparisons. Here, when we mention a pattern, we refer to a
that the different sources of information follow a pattern, which    language with the power to represent information about both
allows reliable comparisons. In addition, it is expected that the
                                                                     the provenance and the meaning of the concepts that should be
task of creating a correspondence between the pattern and the
internal representations of each source of information is not a      available. Moreover, it is expected that the task of creating a
steep learning curve. These two conditions are hardly found in       correspondence between the pattern and the internal
the actual stage, where open data about crime occurrences refer      representations of each source of information is not a steep
to the data disclosed by each police department in its own way.      learning curve. These two conditions are hardly found in the
This paper proposes an interactive tool, called D2RCrime, that       actual stage in the context of opening data about crime
assists the designer/DBA of relational crime databases to make       occurrences. The usual process is each police department to
the correspondence between the relational data and the classes       define its own way to disclose its data by creating intermediary
and properties of a crime ontology. The ontology plays the role of   representations (typically spreadsheets1) that must constantly
a pattern to represent the concepts of crime and report of crime,
                                                                     be updated. Alternatively, the police departments develop their
and is also the interface to publish on-the-fly relational crime
data. This correspondence allows the automatic generation of         own APIs2 that are characterized by their specificity. In brief,
mapping rules between the two representations, what allows for       each department spends time and resources to define its own
access to relational data from SPARQL. An evaluation of              way to disclose its data.
D2RCrime is done with DBA/system analysts who used the tool              This paper proposes a method to guide the process of
for establishing correspondences between relational data and the     opening crime data that aims to mitigate the aforementioned
ontology.                                                            problems. This method relies on ontologies for representing the
                                                                     concepts of crime and crime report. The crime ontology defines
  Index    Terms—Internet,    Semantic   Web,         Knowledge
                                                                     the basic concepts and properties used in the context of Law
Engineering, Law Enforcement, Open Government.
                                                                     Enforcement to define a crime occurrence. The crime report
                                                                     ontology defines the basic information necessary to
                        I. INTRODUCTION                              characterize the report of a crime occurrence such as the source
    The culture of participation and collaboration on the Web        of the report, the date and time of the report, its description, and
could not leave out the public sector. New forms of                  so on.
relationships between citizens and governments are also                  We have designed an interactive tool that assists the
emerging from a new attitude on the tract of government              designer/DBA to make the correspondence between the
information and public service on the Internet. This new             relational data and the classes and properties of the crime
approach, understood here as Government 2.0 (while                   ontology. This correspondence allows us to automatically
complying with the Web 2.0), relies on governments as open           generate the mapping rules between the two representations,
platforms to provide information [1].                                which conducts the process of accessing relational data from
    In the Law Enforcement context, more and more data about         SPARQL. Unlike the majority of approaches that replicate the
crime occurrences are becoming available to the general public.      relational data into another repository, we based our proposal
In the U.S. and Britain in particular, police departments quickly
realized that they should open data to encourage participation           1
                                                                             See http://www.atlantapd.org/crimedatadownloads.aspx in Atlanta
by the population. For an effective use of open information, it              2
                                                                               See http://sanfrancisco.crimespotting.org/api for San Francisco
on the D2R Server [2]. D2R is a system for publishing                properties are important because our ultimate goal is to
relational data on the Web. The D2R Server enables Resource          combine crime open data from a large variety of sources that
Description Framework (RDF) and HTML browsers to                     sometimes can even be anonymous. The CrimeReport class is a
navigate the content of non-RDF databases, and allows                subclass of pmlp:Information. We have also used some specific
applications to query a database using the SPARQL query              properties      to     describe     a    report,      such      as
language over the SPARQL protocol. This approach relieves            pmlp:hasCreationDateTime          (hour    of     the      report),
the data owner of concerns about the integrity and consistency       pmlp:hasDescription (text of the report), and pmlp:hasSource
of the replicated data. Finally, an evaluation of D2RCrime is        (entity that published the report).
done with DBA/system analysts who used the tool for                      The complete ontology is described in [15]. Figure 1 shows
establishing correspondences between relational data and the         a piece of this ontology describing a particular crime
ontology.                                                            (homicide). This is the most refined level of detail that we have
                                                                     proposed. Doing so, we aim to keep the tradeoff between
           II. REPRESENTING CRIME REPORTS                            simplicity and generality while providing good coverage.
    Two ontologies are at the core of our proposal. They intend
to represent the concepts of crime and report of crime. Our
representation of crime is not restricted to the information that
nowadays has been disclosed by police departments
worldwide. However some information is mandatory to define
a unique instance. A crime has at least a type, a date and time
(imported from the time ontology [3], a precise address
(geographical coordinates), and a description. Information
about the people involved such as the perpetrator(s), the
witnesses and the victim(s) may also be inserted, but it is not
mandatory.
    The crime ontology is basically a hierarchy for inferential
purposes. It was modeled so that it is possible to map the             Fig. 1. Piece of the crime ontology for the description of homicide
various classifications of crime type. We define the crime
events as specializations of the Event class, from the Event
Ontology [4]. According to the Event Ontology, “an event is an          III. ASSISTING THE MAP BETWEEN RELATIONAL
arbitrary classification of a space/time region, by a cognitive                DATA AND THE CRIME ONTOLOGY
agent. An event may have a location, a time, active agents,              The definition of a language to be used as a pattern for
factors and products.” To describe where a crime occurred            opening data on criminal incidents is only the first step of the
geographically, we use the ontology wgs843 to express location       proposed method. Patterns require community acceptance,
in terms of latitude and longitude.                                  therefore a key aspect is how friendly the use of the pattern is.
    Typically, a detailed identification of the people involved is   Thus it is essential that the correspondence between
not open information due to privacy concerns. However, this          information represented in the pattern and information
varies according to different countries, sources and cultures. In    represented in the databases of the police departments be easily
Brazil, for instance, the media naturally discloses homicide         established. In this section we describe how the proposed
victims. In the US, raw crime data does not include the victim’s     method seeks to accomplish this. It relies on two assumptions i)
name.                                                                as crime data are originally stored in relational databases, the
    We defined a crime ontology inspired by the Criminal Act         Web publication thereof should not require data replication,
Ontology in the context of the OpenCyC Project, and also took        and ii) the task of associating the original data with the
into consideration the FBI Uniform Crime Report4 standard.           ontology should not require learning another programming
The report of crime refers to a particular crime and has             language.
information about the reporting itself. The identification of the
reporter, the time and date of the report, and links to external     A. Publishing Relational Data on the Web
sources are examples of this kind of information. As a report of         To achieve the first requirement, we have chosen to base
crime contains basic provenance information, in order to             our method on systems that map relational data to RDF on-
represent these latter features, we imported the Provenance          demand such as Asio Semantic Bridge for Relational
Model Language 2 (PML2) ontology [5]. Even though the                Databases5, D2R6 [2], SquirrelRDF7, and UltraWrap8 [7]. In
Open Provenance Model (OPM) [6] and its Open Provenance              these methods, an application (typically a Web server) takes
Model Ontology (OPMO) are becoming widely used for                   requests from the Web and rewrites them to SQL queries. This
provenance exchange, we have chosen to use PML2 because it           on-the-fly translation allows the content of large
includes classes and properties to represent the trustworthiness
of the sources and credibility of the information. These                       5
                                                                                 http://www.bbn.com/technology/knowledge/asio_sbrd
                                                                                  6
                                                                                    http://www4.wiwiss.fu-berlin.de/bizer/d2r-server/
                      3                                                                     7
                        http://www.w3.org/2003/01/geo/                                        http://jena.sf.net/SquirrelRDF
               4                                                     8
                   http://www.fbi.gov/about-us/cjis/ucr/ucr            http://www.cs.utexas.edu/~miranker/studentWeb/UltrawrapHomePage.html
                                               Fig. 2. Example of a SELECT clause to
                                                    define the concept of THEFT

databases to be accessed with acceptable response times              TranslationTable structure, which allows 1 to n mapping (table
without requiring data replication.                                  to classes).
    The World Wide Web Consortium (W3C) has recognized                   The performance of more complex mappings, whereby it
the importance of mapping relational data to the Semantic Web        may be necessary to access a Web service or to use conditional
by starting the RDB2RDF incubator group (XG) to investigate          structures and external sources of data, can be made through
the need for standardization. In particular, we have chosen to       the javaClass structure, which allows the use of Java classes to
use an approach based on the D2R server. D2R is an open and          perform the mapping.
free system for publishing relational data on the Web. It                In practice, it is very difficult to implement mapping just
enables RDF and HTML browsers to navigate the content of             with simple correspondences like one-to-one table to classes.
non-RDF databases, and allows applications to query a                There is often the need to handle more complex structures,
database using the SPARQL query language over the SPARQL             including the javaClass, which requires an effort that the
protocol.                                                            designer is not always able to make. For instance, a tuple of a
    The operation of D2R is through the interpretation and           table that describes crime data must be mapped into instances
execution of rules, described in the Data to Relational Query        of different classes such as robbery, theft, homicide, etc. Our
language (D2RQ [8]), for mapping the equivalence between an          idea then was to provide a tool that facilitates this process of
ontology and a relational database.                                  mapping to the case of criminal data.
    D2RQ consists of a mapping language between relational
                                                                     B. The D2RCrime Tool
database schema and RDFS/OWL ontologies. The D2RQ
platform creates an RDF view of the relational database, which           D2RCrime provides resources to support the publication of
can be accessed through Jena, Sesame, and the SPARQL query           reports of crimes in RDF, from relational databases. In
language. D2RQ’s main elements are ClassMap and                      particular, the goal is to help designers and/or DBA who do not
PropertyBridge. The ClassMaps represent the classes of an            have extensive knowledge in semantic technologies. The
ontology and associates them with a table or a view of a             ontology of crimes described above is used to guide an
database. The PropertyBridges are linked to one or more              interactive process with a designer/DBA. The basic premise is
ClassMaps and are mainly used to connect the columns in a            that D2RCrime mapping between the ontology classes and the
table with the properties (attributes) present in an ontology.       database tables can be obtained interactively by asking the
Usually, they are filled with literal values, but can also make      designer to write SQL queries for retrieving tuples from the
references to URIs that designate other resources.                   database that describe a particular class (or property) of the
    With PropertyBridges it is possible to specify conditional       ontology. The aim is thus to use a language largely dominated
restrictions that can be used to filter a specific domain or range   by designers/DBA and allows them to easily describe the
of information. Using the Join structure, it is also possible to     concepts represented in the ontology of crimes. Figure 2 shows
specify the mapping between multiple tables and a class or a         an example of how this dialog occurs in D2RCrime.
property in the ontology. Another quite usual feature is the
                                                                      d2rq:classDefinitionLabel "Theft";
    It asks the designer to complete a SELECT clause to
retrieve all the thefts from the database of crime occurrences      map:Theft__label a d2rq:PropertyBridge;
(tb-crime in the Figure). The tool also asks that the response       d2rq:belongsToClassMap map:Theft;
contain the date, time, location and description of each theft.      d2rq:property rdfs:label;
For each SELECT clause made by a designer/DBA, D2RCrime              d2rq:pattern "Theft #@@tb_cri_crime.
transforms the query into an N3 rule. The process is iterative          CRI_IDCRIME@@";
and new questions will be carried out until all the classes and     Frame 1. Example of the code in D2RQ generated by
properties of the ontology have been described in terms of        D2RCrime
SELECT clauses. At the end of the process, the entire mapping
is performed using D2RQ and therefore can be executed on the          During the dialogue process, D2RCrime offers the
D2R Server. Frame 1 illustrates the mapping between tables        possibility for the designer to see how the instances of the
and classes. The crime report and theft classes are mapped        classes (crime reports) have been built. A widget to plot crimes
there.                                                            on the spot where they occurred shows the values of each
    D2RCrime transforms the SQL into D2RQ elements. To do         report. Figure 3 shows an example of this.
this, the following mapping is done: Aiming to accelerate the
elicitation of the requirements for the mapping, D2RCrime
identifies which database field is associated with the type of
crime. It then proposes a customized interface in which it is
possible to associate the values of crime type with the
corresponding ontology classes.

   // CrimeReport - In the ClassMap below
      it is defined that the instances are
      generated with the class
      "crime:CrimeReport"

   map:CrimeReport a d2rq:ClassMap;
                                                                      Fig 3 Preview of the instances of crime reports plotted in
    d2rq:dataStorage map:database;
                                                                  the map
    d2rq:uriPattern "crimereport/
       @@tb_cri_crime.CRI_IDCRIME@@";
    d2rq:class crime:CrimeReport;                                                       IV. EVALUATION
    d2rq:classDefinitionLabel "CrimeReport";
                                                                      Our approach proposes a new method of mapping between
    map:CrimeReport__label a
                                                                  relational databases and structured data in RDF. We are not
    d2rq:PropertyBridge;
                                                                  aware of similar tools or approaches that are able to perform
                                                                  the RDF2RDF mapping intuitively using SQL clauses. Because
    d2rq:belongsToClassMap map:CrimeReport;                       of this, we had difficulty choosing what would be the most
    d2rq:property rdfs:label;                                     appropriate way to validate our hypothesis for the comparison
    d2rq:pattern "CrimeReport                                     and experiments. To alleviate this issue, we decided to compare
    #@@tb_cri_crime.CRI_IDCRIME@@";                               D2RCrime with the D2RServer tool itself, which automates the
                                                                  generation of D2RQ code for mapping the relational data into
   // Theft [OCURRENCE_TYPE] -                                    RDF.
      In the ClassMap below, it is defined                            In order to analyze the hypotheses raised in this paper, an
      that the instances are generated with                       empirical study was conducted aimed at assessing: 1) the
      the class "crime:Theft".                                    representational power of the proposed ontology to represent
      Note the d2rq:condition for                                 criminal events; 2) whether the task of creating correspondence
      selecting the adequate type of crime                        by means of the proposed tool is not actually a “steep learning
                                                                  curve” and whether the tool is user friendly and intuitive,
   map:Theft a d2rq:ClassMap;                                     enabling and facilitating the proposed mapping process.
    d2rq:dataStorage map:database;
    d2rq:uriPattern "Theft/@@tb_cri_crime.                        A. Methodology
       CRI_IDCRIME@@";                                                The study was conducted in two stages. In the first stage, a
    d2rq:class crime:Theft;                                       battery of tests of “translation” of information on crimes was
    d2rq:condition "tb_cri_crime.                                 conducted in the laboratory, based on the proposed ontology.
     tcr_idtipo_crime=1 or                                        The battery was based on non-probabilistic and intentional
     tb_cri_crime.tcr_idtipo_crime=4";                            samples (50 each) from police agencies. The choice of samples
                                                                  was based on two factors: the requirement that the police
agencies have their information about crimes published, and         general the main concepts were correctly mapped. Most of the
the interest in evaluating the ontology in different countries      types of reports open to the public refer to crimes against
(criminal law) and in different languages.                          property (robbery, thefts, burglary, etc.) and crimes against life
    In the second stage, tests were conducted with users to         (murder, attempted murder, etc.). Problematic cases refer to
analyze whether the D2RCrime tool softens the “steep learning       types of crimes that are generic, such as “anti-social behavior“
curve” found in the data-opening process. For such, a sample of     or “disturbing the peace.” Typically this involves several types
10 users — 5 analysts and five DBAs, all with experience in         of crimes that differ from country to country. In US, for
DBMSs and SQL language — were invited to publish data on            instance, prostitution is a crime that could be classified as anti-
crimes in two sessions.                                             social behavior. In Brazil, prostitution is not crime. We decide
    The first session used the D2RCrime tool in conjunction         not to drill down in each one of these cases; we created the
with the proposed ontology. The second session was conducted        generic classes to represent them.
without introducing the tool, encouraging users to perform the
                                                                    C. Results: User Interaction
publication without support of the tool. To do so, we used the
automatic mapping generation resource (generate-mapping)                Figure 4a shows the results obtained from the tests, in
available in the D2RServer software. This procedure                 which D2RCrime was used according to the indicators outlined
automatically generates a mapping file expressed in D2RQ            in Section IV.A. Figure 4b shows the results for the case in
language, which reflects the structure of the relational database   which the D2R tool was used.
to be mapped.                                                           Taking into account that the users had no prior knowledge
    All the users who took part in the tests had good knowledge     in the use of the tool or semantic technologies, the tests showed
on SQL language and little or no knowledge on semantic              that the tool is a viable alternative to easily provide for the
technologies, representing the scenario usually found in an IT      opening of data. This strengthened our hypothesis that the use
staff. The proposed method takes this fact into account,            of the SQL metaphor is a good heuristic for the success of the
utilizing the System Analysts’ and DBAs’ prior knowledge in         method. The high percentage obtained in the “RDF mapping”
SQL and not exposing them to the need to learn the set of tools     and “Correctness of vocabulary” indicators can be used to
required for publishing content on the Semantic Web.                demonstrate the effectiveness of the method. During the
    As a methodology for performing the test, users were            experiments, it was also proven that this approach obtained
exposed to a document with different data models, which were        good acceptance due to the fact that it is not necessary to invest
aimed at representing the tables related to the storage of          time in semantic technologies/tools that are often not of direct
criminal occurrences. Thus, different data modeling was             interest to such users.
distributed among the user groups, so that there would be a             Regarding the “the number of activities done in the time
significant representation of the main scenarios found in the       constraint” indicator, we found that each concept of the
databases of police departments. The use of different models        ontology was mapped, with the aid of the tool, taking one
was aimed at assessing the generality of this approach. The         minute on average. It was also perceived that the process of
following performance factors were used for the tests               mapping the last concepts was always performed faster than
conducted:                                                          mapping the initial concepts: after mapping the first concepts,
    1) Success in the mapping activities, which indicates           the users acquire the minimum experience in the tool, enough
whether it was possible to complete the mapping test within the     to perform the subsequent tasks even more quickly.
allotted time (30 minutes);                                             Regarding the “RDF mapping” indicator, there were slight
    2) RDF Mapping, which reflects the quantity of concepts         indications of mapping and usability failures. In one of the
and properties of the ontology that were successfully mapped        tests, the tool did not properly format a string informed by the
to RDF for those users who finished the tasks (item 1);             user for the “date” field, causing the respective property of the
    3) Correctness of the generated vocabulary, which reflects      ontology not to be mapped successfully. The “date” field is
whether the published data met the main concepts described in       more prone to situations such as this, because several SQL
the ontology;                                                       functions are applied thereto (e.g.: substring) to format the data.
    4) Autonomy which is the number of users that have                  In order to make a comparative analysis, we conducted the
finished the activities without human guidance at the time (only    same test with other users, but this time using a different
with the specification of the activity).                            methodology. We chose to use the tool provided by the D2R
                                                                    itself, where — given a relational database — the automated
B. Results: Ontology Coverage                                       mapping functionality (generate mapping) is responsible for
   As mentioned before, the proposed crime ontology was             generating the mapping file starting from the structure of a
based on the current initiatives of open crime data. For the        relational database. In order to do so, the tool generates an RDF
purpose of evaluating the completeness of the ontology              vocabulary according to the database, taking into account the
coverage, we compared the concepts represented therein with         table names as the ontology class names and the table columns
four samples of crime datasets in different countries: Oakland,     as the ontology properties. The following aspects drove the
US; FBI, US; London, UK; and Fortaleza, BR. A table                 choice of the D2R tool:
describing the main concepts used in this comparison is                 1) Independence of paid license;
available at http://www.wikicrimes.org/ontology/table.htm. In           2) Ease of use;
    3) Availability on the market;                                     automatic mapping to be generated, confirming the fact that —
    4) Ability to be used in a 30-minute test without the need         even for a task that is simple to perform — a higher
for special infrastructure.                                            learning/difficulty curve is already shown to be present for the
    Approaches such as the Asio Semantic Bridge for                    completion of the mapping tasks due to the need to learn about
Relational Databases — ASBRD9, SquirrelRDF10, and                      semantic tools.
RDBToOnto [9] are methods that are close to our approach, but
                                                                       D. Discussion
require a considerable learning curve, due largely to the need
for specific configurations and the need to manipulate the                 As a general result, the data obtained showed the proposed
mapping file manually. Tools such as Oracle Semantic                   method as a viable alternative to easily provide for the opening
Technologies and the ASIO SBRD itself require paid software            of data on the Semantic Web. The D2RCrime tool is shown to
licenses.                                                              be an effective alternative to lessen the steep learning curve
    As the methodology for conducting this second phase of             required in this process.
testing, a document containing the information needed to                   It is important to stress that the automatic mapping
perform the installation of D2R Server software was made               generated by the D2R Server software does not provide
available to the users, as well as the procedures to generate the      integration with standardized ontologies accepted by the
                                                                       community (e.g.: GeoNames, Time, PMLP, Sioc, etc.), which
                                                                       somewhat hinders the context of data integration and reuse of
                                                                       information. Using the D2RCrime tool, the data are published
                                                                       using a proposed ontology that foresees this entire scenario of
                                                                       integration/mash-up of information.
                                                                           It is also important to highlight that in order for semantic
                                                                       applications to be integrated more deeply to the published data,
                                                                       it’s necessary to replace the vocabulary generated
                                                                       automatically with RDF vocabularies that are standardized,
                                                                       accepted by the community, widely known, and publicly
                                                                       accessible. The generated mapping can be freely edited.
         Fig. 4. Results of the evaluation (a) with the
     use of D2RCrime and (b) with the D2R standard tool                However, in order to do so, the user must have all of the
                                                                       knowledge about how the mapping method and syntax work.
automatic mapping of the relational database and test whether                                 V. RELATED WORK
the publication of the data was successful. Before beginning the
tests, the basic operation of the D2RQ mapping file was                    Metatomix’s Semantic Platform11 and RDBtoOnto12 [9] are
explained to the users, detailing its main structures and              examples of automatic tools that generate a populated ontology
compulsory components (ClassMaps and PropertyBridges).                 in RDF. In the case of the first, the mapping is done through a
After these procedures, the users then began the tasks related to      graphical eclipse plugin. Other structured sources can map to
publication of the data.                                               the same ontology allowing data integration under the same
                                                                       ontology. DB2OWL [10] automatically generates ontologies
    Figure 4b reflects the results of the testing, according to the
                                                                       from database schemas, but it does not populate the ontology
same aforementioned indicators. The “RDF mapping” (100%)
                                                                       with instances. The mapping process is performed from the
demonstrates that the approach is stable and is able to perform
                                                                       detection of particular cases for conceptual elements in the
the mapping of the various types of data among the tables and
                                                                       database, then the conversion is realized through the mappings
columns involved. The “Correctness of vocabulary” indicator,
                                                                       from these components present in the database to their
however, got a very low percentage (0%). This is obviously
                                                                       counterparts in the ontology.
due to the fact that using only the D2R, the classes and fields of
the ontology cannot be generated. The D2R tool generates its               Triplify [11] is a lightweight plug-in that exposes relational
own vocabulary created in an ad hoc way. This reflects a               database data as RDF and Linked Data on the Web. There is no
common fragility found in automated mapping approaches:                SPARQL support. The desired data to be exposed is defined in
although the data are mapped to RDF, in order for them to be           a series of SQL queries. Triplify is written only in PHP but has
able to actually represent the local domain and its respective         been adapted to several popular web applications (WordPress,
relationships to be mapped, the mapping device must undergo a          Joomla, osCommerce, etc.).
series of customizations to relate the generated instances                 ODEMapster13 is a plugin for the NeOn toolkit, which
efficiently.                                                           provides a GUI to manage mappings between the relational
    The “the number of activities done in the time constraint”         database and RDFS/OWL ontologies. The mappings are
indicator (40%) shows that not all tests could be completed in         expressed in the R2O language.
the stipulated time. This is due to the fact that users had to learn
how to configure the D2RServer software in order for the
                                                                                        11
                                                                                             http://www.metatomix.com
                                                                                                       12
                                                                                                          http://www.tao-
         9
             http://www.bbn.com/technology/knowledge/asio_sbrd          project.eu/researchanddevelopment/demosanddownloads/RDBToOnto.html
                   10                                                                    13
                      http://jena.sourceforge.net/SquirrelRDF                               http://neon-toolkit.org/wiki/ODEMapster
    Asios’ SBRD (Semantic Bridge for Relational Databases)           are able to report criminal facts as well as keep track of the
enables integration of relational databases to the Semantic Web      locations where such crimes occur. We have integrated
by allowing SPARQL queries over the relational database. An          D2RCrime to WikiCrimes in which the instances retrieved by
initially OWL ontology is generated from the database schema,        WikiCrimes from the Police Department’s relational databases
which can then be mapped to a defined domain OWL ontology.           via D2RCrime are plotted directly on the digital map (for
The refinement of the ontology is done by means of Snoogle           further details see [15]). Doing so, a set of services provided by
[12]. Snoogle converts the initial mappings to SWRL/RDF or           WikiCrimes is available to the citizens. It is possible to receive
SWRL/XML. It also allows two ontologies to be viewed on              alerts about dangerous places and to receive alerts by email as
screen and then the correspondence between their classes can         well. Apps for running on iPhones and Android smartphones
be generated, as well as attributes thereof. This whole process      also exist.
of mapping is accomplished via a visual interface.
    This two-step approach followed by Asio requires a                                       ACKNOWLEDGMENT
significant effort by the user compared with the approach we            This work was supported in part by the CNPq under Grants
have proposed. For non-experts, it requires learning of two sets     55977/2010-7 and 304347/2011-6 .
of tools. SquirrelRDF8 is a tool that allows relational databases
to be queried using SPARQL. This tool takes a simplistic                                         REFERENCES
approach by not performing any complex model mapping like            [1] D. Lathrop, L. Ruma, “Open government: Collaboration,
D2RQ. One of the most significant limitations of this approach            transparency, and participation in practice”, in O’Reilly Media,
is that it is not possible to use SPARQL queries searching for            2010.
properties.                                                          [2] C. Bizer, R. Cyganiak, “D2R Server - Publishing Relational
                                                                          Databases on the Semantic Web”, in Poster at the 5th
                         VI. CONCLUSION                                   International Semantic Web Conference, 2006.
    In this paper we have described a method that relies on the      [3] J.R. Hobbs, F. Pan, “An ontology of time for the semantic web”.
representation of ontologies as a pattern to represent the                In ACM Transactions on Asian Language Information
                                                                          Processing (TALIP), 66–85, ISSN 1530-0226.
concepts of crime and report of crimes. Besides a pattern, the
ontologies are the interface to publish relational crime data on-    [4] Y. Raimond, S. A. Abdallah, “The event ontology”, 2006.
the-fly. We have also proposed an interactive tool, called                Available: http://purl.org/NET/c4dm/event.owl.
D2RCrime, which assists the designer/DBA to make the                 [5] D. McGuinness, L. Ding, P. Pinheiro da Silva, C. Chang,
correspondence between the relational data and the classes and            “PML2: A Modular Explanation Interlingua”, in Proceedings of
                                                                          the AAAI 2007 Workshop on Explanation-Aware Computing,
properties of the crime ontology. This correspondence allows
                                                                          Vancouver, British Columbia, Canada, July 22-23, 2007.
automatic generation of the mapping rules between the two
                                                                     [6] L. Moreau, B. Clifford, J. Freire, J. Futrelle, Y. Gil, P. Groth, N.
representations that conduct the process of access of relational
                                                                          Kwas-nikowska, S. Miles, P. Missier, J. Myers, B. Plale, Y.
data from SPARQL.                                                         Simmhan, E. Stephan, and J. Van Den Bussche, “The Open
    Open issues persist and will drive our future research. Open          Provenance Model — Core Specification (v1.1)”, in Future
data may come from different sources. It will be necessary to             Generation Computer Systems, 2010.
have mechanisms to compare and check whether the                     [7] J. F. Sequeda, R. Depena, D. Miranker, “Ultrawrap: Using SQL
information refers to the same fact. Creating mechanisms to               Views for RDB2RDF”, in Poster at the 8th International
automatically identify these repetitions is a challenge to be             Semantic Web Conference (ISWC2009). Washington DC, US,
pursued. Another challenge, also due to the fact that                     2009.
information comes from different sources, is the need to             [8] C. Bizer, A. Seaborne, “The D2RQ Platform v0.7 - Treating
account for the credibility of information automatically. When            Non-RDF Relational Databases as Virtual RDF Graphs”.
sources are known, such as official sources, the attribution of           Available:                                http://www4.wiwiss.fu-
credibility is natural. However, the credibility of non-official          berlin.de/bizer/d2rq/spec/20090810.
information sources is difficult to be assigned. Methods for         [9] F. Cerbah, "RDBToOnto: un logiciel dédié à l’apprentissage
computing reputation and trustworthiness of the sources as in             d’ontologies à partir de bases de données relationnelles",
[13] [14] are examples of how this can be addressed.                      Strasbourg, 2009.
    Finally it is important to point out that the main advantage     [10] N. Cullot, R. Ghawi, K. Yétongnon, “DB2OWL: A Tool for
of having open crime data is the possibility that it will be used         Automatic Database-to-Ontology”, in Proceedings of the 15th
                                                                          Italian Symposium on Advanced Database Systems (SEBD),
to provide services to citizens. Examples of this are alerts about
                                                                          2007.
how dangerous a certain place is and suggestions of safe routes.
                                                                     [11] S. Auer, “Triplify – Light-Weight Linked Data Publication from
Such information can be enriched with data coming from
                                                                          Relational Databases”, in Proceedings of the 18th World Wide
popular participation, for example, via collaborative mapping.            Web Conference (WWW2009).
An example of collaborative mapping in Law Enforcement is
                                                                     [12] H. Wang, C. Tan, Q. Li, “Snoogle: A search engine for the
WikiCrimes14 [13]. WikiCrimes aims to offer a common                      physical world”, in IEEE Infocom, 2008.
interaction space among the public in general, so that people
                                                                     [13] V. Furtado, L. Ayres, L. de Oliveira, M. Vasconcelos, C.
                                                                          Caminha, J. D’Orleans, J, “Collective Intelligence in the Law
                    14
                         http://www.wikicrimes.org
     Enforcement: The WikiCrimes System”, in Information Science,        [15] . J. Tavares, V. Furtado, H. Santos, “Open Government in Law
     2010.                                                                    Enforcement: Assisting the publication of Crime Occurrences in
[14] I. Pinyol, J. Sabater-Mir, G. Cuni, “How to talk about reputation        RDF from Relational Data”, in AAAI Fall Symposium on Open
     using a common ontology: From definition to implementation”,             Government Knowledge: AI Opportunities and Challenges,
     in Proceedings of the Ninth Workshop on Trust in Agent                   Arlington, VA, 2011.
     Societies, Hawaii, USA. pp: 90-101. 2007.