<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>The Case Study of an Australian Crime Dataset</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jessica Liebig</string-name>
          <email>jessica.liebig@rmit.edu.au</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Asha Rao</string-name>
          <email>asha@rmit.edu.au</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Proc. of the 3</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>School of Mathematical and Geospatial Sciences, RMIT University</institution>
          ,
          <addr-line>Melbourne, VIC</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>Analysis of crime data is crucial for prevention and assessment of illegal activity. This paper is one of the first case studies of a crime dataset collected in New South Wales, Australia. We apply methods from complex network analysis to identify key aspects of criminal activity in the state of New South Wales. We further detect groups of local government areas and examine their dynamics over time. We represent our results by using various different visualisation techniques. The analysis of crime datasets is necessary in order to prevent and assess criminal activity [17]. Information about different types of crimes can often be found in the form of annual reports published by government bodies, but rarely in the form of publicly available datasets that may be used for research. In contrast, the New South Wales Bureau of Crime Statistics and Research in Australia has published data on criminal activity in the state of New South Wales (NSW) [4]. It contains information collected between 1995 and 2012, recording several types of offences and the local government area where they occurred.</p>
      </abstract>
      <kwd-group>
        <kwd>Crime data</kwd>
        <kwd>Visualisation</kwd>
        <kwd>Networks</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>INTRODUCTION
The dataset analysed in this paper is publicly available1 and contains information about the different types
of crime that took place in New South Wales between 1995 and 2012. It records the local government
area where the crime occurred along with its offence category and the month and year of the crime. The
New South Wales Bureau of Crime Statistics also provides a helpful visualisation tool for the dataset on
their website2. It allows the user to research various basic statistics of the local government areas and
offence categories.</p>
      <p>As outlined in the introduction we use tools from complex network analysis as a means to analyse
this data and represent the given information as a network. In the case of the NSW crime data, there
are 155 local government areas and 49 offence categories, that can be represented by two different
types of nodes. A government area can never be linked directly to another government area. Similarly,
a connection cannot be established between two offence categories and hence, links are solely found
between areas and offences. For example, the scenario of a person stealing from a retail store in Bourke
and two people escaping custody in the local government areas, Wagga Wagga and Upper Hunter Shire,
may be represented as the network depicted in Figure 1.</p>
    </sec>
    <sec id="sec-2">
      <title>Node representing</title>
      <p>stealing from a retail store</p>
    </sec>
    <sec id="sec-3">
      <title>Node representing</title>
      <p>escaping custody</p>
    </sec>
    <sec id="sec-4">
      <title>Node representing</title>
      <p>Upper Hunter Shire</p>
    </sec>
    <sec id="sec-5">
      <title>Node representing</title>
      <p>Bourke</p>
    </sec>
    <sec id="sec-6">
      <title>Node representing</title>
      <p>Wagga Wagga</p>
      <p>We are particularly interested in changes in the data between 2000 and 2012 and hence have divided
the dataset into 156 networks, each covering a period of one month. Analysing each network separately
and comparing the results gives valuable insights into the dynamics of criminal activity with respect to
the local government areas.</p>
      <p>
        IDENTIFYING CENTRAL ASPECTS TO CRIME ACTIVITY
The identification of central crime locations and offences is highly beneficial in preventing future criminal
activity [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. Knowledge of critical areas allows government agencies to target illegal activities more
efficiently. By applying a combination of the two methods introduced in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], we find the most
important areas and offences in the NSW crime network.
average concentration in the network. A location or offence type that shows a concentration that is very
different to the average, plays an important role for the dynamics of the network. To be able to make the
comparison to the average concentration, we calculate a score based on the mean and standard deviation
of the various concentrations. For more detail on this method see [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. The local government
areas and offence categories can then be ranked accordingly. We have ranked the local government
areas and offence categories for every month between January 2000 and December 2012. Note that the
rankings of the local government areas are based on all 49 offence categories and vice versa.
      </p>
      <sec id="sec-6-1">
        <title>Ranking of local government areas</title>
        <p>A total of 155 local government areas form the Australian state of New South Wales. Their ranks range
between 0 and 1 and are inversely proportional to the concentration. Thus, a rank close to zero shows
that the concentration of a particular area was higher than the mean concentration. A rank close to 0.5
shows a similar concentration to the average, and a high rank (close to one) represents a concentration
much lower than the average. Examination of our results shows that the rank of any individual area never
fell below 0.3, meaning that the concentrations are skewed with many areas exhibiting concentrations
below the average. We found that isolated and sparsely populated areas received extremely high ranks
and did not show much variation over time. We have plotted the ranks of four government areas over
time in Figure 3. Being able to make a clear connection between the rate of certain crimes in particular
areas and their rank requires further work.</p>
        <p>Leichhardt</p>
        <p>Kogarah
20 40 60 80 100 120 140</p>
        <p>time
Unincorporated Far West
20 40 60 80 100 120 140</p>
        <p>time
Lord Howe Island</p>
      </sec>
      <sec id="sec-6-2">
        <title>Ranking of offence categories</title>
        <p>Similar to the government areas, the ranks of the 49 offence categories range between 0 and 1. We
observe that more common, often less serious, crimes are ranked low (close to 0) while, offences that are
less common, but more serious, are given a high rank (close to 1).</p>
        <p>The ranking of offence categories changes from month to month, however, the observed difference in
the ranking of each category is generally small. Two of the lowest ranked categories in the NSW crime
dataset in the years between 2000 and 2012 are possession and use of cannabis and sexual offences.
Some offences that fall under disorderly conduct and certain offences against justice procedures were
also ranked low throughout the 13 year period. Figure 4 shows the change in ranking of these offences
together with other similar offences that fall within the same super-category.</p>
        <sec id="sec-6-2-1">
          <title>Sexual offences</title>
        </sec>
        <sec id="sec-6-2-2">
          <title>Sexual assault</title>
        </sec>
        <sec id="sec-6-2-3">
          <title>Indecent assault, other sexual offences</title>
        </sec>
        <sec id="sec-6-2-4">
          <title>Offences against justice procedures</title>
        </sec>
        <sec id="sec-6-2-5">
          <title>Escaping custody</title>
        </sec>
        <sec id="sec-6-2-6">
          <title>Breaching Apprehended Violence Order</title>
        </sec>
        <sec id="sec-6-2-7">
          <title>Breaching bail conditions</title>
        </sec>
        <sec id="sec-6-2-8">
          <title>Failing to appear</title>
        </sec>
        <sec id="sec-6-2-9">
          <title>Resisting or hindering officer</title>
          <p>Other offences against justice procedures</p>
          <p>
            Looking at the first plot in Figure 4, we can see that the rank of the offence category use or possession
of cannabis is much lower than that that of other listed drugs. Although Australia has seen a significant
decline in the use of drugs after the tightening of drug strategies in 1998, cannabis is still one of the most
common and frequently used drugs [
            <xref ref-type="bibr" rid="ref5">5</xref>
            ].
          </p>
          <p>
            Both sexual offence categories recorded in the dataset received very low rankings throughout the 13
year period. Sexual offences are a huge issue everywhere in Australia with New South Wales having the
highest total number of sexual assaults reported to police [
            <xref ref-type="bibr" rid="ref16">16</xref>
            ]. According to the Australian Bureau of
Statistics, 20% of women and 5% of men over the age of 15, experience sexual violence [
            <xref ref-type="bibr" rid="ref1">1</xref>
            ].
          </p>
          <p>
            Disorderly conduct is another common offence in NSW, specifically on weekends and in connection
with alcohol consumption [
            <xref ref-type="bibr" rid="ref15">15</xref>
            ]. Interestingly, the category criminal intent, is ranked higher than other
acts of disorderly conduct. This is an indicator that in many cases the police do not pick up the planning
of criminal activity.
          </p>
          <p>
            On the other hand, homicide and the dealing of cocaine are two of the highest ranked categories (see
Figure 5). According to the Australian Institute of Criminology [
            <xref ref-type="bibr" rid="ref3">3</xref>
            ], homicide incidents are currently one
of the lowest crime rates in Australia and it is unlikely that a homicide remains unreported, as is often the
case with domestic violence. With regards to cocaine dealing, between 2003 and 2012 cocaine arrests
have accounted for less than 1.5% of national illicit drug arrests [
            <xref ref-type="bibr" rid="ref2">2</xref>
            ].
          </p>
          <p>Clearly, the rank of offences reflects the severity of the crime and not the rate at which it occurs. All
data indicates that more petty crimes such as trespassing occur more often than serious crimes such as
murder.</p>
          <p>
            DETECTION OF GROUPS
The detection of groups, of entities within a system, has been another field of great interest in the area of
complex networks in recent years [
            <xref ref-type="bibr" rid="ref14">14</xref>
            ]. Being able to divide the local government areas of NSW into
groups may further aid in the prevention of crime. Certain strategies of crime prevention that are already
in place in some areas may be applied to other areas. However, a prevention scheme that works in one
kn 0.6
a
r
          </p>
          <p>1
kn 0.6
a
r</p>
          <p>1
0.8
0.4
0.2
0</p>
          <p>Dealing or trafficking cocaine
Dealing or trafficking narcotics</p>
          <p>Dealing or trafficking cannabis
Dealing or trafficking amphetamines</p>
          <p>Dealing or trafficking ecstasy
Dealing or trafficking other drugs
location is not guaranteed to be successful in another. If two local government areas are classified to be in
the same group they certainly have many things in common. Therefore, a prevention strategy that works
in one area is more likely to work in another area that is part of the same group.</p>
          <p>
            Determining groups of local government areas and offence categories respectively, requires a
simplification of the network. Without loss of generality we describe the process in terms of the local government
areas. The network is simplified in the following manner: The nodes representing the 49 different offence
categories are dropped and two areas are linked if one or more crimes of the same category occurred in
both areas. For instance, if an attempted murder occurred in the two areas Bourke and Wagga Wagga,
a connection is established between these regions. Connections are associated with an attribute that
records the number of crimes in common. Once the network is simplified, it is possible to determine the
most significant connections. Dropping all insignificant connections reveals the different groups as such
connections often occur between groups, whereas significant links usually occur within groups. Details
on how to determine the significance of a connection can be found in [
            <xref ref-type="bibr" rid="ref13">13</xref>
            ]. Note that the identified groups
depend on significant connections to all 49 offence categories.
          </p>
          <p>We are interested not solely in finding different groups but also in their development over time and
hence have explored the dynamics in group structure of the local government areas.</p>
          <p>We identified groups of local government areas in all 156 networks and always found two main
groups as well as some smaller groups. The two largest groups usually contained government areas in the
north east and south west respectively. We often found groups that only consisted of a single government
area. Interestingly, such areas always received one of the highest ranks during the corresponding month.</p>
          <p>Figure 6 shows a map of NSW and its local government areas. Areas are coloured according to group
membership. The largest group is coloured in blue, the second largest in green, with the size of a group
determined by the number of its members and not the total area covered. The colour grey represents
missing data for that month. Examination of data from October 2000 (third map in Figure 6) reveals
that the areas in the largest group, coloured in green, experienced higher crime rates of trespassing than
the average for NSW during that month. Trespassing happened to be the lowest ranked crime during
that month. To answer the question whether this pattern continues throughout the dataset requires more
research and is left for future work.</p>
          <p>CONCLUSION
This paper has shown how tools from complex network analysis can be applied to crime data in order to
describe its dynamics. We have ranked the different offence categories and local government areas in
the state of New South Wales in order to gain an understanding of the underlying mechanics of criminal
activity. Different visualisation techniques were used to present the results. Being able to draw clear
conclusions and find causations of the results presented in this paper requires further research.
Proc. of the 3rd Annual Conference of Research@Locate 35</p>
        </sec>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <source>[1] Australian Bureau of Statistics, viewed 10 December</source>
          <year>2015</year>
          , &lt;http://www.abs.gov.au/&gt;.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <source>[2] Australian Crime Commission</source>
          <year>2014</year>
          ,
          <article-title>Illicit drug data report</article-title>
          ,
          <source>viewed 10 December</source>
          <year>2015</year>
          , &lt;https://www.crimecommission.gov.au/sites/default/files/290414-IDDR-2012-13.pdf&gt;.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Australian</given-names>
            <surname>Institute</surname>
          </string-name>
          of Criminology,
          <source>viewed 10 December</source>
          <year>2012</year>
          , &lt;http://www.aic.gov.au/&gt;.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <source>[4] NSW Bureau of Crime Statistics and Research</source>
          <year>2013</year>
          ,
          <article-title>NSW crime data</article-title>
          ,
          <source>viewed 6 December</source>
          <year>2015</year>
          , &lt;http://data.gov.au/dataset/nsw-crime
          <article-title>-data&gt;.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <source>[5] United Nations Office on Drugs and Crime, viewed 10 December</source>
          <year>2015</year>
          , &lt;www.unodc.org&gt;.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>ALZAHRANI</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          , AND HORADAM,
          <string-name>
            <surname>K. J.</surname>
          </string-name>
          <article-title>Analysis of two crime-related networks derived from bipartite social networks</article-title>
          .
          <source>In Advances in Social Networks Analysis and Mining (ASONAM)</source>
          ,
          <year>2014</year>
          IEEE/ACM International Conference on (
          <year>2014</year>
          ), pp.
          <fpage>890</fpage>
          -
          <lpage>897</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>ARAL</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , AND WALKER,
          <string-name>
            <surname>D.</surname>
          </string-name>
          <article-title>Identifying influential and susceptible members of social networks</article-title>
          .
          <source>Science</source>
          <volume>337</volume>
          ,
          <issue>6092</issue>
          (
          <year>2012</year>
          ),
          <fpage>337</fpage>
          -
          <lpage>341</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>CHEN</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , LU¨ ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>SHANG</surname>
          </string-name>
          , M.-
          <string-name>
            <surname>S.</surname>
          </string-name>
          , ZHANG, Y.-
          <string-name>
            <surname>C.</surname>
          </string-name>
          , AND ZHOU,
          <string-name>
            <surname>T.</surname>
          </string-name>
          <article-title>Identifying influential nodes in complex networks</article-title>
          .
          <source>Physica A 391</source>
          ,
          <issue>4</issue>
          (
          <year>2012</year>
          ),
          <fpage>1777</fpage>
          -
          <lpage>1787</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>CHEN</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          -B.,
          <string-name>
            <surname>GAO</surname>
            , H., L U¨,
            <given-names>L.</given-names>
          </string-name>
          , AND ZHOU,
          <string-name>
            <surname>T.</surname>
          </string-name>
          <article-title>Identifying influential nodes in large-scale directed networks: The role of clustering</article-title>
          .
          <source>PloS One</source>
          <volume>8</volume>
          ,
          <issue>10</issue>
          (
          <year>2013</year>
          ),
          <year>e77455</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>KITSAK</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>GALLOS</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. K.</given-names>
            ,
            <surname>HAVLIN</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>LILJEROS</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>MUCHNIK</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>STANLEY</surname>
          </string-name>
          ,
          <string-name>
            <surname>H. E.</surname>
          </string-name>
          , AND MAKSE,
          <string-name>
            <surname>H. A.</surname>
          </string-name>
          <article-title>Identifying influential spreaders in complex networks</article-title>
          .
          <source>Nature Physics</source>
          <volume>6</volume>
          ,
          <issue>11</issue>
          (
          <year>2010</year>
          ),
          <fpage>36</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>LIEBIG</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>AND RAO</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          <article-title>Identifying influential nodes in bipartite networks using the clustering coefficient</article-title>
          .
          <source>In 2014 Tenth International Conference on Signal-Image Technology and Internet-Based Systems</source>
          (
          <year>2014</year>
          ), pp.
          <fpage>323</fpage>
          -
          <lpage>330</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>LIEBIG</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>AND RAO</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          <article-title>Predicting item popularity: Analysing local clustering behaviour of users</article-title>
          .
          <source>Physica A</source>
          <volume>442</volume>
          (
          <year>2016</year>
          ),
          <fpage>523</fpage>
          -
          <lpage>531</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>LIEBIG</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>AND RAO</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          <article-title>Fast extraction of the backbone of projected bipartite networks to aid community detection</article-title>
          .
          <source>Europhysics Letters</source>
          (To appear, accepted: 25
          <source>January</source>
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>NEWMAN</surname>
            ,
            <given-names>M. E. J.</given-names>
          </string-name>
          <article-title>Finding community structure in networks using the eigenvectors of matrices</article-title>
          .
          <source>Physical Review E 74</source>
          ,
          <issue>3</issue>
          (
          <year>2006</year>
          ),
          <fpage>036104</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>SWEENEY</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , AND PAYNE,
          <string-name>
            <surname>J.</surname>
          </string-name>
          <year>2012</year>
          ,
          <article-title>Alcohol and disorderly conduct on Friday and Saturday nights</article-title>
          ,
          <source>viewed 10 December</source>
          <year>2015</year>
          , &lt;http://www.aic.gov.au/publications/current%20series/rip/1- 10/15.html&gt;.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>TARCZON</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , AND QUADARA,
          <string-name>
            <surname>A.</surname>
          </string-name>
          <year>2012</year>
          ,
          <article-title>The nature and extent of sexual assault and</article-title>
          abuse in Australia,
          <source>viewed 10 December</source>
          <year>2015</year>
          , &lt;http://www3.aifs.gov.au/acssa/pubs/sheets/rs5/&gt;.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>WHITE</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , YEHLE,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>SERRANO</surname>
          </string-name>
          , H., OLIVEIRA,
          <string-name>
            <surname>M.</surname>
          </string-name>
          ,
          <string-name>
            <surname>AND MENEZES</surname>
          </string-name>
          ,
          <string-name>
            <surname>R.</surname>
          </string-name>
          <article-title>The spatial structure of crime in urban environments</article-title>
          .
          <source>In Social Informatics</source>
          . Springer, New York,
          <year>2014</year>
          , pp.
          <fpage>102</fpage>
          -
          <lpage>111</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>