<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Framework for Occupational Fraud Detection by Social Network Analysis</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sanni Lookman</string-name>
          <email>lookman.sanni@malix.univ-paris1.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Selmin Nurcan</string-name>
          <email>nurcan@univ-paris1.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Centre de Recherche en Informatique, Université Paris I</institution>
          ,
          <addr-line>Panthéon-Sorbonne</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper explores issues related to occupational fraud detection. We observe over the past years, a broad use of network research across social and physical sciences including but not limited to social sharing and filtering, recommendation systems, marketing and customer intelligence, counter intelligence and law enforcement. However, the rate of social network analysis adoption in organizations by control professionals or even by academics for insider fraud detection purpose is still very low. This paper introduces the OFD - Occupational Fraud Detection framework, based on formal social network analysis and semantic reasoning principles by taking a design science research perspective.</p>
      </abstract>
      <kwd-group>
        <kwd>Design science</kwd>
        <kwd>ontology</kwd>
        <kwd>data mining</kwd>
        <kwd>fraud detection</kwd>
        <kwd>social network analysis</kwd>
        <kwd>internal control</kwd>
        <kwd>governance</kwd>
        <kwd>risk</kwd>
        <kwd>compliance</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Frauds partly draw from human beings imaginative nature. Over the years,
fraudster’s attack methodologies have evolved from an opportunistic approach to some
more sophisticated and traceless deception schemes and that, in a constantly yet
automatizing but complexifying business environment. In recent years, several unethical
behaviors within organizations have received significant attention. Celebrated cases
range from financial scandals (Pechiney - 1988, Elf - 1994, Enron - 2001, Kerviel
2008) to data theft (WINDOWS 8 Beta - 2012, Korea Credit Bureau - 2014, SONY
2014) and have proven that fraud is likely to happen at any level of an organization.
The Association of Certified Fraud Examiners, in his 2014 report to the nations on
occupational fraud and abuse [ACFE, 2014], estimates a global loss of 5% of
revenues to fraud
        <xref ref-type="bibr" rid="ref1">(3.7 trillion dollars if applied to the 2013 Gross World Product)</xref>
        . They
additionally reported that fraud cases were mostly uncovered by tips or chance (40%).
That is an anonymous fraud hotline would even anticipate a lot of fraud damage and
yet, knowledge discovery and data mining techniques are teeming.
      </p>
      <p>Detection innovations include automated rules, watch lists matching, supervised
and unsupervised classification, data fusion and link analysis. Such techniques have
received increased industry specific interests for external frauds (i.e. committed by
people outside of the organization) detection. Those would include cybercrimes by
computer or network intrusion, credit card, insurance, telecommunication and credit
application frauds [Phua et al., 2004], [Yufeng et al., 2004], [Cox et al., 1997],
[Wheeler et al., 2000]. In the meantime, internal or occupational fraud, defined by the
ACFE as the use of one’s occupation for personal enrichment through the deliberate
misuse or misapplication of the employing organization’s resources or assets, has
proved to be more prevalent than external fraud. PriceWaterhouseCoopers’ 2014
Global Economic Crime Survey reports in France an average of 56% of internal fraud
[PWC, 2014].</p>
      <p>This paper elicits problems faced by investigators in the process of occupational
fraud detection and comes up with a solution which contributes to solving these
problems. Following the design science research paradigm [Wieringa, 2009], formal social
network analysis and semantic modeling concepts have been reused to suggest a new
perspective on the architecture of an effective fraud detection system.</p>
      <p>The remainder of this paper is organized as follows. Section 2 introduces
properties, formal analysis of social networks and motivation for their use to address fraud
detection issues. Section 3 then demonstrates the design of the OFD framework and
the validation of its design within a context of fraud detection from journal entries. In
the last section, related works, concluding remarks and their implication for further
research on social network analysis were taken up.
2
2.1</p>
    </sec>
    <sec id="sec-2">
      <title>Social Networks</title>
      <sec id="sec-2-1">
        <title>What is Social Network Analysis (SNA)?</title>
        <p>A social network is a concept referring to a structure made of social actors sharing
interests, activities, etc… Joseph Moreno is cited by most research papers on the topic
of social network analysis as being the first to introduce methods and tools for a
formal analysis. In the 1930s, he was the first one to use all four properties that
characterizes SNA at the same time in a study aiming at explaining a spate of girls’
runaways: (1) the intuition that links among social actors are important. (2) It is based on
data that record social relations that link actors. (3)It draws heavily on graphic
imagery to reveal and display the patterning of those links. (4) It develops mathematical and
computational models to describe and explain those patterns [Freeman, 2011].
Basically, SNA aims at understanding relationships between the network participants, by
means of mapping and measuring. SNA has received increased attention from
organizations seeking to understand connection between patterns of interactions. It applies
to a wide range of business problems including collaboration in workplaces, team
building in post merger configuration, employee’s engagement measurement, online
reputation, customer intelligence, business strategy, disease contagion, counter
terrorism, etc. It was a SNA which led US military to the capture of Saddam Hussein in
December 2003 [PSU, 2007]. The tool Inflow for example is credited with
contribution to the analysis of terrorist networks surrounding the September 11th events and
contact tracing for HIV transmission in a state prison [INFLOW, 2010].
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Formal Social Network Analysis</title>
        <p>Whether used for infectious disease spread modeling, professional relations
analysis, concentration of resources or power identification, SNA would follow two
different approaches. Researchers distinguish between egocentric and socio-centric analysis
of networks [Chung et al., 2006]. In the former type of analysis, the focus is made on
local structure of networks, i.e. the network around a given node while the latter
considers the network as a whole, looking at interactions patterns and the overall network
structure by quantifying relationships between people. This distinction would impact
the SNA process during data collection and graph visualization.</p>
        <p>SNA provides both a visual and a mathematical analysis of relationships between
the entities participating to the network. From the visual perspective, social networks
are represented as “sociogram” [Scott et al., 2011] or graphs showing actors as nodes
that are tied by one or many types of interdependency (values, ideas, visions, sex,
friendship, kinship, collaboration, trade, antagonism, etc…). From a mathematical
perspective, the social relations datasets translate into a matrix, underlining the
visualized graph. This perspective serves at uncovering the graph’s theoretic properties
(e.g.: number of edges, number of vertices, degree, multiplexity, centrality, density,
closeness, betweenness, etc…), supported by metrics computed from the matrix, that
help characterizing and even querying the network at hand.
2.3</p>
      </sec>
      <sec id="sec-2-3">
        <title>Why Social Network Analysis for Addressing Insider Fraud Detection?</title>
        <p>Social Network Analysis can bring value to occupational fraud detection in at least
three ways. First, nowadays organizations are networked (staff, management,
customers, suppliers, etc…) and fraud can originate from any part of the network. SNA
brings the ability to analyze behaviors and reveal hidden connections that would have
not been seen in raw text format.</p>
        <p>Secondly, the dynamic nature of fraud makes detection challenging for the
traditional rule based algorithms. Fraudsters are constantly adapting to circumvent the
existing controls and any new pattern would not be covered by such static algorithms.
As people excel at detecting patterns and their judgment when reviewing anomalous
activities or transactions very valuable, we believe combining this human ability to
computer’s capability to iteratively and tirelessly search for defined instances would
improve the overall detection process.</p>
        <p>Thirdly, SNA can help saving time during manual investigation, which is a
necessary step for validating any potential fraud case uncovered by a tool. Traditional
computer-aided audit tools are transaction oriented [ACL, 1987], output rows of
incriminated transactions without the view of other related transactions performed by the
same entities and thus make the manual investigation process labor intensive. With
SNA the involved entities and their overall activities is readily available in a graph
view for fraud examiners, who in turn are able to quickly visualize false positives and
can focus on more risky cases.
3.1</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>The Occupational Fraud Detection Framework - OFD</title>
      <sec id="sec-3-1">
        <title>Framework design</title>
        <p>the rationale behind how fraudulent cases are uncovered, but also the display of
the different visualizations available.</p>
        <p>The risk assessment engine is made of a data parser, for ensuring proper
integration of raw data collected and a semantic reasoning system. The reasoner is
meant to infer logical consequences from the rules specified in the ontology
designer. It would analyze the parsed data with both socio-centric and ego-centric
perspectives. At one hand, ego-centric analysis will highlight individual
interactions which violate the set of rules specified, while at the other hand,
sociocentric analysis will enable the identification of internal control deficiencies (e.g.:
no segregation of duties) and the detection of fraud not pertaining to a specific
transaction, or entity (e.g.: conflicts of interest, management frauds, etc…).
The reporting component, like what exist today in the industry, would report on
cases of violation of the specified rules, by outputting rows of potentially
fraudulent transactions.</p>
        <p>The visualization component with its set of actionable sociograms includes a
multidimensional social network view, showing several interaction types in the
same network, what goes along with the socio-centric perspective mentioned
earlier. Drill down and rollup capabilities would help zooming into transactions
pertaining to a specific interaction type, or a specific actor of the network
(egocentric analysis). On reduced set of interactions, the time dimension would also
be viewable. This component is critical to the overall detection process as
through it, fraud examiners would uncover new unforeseen patterns to be
specified in the ontology editor, thus paving the way for a continuously improving
fraud detection engine.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>OFD framework evaluation by early prototyping</title>
        <p>Before jumping to the development of a generic, sound and theoretically
grounded tool for supporting the framework introduced earlier, a review of the design has
been performed. The aim of such evaluation was threefold:
a. Assess the extent to which graphs can fairly and faithfully represent the diversity
of interactions happening between actors of the same or different organizations.
b. Measure the expressiveness of a social network in terms of red flagging of
fraudulent interactions or transactions.
c. Gain insights on the perceived complexity by fraud examiners in the use of such
visualizations to support fraud detection.</p>
        <p>To this end, we ran a case study using accounting journal entries extractions as
input for two different organizations of different size. The case study was conducted
in collaboration with a population of internal auditors, who have been surveyed on
various multidimensional social networks generated from the accounting journal
entries (actionable visualization component). The number of auditors involved in the
evaluation cannot be revealed in the presence of non disclosure agreement with the
cooperating organization. The R project and the network analysis package “IGRAPH
0.7.1” [Csárdi et al., 2006] were used for scripting raw data parsing, business and
design logic. Figure 2 illustrates the overall multidimensional network for one of the
entities studied.
Each edge in the graph above corresponds to a type of interaction happening between
an employee (orange nodes) and a third party (other nodes - customer or supplier in
this case). Red edges correspond to outgoing payments, orange ones being purchase
invoices, etc… Different drills down or subsets of what is shown in figure 2 have
been submitted to the auditors, like the one in figure 3, illustrating supplier only
related interactions for the same entity as above.</p>
        <p>The key takeaways from this evaluation exercise are as follows:
 Not all journal entries involve a third party (customer, supplier, etc…), what could
be perceived as a threat to the validity of our social network oriented approach.
Fortunately, such entries (depreciation, amortization, miscellaneous incomes,
etc…) are usually subjected to rules which can be reasoned by the risk assessment
engine and atypical entries solely highlighted in the reporting engine.
 The manipulation of the proposed visualizations is not that intuitive for auditors,
even with a help document attached. Training should not be neglected as 20% of
the surveyed auditors perceived the visualization as being too complex, embedding
too much information at once. They actually did not provide any further answer to
the questionnaire.
 The remaining participants’ high level observations or socio-centric conclusions
were identical (e.g. non effectiveness of segregation of duties), denoting the good
expressiveness of graphs for serving such purpose.
 At the other hand, the ego centric findings were diverse and varied from an auditor
to another one, but not contradictory. The variability in the red flags of interest
might be explained by the difference in the past experiences of each one of the
auditors. They tend to focus their testing procedures on the types of anomalies they
expect to come across (what is quite aligned with traditional rule based static
detection algorithms); the visualization can help then going beyond that, by
expanding the range of possibilities and suggesting further investigation axes.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Conclusion and future works</title>
      <p>To the best of our knowledge, only few research papers tackled issues in
occupational fraud and even fewer integrated visual analytic concepts to their approach.
Those last include unsupervised approaches like graph pattern matching techniques
[Eberle et al, 2011], with strong focus on structural anomalies identification, but
unfortunately forgoing real world business specificities and rules, what leads to a high
rate of false positives and complex maintenance by end users. Other approaches like
[Luell, 2010] or [Argyriou et al, 2013], rely on innovative but tailor made
visualizations which cannot be applied to other business processes. The framework presented
in this paper extends existing data mining techniques used for occupational fraud
detection by offering not only visualizations to be used by auditors to uncover new
fraud patterns, but also semantic reasoning capabilities for integrating those new
patterns to the fraud detection engine. The targeted architecture is then scalable and
extensible provided the only maintenance of specified ontologies. Our assessment of the
serviceability of the sociograms on accounting journal entries delivered promising
results and future directions for this research will be towards the design and the
evaluation of a full prototype for supporting the framework. The generic nature of the
framework presented herein and its network oriented approach also open perspectives
for investigation beyond the scope of occupational fraud detection. Cyber criminality
in an environment where information systems are more and more interoperable may
also be investigated following a likely approach.
















</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <source>[ACFE</source>
          ,
          <year>2014</year>
          ]
          <article-title>ACFE 2014 Report to the nations on occupational fraud and</article-title>
          abuse http://www.acfe.com/uploadedFiles/ACFE_Website/Content/documents/2004RttN.pdf [ACL,
          <year>1987</year>
          <article-title>] www</article-title>
          .acl.com [Argyriou et al,
          <year>2013</year>
          ]
          <string-name>
            <given-names>Evmorfia N.</given-names>
            <surname>Argyriou</surname>
          </string-name>
          , Aikaterini A.
          <string-name>
            <surname>Sotiraki</surname>
          </string-name>
          , Antonios Symvonis.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>Occupational</given-names>
            <surname>Fraud Detection Through Visualization</surname>
          </string-name>
          ,
          <source>In Proc. of the 11th IEEE Intelligence and Security Informatics (ISI</source>
          <year>2013</year>
          ), pages
          <fpage>4</fpage>
          -
          <lpage>7</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [Chung et al.,
          <year>2006</year>
          ]
          <article-title>Kenneth K Chung, Liquat Hossain, Joseph Davis</article-title>
          .
          <article-title>Exploring sociocentric and egocentric approaches for social network analysis</article-title>
          .
          <source>KMAP 2005: Second International Conference on Knowledge Management in Asia Pacific</source>
          (pp.
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          ). New Zealand: Victoria University of Wellington.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>[Cox</surname>
          </string-name>
          et al.,
          <year>1997</year>
          ]
          <string-name>
            <given-names>Kenneth C.</given-names>
            <surname>Cox</surname>
          </string-name>
          , Stephen G. Eick,
          <string-name>
            <given-names>Graham J.</given-names>
            <surname>Wills</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Ronald J.</given-names>
            <surname>Brachman</surname>
          </string-name>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <article-title>Visual data mining: Recognizing telephone calling fraud</article-title>
          .
          <source>Data Mining and Knowledge Discovery</source>
          <year>1997</year>
          , Volume
          <volume>1</volume>
          , Issue 2, pp
          <fpage>225</fpage>
          -
          <lpage>231</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [Eberle et al,
          <year>2011</year>
          ]
          <string-name>
            <given-names>William</given-names>
            <surname>Eberle - PhD</surname>
          </string-name>
          , Jeffrey Graves.
          <article-title>Insider Threat Detection Using a Graph-Based Approach</article-title>
          ,
          <source>Journal of Applied Security Research</source>
          ,
          <volume>6</volume>
          :
          <fpage>32</fpage>
          -
          <lpage>81</lpage>
          ,
          <year>2011</year>
          [Freeman, 2011]
          <string-name>
            <surname>Linton</surname>
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Freeman</surname>
          </string-name>
          .
          <article-title>The development of social network analysis - with an emphasis on recent events. The SAGE Handbook of Social Network Analysis, SAGE Publications Ltd</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [Csárdi et al.,
          <year>2006</year>
          ]
          <string-name>
            <given-names>Gábor</given-names>
            <surname>Csárdi</surname>
          </string-name>
          , Tamás Nepusz:
          <article-title>The igraph software package for complex network research</article-title>
          .
          <source>InterJournal Complex Systems</source>
          ,
          <volume>1695</volume>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <source>[INFLOW</source>
          ,
          <year>2010</year>
          ] http://www.orgnet.com/cases.html [Luell,
          <year>2010</year>
          ] “
          <article-title>Employee fraud detection under real world conditions,”</article-title>
          <source>Ph.D. dissertation</source>
          ,
          <year>2010</year>
          . [Online]. Available: http://www.zora.uzh.ch/44863/ [Phua et al.,
          <year>2004</year>
          ]
          <string-name>
            <given-names>Clifton</given-names>
            <surname>Phua</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Vincent</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Kate</given-names>
            <surname>Smith</surname>
          </string-name>
          &amp;
          <string-name>
            <given-names>Ross</given-names>
            <surname>Gayler</surname>
          </string-name>
          .
          <article-title>A comprehensive Survey of Data Mining-based Fraud Detection Research</article-title>
          . Arxiv preprint arXiv:
          <volume>1009</volume>
          .
          <fpage>6119</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <source>[PSU</source>
          ,
          <year>2007</year>
          ] https://courseware.e-education.psu.edu/courses/bootcamp/lo09/08.html [PWC,
          <year>2014</year>
          ] PriceWaterCoopers.
          <year>2014</year>
          <article-title>Global economic crime survey. La fraude continue à être une vraie menace pour les entreprises</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [Scott et al.,
          <year>2011</year>
          ]
          <string-name>
            <given-names>John</given-names>
            <surname>Scott</surname>
          </string-name>
          , Peter J.
          <string-name>
            <surname>Carrington</surname>
          </string-name>
          .
          <article-title>The SAGE handbook of social network analysis</article-title>
          .
          <source>SAGE Publications Ltd.</source>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [Wheeler et al, 2000] Richard Wheeler, Stuart Aitken.
          <article-title>Multiple algorithms for fraud detection</article-title>
          .
          <source>Knowledge-Based Systems</source>
          <volume>13</volume>
          (
          <issue>3</issue>
          ):
          <fpage>93</fpage>
          -
          <lpage>99</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <source>[Wieringa</source>
          , 2009]
          <string-name>
            <given-names>Roel</given-names>
            <surname>Wieringa</surname>
          </string-name>
          .
          <article-title>Design science as nested problem solving</article-title>
          .
          <source>Proceedings of the 4th International Conference on Design Science Research in Information Systems and Technology (DESRIST '09)</source>
          , Philadelphia, Pennsylvania, USA.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [Yufeng et al.,
          <year>2004</year>
          ]
          <string-name>
            <given-names>Yufeng</given-names>
            <surname>Kou</surname>
          </string-name>
          ,
          <string-name>
            <surname>Chang-Tien</surname>
            <given-names>Lu</given-names>
          </string-name>
          , Sirirat Sirwongwattana,
          <string-name>
            <surname>Yo-Ping Huang</surname>
          </string-name>
          .
          <article-title>Survey of fraud detection techniques</article-title>
          .
          <source>Proceedings of the 2004 IEEE. International Conference of Networking, Sensing &amp; Control. Taipei, Taiwan, March</source>
          <volume>21</volume>
          -23,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>