<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Data Loss Prevention and Challenges Faced in their Deployments</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Victor O. Waziri</string-name>
          <email>victor.waziri@futminna.edu.ng</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ismaila Idris</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>John K. Alhassan</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bolaji O. Adedayo</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Cyber Security Science, Federal University of Technology</institution>
          ,
          <addr-line>Minna</addr-line>
          ,
          <country country="NG">Nigeria</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Federal University of Technology</institution>
          ,
          <addr-line>Minna</addr-line>
          ,
          <country country="NG">Nigeria</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2016</year>
      </pub-date>
      <fpage>90</fpage>
      <lpage>96</lpage>
      <abstract>
        <p>-The technology world has greatly evolved over the past three decades and it is at a pace where an average user's laptop can accommodate up to a terabyte of data, where a tiny SD card can store an entire database of an organization, where file transferring has become less complex, and where users can easily connect to any wireless network (Private or Public) within the range of their wireless devices to exchange sensitive information. This evolvement has led to one of the greatest challenges organizations are faced with, which is in the area of adequately protecting their sensitive information from being lost or leaked. Data Loss Prevention (DLP) techniques was created in preventing these breaches on data loss, when these breaches occur in an organization. DLP systems has gained popularity over the last decade and is now referred as a matured technology, and with the alarming rate at which digitally stored assets is growing, the need for DLP systems has also increased. This paper discusses some of DLP concepts and trends, as well as the some of the challenges these various DLPs face and proffer a solution for a successful implementation.</p>
      </abstract>
      <kwd-group>
        <kwd>-Data loss prevention</kwd>
        <kwd>Data loss</kwd>
        <kwd>Data protection</kwd>
        <kwd>Data security</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>INTRODUCTION</p>
      <p>Data loss can be defined as the unauthorized transfer of
sensitive or confidential information about an organization
from a workstation or from the organization data center to
the outside world or to an untrusted environment. This can
be achieved through various channels of communications or
by using storage devices or simply by memorizing the
information displayed on the screen [1].These information
could be either a regular data (Debit card data, Bank
verification number and Health care data) or organization
secrets (Financial information, intellectual property and trade
secrets) [2].</p>
      <p>Over the last decades there has been major data loss with
serious impacts on organizations and this losses is on an
increase in recent years. According to DataLossDB [3], their
report shows that2015 surpassed the year 2012 all-time
record, for the number of reported data loss incidents
worldwide. Over 736 million records were exposed in the
3,930 reported incidents in 2015. An estimate of 50% out of
those records experienced data loss in the business sector,
20% in the government sector and the remaining 30%
occurring in the education and health sectors. It is also
important to mention that private users are also victims of
data loss and it is hard to know the extent or amount of data
loss that has occurred.</p>
      <p>There have been some notable data loss incidents in
recent years that has cost organizations millions of dollars in
the process. An estimated forecast indicates that an average
cost of a data loss will be over $150 million by 2020 and a
global annual cost forecast to be $2.1 trillion [4].In March
2016, LulsZec Philippines uploaded COMELEC’s entire
database on Facebook, after their website had been
hacked[5], while in October 2015, TalkTalk a British
telecommunications provider suffered a data loss of over 4
million of their customer’s details, thereby causing their
stock to fall drastically[6]. In February 2015, over 80 million
records were lost due to data loss in Anthem, these records
included social security numbers and very sensitive
information[7]. Adobe Systems revealed in October 2013
that there was a data loss of over 130 million user records to
a hack group due to insider assistance.These kinds of
incidents has caused organizations major financial losses,
damages to their reputation, loss of their customer
confidence, legal prosecution, productivity and morale of
employee and loss of business opportunities[8].</p>
      <p>One of the biggest challenges in mitigating data loss, is
that there are so many reasons attributed to data loss in an
organization and there is no tool or a simple solution that
adequately address these various data losses. However to be
able to address the risks faced, a solution must be developed
to incorporate the causes of data loss, which are can be
classified as people, processes and technology[9].
 People: Data loss can be caused by people through
their lack of awareness of the security issues relating
to sensitive information that are to be securedand
most times are not been accountable for protecting
these information.
 Process: The process of securing these sensitive
information can be caused by inadequate data usage
policies, no proper data transmission process and
lack of data monitoring usage.
 Technology: Lack of flexibility and communication
platform in technology deployed for the protection
of data, makes it difficult for the user, thereby
making the user to look for an alternative.</p>
      <p>As data loss is one of the major problems been faced by
organization and if not properly managed can cost the
organization millions in terms of finance. This problem can
be mitigated by using various types of Data Loss Prevention
task), they can be integrated to support other technologies
like identity access management or encryption. The table 1
summarizes the features for each of this DLP vendors and
figure 2 also shows the performance based on these features.</p>
      <p>
        DATA LOSS PREVENTION PRODUCT MATRIX
(DLP) methods and techniques. DLP can be defined as a
system, which is designed to detect and prevent any potential
data breach both intentionally or unintentionally[
        <xref ref-type="bibr" rid="ref5">10</xref>
        ].Most
organizations combine two or more DLPs to effectively
control the potential data loss they might be faced with.
DLPsystems differs from the conventional security as it has
the ability to analyze the content of the confidential data and
the context surrounding those data and it also has the ability
to protect those confidential data in all data states.
      </p>
      <p>
        A basic DLP system consist of three stages which include
discover, monitor and protect[
        <xref ref-type="bibr" rid="ref6">11</xref>
        ]. This stages are vital in
setting up an effective DPL system. The discovery stage
locates where your confidential data are been stored, by takin
a detailed inventory of this classified data and then
regrouping these sensitive data in terms of priorities. In the
monitoring stageit monitors how the confidential data are
used, by understanding the content and context of this
sensitive data and by analyzing when a breach occurs. The
last stage which is the protect stage, basically describes the
ways for protecting data loss and this is done by been
proactive in protecting these confidential data or by
enforcing the data loss policies created.
For a DLP system to be effectively deployed in an
organization, the data life cycle of the organization is
considered. A data life cycle is a detailed outline of the
phases involved in effectively preserving and managing of
data to be used and reused. This stages include data at rest
(data in storage), data in use (data flowing through internal
network) and data in transit (data that are been accessed).
Figure 1 shows a summary of these phases and how the
DLPs prevents data loss at those phases. In the protection of
targeted data, DLPs can take many forms during its
deployment, which are mostly based on the data state [
        <xref ref-type="bibr" rid="ref7">12</xref>
        ].
      </p>
      <p>In evaluating some of these tools designed by various
vendors, we were able to make a comparison for some of the
top DLP tools from various security vendors such as CA
Technologies (A), Code Green Networks (B), Digital
Guardian (C), Forcepoint (D), McAfee (E), Palisade Systems
(F), RSA (G), Trend Micro (H), Trustwave (I) and Symantec
(J), it was observed that these security vendors offer
protection for various data states. Though some of these tools
are specialized in their design (don’t perform other security</p>
      <p>TABLE I.</p>
      <p>Vendor
Mobile/tablet
Laptop/Desktop/Workst
ation
Local network
Server
Cloud/SaaS
biometric signatures
classification
context analysis
data matching
flagging
dictionaries/lexicons
data discovery
file type
detection/classification
machine
learning/pattern
recognition
Optical Character
Recognition
regular
expressions/pattern
matching
block
encrypt
fingerprinting
move/remove
notify/alert
quarantine
Databases (e.g. SQL
Server)
Email client
file sharing
instant messaging
web 2.0
webmail
CD/DVD
Printer
USB drives
external/removable HD</p>
      <p>Detection technologies</p>
      <p>A
X
✓
X
✓
✓
X
✓
✓
X
X
X
✓
X
X
X
X
✓
✓
✓
✓
✓
✓
X
X
X
X
X
X
X
X
X
X</p>
      <p>B
X
✓
✓
✓
✓
X
✓
X
✓
✓
✓
✓
X
✓
X
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
X
✓
X
✓</p>
      <p>C
✓
✓
X
✓
✓
X
✓
✓
X
X
X
✓
✓
X
X
X
✓
✓
✓
✓
✓
X
X
✓
X
X
X
X
X
✓
X
X</p>
      <p>D
✓
✓
X
X
✓
X
X
✓
X
X
X
X
✓
X
✓
X
✓
X
X
X
X
X
X
✓
X
X
✓
X
X
X
X
✓</p>
      <p>E
X
✓
✓
✓
✓
✓
✓
✓
X
X
✓
X
✓
X
✓
✓
✓
✓
✓
X
X
✓
✓
✓
✓
✓
✓
X
✓
✓
✓
✓
Enforcement technologies</p>
      <p>Software integration
Hardware integration</p>
      <p>F
X
✓
X
✓
X
X
X
X
X
X
X
✓
X
X
X
✓
✓
X
X
X
✓
X
X
✓
✓
✓
✓
✓
✓
X
X
✓</p>
      <p>G
✓
✓
X
X
✓
X
X
✓
X
X
X
✓
X
✓
X
X
✓
X
X
X
X
✓
✓
✓
✓
X
X
✓
✓
X
✓
✓</p>
      <p>H
✓
✓
X
✓
✓
X
X
X
X
X
X
✓
✓
✓
X
✓
✓
✓
X
X
✓
✓
X
✓
X
✓
X
✓
✓
✓
✓
✓</p>
      <p>I
X
✓
X
X
X
X
X
X
✓
X
X
✓
X
X
X
X
✓
✓
X
X
✓
✓
X
✓
✓
✓
✓
X
X
X
X
X</p>
      <p>J
✓
✓
X
X
✓
X
X
X
✓
X
X
✓
✓
✓
X
✓
✓
✓
✓
X
X
✓
✓
✓
✓
✓
✓
X
✓
X
X
✓
wireless devices
Monitoring
centralized
offline
real-time
A</p>
      <p>B</p>
      <p>C</p>
      <p>D</p>
      <p>E</p>
      <p>F</p>
      <p>G</p>
      <p>H</p>
      <p>I</p>
      <p>J</p>
      <p>Though this doesn’t necessary imply that Code Green
Networks has the best DLP system, it simply means it
compensates in the areas where it lacks. When choosing the
overall DLP vendor for your organization, it is ideal to check
the features that best suites the implantation, such as the ease
of installation, scalability, control features as well as its
maintenance.</p>
      <p>III.</p>
      <p>CHALLENGES ON DLP SYSTEMS DEPLOYMENT</p>
      <p>
        In protecting sensitive data from loss, DLP systems faces
many challenges and like other security mechanism these
challenges can render the system ineffective. In a review
conducted by researchers in the area of both industrial and
academic DPL systems, it was discovered that there were
seven common challenges been identified[
        <xref ref-type="bibr" rid="ref8">13</xref>
        ]. These are
Leaking Channels, The Human Factor, Access Rights,
Encryption and Steganography, Data Modification,
Scalability and Integration, and Data Classification. For an
effective DLP system to be implemented, these various
challenges must be addressed. In the following sections we
will discuss those challenges faced and try to suggest
possible solution for each of them.
      </p>
    </sec>
    <sec id="sec-2">
      <title>A. Leaking Channels</title>
      <p>
        Everyday there is need to share and access data between
different medium and users, and this done with the assistance
of intermediate channels. In an ideal scenario these channels
are used to legitimately exchange data from one end to
another, however these channels can also create a major treat
in the leakage of sensitive data. These channels cannot be
totally blocked, as it is an important aspect in the sharing of
data, which requires some or even all these channels to be
open. As technology keeps growing at a fast pace and more
channels becoming available, it has become hard to keep
pace with securing these channels[
        <xref ref-type="bibr" rid="ref9">14</xref>
        ]. The figure 3 shows
some of the commonly used channels used for data
exchange, as not all these channels are very easy to secure.
Some of them will require a great number of techniques and
monitoring to adequately secure them.
      </p>
      <p>
        Sensitive data that are either ‘at rest’ or ‘in use’ can be
compromised using these channels, which could include
USB ports, CD/DVD drives, printed documents and even
through web services. Though data leakages can be mitigated
using host DLPs for CD/DVD drives and USB port channels,
it isn’t adequate enough to prevent data leakages from other
channels such as Instant Messaging (IM) and emails that are
always made available [
        <xref ref-type="bibr" rid="ref10">15</xref>
        ]. Even when the access right are
restrict to confidential data, some of these data can still be
still be accessed in a printable format. While channels like
file sharing and web services associated with data ‘in transit’
has been one of the biggest challenges in migrating, as these
channels cannot be blocked and the serve as the backbones
of the organization in terms of data exchange. To effectively
maintain a maximum security in these channels, an intensive
filtering traffic is to be done. The DLP system to be deployed
should always try to create a balance in security without
affecting the interconnectivity in these channels.
      </p>
    </sec>
    <sec id="sec-3">
      <title>B. The Human Factor</title>
      <p>
        Humans are generally a complex being, as their
behaviors and motives are usually hard to predict or to
determine, as they are been influenced by many factors,
which could be psychological or sociological. Decision
makings such as granting of access to a set of users, defining
the confidentiality level of a data and setting a threshold
level for DLP systems, is basically affected by human
actions. It must also be noted here that, even when
organization’s security policies are in place to mitigate such
data loss, it doesn’t mean it is guarantee to tackle the
problem. Almost all human interactions with data occurs at
the data ‘in use’ state, which simply means the user needs an
endpoint terminal to access this confidential datafor there to
be data leakages [
        <xref ref-type="bibr" rid="ref11">16</xref>
        ]. Though a typical DLP system will
tend to put some restrictions on data leakages by the user, in
the form of disabling some aspect of the system, such as
CD/DVD drives, USB ports and removable drives. But this
user restrictions can easily be bypassed, by the sharing of
access rights by users either intentionally (by trust) or
unintentionally (by social engineering), there by
compromising the security of the confidentiality of that data
that would be accessed. Users can make use of mobile
gadgets to snap pictures of sensitive information or even use
hidden cameras to record the entire classified documents and
transmit it remotely. The human factor will always be a
major challenge when deploying a DLP systems, as long as
in is human interactions with the system[
        <xref ref-type="bibr" rid="ref12">17</xref>
        ].
      </p>
    </sec>
    <sec id="sec-4">
      <title>C. Access Rights</title>
      <p>
        Access right has always been a key feature in the
deployment of any security mechanism including DLP
systems. Therefore it is of great importance to be able to
categorize these access rights properly and to be able to
separate each category of users from each other based on
their level of permission. DLP systems won’t be able to
prevent illegitimate users from accessing confidential
information, if there is not a proper categorization of access
right in place. Access rights is of great importance in
preventing data loss in an organization and should always be
updated regularly[
        <xref ref-type="bibr" rid="ref13">18</xref>
        ]. As an obsolete access right can have a
huge impact on the entire system negatively, thereby making
the system vulnerable to data loss. For instance, when a user
is downgraded or dismissed from the organization and the
access rights are not updated according, it leaves the system
vulnerable to data leakages, as the DLP system won’t be able
to detect any data breach when that user tries to access
information he/she is not permitted to access.
      </p>
      <p>A data leak can also occur by a legitimate user with the
right access rights either intentionally (due to so many
factors such as financial gain or whistle blower) or
accidentally. For efficiency, a DLP system should be able
maintain and control access rights of the organization while
also performing the function of protection of data from
intentional and unintentional leakages.</p>
    </sec>
    <sec id="sec-5">
      <title>D. Encryption and Steganography</title>
      <p>
        Encryption is another major challenge been faced by
network based DLP systems, as these systems uses different
forms of analytical techniques in identifying copies of the
sensitive data and comparing it with the original data that are
been classified as confidential. But with complex encryption
of the confidential data by the user it makes it hard for the
DLP system to be able to analyze such data content, thereby
creating a major vulnerability in the system. The implication
of this, is that a confidential document can bypass the DLP
system detection mechanism when the user encrypts the
documents, thereby allowing the user to be able to send the
confidential document through his/her email as an
attachment[
        <xref ref-type="bibr" rid="ref14">19</xref>
        ]. Stenography is another type of challenge
similar to encryption, but it is more challenging to mitigate
and even impossible to detect when used by the user. The
user uses stenography tools to hide classified documents
within other media, these media could be digital photos,
audio files and video files. This becomes a challenge to DLP
system as it won’t be able to detect the confidential data
inside those media[
        <xref ref-type="bibr" rid="ref15">20</xref>
        ]. In some instances a document can be
compressed, or converted to a different format, thereby
making the system unable to detect such documents, as it
won’t be able to analyze such documents.
      </p>
    </sec>
    <sec id="sec-6">
      <title>E. Data Modification</title>
      <p>
        The design of some DLP systems are created to compare
the original sensitive data and inspected traffic flowing
through the system by using data signatures and patterns to
achieve prevention of data leakages. In this system, detection
occurs whenever there is a signature and patterns match to
that of the confidential data or when there is a high
percentage of similarity to the confidential data. The major
challenge of this system design is that confidential data are
mostly not sent in a form that will enable the system detect
such modification. Data can be modified using various type
of techniques, which are readily available online. These
confidential documents can be easily modified by removing
some vital lines in the documents or adding to it, thereby
creating an entire different document before sending such
documents over the allowed channels. The user can also
entirely change the structure or format of the document,
thereby rendering the documents undetectable from the DLP
systems [
        <xref ref-type="bibr" rid="ref16">21</xref>
        ].
      </p>
      <p>
        In some other design DLP systems uses data hashing in
analyzing the outgoing traffic by comparing the values
(including SHA1 and MD5) with the original confidential
data. The moment these two values matches each other, then
detection of data leak occurs. The problem with hashing
design, is it becomes ineffective the moment the confidential
documents is extensively modified, which in turns gives a
different hash value [
        <xref ref-type="bibr" rid="ref17">22</xref>
        ].
      </p>
    </sec>
    <sec id="sec-7">
      <title>F. Scalability and Integration</title>
      <p>
        The volume data processed can affect the performance of
any security mechanism deployed in securing an
organization’s assets. DLP systems can also be a victim of
such challenges, which means when deploying them either in
a host, network or storage section, it should be effective in
performing its function and smoothly incorporated into the
system without affecting or causing delay in the entire work
flow of the organization’s system. Therefore factors affecting
the scalability of a DLP systems such as its computational
ability and analyzation techniques should be considered
when deploying the system [
        <xref ref-type="bibr" rid="ref18">23</xref>
        ].
      </p>
      <p>There is usually some challenges faced when integrating
the DLP systems during its deployment, as there are similar
function already been handled by other security mechanism
like firewalls and intrusion detection. Therefore before
deployment, the entire system must be carefully analyzed
and implemented to give an effective performance. As there
shouldn’t be repetition of functions, as having two similar
function on the system can cause a delay in the entire process
of the system, thereby reducing the performance of the
system.</p>
    </sec>
    <sec id="sec-8">
      <title>G. Data Classification</title>
      <p>
        Data classification is the process of organizing data into
categories or levels for an effective and efficient use [
        <xref ref-type="bibr" rid="ref19">24</xref>
        ].
This definition implies that, DLP systems rely entirely on
well-defined data classification to enable the system
differentiate confidential data from normal data. The main
purpose of classification of data is in determining the
baseline of security controls to be used in safe guarding data.
There are different ways by which data can be classified
based on the organization classification, with terms like
Security Gaps
Network partitioning of
network security
Load Balancer Integration
Accelerated program delivery
TCP connection pooling
SSL offloading
Built in authentication engine
Validate encrypted sessions
Multiple applications single
sign on
Injection attack protection
(XSS, SQL)
Normalize encoded traffic
Inspect HTTPS traffic
Session tampering/ hijacking/
riding protection
Forceful browsing prevention
Data theft protection, cloaking
Brute-force protection
Trojan/Warms/Virus/malware
upload protection
Rate control protection
Request, response rewrite
Application access logging
and user audit trails
      </p>
      <p>Limited
No
No
No
No
No
No
No
No
No
No
No
No
No
No
No
No</p>
      <p>DLP Systems
Yes
Limited
Yes
(MTA Sensor)
Yes
Yes (vary from
different
policy)
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes (depends
on policy rule)
Yes
Door
Detection)
No
(Back</p>
      <p>
        Yes (Block
and report
users‟act)
Yes
confidential, secret and top secret been used by the military
to classify their data, while for an Institution data can be
classified as restricted data, private data and public
data[
        <xref ref-type="bibr" rid="ref20">25</xref>
        ]. With this classification it becomes easy in
identifying those confidential data, thereby making the
system adequately equipped to protect those confidential
data. However the problem with data classification is
determining the level of secrecy of those sensitive data. For
there to be a proper classification of these secrecy levels, the
owner of the data for protection should be the one
responsible for this classification process. However, most
times the classification process is left to those people who
don’t have enough knowledge about all the data. This creates
a vulnerability in the DLP system, as those who are not
permitted to see certain information are now equipped to
having access to such confidential information. It is therefore
important to properly have a good classification, as without it
the DLP system becomes ineffective.
In the aspect of security, there are a lot of security
systems and security vendors in the market. These security
systems can be classified or grouped as network security
systems, antivirus systems, monitoring systems, scanning
systems, data controlling systems as well as transaction
systems. These systems are unique and separated by their
functionality. Take for an example, the antivirus system
cannot perform encryption of data but it works perfectly in
the monitoring of the data source code. For this reasons
corporate organization will require many types of security
systems in their protection of data. As these security system
have specific functions, this makes the DLP system having
more edge over the rest, as it has the ability to perform
various functions. This reduces the cost of purchase a lot of
security systems in the monitoring of the various security
gaps. The table 2 summarizes the features of a DLP when
compared to similar protection system (Intrusion Prevention
System (IPS) and Firewall System).
      </p>
      <p>V.</p>
      <p>
        THE WAYS DLP SYSTEMS ANALYZES DATA
Though there are different ways by which DLP systems
analyze their data, these analysis can be grouped into two
major group. They are context analysis and content analysis.
The context focuses on the surroundings of the data while
the content focuses on the actual data[
        <xref ref-type="bibr" rid="ref21">26</xref>
        ].
      </p>
      <p>
         Context analysis: This method of analysis actual
analyzes the metadata properties with the
confidential data. It does this by examining the
information about data and keeps track of the data
using various attributes of the data such as the size of
the document, the source, the destination, when the
document was created or modified and other
properties. With this metadata attributes of the
confidential data, a pattern and signature can be used
to form a process in defining how the policies can be
created for the detection of data loss [
        <xref ref-type="bibr" rid="ref22">27</xref>
        ].
 Content analysis: In this method, analysis focuses on
the content of the confidential data, which could be
text or any multimedia material. It does this by
comparing the transmitted data with the original
confidential data and detects a breach if there is a
high percentage in similarity [
        <xref ref-type="bibr" rid="ref23">28</xref>
        ]. This process can
be done through basically three techniques: data
fingerprinting (identifies patterns with exact or
partial match), regular expression (identifies its
patterns based on words or text) and statistical
analysis (using prerecorded information) [
        <xref ref-type="bibr" rid="ref24">29</xref>
        ].
      </p>
      <p>DLP systems could be either preventive or detective,
depending on the type methods been used by the
organization. The preventive methods includes: Policy and
Access Rights, Virtualization and Isolation, Cryptographic
Approaches, Quantifying and Limiting; while detective
methods includes: Data Identification, Social and Behavioral,
Data Mining/Text Clustering, Quantifying and Limiting.</p>
    </sec>
    <sec id="sec-9">
      <title>A. Policy and Access Rights</title>
      <p>
        This type of method is widely suitable for organizations,
as long as there is a proper classification of their data and a
well-defined access rights system in place. This becomes
easy to manage as the procedures are clearly stated and
makes it ideal for data ‘at rest’ and data ‘in use’. This
method is constrained by basically improper classification of
data and not using the effective access controls. As it is a
preventive method, it doesn’t have the capability to detect
when a breach has occurred [
        <xref ref-type="bibr" rid="ref13">18</xref>
        ].
      </p>
    </sec>
    <sec id="sec-10">
      <title>B. Virtualization and Isolation</title>
      <p>
        It is based isolating the activities of the user virtually and
only allowing the system process trusted function or data to
pass through the system. This method usually requires
hardware in its implementation, thereby reducing the amount
of administrative functions as it makes use of the existing
data classification on the system. However it isn’t cost
effective and doesn’t detect when there is a data leakage[
        <xref ref-type="bibr" rid="ref25">30</xref>
        ].
      </p>
    </sec>
    <sec id="sec-11">
      <title>C. Cryptographic Approaches</title>
      <p>
        This approach involves encrypting the confidential
information with strong encryption tools to enable it produce
a maximum level of security. This approach is almost used in
all DLP systems as it has various options to encrypt such
files and it is effective for data ‘at rest’. The major challenge
is that encryption doesn’t hide those confidential documents
even though they might be encrypted. It isn’t a detective
method, making it vulnerable when there is a data
leakage[
        <xref ref-type="bibr" rid="ref26">31</xref>
        ].
      </p>
    </sec>
    <sec id="sec-12">
      <title>D. Quantifying and Limiting</title>
      <p>
        This method has an added advantage, as it also monitors
the channels in which those data travels and blocks any
sensitive data from passing through those channels. It can
effectively be implemented for data ‘in transit’, ‘in use’ and
‘at rest’, thereby making it easy to deploy it for a specific
attack on the organization system. As with the other
preventive methods, it makes it hard to detect data leakages
and if not properly deployed can disrupt the workflow of the
entire system. It is also limited to specific scenarios of data
leakages thereby making it vulnerable to other data forms of
leakages[
        <xref ref-type="bibr" rid="ref27">32</xref>
        ].
      </p>
    </sec>
    <sec id="sec-13">
      <title>E. Social and Behavior Analysis</title>
      <p>
        This method involves analyzing the level of interaction
between people or in this case users of the organization and
measuring this level, by creating adequate guidelines for the
protection of sensitive data. When adequately implemented
prevents leakages by detecting any relationship that is of
malicious intent and it is effective in all data states. As it is
difficult to predict such human behaviors, thereby leading to
a high percentage of false positives and also requiring the
administrator to regularly interact with the DLP system. This
method also requires a huge amount of time in profiling the
various users and indexing each of their behavioral patterns
[
        <xref ref-type="bibr" rid="ref28">33</xref>
        ].
      </p>
    </sec>
    <sec id="sec-14">
      <title>F. Data Identification</title>
      <p>
        This methods uses a mechanism that compares data
traffic flowing through the system with that of the original
confidential documents and tries to prevent such data from
been leaked when there is match. This method produces a
very low false positive, when using fingerprinting in its
analysis. However this method can easily be bypassed by
extremely modifying those data, making it impossible to
detect it [
        <xref ref-type="bibr" rid="ref29">34</xref>
        ].
      </p>
    </sec>
    <sec id="sec-15">
      <title>G. Data Mining and Text Clustering</title>
      <p>
        This method involves the ability to be able to predict
when a data leakage will occur by learning about the data
process and data leakages patterns over time. It is effective in
detecting unstructured documents, making it less dependable
on administrative interfacing, which makes the method easy
to integrate. The method is faced with a very high false
positive as it requires a learning phase to work, thereby
requiring a huge amount of processing power [
        <xref ref-type="bibr" rid="ref30">35</xref>
        ].
      </p>
      <p>VI.</p>
      <p>SOLUTION FOR A SUCCESSFUL DLP IMPLEMENTATION
For there to be a proper implementation of any DLP
systems, there are ten key steps we have considered and if
this steps are followed would help an organization to
adequately implement the DLP systems for protection of
their confidential data. These steps are as follows:
Step 1: Implementation of a universal technique and value
proposal for DLP centered on a risk assessment
Step 2: Involve the right people with the right organization
model
Step 3: Identify sensitive data and understand how they are
handled
Step 4: Provide a phased implementation based on
progress
Step 5: Minimize the impact to system performance and
business operations
Step 6: Create meaningful DLP policies and policy
management processes
Step 7: Implement effective event review and investigation
mechanisms
Step 8: Provide analysis and meaningful reporting
Step 9: Implement security and compliance measures
Step 10: Implement an organizational data flow and
oversight mechanism</p>
      <p>VII. CONCLUSION</p>
      <p>Many of organizations have given a great deal of
attention in protecting their sensitive data from been lost
accidentally or intentionally. DLP systems cannot function
effectively in isolation, this implies that for a DLP system to
effectively function it requires linking other security
information process. However, before implementing any
DLP system, there is need to adequately understand what
confidential data the organization wants to hold, where does
confidential data are to be stored in terms of locations as
where those data are been stored are vital in its protection
and the destination and the channels this information will
pass through.</p>
      <p>There are several challenges associated with DLP
systems, before they are deployed it is necessary and as well
as important to adequately have a deep understanding and be
able to analyze these various challenges associated with the
system. It is also important to make the system easy to be
used and managed, so as to avoid any form of complexity, as
the more complex a DLP system, the more likelihood the
system will be compromised by the user.</p>
      <p>As new technology are been developed and the ways this
technologies communicates changes as well, it is of great
importance an organizations must keep pace with these
increasing technology advancements by identifying new and
better ways in protecting data from been lost by unauthorized
users.
its</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>N.</given-names>
            <surname>Kumaresan</surname>
          </string-name>
          ,
          <article-title>"Key consideration in protecting sesitive data leakage using Data Loss Prevention Tools,"</article-title>
          <source>ISACA Journal</source>
          , vol.
          <volume>1</volume>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>5</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>E.</given-names>
            <surname>Bergstrom and R. M. Ahlfedt</surname>
          </string-name>
          ,
          <article-title>"Information Classification Issues,"</article-title>
          <source>Sprin International Publishing</source>
          , pp.
          <fpage>27</fpage>
          -
          <lpage>41</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>DataLossDB.</surname>
          </string-name>
          (
          <year>2016</year>
          ).
          <year>2015</year>
          <article-title>Reported data breaches surpasses all previous years</article-title>
          . Available: http://blog.datalossdb.
          <source>org [4] IBM and Ponemon</source>
          Institute LLC,
          <article-title>"2015 Cost of Data Breach Study: Global Analysis,"</article-title>
          <source>Ponemon Institute LLC Research Department 2308 US 31 North Traverse City, Michigan 49686 USA</source>
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Trend</given-names>
            <surname>Micro</surname>
          </string-name>
          . (
          <year>2016</year>
          ).
          <article-title>Data Protection Mishap leavees 55M Philippine Voters at Risk</article-title>
          . Available: http://blog.trendmicro.
          <article-title>com/treandlabs-security-intelligence/55mregistered-voters-risk-philippine-commission-elections-hacked BBC NEWS</article-title>
          . (
          <year>2015</year>
          ).
          <source>TalkTalk hack 'affected 157</source>
          ,000 customers. Available: http://www.bbc.com/news/business-34743185
          <string-name>
            <given-names>C.</given-names>
            <surname>Osborne</surname>
          </string-name>
          . (
          <year>2015</year>
          ).
          <article-title>Health insurer Anthem hit by hackers, up to 80 million records exposed</article-title>
          . Available: http://www.zdnet.com/article/health
          <article-title>-insurer-anthem-hit-by-hackersup-</article-title>
          <string-name>
            <surname>to-</surname>
          </string-name>
          80
          <string-name>
            <surname>-</surname>
          </string-name>
          million-records-exposed
          <source>T. Seals</source>
          . (
          <year>2016</year>
          ).
          <article-title>Data Breach Trends to Evolve in 2016</article-title>
          . Available: http://infosecurity-magazine.
          <article-title>com/news/data]breache-trends-toevolve-in EYGM Limited</article-title>
          .
          <article-title>(</article-title>
          <year>2011</year>
          ).
          <article-title>Data Loss Prevention: Keeping your sensitive data out of the public domain</article-title>
          . Available: http://www.ey.com
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>R. R.</given-names>
            <surname>Tahboub</surname>
          </string-name>
          and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Saleh</surname>
          </string-name>
          ,
          <article-title>"Data Leakage/Loss Prevention Systems (DLP),"</article-title>
          <source>ResearchGate</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Jonathan</given-names>
            <surname>Jesse</surname>
          </string-name>
          and
          <string-name>
            <given-names>ITS</given-names>
            <surname>Partners</surname>
          </string-name>
          .
          <article-title>(</article-title>
          <year>2015</year>
          ).
          <article-title>Symantec DLP Overview</article-title>
          . Available: http://www.symantec.com/en/uk/business/theme.jsp?th
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Price</given-names>
            <surname>Waterhouse Coopers</surname>
          </string-name>
          ,
          <article-title>"Data Loss Prevention: Keeping sensitive data out of the wrong hands*,"</article-title>
          pp.
          <fpage>1</fpage>
          -
          <lpage>16</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>N.</given-names>
            <surname>Lord</surname>
          </string-name>
          ,
          <article-title>"Experts on the Data Loss Prevention (DLP) Market in 2016</article-title>
          &amp; Beyond," ed,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>V.</given-names>
            <surname>Shaj</surname>
          </string-name>
          and
          <string-name>
            <given-names>K. P.</given-names>
            <surname>Kaliyamurthie</surname>
          </string-name>
          ,
          <article-title>"A review of Data Leakage Detection,"</article-title>
          <source>IJCSMC Journal</source>
          , vol.
          <volume>2</volume>
          , pp.
          <fpage>577</fpage>
          -
          <lpage>581</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>T. T. T.</given-names>
            <surname>Huong</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Corner</surname>
          </string-name>
          ,
          <article-title>"The impact of communication channels on mobile banking adoption,"</article-title>
          <source>International Journal of Banking Marketing</source>
          , vol.
          <volume>34</volume>
          , pp.
          <fpage>78</fpage>
          -
          <lpage>109</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [16]
          <string-name>
            <surname>I. Ponemon</surname>
          </string-name>
          ,
          <article-title>"The Human Factor in Data Protection "</article-title>
          <source>Trend Micro</source>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>27</lpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>T.</given-names>
            <surname>Pepper</surname>
          </string-name>
          . (
          <year>2016</year>
          ).
          <article-title>The people problem: How to manage the human factor to shore up security</article-title>
          . Available:
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>D.</given-names>
            <surname>Gibson</surname>
          </string-name>
          ,
          <article-title>"What's missing from Data Loss Prevention,"</article-title>
          <source>Data Center Journal</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>S. R.</given-names>
            <surname>Raj</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Cherian</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Abraham</surname>
          </string-name>
          ,
          <article-title>"A Survey on Data Loss Prevention Techniques,"</article-title>
          <source>International Journal of Science and Research</source>
          , vol.
          <volume>2</volume>
          , pp.
          <fpage>240</fpage>
          -
          <lpage>241</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>N. B.</given-names>
            <surname>Pamula</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Naga</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P. K.</given-names>
            <surname>Deepthi</surname>
          </string-name>
          ,
          <article-title>"Preventing Data Leakage in Distributive Strategies by Steganography Technique,"</article-title>
          <source>International Journal of Computer Science and Information Technologies</source>
          , vol.
          <volume>4</volume>
          , pp.
          <fpage>220</fpage>
          -
          <lpage>223</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>S. W.</given-names>
            <surname>Ahmad</surname>
          </string-name>
          and
          <string-name>
            <given-names>G. R.</given-names>
            <surname>Bamnote</surname>
          </string-name>
          ,
          <article-title>"Data Leakage Detection and Data Prevention using Algorithm,"</article-title>
          <source>International Journal of Computer Science and Application</source>
          , vol.
          <volume>6</volume>
          , pp.
          <fpage>394</fpage>
          -
          <lpage>399</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>M.</given-names>
            <surname>Hart</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Manadhata</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Johnson</surname>
          </string-name>
          ,
          <article-title>"Text Classification for Data Loss Prevention," in Privacy Enhancing Technologies</article-title>
          , ed Waterloo, ON, Canada: Springer Berlin Heidelberg,
          <year>2011</year>
          , pp.
          <fpage>18</fpage>
          -
          <lpage>37</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [23]
          <string-name>
            <surname>J. Thorkelson.</surname>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>Data Loss Prevention: Simplified</article-title>
          . Available: http://www.codegreennetworks.com
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>M.</given-names>
            <surname>Rouse</surname>
          </string-name>
          . (
          <year>2015</year>
          ).
          <article-title>Data Classification</article-title>
          . Available: http://searchdatamanagement.techtarget.com/data-classification
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>R.</given-names>
            <surname>Bragg</surname>
          </string-name>
          ,
          <article-title>"Data Classification,"</article-title>
          <source>in CISSP Training Guide, 1st ed 800 East 96th Street</source>
          , Indianapolis, Idiana: Pearson IT Certification,
          <year>2002</year>
          , pp.
          <fpage>48</fpage>
          -
          <lpage>51</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bryman</surname>
          </string-name>
          , Social Research Methods, 2nd ed.
          <source>Great Clarendon Street</source>
          , Oxford, United Kingdom: Oxford University Press,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Kale</surname>
          </string-name>
          and
          <string-name>
            <given-names>S. V.</given-names>
            <surname>Kulkari</surname>
          </string-name>
          ,
          <article-title>"Data Leakage Detection,"</article-title>
          <source>International Journal of Advanced Research in Computer and Communication Engineering</source>
          , vol.
          <volume>1</volume>
          , pp.
          <fpage>668</fpage>
          -
          <lpage>678</lpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>K. A.</given-names>
            <surname>Neuendorf</surname>
          </string-name>
          ,
          <article-title>The Content Analysis Guidebook</article-title>
          . Thousand Oaks, Ca.: Sage Publication Inc.,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>K.</given-names>
            <surname>Krippendorf</surname>
          </string-name>
          ,
          <string-name>
            <surname>Content Analysis</surname>
          </string-name>
          :
          <article-title>An introduction to methodology</article-title>
          . Thousand Oaks, Ca.: Sage Publication Inc.,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>J. N.</given-names>
            <surname>Mathews</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hapuarachchi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Deshane</surname>
          </string-name>
          ,
          <article-title>"Quantifying the performanceof IsolationProperties of Virualization Systems,"</article-title>
          <source>ACM</source>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>9</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>K.</given-names>
            <surname>Scarfone.</surname>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>How to help DLP and Encryption Coexist</article-title>
          . Available: http://www.statetechmagazine.com/article/2013/11/howhelp
          <article-title>-dlp-and-encryption-coexist-state</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>S.</given-names>
            <surname>Vavilis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Petkovic</surname>
          </string-name>
          , and
          <string-name>
            <given-names>N.</given-names>
            <surname>Zannone</surname>
          </string-name>
          ,
          <article-title>"Data Leakage Quantification," presented at the Data Applications Security and Privacy XXVIII: 28th Annual IFIP WG</article-title>
          <year>11</year>
          .3 Vienna, Austria,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [33]
          <string-name>
            <surname>J. M. Kizza</surname>
            , Computer Network Security and
            <given-names>Cyber</given-names>
          </string-name>
          <string-name>
            <surname>Ethic</surname>
          </string-name>
          , 4th ed. Jefferson, North Carolina: McFarland &amp; Company, Inc.,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>M.</given-names>
            <surname>Tu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Spoa-Harty</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <article-title>"Data Loss Prevention Management and Control: Inside Activity Incident Monitoring,Identification, and Tracking in Healthcare Enterprise Environments,"</article-title>
          <source>The Journal of Digital Forensics, Security and Law</source>
          , vol.
          <volume>10</volume>
          , pp.
          <fpage>27</fpage>
          -
          <lpage>44</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>I. H.</given-names>
            <surname>Witten</surname>
          </string-name>
          and
          <string-name>
            <given-names>E.</given-names>
            <surname>Frank</surname>
          </string-name>
          ,
          <article-title>"Classification rule,"</article-title>
          <source>in Data Mining Practical Machine Learning Tools and Techniques, 2nd ed 500 Sansome Street, Suite</source>
          <volume>400</volume>
          , San Francisco, CA 94111: Morgan Kaufmann Publisher,
          <year>2005</year>
          , pp.
          <fpage>200</fpage>
          -
          <lpage>213</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>