<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Leveraging Multi-task Learning for Unambiguous and Flexible Deep Neural Network Watermarking</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Fangqi Li</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lei Yang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shilin Wang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alan Wee-Chung Liew</string-name>
          <email>a.liew@grif</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>School of Information and Communication Technology, Griffith University</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Deep neural networks are playing an important role in many real-life applications. An important prerequisite in commercializing deep neural networks is the identification of their genuine owners. Therefore, watermarking schemes that embed the owner's identity information into the models have been proposed. However, current schemes cannot meet all the security requirements such as unambiguity and are inflexible since most of them focus on classification models. To meet the formal definitions of the security requirements and increase the applicability of deep neural network watermarking schemes, we propose a new method, MTLSign, based on multi-task learning. By treating the watermark embedding as an extra task, the security requirements are explicitly formulated and met with well-designed regularizers and components from cryptography. Experiments have demonstrated that MTLSign is flexible and robust for practical security in machine learning applications.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>Deep neural network (DNN) is spearheading artificial
intelligence with broad application in assorted fields. Training a
DNN is expensive, a large amount of data has to be collected
and preprocessed, following the data preparation is
parameter tuning and DNN structure optimizing. On the contrary,
using a DNN is easy: a user simply propagates the input
forward. Such imbalance between DNN production and
deployment calls for protecting DNN models as intellectual
properties (IP) against piracy. Moreover, the identification
of DNN’s owner forms the basis of the accountability of AI
systems.</p>
      <p>Watermarking is an influential method for DNN IP
protection (Uchida et al. 2017). Some information is embedded
into the neural network as the watermark. After adversaries
stealing the model and pretending to have built it on
themselves, an ownership verification (OV) process reveals the
hidden information and identifies the authentic owner.</p>
      <p>*S. Wang is the corresponding author. This work was supported
by National Natural Science Foundation of China (61771310). Part
of the work appeared as https://arxiv.org/pdf/2108.09065.pdf
Copyright © 2022, 2022 for this paper by its authors. Use
permitted under Creative Commons License Attribution 4.0 International
(CC BY 4.0).</p>
      <p>Dprimary
DWkeyM
key</p>
      <p>The primary branch
· · · · · ·
cp</p>
      <p>7</p>
      <p>+/−</p>
      <p>The watermark branch.</p>
      <p>
        If the pirated model is deployed as an API then the owner
has to adopt backdoor-based watermarking schemes (Zhang
et al. 2018; Adi et al. 2018), where special triggers evoke
certain outputs. Triggers can be generated from an
autoencoder (Li et al. 2019b; Li and Wang 2021), adversarial
samples (Le Merrer, Perez, and Tre´dan 2020), or exceptional
samples (Li et al. 2019a). Backdoor-based watermarking
schemes are fragile given backdoor clearance methods (Liu
et al. 2020; Li et al. 2021; Namba and Sakuma 2019). Model
tuning such as fine-pruning
        <xref ref-type="bibr" rid="ref1">(Liu, Dolan-Gavitt, and Garg
2018)</xref>
        can also block some backdoors and hence the
watermark.
      </p>
      <p>If the entire suspicious model is accessible, e.g., in model
competitions and project certifications, then weight-based
watermarks can incorporate the owner’s identity information
into the weights of a DNN (Uchida et al. 2017), or the
statistics of the intermediate feature maps (Darvish, Chen, and
Koushanfar 2019). These white-box schemes usually carry
more information and have a larger forensics value.</p>
      <p>Hitherto, most watermarking methods are only designed
and examined for DNNs for image classification or
depend on specialized layers. Such inflexibility challenges
the broader application of DNN watermarking schemes
as a commercial standard. Moreover, some basic security
requirements against adversarial attacks have been
overlooked. The robustness of watermarks against new adaptive
attacks such as the spoil attack (Li, Wang, and Liew 2021)
also requires more attention.</p>
      <p>
        To overcome these difficulties, we propose a new
whitebox DNN watermarking scheme based on multi-task
learning (MTL)
        <xref ref-type="bibr" rid="ref1">(Sener and Koltun 2018)</xref>
        , MTLSign, as shown
in Fig. 1. By modeling the watermark embedding
procedure as an extra task, security requirements are satisfied with
well-designed regularizers. This extra task has an
independent backend classifier, hence it can verify the ownership of
arbitrary models. Cryptological primitives are adopted to
instantiate the watermarking task, making MTLSign provably
secure against the ambiguity attack. The major contributions
of our work are three-fold:
• We examine the security requirements for DNN
watermark, especially the unambiguity, in a formal manner.
• A DNN watermarking scheme based on MTL is
proposed. It can be applied to DNNs for tasks other than
image classification, the major focus of previous works.
• Experiments show that MTLSign is more robust,
flexible, and secure compared with several state-of-the-art
schemes.
      </p>
      <p>2</p>
    </sec>
    <sec id="sec-2">
      <title>Security Requirements</title>
      <p>We assume that the adversary possesses fewer data than the
owner (otherwise the piracy is unnecessary), but has full
knowledge of the watermarking scheme and can tune the
model adaptively. The pirated deep learning model fulfils
a primary task, Tprimary, with dataset Dprimary, data space X ,
label space Y and a metric d on Y. We study four crucial
security requirements confronting DNN IP protection.
2.1</p>
      <sec id="sec-2-1">
        <title>Unambiguity</title>
        <p>A DNN watermarking scheme WM composed of a key
generation module Gen and embedding module Embed, it first
generates a key for the owner with security parameter N :
key</p>
        <p>Gen(1N );
then embed key into a clean model Mclean:
g
:
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Functionality-preserving and covertness</title>
        <p>The watermarked DNN should perform slightly worse than,
if not as well as, the clean model. The formal definition is:</p>
        <sec id="sec-2-2-1">
          <title>Pr(x;y) Tprimary fd(Mclean(x); MWM(x))</title>
          <p>1;
which can be examined a posteriori. However, it is hard to
explicitly incorporate this definition into the watermarking
scheme. Instead, we resort to the following definition:
8x 2 X , d(Mclean(x); MWM(x))
To meet Eq. (3), we only have to ensure that the
parameters of MWM do not deviate from those of Mclean too much.
Meanwhile, such small deviation is also the requirement of
covertness, i.e., the secrecy of the watermark (Ganju et al.
2018). The owner should be able to control the level of this
difference. Let be a parameter within WM that regulates
such difference. It is desirable that in the extreme case where
approaches zero, the watermarked model converges to the
clean model:</p>
        </sec>
        <sec id="sec-2-2-2">
          <title>MWM ! Mclean, when</title>
          <p>! 0:
So the owner can select the optimal level of
functionality/covertness by modifying .
2.3</p>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>Robustness against tuning</title>
        <p>
          An adversary can tune M by running backpropagation
on a local dataset, pruning unnecessary neurons (NP), or
pruning and fine-tuning M (FP). It is suggested that FP
can efficiently eliminate backdoors from image
classification models and watermarks within
          <xref ref-type="bibr" rid="ref1">(Liu, Dolan-Gavitt, and
Garg 2018)</xref>
          . After being tuned on the adversary’s dataset
Dadversary, the model’s parameters shift and the verification
of the watermark might fail. Let M 0 Dadversary
MWM denotes
a model M 0 obtained by tuning MWM with Dadversary. As
shown in Fig. 2(b), a watermarking scheme is robust against
tuning if:
        </p>
        <p>Pr fverify(M 0; key) = 1g
1
(N ):
(5)
To meet (5), the owner has to make verify( ; key)
insensitive to tuning in the neighbour of MWM.
(3)
(4)
(MWM; verify)</p>
        <p>Embed(Mclean; key):
2.4</p>
      </sec>
      <sec id="sec-2-4">
        <title>Flexibility</title>
        <p>where MWM is the watermarked DNN model and verify
is the (possibly publicly available) ownership verifier (Li,
Wang, and Liew 2021). To accurately verify the ownership,
it is necessary and sufficient that:</p>
        <p>Pr fverify(MWM; key) = 1g
1
(N );
(1)</p>
        <p>Pr fverify(MWM; key0) = 0g 1 (N ); (2)
where declines exponentially in N and key0 6= key is a
random key. Claiming ownership with verify and key0 is
the ambiguity attack, hence Eq. (2) is defined as the
unambiguity property, which is demonstrated in Fig. 2(a).
Unambiguity has been examined for certain models as GAN (Ong
et al. 2021) but its formal connection with the security
parameter has not been established.</p>
        <p>Many white-box DNN watermarking schemes rely on extra
modules as passport layers or specialized network
architectures (Fan et al. 2021). Therefore, they cannot be readily
applied to arbitrary DNN models. To ensure generalization, it
is desirable that the watermarking scheme does not depend
on specific modules incorporated within the DNN or
explicitly modify the product’s structure.</p>
        <p>A comprehensive summary of established watermarking
schemes judged according to the enumerated security
requirements is given in Table 1.</p>
        <p>Remark Apart from these major requirements, there are
secondary security demands such as the security against
overwriting and declaration attack as shown in Fig. 2(c),
removal, privacy concerns, etc. We save the examinations and
discussions on these demands to the empirical studies.
Adversary</p>
        <p>key0
overwrite WM
verify0</p>
        <p>X</p>
        <p>M0
(a) Security against the ambiguity attack.
(b) Robustnenss against tuning.
(c) Redeclaration attack.
We leverage multi-task learning to design a white-box
watermarking framework for DNN IP protection. The watermark
embedding is modeled as an additional task TWM. A
classifier for TWM is built independent to the backend for Tprimary,
so common tunings such as fine-tune last layer (FTLL) or
re-train last layers (RTLL) (Adi et al. 2018) have no impact
on our watermark. After training and watermark embedding,
only the network structure for Tprimary is published.</p>
        <p>Under this formulation, the functionality-preserving
property and the security against tuning can be formally
addressed. A decently designed TWM ensures the security
against ambiguity attacks as well, making MTLSign a
secure and flexible option for DNN IP protection. To better
handle the forensic difficulties involving watermark
redeclaration, we adopt a decentralized consensus protocol to
authorize the time-stamp correlated with the watermarks.
3.2</p>
      </sec>
      <sec id="sec-2-5">
        <title>The watermarking scheme MTLSign</title>
        <p>The structure of the watermarking scheme MTLSign is
illustrated in Fig. 1. The entire network consists of the
backbone network and two independent backends: cp and cWM.
The published watermarked model MWM is the backbone
followed by cp and fWM is the watermarking branch in
which cWM takes the output of different layers from the
backbone as its input. cWM monitors the outputs of
differkey
M0
X
X
X
X
X
X</p>
        <p>X
X
X
X
X
X
X
X
X
X</p>
        <p>X
X
X
X
X
X
X
X</p>
        <p>X
X
X
ent layers of the backbone network, so it is harder to
invalidate the watermark completely compared with
passportlayer based schemes.</p>
        <p>To produce a watermarked model, the owner should:
N
1. Generate N samples DWkeMy = fxi; yigi=1 using a
pseudorandom algorithm with key as the seed.
key
2. Optimize the DNN to jointly minimize the loss on DWM
and Dprimary. During the optimization, a series of
regularizers are designed to meet the security requirements
enumerated in Section 2.
3. Publishes MWM.</p>
        <p>To prove its ownership over a model M to a third-party
customer, the owner and the customer conduct the
followings:
1. The owner submits M , cWM and key.
2. The customer checks whether cWM is consistent with</p>
        <p>M ’s architecture.
3. The customer generates DWkeMy from key and combines
cWM with M ’s backbone to reproduce fWM.
4. If fWM statistically fits DWkeMy then the customer confirms
the owner’s ownership over M .</p>
        <p>The implementation of TWM The watermark task TkWeyM,
is instantiated as a binary classification. To generate DWM
key is used as the seed of a pseudo-random generator (e.g.,
a stream cipher) to generate key, a sequence of N different
integers from [0; ; 2m 1], and a binary string lkey of
length N , where m = 3dlog2(N )e.</p>
        <p>For each type of data space X , a deterministic and
injective function is adopted to map each integer in key into an
element in X . For example, when X is the image domain,
the mapping could be the QRcode encoder. When X is the
sequence of words in English, the mapping could map an
integer n into the n-th word of the dictionary. Without loss of
generality, let key[i] denote the mapped data from the i-th
integer in key. Both the pseudo-random generator and the
functions that map integers into specialized data space are
accessible for all parties. Now we set:</p>
        <p>N</p>
        <p>DWkeMy = ( mkey[i]; lkey[i]) i=1 ;
where lkey[i] is the i-th bit of lkey. The security
requirements raised in Section 2 are merged into MTLSign as the
analysis below.</p>
        <p>Unambiguity To justify the ownership of a model M to a
owner with key given cWM, verify operates as Algo. 1.</p>
        <sec id="sec-2-5-1">
          <title>Algorithm 1: verify( ; jcWM; )</title>
        </sec>
      </sec>
      <sec id="sec-2-6">
        <title>Require: M , key.</title>
        <p>Ensure: The verification of M ’s ownership.
1: Build the watermarking branch f from M and cWM;
2: Generate DWkeMy from key;
3: If f correctly classifies at least</p>
      </sec>
      <sec id="sec-2-7">
        <title>4: Then return 1.</title>
      </sec>
      <sec id="sec-2-8">
        <title>5: Else return 0.</title>
        <p>key
N terms within DWM</p>
        <p>If M = MWM then M has been trained to minimize the
binary classification loss on TWM, hence the test is likely to
succeed, this justifies the correctness requirement in (1). For
an arbitrary key0 6= key, the induced watermark training
data DWkeMy0 and DWkeMy can hardly overlap. It can be proven
that if m log2(N 3) and is selected to be significantly
higher than 21 then the probability of a successful ambiguity
attack declines exponentially with N , details are given in
Appendix A. This justifies the unambiguity condition (2).</p>
      </sec>
      <sec id="sec-2-9">
        <title>The functionality-preserving regularizer Denote the</title>
        <p>trainable parameters of the DNN model by W. The
optimization target for Tprimary takes the form:
L0(WjDprimary) =
l MWWM(x); y + 0 u(W);</p>
        <p>X
(x;y)2Dprimary
L1(WjDprimary; DWkeMy) =
(6)
where l( ; ) is the loss defined by Tprimary and u( ) is a
regularizer reflecting the prior knowledge on W.</p>
        <p>Since DWM is much smaller than Dprimary, TWM might
not converge properly when being learned simultaneously
with Tprimary. Hence we first optimize W w.r.t. the loss on
the primary task (6) to obtain Mclean with parameter W0 =
arg minW fL0(W; Dprimary)g.</p>
        <p>Then the model is tuned for TWM by minimizing:
X</p>
        <p>lWM(fWWM(x); y)
where lWM( ; ) is the cross entropy loss, and</p>
        <p>Rfunc(W) = kW
The regularizer Rfunc in (8) confines W in the neighbour of
W0. Then the continuity of MWM as a function of W ensures
the functionality-preserving property defined in (3).
Remark on covertness Note that 1 = 1 regarding
Eq. (4) regulates the parameter deviation of MWM from
Mclean. If the owner adopts a large 1 then it obtains a high
level of covertness. Meanwhile, a smaller 1 trades
covertness for faster convergence of the watermarking task.
The tuning regularizer To be robust against adversarial
tuning, it is sufficient to make cWM robust against tuning
according to the definition in (5). We assume that Dadversary
shares a similar distribution as Dprimary. Otherwise, the stolen
model would not have state-of-the-art performance on the
adversary’s task. A subset of Dprimary is firstly sampled as an
estimation of Dadversary. Let W be the current configuration
of the model’s parameter. Tuning is tantamount to
minimizing the empirical loss on Dp0rimary by starting from W, which
results in the updated parameter: Wt Dp0rimary W. In practice,
Wt is obtained by replacing Dprimary in (6) by Dp0rimary and
training for a few epochs.</p>
        <p>To achieve the security in (5), for any Dadversary and
(x; y) 2 DWkeMy, the parameter W should meet:</p>
        <p>t
fWWM(x) = y, W
t Dp0rimary</p>
        <p>W:
This condition, together with Algo. 1 implies (5).</p>
        <p>To exert the constraint in (9) to the training process, we
design a new regularizer:</p>
        <p>RDA(W) =</p>
        <p>X
Wt Dp0rimary W;(x;y)2DWkeMy
t
lWM fWWM(x); y : (9)
Then the loss to be minimized is updated from (7) to:
L2(WjDprimary; DWkeMy) =L1(W; Dprimary; DWkeMy)</p>
        <p>+ 2 RDA(W):
RDA defined by (9) can be understood as one kind of data
augmentation for TWM. Data augmentation aims to improve
the model’s robustness against some specific perturbation in
the input domain (Shorten and Khoshgoftaar 2019). This is
usually done by adding an extra regularizer:</p>
        <p>X
(x;y)2D;x0 perturb x
l f W(x0); y :
Unlike in the data domain of Tprimary, it is hard to explicitly
define augmentation for TWM against tuning. A regularizer
with the form of (11) can be derived from (9) by
interchanging the order of summation. Concretely, the perturbation in
the watermarking task with the form:
x0 2 fWWM
1</p>
        <p>t
fWWM (x)
perturb x
can increase the watermarked model’s robustness against
tuning.
(10)
(11)
To regulate the OV process against watermark overwriting
and piracy, one option is to use a trusted authorization
center, which is vulnerable and expensive. Therefore, we resort
to decentralized consensus protocols as Raft (Ongaro and
Ousterhout 2014) or PBFT (Castro, Liskov et al. 1999),
under which messages are responded to and recorded by clients
within the community. By storing the necessary information
into the servers of a distributed community, the watermark
becomes unforgeable (Li, Wang, and Liew 2021).</p>
        <p>To conduct an OV, the owner submits the evidence to the
entire community, so each member can independently
conduct the verification. The final result is obtained through
voting, the process is illustrated in Fig. 3. The key
generation process can be tangled with the owner’s digital
signature (e.g., by a CPA-encryption) so revealing key would not
violate the privacy or lead to further threats.</p>
        <p>To publish a model An owner B signs and broadcasts the
following message to the entire community:
hPublish:ktimekhash(key)khash(cWM)khash(info)i;
where k denotes string concatenation, time is the time
stamp, info explains how cWM connects to the backbone
model, and hash is a preimage resistant hash function
mapping an object into a string and is accessible for all parties.
Once B is confirmed that the majority of clients has recorded
its broadcast (e.g. when B receives a confirmation from the
current leader under the Raft protocol), it publishes MWM.
To prove the ownership over a model For model M , B
signs and broadcasts the following message:</p>
        <p>hOV:klM khash(M )klcWM kkeyi;
where lM and lcWM are pointers to M and cWM. Upon
receiving this request, any client within the consensus community
can independently conduct the ownership proof. It firstly
downloads the model from lM and examines its hash. Then
it downloads cWM and retrieves the corresponding message
from B by hash(cWM). The last steps follow Section 3.2.
After finishing the verification, this client broadcasts its
result as the proof for B’s ownership over the model in lM .</p>
      </sec>
      <sec id="sec-2-10">
        <title>Security of the OV protocol To pirate a model under this</title>
        <p>protocol, an adversary must obtain a legal key, the hash of
a cWM, and the correct info at earlier than the owner. This
is hard since the adversary has to correctly guess the pirated
DNN’s architecture and embed its key into it without
modifying its cWM. Otherwise, such piracy can be falsified by
examining the time-stamp.</p>
        <p>4</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Experiments and Discussions</title>
      <p>4.1</p>
      <sec id="sec-3-1">
        <title>Experiment Setup</title>
        <p>To illustrate the flexibility of MTLSign, we considered three
primary tasks: image classification (IC), sentimental
analysis (SA) of discourse, and image semantic segmentation
(SS). We adopted five datasets for IC, two datasets for SS,
and two datasets for SA. The descriptions of these datasets
and the corresponding DNN structures are listed in Table 2.</p>
        <p>
          ResNet (He et al. 2016) is a classical model for image
processing. For the VirusShare dataset, we compiled a
collection of 26,000 malware into images and adopted ResNet
as the classifier. Glove (Pennington, Socher, and Manning
2014) is a pre-trained word embedding, while bidirectional
long short-term memory (Bi-LSTM) (Huang, Xu, and Yu
2015) is commonly used in NLP. Cascade mask RCNN
(CMRCNN)
          <xref ref-type="bibr" rid="ref1">(Cai and Vasconcelos 2018)</xref>
          is a DNN
specialized for semantic segmentation.
2:69 10 8 with = 0:34 in the Chernoff bound according
to Appendix A. Dprimary0 took 10% samples randomly from
the training dataset. For the tuning attacks, we considered FP
and NP. As for adaptive attacks, we adopted the overwriting
attack and the spoil attack (Li, Wang, and Liew 2021).
To examine the efficacy of Rfunc and RDA, we compared the
performance of the watermarked DNN MWM under
different configurations. Three metrics are of interest: (i) The
performance of MWM on Tprimary. (ii) The decline of the
performance of MWM on Tprimary when NP made fWM’s
accuracy on TWM lower than . (iii) The performance of fWM
on TWM after FP. The models were trained by minimizing
the MTL loss defined by (10), where we adopted fine-tuning
and NP and chose the optimal 1 and 2 by grid search in
[0:02; 0:04; ; 0:2]. The results are collected in Fig, 5. We
observe that Rfunc preserves the model’s performance on the
primary task. On the other hand, RDA makes the
watermarking branch robust against FP, whose accuracy on TWM is
significantly higher than the models without RDA. Meanwhile,
the performance on the primary task has to decrease much
larger during NP to invalidate the watermarked model with
RDA, so the adversary has to sacrifice more in order to
invalidate the original ownership. Therefore, we suggest that both
regularizers be incorporated in watermarking the model.
4.3
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Comparative Studies and Discussion</title>
        <p>
          For comparison, several SOTA watermarking schemes
          <xref ref-type="bibr" rid="ref2">(Zhu
et al. 2020; Li et al. 2019a; Darvish, Chen, and Koushanfar
2019; Fan et al. 2021)</xref>
          that are secure against the ambiguity
attack and tuning were considered. Yet they cannot be
readily generalized to semantic segmentation and NLP tasks. We
generated 600 backdoor/passport/feature map triggers and
assigned them with proper labels for each candidate scheme.
        </p>
        <p>To compare the levels of covertness, we measured the
average deviation of parameters after watermarking. For the
functionality-preserving property and the robustness against
tuning, we recorded the performance of the watermarked
models on the primary task, the verification accuracy of
watermarks after FP, and the relative decline of the performance
on the primary task when NP invalidated the watermarks.</p>
        <p>Finally, we conducted the spoil attack, an improved
watermark removal attack (Li, Wang, and Liew 2021), to the
watermarked model. The spoil attack can always eliminate
the watermark, so as in NP, the statistics of interest is the
relative decrease of the performance on Tprimary, which reflects
the adversary’s expense. We measured these values for all
compared schemes in five classification datasets, the results
are summarized in Fig. 6, detailed implementations of the
spoil attacks are provided in Appendix B.</p>
        <p>Our method resulted in only a slight difference in
parameters compared with other candidates, in particular the
whitebox competitors. It is harder for an adversary to distinguish
a model watermarked by MTLSign from a clean one.
Regarding robustness and functionality-preserving, our method
uniformly outperformed other competitors, this is due to: (1)
MTLSign does not incorporate backdoors into the model,
so adversarial modifications such as FP, which are designed
to eliminate backdoor, can hardly reduce our watermark. (2)
MTLSign relies on an extra module, cWM, as a verifier. As
an adversary cannot tamper with this module, universal
tunings such as NP have less impact. MTLSign can also adapt
to new tuning operators by incorporating them into RDA.
Moreover, MTLSign asserts weak conditions on both the
task (e.g. NLP) and the DNN architecture and is more
flexible. At last, we consider the overwriting attack, where the
adversary embeds its watermark into the pirated DNN.
Although the adversary’s ownership declaration can be
falsified by the OV protocol, it is necessary that such
overwriting does not invalidate the owner’s watermark. The decrease
of the accuracy of the watermarking branch with the
overwriting epochs was recorded in Table 3. Since the decrease
is uniformly bounded by 5%, overwriting does not form a
threat to MTLSign.</p>
        <p>5</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Conclusion</title>
      <p>This paper presents MTLSign, an MTL-based DNN
watermarking scheme. We examine the basic security
requirements for the DNN watermark, especially the unambiguity,
and propose to embed the watermark as an additional task.</p>
      <sec id="sec-4-1">
        <title>Dataset</title>
        <p>MNIST
F-MNIST
The proposed scheme explicitly meets security requirements
by corresponding regularizers. With a decentralized
consensus protocol, MTLSign is secure against adaptive attacks.
It is true that like any other white-box DNN
watermarking scheme, MTLSign remains vulnerable to functionality
equivalence attacks such as the neuron permutation. This is
one of the aspects that require further effort to increase the
applicability of DNN watermarks.
Adi, Y.; Baum, C.; Cisse, M.; Pinkas, B.; and Keshet, J.
2018. Turning your weakness into a strength: Watermarking
deep neural networks by backdooring. In 27th fUSENIXg
Security Symposium (fUSENIXg Security 18), 1615–1631.
Cai, Z.; and Vasconcelos, N. 2018. Cascade r-cnn:
Delving into high quality object detection. In Proceedings of the
IEEE conference on computer vision and pattern
recognition, 6154–6162.</p>
        <p>Castro, M.; Liskov, B.; et al. 1999. Practical byzantine fault
tolerance. In OSDI, volume 99, 173–186.</p>
        <p>Darvish, R. B.; Chen, H.; and Koushanfar, F. 2019.
DeepSigns: an end-to-end watermarking framework for
ownership protection of deep neural networks. In Proceedings
of the Twenty-Fourth International Conference on
Architectural Support for Programming Languages and Operating
Systems, 485–497.</p>
        <p>Fan, L.; Ng, K. W.; Chan, C. S.; and Yang, Q. 2021. DeepIP:
Deep Neural Network Intellectual Property Protection with
Passports. IEEE Transactions on Pattern Analysis and
Machine Intelligence.</p>
        <p>Ganju, K.; Wang, Q.; Yang, W.; Gunter, C. A.; and Borisov,
N. 2018. Property inference attacks on fully connected
neural networks using permutation invariant representations. In
Proceedings of the 2018 ACM SIGSAC Conference on
Computer and Communications Security, 619–633.</p>
        <p>Guan, X.; Feng, H.; Zhang, W.; Zhou, H.; Zhang, J.; and Yu,
N. 2020. Reversible Watermarking in Deep Convolutional
Neural Networks for Integrity Authentication. In
Proceedings of the 28th ACM International Conference on
Multimedia, 2273–2280.</p>
        <p>He, K.; Zhang, X.; Ren, S.; and Sun, J. 2016. Deep
residual learning for image recognition. In Proceedings of the
IEEE conference on computer vision and pattern
recognition, 770–778.</p>
        <p>Huang, Z.; Xu, W.; and Yu, K. 2015. Bidirectional
LSTM-CRF models for sequence tagging. arXiv preprint
arXiv:1508.01991.</p>
        <p>Le Merrer, E.; Perez, P.; and Tre´dan, G. 2020. Adversarial
frontier stitching for remote neural network watermarking.
Neural Computing and Applications, 32(13): 9233–9244.
Li, F.; Wang, S.; and Liew, A. W.-C. 2021. Regulating
Ownership Verification for Deep Neural Networks: Scenarios,
Protocols, and Prospects. IJCAI Workshop.</p>
        <p>Li, F.-Q.; and Wang, S.-L. 2021. Persistent Watermark For
Image Classification Neural Networks By Penetrating The
Autoencoder. In 2021 IEEE International Conference on
Image Processing (ICIP), 3063–3067.</p>
        <p>Li, H.; Willson, E.; Zheng, H.; and Zhao, B. Y. 2019a.
Persistent and unforgeable watermarks for deep neural
networks. arXiv preprint arXiv:1910.01226.
(a) MNIST.
(b) Fashion-MNIST.
(c) CIFAR-10.
(d) CIFAR-100.
(e) VirusShare.
Li, Y.; Koren, N.; Lyu, L.; Lyu, X.; Li, B.; and Ma, X.
2021. Neural Attention Distillation: Erasing Backdoor
Triggers from Deep Neural Networks. arXiv preprint
arXiv:2101.05930.</p>
        <p>Li, Z.; Hu, C.; Zhang, Y.; and Guo, S. 2019b. How to
prove your model belongs to you: a blind-watermark based
framework to protect intellectual property of DNN. In
Proceedings of the 35th Annual Computer Security Applications
Conference, 126–137.</p>
        <p>Liu, H.; Weng, Z.; and Zhu, Y. 2021. Watermarking Deep
Neural Networks with Greedy Residuals. In International
Conference on Machine Learning, 6978–6988. PMLR.
Liu, K.; Dolan-Gavitt, B.; and Garg, S. 2018. Fine-pruning:
Defending against backdooring attacks on deep neural
networks. In International Symposium on Research in Attacks,
Intrusions, and Defenses, 273–294. Springer.</p>
        <p>Liu, X.; Li, F.; Wen, B.; and Li, Q. 2020. Removing
Backdoor-Based Watermarks in Neural Networks with
Limited Data. arXiv preprint arXiv:2008.00407.</p>
        <p>Namba, R.; and Sakuma, J. 2019. Robust watermarking of
neural network with exponential weighting. In Proceedings
of the 2019 ACM Asia Conference on Computer and
Communications Security, 228–240.</p>
        <p>Ong, D. S.; Chan, C. S.; Ng, K. W.; Fan, L.; and Yang, Q.
2021. Protecting Intellectual Property of Generative
Adversarial Networks From Ambiguity Attacks. In Proceedings of
the IEEE/CVF Conference on Computer Vision and Pattern
Recognition, 3630–3639.</p>
        <p>Ongaro, D.; and Ousterhout, J. 2014. In search of an
understandable consensus algorithm. In 2014 fUSENIXg Annual
Technical Conference (fUSENIXgfATCg 14), 305–319.
Pennington, J.; Socher, R.; and Manning, C. D. 2014. Glove:
Global vectors for word representation. In Proceedings of
the 2014 conference on empirical methods in natural
language processing (EMNLP), 1532–1543.</p>
        <p>Sener, O.; and Koltun, V. 2018. Multi-task learning as
multiobjective optimization. In Advances in Neural Information
Processing Systems, 527–538.</p>
        <p>Shorten, C.; and Khoshgoftaar, T. M. 2019. A survey on
image data augmentation for deep learning. Journal of Big
Data, 6(1): 60–107.</p>
        <p>Uchida, Y.; Nagai, Y.; Sakazawa, S.; and Satoh, S. 2017.
Embedding watermarks into deep neural networks. In
Proceedings of the 2017 ACM on International Conference on
Multimedia Retrieval, 269–277.</p>
        <p>Zhang, J.; Gu, Z.; Jang, J.; Wu, H.; Stoecklin, M. P.; Huang,</p>
        <p>A</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>A: Derivation for the umambiguity condition</title>
      <p>key0
To formulate this intuition, consider the event where DWM
shares q N terms with DWkeMy, q 2 (0; 1). With a
pseudorandom generator, it is computationally impossible to
distinguish key from a sequence of N randomly selected
integers. The same argument holds for lkey and a random binary
string of length N . Therefore the probability of this event
can be upper bounded by:</p>
      <p>N
qN
rqN (1
r)(1 q)N
qN
;
where r = 2mN+1 . For an arbitrary q, let r &lt; 2+(11 q)N then
key0 overlaps with DWkeMy with a portion
the probability that DWM
of q declines exponentially.</p>
      <p>For numbers not appeared in key, the watermarking
branch is expected to output a random guess. Therefore if
q is smaller than a threshold then DWkeMy0 can hardly pass
the statistical test in Algo. 1 with N big enough. So let
m
and N be large enough would make an effective collision
in the watermark dataset almost impossible. For simplicity,
setting m = 3 dlog2(N )e log2(N 3) is sufficient.</p>
      <p>To select the threshold , assume that the random guess
strategy achieves an average accuracy of at most p = 0:5 +
(N ), where is a negligible function. The verification
process returns 1 iff the watermark classifier achieves binary
classification of accuracy no less than . The demand for
security is that by randomly guessing, the probability that an
adversary passes the test declines exponentially with n. Let
X denote the number of correct guesses with average
accuracy p, an adversary succeeds only if X N . By the
Chernoff theorem:</p>
      <p>Pr fX</p>
      <p>N g
1
p + p e
e</p>
      <p>N
;
where is an arbitrary nonnegative number. If is larger
than p by a constant independent of N then 1 pe+p e is
less than unity with proper , reducing the probability of a
successful attack into negligibility.</p>
      <p>B</p>
    </sec>
    <sec id="sec-6">
      <title>B: Implementation of the spoil attacks</title>
      <p>During the spoil attack, the adversary has full knowledge
of key, verify, and has obtained MWM. The adversary’s
objective is to tune MWM into Mspoiled in order to escape IP
regulation, which means the following condition holds with
a large probability:</p>
      <p>verify(Mspoiled; key) = 0:</p>
      <p>For the backdoor-based watermarking schemes, key is
uniquely corelated with a collection of labelled triggers</p>
      <p>N
ftn; yngn=1. The spoil attack is tantamount to fitting the
watermarked model on the same triggers with adversarially
shuffled labels.</p>
      <p>For the weight-based watermarking schemes, key reveals
the places where information is hidden. So the adversary
only has to replace these parameters (which is usually a
small part of the entire model) with random values.</p>
      <p>For hybrid white-box watermarking schemes with a
complex verify module such as MTLSign, the adversary has
to tune the watermarking branch to fit shuffled labels with
the backend fixed. The loss function to be minimized can be
written as:
L(Wbackbone) =</p>
      <p>lWM(y0; cWM(M (xjWbackbone)));</p>
      <p>X
(x;y)2DWkeMy
in which y0 is a randomly assigned label independent from
y. This attack usually results in a large-scale shift of the
parameters within the backbone DNN. If the adversary cannot
properly fine-tune the model afterward (which is always the
case in practice since otherwise the adversary would have
already acquired enough data and can train its DNN from
scratch) then the DNN’s SOTA performance is at risk as
demonstrated in the empirical studies.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>H.</given-names>
            <surname>;</surname>
          </string-name>
          and
          <string-name>
            <surname>Molloy</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          <year>2018</year>
          .
          <article-title>Protecting intellectual property of deep neural networks with watermarking</article-title>
          .
          <source>In Proceedings of the 2018 on Asia Conference on Computer and Communications Security</source>
          ,
          <fpage>159</fpage>
          -
          <lpage>172</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Zhu</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ; Zhang,
          <string-name>
            <given-names>X.</given-names>
            ;
            <surname>Shi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ; and
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <surname>Z.</surname>
          </string-name>
          <year>2020</year>
          .
          <article-title>Secure neural network watermarking protocol against forging attack</article-title>
          .
          <source>EURASIP Journal on Image and Video Processing</source>
          ,
          <year>2020</year>
          (1):
          <fpage>1</fpage>
          -
          <lpage>12</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>