<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Digital Watermarking Scheme of Aerial Video Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Margarita N. Favorskaya</string-name>
          <email>favorskaya@sibsau.ru</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vladimir V. Buryachenko</string-name>
          <email>buryachenko@sibsau.ru</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Konstantin A. Gusev</string-name>
          <email>kgusev@sibsau.ru</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Reshetnev Siberian State University of Science and Technology</institution>
          ,
          <addr-line>Krasnoyarsk, Russian Federation</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>The paper describes a classification of methods for digital watermarking of video sequences, as well as a classification of Internet attacks, which are divided into intentional or accidental ones. Approach for multilevel protection is based on the fragile and informative watermarks embedding. The informative watermark containing flight information, encrypted possibly, is embedded into the textural regions of a frame. The developed method for the embedding and extracting digital watermarks is invariant to the global and local geometric distortions Recently, digital watermarking of video materials is becoming increasingly important due to a great volume of multimedia data transmitted through the unprotected Internet networks. Digital watermarking implies the embedding of hidden digital watermarks in a view of images or textual information depending on the solved task. The goals of digital watermarking can be also different, and depending on the purpose various algorithms for embedding and extraction of the watermarks are used. The paper considers a digital watermarking of aerial video materials in order to provide a copyright protection. More complicated task is embedding the annotation results in video materials, for example the trajectory coordinates during a surveillance of object of interest, a number of moving objects, detection of forest wildfire, etc. The structure of this paper is the following. Section 2 briefly reviews classification of the watermarking methods. Section 3 describes a multilevel protection of aerial video data. The proposed method for embedding and extraction of watermarks in aerial videos is given in Section 4. Further the experimental results are reported in Section 5. Section 6 concludes the paper.</p>
      </abstract>
      <kwd-group>
        <kwd>digital watermarking scheme</kwd>
        <kwd>aerial video data</kwd>
        <kwd>copyright protection</kwd>
        <kwd>security</kwd>
        <kwd>capacity</kwd>
        <kwd>robustness</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>attack), and robust (saving their original view generally but a degree of uniform destroy depends from the unknown
parameters of attack).</p>
      <p>The watermarks are embedded in the spatial or frequency domains of image or frame. The spatial methods are
simpler in implementation but at the same time make the watermarks more visible for a human and lesser robust
respect to the frequency methods. We can mention some spatial methods, i.e. method of list significant bit, difference
of pixels’ values, method of histogram displacement, method based on the bit planes, method based on a quantization,
method based using patterns, method based on a modulation, etc. Frequency methods demonstrate greater robustness
to attacks and reliable of hiding the embedded information. However, they have a volume of embedded information
significantly lesser. Frequency methods for embedding hidden information are based on the transforms (discrete
Fourier transform, discrete cosine transform, polar harmonic transform, discrete wavelet transform, complex wavelet
transform, discrete curvelet transform, discrete shearlet transform), as well as moments (Zernike moments, pseudo
Zernike moments, Chebyshev moments).</p>
      <p>Also, the blend and non-blend watermarking schemes are possible. A blind watermarking scheme supposes that
the host image or frame does not transmit through a channel, only a secrete key is transmitted. In this case, the
algorithms of watermark extraction and quality estimates of its reconstruction complicate significantly.</p>
      <p>
        It should be noted that the image watermarking schemes are more studied scope respect to videos watermarking
schemes. If a video sequence is compressed according to one of the existing standards, then the embedding
algorithms become complicated due to the limitations of compression standards. Another new area of digital
watermarking schemes, having large limitations on a volume of embedded information and characterizing by large
number of attacks, is a watermarking of 3D visual objects. Whenever possible, the task of 3D digital watermarking is
reduced to the task of 2D digital watermarking. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
3
      </p>
    </sec>
    <sec id="sec-2">
      <title>Multilevel Protection of Aerial Video Data</title>
      <p>Multilevel protection is increasingly used in practice in response to more sophisticated Internet attacks applied to
the multimedia content. It should be noted that there are a great variety of types of attacks respect to video materials.
The attacks can be the intentional and accidental attacks, as it shown in table 1. The intentional attacks are directed on
a distortion of a whole video sequence or single frame. The intentional attacks are divided on the common image
processing and geometric attacks. At the same time, the accidental attacks are the common image processing only.</p>
      <p>
        Frame dropping means a removal one or several frames from the watermarked video sequence. Frame averaging
distorts a motion in a scene. Frame swapping changes an ordering of frames. If a number of remote, averaged, or
swapped frames is large, then a quality of video sequence becomes low. Several attacks, such as MPEG/JPEG
compression, color distortion, contract distortion, and noise adding, are applied to a whole video sequence and to
separate frames. Copying attack is used for frame fakes based on a textual analysis [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Single frame can be distorted
randomly by affine transform (rotation, scaling, and shift) and also by flipping, cropping, or local random bending.
Composition attacks imply a simultaneously application to frame several types of attacks. Also, the geometric attacks
can be global and local attacks. Glossy copying, MPEG compression, change of frame rate or resolution are ordinary
accidental attacks.
      </p>
      <p>It should be noted that random manipulations with video sequences are very simple editing process. At the same
time, a restoration of the distorted watermarked video sequence is a problem because the unknown parameters of
distortion. At present, all existing methods of blind watermarking cannot prevent the most of distributed types of
attacks. At the same time, use of non-blind watermarking requires a re-transmission of the original video sequence.</p>
      <sec id="sec-2-1">
        <title>Random bending</title>
      </sec>
      <sec id="sec-2-2">
        <title>Change of frame rate</title>
      </sec>
      <sec id="sec-2-3">
        <title>Accidental attacks</title>
      </sec>
      <sec id="sec-2-4">
        <title>Against video sequence</title>
      </sec>
      <sec id="sec-2-5">
        <title>Common image processing</title>
      </sec>
      <sec id="sec-2-6">
        <title>Glossy copying</title>
      </sec>
      <sec id="sec-2-7">
        <title>Change of resolution</title>
      </sec>
      <sec id="sec-2-8">
        <title>MPEG compression</title>
        <p>Common image</p>
        <p>processing
Frame dropping
Frame averaging
Frame swapping
MPEG compression</p>
        <p>Color distortions
Contract distortions
Noise adding
Geometric</p>
      </sec>
      <sec id="sec-2-9">
        <title>Cropping Random bending</title>
      </sec>
      <sec id="sec-2-10">
        <title>Common image processing</title>
      </sec>
      <sec id="sec-2-11">
        <title>Median filtering</title>
      </sec>
      <sec id="sec-2-12">
        <title>Blurring</title>
      </sec>
      <sec id="sec-2-13">
        <title>Copying</title>
      </sec>
      <sec id="sec-2-14">
        <title>JPEG compression</title>
      </sec>
      <sec id="sec-2-15">
        <title>Color distortions</title>
      </sec>
      <sec id="sec-2-16">
        <title>Contract distortions</title>
      </sec>
      <sec id="sec-2-17">
        <title>Noise adding</title>
      </sec>
      <sec id="sec-2-18">
        <title>Geometric</title>
      </sec>
      <sec id="sec-2-19">
        <title>Cropping</title>
      </sec>
      <sec id="sec-2-20">
        <title>Rotation</title>
      </sec>
      <sec id="sec-2-21">
        <title>Scaling</title>
      </sec>
      <sec id="sec-2-22">
        <title>Translation</title>
      </sec>
      <sec id="sec-2-23">
        <title>Flipping</title>
      </sec>
      <sec id="sec-2-24">
        <title>Composition</title>
        <p>Therefore, the blind watermarking schemes are developed in the direction of multilevel protection and application of
video content transforms, which are invariant to several types of attacks.</p>
        <p>
          We apply three levels for frame protection. At the first level, a fragile watermark (visible, semi-visible, or
invisible) WMFR is embedded in the predetermined region. Usually, a fragile watermark is a logotype of company. For
its embedding, it is reasonable to use discrete Hadamard transform, which does not require high computational costs
[
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. Notice that the most of manipulations with content lead to partial or full destruction of this watermark.
        </p>
        <p>
          Second level of protection means an embedding of the main watermark, for example with the flight information
WMFI, using one of frequency transforms, which is invariant to the most of the supposed attacks. If aerial video data
are not compressed, then it is reasonable to apply discrete wavelet transform or discrete shearlet transform with a
singular decomposition [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. Also a selection of regions for embedding of hidden information has a significant
meaning. The main recommendations for such selection are to choice the high textural regions, which do not attract a
human attention, and the regions, where a blue component prevails because a human vision has lesser sensitivity to
this wavelength range [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ].
        </p>
        <p>
          Third level of protection is the encryption of main watermark before its embedding in video content. It the main
watermark is an image, then usually the reversible chaotic transforms are applied. Among them, the Arnold transform
is widely used [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. Arnold transform is a periodic reversible mapping. A number of iterations leading to appearance of
the initial image is called the Arnold period. The predetermined chosen value of a number of iterations is written in a
secrete key. Application of Arnold transform to the encrypted image the given times (Arnold period minus value of a
secrete key) leads to the full reconstruction of initial image. Such procedure is called a scrambling procedure. If the
main watermark is digital data, then we can apply the typical procedure of text data encryption (substitution,
permutation), which parameters are also written in a secrete key.
4
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Method for Embedding and Extraction of Watermarks</title>
      <p>List of the main process of digital watermarking schemes is mentioned below:
– Process GN: the preparing of a watermark containing flight information WMFI with transform to the required
format and encryption if it is necessary, a fragile watermark WMFR, and secrete key K. It should be noted that if any
event, for example object surveillance, features of ecological disaster, or forest wildfire, is detected using additional
software tools, then a watermark of event WMEV is formed.</p>
      <p>– Process EM: the watermarks embedding in the preliminary selected regions of a host image (frame).
– Process EX: the extraction of all watermarks from an image after its transmission through Internet networks
using secrete key K.</p>
      <p>– Process RC: the quality estimation of the reconstructed watermarks and reconstruction of a watermarked image
if it is necessary.</p>
      <p>
        Each process has its own characteristics and deserves a separate consideration. The process of embedding and
extraction are reversible. However, the embedding process has a significant meaning at the sense of information
hiding, as well as robustness to Internet attacks. We propose an original method of adaptive watermarking based on
feature points, which is robust to the global and local geometric attacks. It is well known that feature points are robust
to affine transform. If a function describing a neighborhood of a feature point on a unit circle is transformed to the
function invariant to rotations (for example, using exponential moments), then the coordinates of this feature point
can be embedded in this neighborhood. We recommend to apply this procedure to the restricted number of feature
points, lesser 10 feature points uniformly distributed in a frame. Such method allows us to calculate the parameters of
affine transform and normalize an image before extraction of the watermarks. Let us note that a fragile watermark is
used at the beginning of extraction process. If a fragile watermark was not changed, then it is considered that the
Internet attacks have not been applied. For fragile watermark embedding, we apply discrete Hadamard transform [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ],
while the main watermarks are embedded using discrete wavelet transform. Blind watermarking scheme is utilized.
After compensation of global geometric distortions, the corresponding feature points are analyzed on the subject of
local geometric distortions. In the case of local geometric distortions, we apply a bicubic interpolation in order to
increase a quality of frame after a watermark extraction.
5
      </p>
    </sec>
    <sec id="sec-4">
      <title>Experimental Results</title>
      <p>For experiments, 12 video sequences obtained from a drone DJI Mavic Pro with different shooting conditions [9]
were employed. The main parameters of some video sequenced from this dataset are depicted in table 2.</p>
      <p>
        In each of mentioned in table 2 video sequences, we embedded the semi-visible fragile watermark and watermark
in a view of logotype and then estimate a quality of reconstructed watermark after simulation of different types of
attacks. For quality estimation, Peak Signal-to-Noise Ratio (PSNR) and Normalized Correlation Coefficient (NCC)
metrics were used [
        <xref ref-type="bibr" rid="ref9">10</xref>
        ].
      </p>
      <p>PSNR values are calculated by the following equation:
2 
 MAXI  ,
PSNR  10log10 
 MSE 
 
where MAXI is the maximum possible pixel value of the frame, MSE is the mean squared error between the original
and watermarked frames.</p>
      <p>1 m nI (i, j)  Iw(i, j)2 ,
m n i1 j1
where m and n are the width and height of frame, respectively, I and IW are the intensity values of the original frame
and watermarked frame in coordinates (x, y), respectively.</p>
      <p>The larger PSNR value, the lesser losses during an embedding process.</p>
      <p>Quality of watermark reconstruction can be estimated using NCC metric:</p>
      <p>NCC 
k l
  wi, j   w   wi, j   w 
i1 j1
k l
  wi, j   w  
i1 j1
2
k l
  wi, j    </p>
      <p>w 
i1 j1
2
,
where k and l are the width and height of the watermark, respectively, wi, j  and wi, j  are the intensity values of
the original and reconstructed watermarks, respectively,  w and  w are the mean values of the original and
reconstructed watermarks, respectively.</p>
      <p>The NCC values change in the range [–1, +1]. A value close to 1 indicates a high degree of watermark correlation.
Value close to 0 means the strong differences between the reconstructed and original watermarks that is caused by the
negative impact of attacks on a video sequence.</p>
      <p>Examples of attacks are depicted in figure 1 respect to video sequence Creux du Van Flight.mp4. The attacks
simulated the typical distortions, such as the noise adding (Salt and Pepper and Gaussian noise), contract distortions,
blurring, median filtering, JPEG compression, scaling, rotation, and cropping.</p>
      <p>Table 3 shows the estimate results of extracted watermarks from aerial video sequences with different quality, viz.
Creux du Van Flight.mp4, Bluemlisal Flyover.mp4, and Berghouse Leopard.mp4. The best results were obtained for
video sequence Berghouse Leopard.mp4 that is explained by good quality of shooting and simple structure of a scene.
Also in this video sequence, a background contains the high textural regions that are suitable for better embedding.</p>
      <sec id="sec-4-1">
        <title>Types of attacks</title>
      </sec>
      <sec id="sec-4-2">
        <title>No attack</title>
        <p>Rotation (15)
Salt and Pepper (0.01)
Gaussian noise (0.01)
Intensity correction (1.2)</p>
      </sec>
      <sec id="sec-4-3">
        <title>Blurring on motion (10, 45)</title>
        <p>Intensity distortion,
Gaussian noise, blurring</p>
        <p>Median filtering (33)
Gaussian noise (0.01) and
median filtering (33)</p>
        <p>Scaling (1.15)
Cropping (25%)
0.93
0.91
Such geometric distortions as scaling, cropping, and rotation have the greatest impact on a quality of distorted
watermarks: around 19% of information is lost. At that time, the algorithm provides a high robustness to common
image processing – noise, blurring, and compression. In these cases, losses are lesser 10%.</p>
        <p>In this research, we propose a method for embedding the hidden and fragile digital watermarks in the frames of
video sequences, which can be applied for digital watermarking process of aerial videos captures by the cameras of
unmanned aerial vehicles and drones in order to data protection or embedding additional data, for example flight
information. We developed an algorithm providing a high level of data protection using a watermark encryption and
embedding a fragile watermark, which allows us to get information about Internet attacks. The conducted
experiments simulating the intentional and accidental attacks show a high robustness of digital watermarks to the
geometric transforms and other types of attacks, which are possible during a transmitting of video materials.
Acknowledgements. The reported study was funded by the Russian Fund for Basic Researches according to the
research project № 19-07-00047.</p>
        <p>Drone Videos DJI Mavic Pro Footage in Switzerland https://www.kaggle.com/kmader/drone-videos</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Cheddad</surname>
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Condell</surname>
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Curran</surname>
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mc</surname>
            <given-names>Kevitt P.</given-names>
          </string-name>
          <article-title>Digital image steganography: survey and analysis of current methods /</article-title>
          / Signal Processing.
          <year>2010</year>
          . Vol.
          <volume>90</volume>
          , No. 3. P.
          <volume>727</volume>
          -
          <fpage>752</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Shih</surname>
            <given-names>F.Y.</given-names>
          </string-name>
          <string-name>
            <surname>Digital</surname>
          </string-name>
          <article-title>Watermarking and steganography: Fundamentals and Techniques. 2nd edn</article-title>
          .,
          <string-name>
            <surname>Boca</surname>
            <given-names>Raton</given-names>
          </string-name>
          , London, New York: CRC Press,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Favorskaya</surname>
            <given-names>M.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Savchina</surname>
            <given-names>E.I.</given-names>
          </string-name>
          <article-title>Digital watermarking of 3D medical visual objects // Int</article-title>
          . Arch. Photogramm. Remote Sens.
          <source>Spatial Inf. Sci., XLII-2/W12</source>
          ,
          <year>2019</year>
          . P.
          <volume>61</volume>
          -
          <fpage>67</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Lu</surname>
            <given-names>C.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hsu</surname>
            <given-names>C.Y.</given-names>
          </string-name>
          <article-title>Near-optimal watermark estimation and its countermeasure: antidisclosure watermark for multiple watermark embedding // IEEE Trans</article-title>
          .
          <article-title>Circuits and Systems for Video Technology</article-title>
          .
          <year>2007</year>
          . Vol.
          <volume>17</volume>
          , No. 4. P.
          <volume>454</volume>
          -
          <fpage>467</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Favorskaya</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Savchina</surname>
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Popov</surname>
            <given-names>A</given-names>
          </string-name>
          .
          <article-title>Adaptive visible image watermarking based on Hadamard transform /</article-title>
          / IOP Conference Series: Materials Science and Engineering, MIST Aerospace,
          <year>2018</year>
          .
          <volume>2018450</volume>
          :
          <fpage>052003</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Favorskaya</surname>
            ,
            <given-names>M.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jain</surname>
            ,
            <given-names>L.C. Savchina E.I.</given-names>
          </string-name>
          <article-title>Perceptually tuned watermarking using non-subsampled shearlet transform</article-title>
          // Computer Vision in Control Systems-3: Springer International Publishing Switzerland.
          <year>2018</year>
          . ISRL, Vol.
          <volume>136</volume>
          . P.
          <volume>41</volume>
          -
          <fpage>69</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Favorskaya</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pyataeva</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Popov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <article-title>Texture analysis in watermarking paradigms</article-title>
          // Procedia Computer Science.
          <year>2017</year>
          . Vol.
          <volume>112</volume>
          . P.
          <volume>1460</volume>
          -
          <fpage>1469</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8] [9]
          <string-name>
            <surname>Arnol'd</surname>
            ,
            <given-names>V.I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Avez</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <article-title>Ergodic problems of classical mechanics</article-title>
          .
          <source>Mathematical physics monograph series</source>
          . New York, Benjamin,
          <year>1968</year>
          . 286 P.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Sachin</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vinay</surname>
            ,
            <given-names>K. A RDWT</given-names>
          </string-name>
          and
          <article-title>Block-SVD based dual watermarking scheme for digital images // Int</article-title>
          . J.
          <source>Advanced Computer Science and Applications (IJACSA)</source>
          .
          <year>2017</year>
          . Vol.
          <volume>8</volume>
          , No. 4. P.
          <volume>211</volume>
          -
          <fpage>219</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>