<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>COLINS-</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Redistribution of the Compressed Data Between Modified DEFLATE-Blocks in the Image Compression Process Without Lossless</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Andrii Bomba</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alexander Shportko</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Veronika Postolatii</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Academician Stepan Demianchuk International University of Economics and Humanities</institution>
          ,
          <addr-line>4, Acad. S. Demianchuk Str, 33000, Rivne</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>National University of Water and Environmental Engineering</institution>
          ,
          <addr-line>11, Soborna Str, 33028, Rivne</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>National University “Lviv Polytechnic”</institution>
          ,
          <addr-line>12, St. Bandera Str, 79000, Lviv</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <volume>8</volume>
      <fpage>12</fpage>
      <lpage>13</lpage>
      <abstract>
        <p>Reasonable expediency of each stage of lossless image compression. The advantages and disadvantages of the DEFLATE data compression format are described. The feasibility of using arithmetic coding instead of Huffman compression in modifications of this format is argued. The principles of redistribution of compressed data between DEFLATE- blocks in order to increase their homogeneity to reduce data compression coefficients are presented and substantiated. The software implementation of this redistribution of compressed data between DEFLATE blocks in C++ language is given. On commonly known test the ACT set shows that application proposed redistribution of coded data between modified DEFLATE blocks in the process of lossless progressive hierarchical compression enables reduce the total size of its compressed images by 3 KB and speeds up decoding by an average of 2.14%, although it slows down encoding by 6.48 %.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Image compression lossless</kwd>
        <kwd>DEFLATE compressed data format</kwd>
        <kwd>LZ77 dictionary compression algorithm</kwd>
        <kwd>1</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Today, files transmitted by communication channels most often contain data of one of four main
types: texts, images, video or sound. And if 4 KB is enough to store one page of unformatted text,
then a photo-realistic raster image of 1015 cm size without compression can take several MB,
song recordings - tens of MB, and movie recordings - several GB. Fortunately, most of these files
have a high level of redundancy. It is the reduction of redundancy that allows you to compress
files and thereby increase the speed of information exchange over the network and reduce the
amount of disk space used.</p>
      <p>Most of the well-known algorithms that allow you to compress images dozens of times (JPEG
[1][2], fractal and wavelet transforms [3]) lead to slight loss of image quality. Such losses are
invisible for photorealistic high-resolution images, but affect the quality of low-resolution
images with fragments of business graphics or discrete-tone transitions. In addition, there are
classes of images for which distortions are unacceptable (for example, X-rays). Algorithms for
image compression without loss [4], although they compress images much weaker, do not lead
to deterioration of their quality. Currently, lossy image compression algorithms are developing
most dynamically, although lossless image compression algorithms are also gradually
increasing their compression rates. Therefore, in the era of information civilization, when most
information is stored electronically, the problem of lossless image compression by reducing
redundancy remains relevant now and will be relevant in the future.
2. Related works. Stages lossless compression of images
As it is known, any data compression is possible precisely due to the reduction or elimination of
redundancies [5]. Three main types of redundancies are distinguished in images [6]: visual
(consisting in the presence of information that is not perceived by the human visual system),
inter-element or spatial (manifested in the correlation of brightness of adjacent pixels or
components of the color model) and code (revealed when using codes of the same length for
items with different probabilities). The more types of redundancies of each type are processed
in a graphic format, the more effective the compression is. In the process of lossless
compression, no information is lost, so the first type of redundancy is not reduced.</p>
      <p>To reduce redundancies of the second and third types in archivers and graphic formats
lossless compression of images most often occurs in a maximum of four stages:
1. At the first stage, context-dependent coding reduces redundancies between the same
fragments or fragments with the same structure (reduces inter-element redundancy, can be
also used before the fourth stage). For example, the popular context-dependent algorithm
LZ77 [7] is based on the replacement in the output stream of the sequence of consecutive
literals of the buffer with a reference to a similar pre-encoded sequence of dictionary literals
in the form of a pair of numbers &lt;length, offset from the end of the dictionary&gt; in the process
of coding. If there is no similar sequence of literals in the dictionary longer than two literals,
the first literal of the buffer is transferred to the output stream without changes. After that,
the encoded literals are transferred from the beginning of the buffer to the end of the
dictionary and the encoding continues similarly until the end of the literals of the input
stream. An example of the step-by-step results of the LZ77 algorithm scheme formation for
flow 3, 4, 6, 3, 4, 6, 3, 2, 6, 3, 4, 4 from [8] is given in Table 1. This stream in algorithm-encoded
form will be written as 3, 4, 6, &lt;4, 3&gt;, 2, &lt;3, 6&gt;, 4;
2. At the second stage, the transition to an alternative color model is performed. For
example, in a popular archiver WinRAR [9] before lossless image compression, the transition
from the RGB color model to the R-G, G, B-G color model is performed;
3. On the third, the brightness of the pixel components is transformed using predictors
[10], which during image traversal predict the brightness value of each component of the
next pixel (for example, for the most common 24-bit images, these are the brightness of the
red, green, and blue components, written as integers in separate bytes), using the brightness
values of the same components of previously processed adjacent pixels. In the process of
using this approach, deviations ij of the value of the brightness of the next pixel
component brightnessij from the value predicted by the selected predictor predictij are
calculated and further coded, i.e.</p>
      <p>
        ij  brightnessij  predictij
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
(i and j run respectively through all the rows and columns of the pixel components of the
image). Adjacent image pixels most commonly have similar colors, which means that they
also have close value brightness of relevant components, therefore value forecast often
matches with value brightness of the regular components, most often it is close to this value
and rarely – much differs from him (Figure 1 from [10]). That is, most of the values ij are
close to zero. The use of predictors is similar to delta coding [2];
2500
2000
4. At the fourth stage, context-independent coding forms element codes with lengths that
dependent on their probabilities (processes code redundancy). Identical fragments or
fragments of the same structure targeted by context-sensitive coding rarely occur in
photorealistic images, so the only universal step of lossless image compression is
contextindependent coding itself. It can even be used instead of context-sensitive luminance codes
of individual pixels, if this further reduces the compression ratio (the ratio of the size of
compressed to uncompressed image files, expressed in bpb, hereafter CR). According to our
calculations, the application of a context-independent algorithm reduces the CR of images by
33% on average. Therefore, let's consider the ideas of this coding in more detail.
      </p>
      <p>The main principle of context-independent coding is formulated as follows: the length of the
code of an arbitrary element with a higher probability should not exceed the length of the
code of any element with a lower probability. This principle is based on the fundamental
position of information theory, according to which, in order to minimize the length of the
sequence code, an element si with a probability of occurrence psi  should be coded with
 log2 psi  bits [6]. Then the average code length of the block element after applying any
context-independent algorithm according to the formula of Shannon [11] cannot be less than
the entropy of the source</p>
      <p>
        H   psi  log2 psi  . (
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
      </p>
      <p>i</p>
      <p>In practice today, two alternative context-independent methods are most often used:
Huffman coding and arithmetic coding.</p>
      <p>Huffman coding (HUFF) matches each element depending on its probability (frequency of
occurrence) with a fixed prefix code [12] [13] [14][15]. This coding is most often performed in
the following sequence: the probabilities (frequencies) of individual elements are calculated;
elements in descending order of probabilities are arranged; the two elements with the lowest
probabilities (most often the last ones in the created list) are iteratively combined in order to
obtain one element, the code 0 to the first one, and 1 to the second one are added; the
probabilities of the selected elements to calculate the probability of the formed element and
insert this element into the sorted list of probabilities are summed; HUFF codes are formed, the
formed codes in reverse order – from the top to each element are written.</p>
      <p>The average length of the HUFF code coincides with the entropy of the source only when for
all elements s i the lengths of their optimal codes -log2 p(s i) are integers. In addition, the length
of the HUFF code of even the most probable element cannot be less than one bit.</p>
      <p>Arithmetic compression (ARIC), in contrast to HUFF coding, matches each successive
element with an interval with a length proportional to its probability (frequency) [16]
[17][18][19]. At the same time, a subinterval corresponding to the first element of the flow is
selected from the initial interval (most often [0; 1)) and further considered. This subinterval is
again divided in proportion to the frequencies of individual elements, and the subinterval
corresponding to the second element of the flow is selected from it (see, for example, Figure 2
from [20], on which the subintervals are ordered by the increasing values of the elements). The
selection of subintervals continues in the same way until the end of the flow elements. Thus, the
final length of the selected subinterval is equal to the product of the probabilities of all
elements, and its beginning depends on the sequence of these elements in the stream. The result
of ARIC is any number from the last subinterval received. Taking into account the finiteness of
the numbers, to record the resulting number, the first significant digits of the intervals are
monitored each time, and if they match, these digits are recorded in the output stream and
excluded from further consideration. Decoding of the next ARIC element is performed by
determining the corresponding next subinterval to which the read number belongs.</p>
      <p>From the description of the ARIC principles, it follows that in order to perform such
arithmetic compression, the coder and, even more so, the decoder, must know the probabilities
(frequencies) of individual elements. Today, two strategies for forming these frequencies are
widely used: static and adaptive. In the case of a static strategy, the frequencies of individual
elements are transmitted to the decoder in compressed data explicitly. When using an adaptive
strategy, the intervals of the elements are formed synchronously by the encoder and decoder in
the process of data processing. The adaptive strategy provides, as a rule, better CR (since it does
not need to store the frequencies of individual elements in the compressed data and takes into
account the unevenness of the distribution of the current sequence of elements, and not the
flow in general), but it significantly slows down the work of the decompressor, because it
requires recalculation of the probabilities of the elements in the decoding process. Graphics file
formats should, first of all, provide fast decoding, so they often use a static strategy for forming
element intervals.</p>
      <p>To speed up the development of graphic formats, data compression formats are created,
improved and used, which combine proven context-dependent and context-independent
algorithms. One of the successful examples of such a combination is the DEFLATE dictionary
compression format [21], which for processing of incoming flow uses context-dependent LZ77
algorithm [7], a the results of his work is compressed by Huffman's dynamic codes [12]. This
format is used in many popular archivers (for example, GZIP [22]) and graphic formats (for
example, PNG [23]) and does not need the acquisition licenses for use in applied software. We
use this compressed data format to develop a lossless progressive hierarchical image
compression format [8][10][20].</p>
      <p>It is clear that implementations of the decoding algorithm of LZ77 codes must distinguish
between individual literals and groups &lt;length; displacement&gt;. In the DEFLATE format, for this
purpose, the lengths of the substitutions and the individual literals of the LZ77 algorithm are
encoded together as numbers in the range [0; 285]. At the same time, in each compressed block
numbers from the range [0; 255] correspond to the codes of individual literals, 256 marks the
end of the block, and numbers from the range [257; 285] indicate the basic values of lengths.
After the basic length values, there is a number of bits defined by the format, which, together
with the basic value, uniquely determines the replacement length [21]. The offset is stored after
the corresponding replacement length similarly – in the form of the base value and additional
bits. The basic displacement value is within [0; 29]. In the DEFLATE format, the maximum
encoded sequence length can be 258, the offset can be 32768, and different dynamic codes
HUFF are used to encode the base literals/lengths and offsets.</p>
      <p>The purpose of this article is to describe and justify the algorithm for redistribution of
compressed data between modified DEFLATE blocks to improve the compression performance
of this format.
3. Options for reducing the size of compressed data in the DEFLATE
format
Let's first emphasize the reasons for improving compression by context-independent
algorithms of the fourth stage in cases of applying the algorithms of the second and third stages.
Let's investigate the dependence of entropy for the case of two elements (Figure 3 and [24]). Let
the probability of the appearance of the first element be equal to p1 . Then the probability of the
second
element
will
be
p2 1  p1 ,
and
the
entropy
will
be
equal
to
H   p1 log2 p1  1  p1log2 1  p1 . We see that the entropy is maximal when p1  p2  0.5 .
The greater the probabilities of the elements differ among themselves, that is, the greater the
unevenness of the distribution is, the lower the entropy is [24]. For M the entropy of different
elements is also maximal when the probabilities of these elements are the same and is
calculated by Hartley's formula [24]:</p>
      <p>
        H  log2 M.
(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )
Difference color models of the second and predictors of the third stage of lossless image
compression do not compress the image, but increase the unevenness of the brightness
distribution around zero (Figure 1b) and therefore reduce the entropy (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ), i.e. increase the code
redundancy by reducing the inter-element one.
      </p>
      <p>
        Taking into account the fact that arithmetic coding more accurately encodes elements whose
occurrence of probabilities are not equal to powers of two [5], we modified the DEFLATE
compressed data format by applying E. Shelvin's range coder in it, the code of which can be
found, for example, in [25]. This encoder reads/writes bytes instead of bits of data and thus
significantly speeds up encoding/decoding. For each element, it uses an interval with an integer
length proportional to its probability. In order to store the length of the interval for each
element in the header of the modified DEFLATE-block of compressed data, instead of the length
of the HUFF code, we write the number of bits for storing the corresponding length of the
interval in the general interval [0; 32768), and after the header – a binary record of this length
without the first bit (which is always equal to one) [20]. The DEFLATE format uses a maximum
of 285 different elements to encode literals/lengths replacement. If there are M different
elements in the next modified DEFLATE block, then, taking into account (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ), the number of
additional bits for recording their interval lengths in the block header will be M  log2 M  1,
if these elements are equally likely. In general, the header size of the modified DEFLATE block
will not exceed 2635 bits.
      </p>
      <p>
        The size of the most compressed data in each block by its length entropy code [20]
LH  N  H  N log2 N    Ni log2 Ni ,
i
(
        <xref ref-type="bibr" rid="ref3">3</xref>
        )
where Ni is the frequency of the element si in the next block, and N   Ni .
i
      </p>
      <p>At first glance, it seems that combining all the compressed data into one modified DEFLATE
block should reduce CR, because then only one header of the compressed data block needs to be
stored instead of many. But it was shown in work [20] that the combination of blocks of
compressed data does not reduce (most often – increases) the length of their entropy code. The
greater the difference between the relative frequencies of each element in the two input blocks
is, the greater the additional increase in the length of the entropy code (Figure 4 from [20]) will
be. This gain is zero only when the relative frequencies of the element in the two input blocks
match.</p>
      <p>
        Taking into account (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ), the increase in the length of the entropy code due to the
combination of two adjacent blocks will be calculated using the formula
dLH  N   N log2 N   N   N log2 N   N log2 N  
 Ni  Nilog2 Ni  Ni  Ni log2 Ni   Nilog2 Ni,
i i i
(
        <xref ref-type="bibr" rid="ref4">4</xref>
        )
where ' indicates the corresponding characteristic of the first block, and '' indicates the
characteristic of the second.
      </p>
      <p>
        Therefore, to reduce the size of compressed data in the DEFLATE format, it is advisable to
implement the following options for redistributing compressed data between its blocks:
1. Combine two adjacent blocks of compressed data, if the increase in the length of the
entropy code from their combination (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) does not exceed the savings from storing the
header of the combined block instead of the two headers of these adjacent blocks;
2. Split blocks of compressed data into homogeneous blocks, if the decrease in the length
of the entropy code from such a split exceeds the increase in the size of their headers.
4. Software implementation of redistribution of compressed data
between modified DEFLATE-blocks
Let us first provide the text of the function in the C++ language for calculating the predicted
length of the DEFLATE entropy code of the compressed data block (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) based on the frequencies
of its elements, taking into account the predicted length of its header:
UBYTE4 countBitCodeFreqLL(UBYTE4 * masFreq, UBYTE4 freqElementMas)
{ double s=0; UBYTE2 countElement=0;
for (UBYTE2 i=0; i&lt; MaxLengthCodes; i++) // calculation of the subtractor from (
        <xref ref-type="bibr" rid="ref3">3</xref>
        )
if (masFreq [i]&gt;0)
{ countElement++; // number of elements with non-zero frequency
      </p>
      <p>if (masFreq [i]&gt;1) s+=masFreq [i]*log2(masFreq [i]); }
return 70+countElement*log2(countElement)+ // predicted header length
freqElementMas *log2(freqElementMas)-s; } // length of block entropy code</p>
      <p>The function for calculating the predicted length of the combination of two blocks functions
similarly:
UBYTE4 countBitCodeSumisnFreqLL (UBYTE4* masFreq1, UBYTE4* masFreq2,
UBYTE4 freqElementMas)
{ double s=0; UBYTE2 countElement=0;
for (UBYTE2 i=0; i&lt;MaxLengthCodes; i++)
if (masFreq1[i]+masFreq2[i]&gt;0)
{ countElement++;</p>
      <p>if (masFreq1[i]+masFreq2[i]&gt;1)
s+=(masFreq1[i]+masFreq2[i])*log2(masFreq1[i]+masFreq2[i]); }
return 70+countElement*log2(countElement)+freqElementMas*log2(freqElementMas)-s; }</p>
      <p>Using these functions, a possible combination of two adjacent DEFLATE blocks (the first
option of redistribution) is realized by postponing the data storage of the previous block of
compressed data. If the sum of the sizes of the previous and current blocks does not exceed the
maximum allowable and the combination of blocks reduces the size of the compressed data,
then we will join the current block to the previous one. Otherwise, we will output the previous
block, and postpone the current one for further analysis:
if (countPrevElement &amp;&amp; countPrevElement+countCurrentElement &lt; =OutputBufferSize)
// combination of previous and current DEFLATE blocks is possible
{ // calculate the frequency of literals / lengths of substitutions of the previous block
memset (freqPrevBD, 0, MaxLengthCodes * sizeof (UBYTE4));
for (j=0; j&lt; countPrevElement; j++)
if (codePrevLL [j]&lt;256) freqPrevBD [codePrevLL [j]]++; // literal frequency
else
{LengthToCode (codePrevLL [j]-256, code, extra, value);
freqPrevBD [code]++; } // frequency of the base value of the replacement length
// calculation of size reduction from the combination of previous and current blocks
countBitEconomUnion=countBitCodeFreqLL (freqCurrentBD, countCurrentElement)+
countBitCodeFreqLL (freqPrevBD, countPrevElement) –
countBitCodeSumisnFreqLL(freqCurrentBD, freqPrevBD, countCurrentElement+countPrevElement);}
// we combine the previous and current block, if it is appropriate and possible
if (countBitEconomUnion &gt;=0 &amp;&amp;</p>
      <p>countPrevElement+countCurrentElement &lt;=OutputBufferSize)
{UBYTE2 *codeUnionLL=new UBYTE2[countPrevElement+countCurrentElement];
UBYTE2 *codeUnionD=new UBYTE2[countPrevElement+countCurrentElement];
memcpy(codeUnionLL, codePrevLL, countPrevElement * sizeof (UBYTE2));
memcpy(codeUnionLL+countPrevElement,codeCurrentLL,countCurrentElement*sizeof(UBYTE2));
memcpy(codeUnionD, codePrevD, countPrevElement * sizeof (UBYTE2));
memcpy(codeUnionD+countPrevElement,codeCurrentD,countCurrentElement*sizeof(UBYTE2));
UBYTE4 countUnionElement=countPrevElement+countCurrentElement;
// after combining, we destroy the previous and current blocks
if (countPrevElement)
{delete [] codePrevLL; codePrevLL=NULL;
delete [] codePrevD; codePrevD=NULL; }
delete [] codeCurrentLL; codeCurrentLL=NULL;
delete [] codeCurrentD; codeCurrentD=NULL;
countCurrentElement=0;
// postpone the combination to the previous block
codePrevLL=codeUnionLL; codePrevD=codeUnionD;
countPrevElement=countUnionElement;
if (!lastblock) return; } // if not the last block, then we expect the next block
// otherwise, we output the previous block and postpone the current block</p>
      <p>To perform the division of input blocks of compressed data into homogeneous blocks (the
second variant of redistribution), we implemented the fragmentation algorithm described in
[26], for dividing the input block into homogeneous blocks with different statistical
characteristics. Having these homogeneous blocks, we iteratively choose and combine two
adjacent ones that maximally reduce the length of the compressed data as long as such
reduction is expected. Each of the remaining blocks after iterative combination is output as a
separate DEFLATE block. Here is a fragment of the program for iteratively combining
homogeneous blocks into initial DEFLATE blocks:
for (i=0; i&lt;countRBlock; i++) // determine the predicted size of each homogeneous block
{startPozRBlock=startRBlock [i];
endPozRBlock=startRBlock [i+1]-1;
// determine the frequency of literals/lengths of substitutions and offsets for the next block
for (j=startPozRBlock; j&lt; =endPozRBlock; j++)
if (codeLL[j]&lt;256) freqLLRBlock[i][codeLL[j]]++; // literal frequency
else // frequency of replacement length and offset
{LengthToCode(codeLL[j]-256, code, extra, value);
freqLLRBlock[i][code]++; // accumulate the frequency of the literal/length of the substitution
DistanceToCode(codeD[j], code, extra, value);
freqDRBlock[i][code]++; } // accumulate the displacement frequency
countBitCodeRBlock [i] =countBitCodeFreqLL (freqLLRBlock [i],
endPozRBlockstartPozRBlock+2); } // predicted length of taking into account completion block
// define length combination of adjacent homogeneous blocks
for (i=0; i&lt;countRBlock-1; i++)
countBitCodeSRBlock[i]= countBitCodeSumisnFreqLL (freqLLRBlock [i], freqLLRBlock [i+1],
startRBlock[i+2]-startRBlock[i]+1);// from taking into account code completion block
// combine adjacent homogeneous blocks for improvement compression
while (countRBlock &gt;1)
{economBit=countBitCodeRBlock[0]+countBitCodeRBlock[1]-countBitCodeSRBlock[0];
indexMaxEconomic=0;
for (i=1; i&lt;countRBlock-1; i++)
if ((int)(countBitCodeRBlock[i]+countBitCodeRBlock[i+1]-countBitCodeSRBlock[i])&gt;economBit)
{economBit=countBitCodeRBlock[i]+countBitCodeRBlock[i+1]-countBitCodeSRBlock[i];
indexMaxEconomic=i; } // index of first of contiguous blocks improving compression
if (ekonomBit &lt;0) break; // combining blocks no longer reduces code length
for (j=0; j&lt; MaxLengthCodes; j++) // combine the frequencies of certain adjacent blocks
freqLLRBlock[indexMaxEkonom][j]+=freqLLRBlock[indexMaxEkonom+1][j];
// we shift the frequencies of the following blocks to the left
memcpy(freqLLRBlock+indexMaxEkonom+1, freqLLRBlock+indexMaxEkonom+2,
(countRBlock-indexMaxEkonom-2)* sizeof (freqLL));
for (j=0; j&lt;MaxDistanceCodes; j++) // process displacement block frequencies in the same way
freqDRBlock[indexMaxEconomics][j]+=freqDRBlock[indexMaxEconomics+1][j];
memcpy(freqDRBlock+indexMaxEkonom+1, freqDRBlock+indexMaxEkonom+2,
(countRBlock-indexMaxEkonom-2)* sizeof (freqD));
// shift the beginnings of the following blocks to the left
memcpy(startRBlock+indexMaxEkonom+1, startRBlock+indexMaxEkonom+2,
(countRBlock-indexMaxEkonom-1)* sizeof (UBYTE4));
// determine the size of the combined block
countBitCodeRBlock[indexMaxEkonom]=countBitCodeFreqLL(freqLLRBlock[indexMaxEkonom],
startRBlock [indexMaxEkonom+1]-startRBlock[indexMaxEkonom]+1);
memcpy (countBitCodeRBlock+indexMaxEkonom+1, countBitCodeRBlock+
indexMaxEkonom+2, (countRBlock-indexMaxEkonom-2)* sizeof (UBYTE4));
// it is necessary to recalculate the length of the combination with the previous block
if (indexMaxEkonom&gt;0)
countBitCodeSRBlock[indexMaxEkonom-1]=countBitCodeSumisnFreqLL(
freqLLRBlock [indexMaxEkonom-1], freqLLRBlock [indexMaxEkonom],
startRBlock [indexMaxEkonom+1]-startRBlock[indexMaxEkonom-1]+1);
if (indexMaxEkonom+2&lt; countRBlock) // count the combination with the next block
countBitCodeSRBlock[indexMaxEkonom]=countBitCodeSumisnFreqLL(
freqLLRBlock [indexMaxEkonom], freqLLRBlock [indexMaxEkonom+1],
startRBlock [indexMaxEkonom+2]-startRBlock[indexMaxEkonom]+1);
// shift the lengths of the following combinations to the left
if (indexMaxEkonom+3&lt;countRBlock)
memcpy(countBitCodeSRBlock+indexMaxEkonom+1, countBitCodeSRBlock+
indexMaxEkonom+2, (countRBlock-indexMaxEkonom-3)* sizeof (UBYTE4));
countRBlock--; }
// then each received block is displayed as an output DEFLATE block
5. Results of the applying redistribution of compressed data between
modified DEFLATE-blocks in the process of progressive
hierarchical lossless image compression
We present the results of the impact of the redistribution of compressed data between
DEFLATE-blocks (Table 2 – Table 4) on image compression of the well-known Archive
Comparison Test test set (ACT, Figure 4) in the process of progressive hierarchical bypass
[8][10][20]. You can download these images, for example, from [27]. This set contains both
synthesized (№ 1, 2, 7) and photorealistic (the rest) images. The choice of this particular test set
is determined by the diversity of its images and the availability in open sources of the results of
testing algorithms on it by other researchers. Testing was conducted on a computer with an
Intel processor Pentium 4 with a clock frequency of 3 GHz and 4 Gb of RAM.</p>
      <p>A variant of DEFLATE blocks</p>
    </sec>
    <sec id="sec-2">
      <title>6. Discussions</title>
      <p>From the data in Table 2-Table 4 we can see, that the application of redistribution of
compressed data between modified DEFLATE blocks made it possible to reduce the total size of
compressed images of the ACT set by 3 Kb, although on average it slowed down the encoding by
6.48%. Moreover, a reduction in the size of compressed images is observed both for
photorealistic images (№ 4) and for synthesized images (№ 1, 7). Other compressed images are
hundreds of bytes smaller. Such modest results are explained by the fact that, firstly, the
maximum gain from the combination of adjacent modified DEFLATE blocks is equal to the block
header (up to 380 bytes), and secondly, after using difference color models and predictors,
these images have a high uneven distribution of element frequencies (Figure 1b) and
partitioning does not significantly increase their homogeneity and, thirdly, the data of different
layers and passes of progressive hierarchical compression are stored in different compressed
blocks by default. On the other hand, the redistribution of compressed data between DEFLATE
blocks made it possible to speed up decoding by an average of 2.14% due to discrete-tone
images, which indicates the feasibility of its use. For example, we use the redistribution of
compressed data between modified DEFLATE-blocks in our developed HBF-LS lossless
progressive hierarchical image compression format [8][10][20] when the encoding time is
insignificant.
7. Conclusions
1. Using compressed data formats, which provide for their division into blocks, it is worth
analyzing the expediency of combining adjacent blocks and dividing blocks into
homogeneous fragments to reduce CR before writing these blocks into a file;
2. The combination of DEFLATE blocks reduces the size of compressed data by a maximum
of the size of the header of one such block, and splitting increases the uneven distribution of
the relative frequencies of their elements, so these actions should be implemented in image
compression programs;
3. The combination of DEFLATE blocks of compressed data makes it possible to speed up
decoding by reducing the number of generation intervals or codes of individual elements for
each such block.</p>
      <p>In the future, we plan to improve the mechanism of the "lazy" layout of the LZ77 algorithm
[8] by memorizing and using smaller offsets to shorter identical sequences that can be applied
in the process of its formation, and to investigate the effectiveness of using the previous layout
this algorithm to predict the lengths of literals, lengths of substitutions and offsets.
[7] J. Ziv, A. Lempel, “A universal algorithm for sequential data compression,” IEEE</p>
      <p>
        Transactions on Information Theory, 23(
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) (1977) 337-343.
[8] A. Shportko, A. Bomba, V. Postolatii, Rejection of the Inefficient Replacements while
Forming the Schedule of the Modified Algorithm LZ77 in the Process of Progressive
Hierarchical Compression of Images without Losses, in: Proceedings of the 6th
International Conference Computational Linguistics and Intelligent Systems (COLINS
2022), Glivice, Poland, 12-13 May 2022, volume 3171, pp. 1594-1605. URL:
http://ceurws.org/Vol-3171/paper113.pdf.
[9] WinRAR download free and support, version 7.00, 2024. URL:
https://www.winrar.com/start.html?&amp;L=0.
[10] A. Shportko, V. Postolatii, Development of Predictors to Increase the Efficiency of
Progressive Hierarchic Context-Independent Compression of Images Without Losses, in:
Proceedings of the 5th International Conference Computational Linguistics and Intelligent
Systems (COLINS 2021), Kharkiv, Ukraine, Apr. 22-23, 2021, volume 1, pp. 1026-1038. URL:
http://ceur-ws.org/Vol-2870/paper77.pdf.
[11] C. E. Shannon, “A Mathematical Theory of Communication,” Bell System Technical Journal,
27 (1948) 379-423, 623-656. doi:10.1002/j.1538-7305.1948.tb00917.x.
[12] D. Huffman, “A Method for the Construction of Minimum Redundancy Codes,” Proceedings
of the IRE, 40(9) (1952) 1098-1101. doi:10.1109/JRPROC.1952.273898.
[13] T. H. Cormen, C. E. Leiserson, R. L. Rivest and C. Stein, Introduction to Algorithms, Fourth
Edition, MIT Press, 2022, pp. 431–439. URL:
http://mitpress.mit.edu/9780262046305/introduction-to-algorithms.
[14] J. Duda, K. Tahboub, N. J. Gadil and E. J. Delp, “The use of asymmetric numeral systems as an
accurate replacement for Huffman coding,” Picture Coding Symposium, Cairns, QLD,
Australia, Jul. 30, 2015. doi:10.1109/PCS.2015.7170048.
[15] M. A. Rahman, M. Hamada, “Lossless image compression techniques: A state-of-the-art
survey,” Symmetry, 11 (2019) 1274. doi:10.3390/sym11101274.
[16] J. Rissanen, “Generalized Kraft Inequality and Arithmetic Coding,” IBM Journal of Research
and Development, 20(
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) (1976) 198–203. doi:10.1147/rd.203.0198.
[17] J. Rissanen, G. G. Langdon, “Arithmetic coding,” IBM Journal of Research and Development,
23(
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) (1979) 149–162. doi:10.1147/rd.232.0149.
[18] I. H. Witten, R. M. Neal and J. G. Cleary, “Arithmetic Coding for Data Compression,”
      </p>
      <p>
        Communications of the ACM, 30(
        <xref ref-type="bibr" rid="ref6">6</xref>
        ) (1987) 520–540. doi:10.1145/214762.214771.
[19] A. Moffat, R. M. Neal and I. H. Witten, “Arithmetic coding revisited,” ACM Transactions on
      </p>
      <p>
        Information Systems, 16(
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) (1998) 256-294. doi:10.1145/290159.290162.
[20] A. Ya. Bomba, A. V. Shportko and A. V. Shportko, “Features of the application of arithmetic
coding in the process of lossless progressive hierarchical compression of images,”
Proceedings of the National University "Lviv Polytechnic", Series: Information Systems and
Networks, 783 (2014) 12-22. URL: http://nbuv.gov.ua/UJRN/VNULPICM_2014_783_4.
[21] P. Deutsch, DEFLATE Compressed Data Format Specification, version 1.3, RFC 1951. URL:
https://www.rfc-editor.org/rfc/rfc1951.
[22] P. Deutsch, J-L. Gailly, ZLIB Compressed Data Format Specification, version 3.3, RFC 1950,
      </p>
      <p>Network Working Group, 1996, 10 p. URL: http://www.ietf.org/rfc/rfc1950.txt.
[23] J. Miano, Compressed Image File Format: JPEG, PNG, GIF, XBM, BMP, Addison Wesley, New</p>
      <p>York, 1999, 264 p. ISBN 0201604434.
[24] Yu. P. Zhurakovskyi, V. P. Poltorak, Teoriia informatsii ta koduvannia, Vyshcha shkola, Kyiv,
2001, pp. 33. URL: https://b.eruditor.link/file/348269.
[25] Compression by PPM + Arithmetic Coder C# implementation (based on C++ algorithm from
the book "Data Compression Methods"), 2022. URL:
https://gist.github.com/Geographus/a1beae713c337243c478cb5575a22f75.
[26] Shportko A. V. Rise of efficiency of compression of coloured images in the PNG format, Ph.D.
thesis, Rivne State University of the Humanities, Rivne, 2010, pp. 83-85. URL:
https://dspace.megu.edu.ua:8443/jspui/handle/123456789/1665.
[27] ACT – Test Files, 2002. URL: http://www.compression.ca/act/act-files.html.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>G.</given-names>
            <surname>Wallace</surname>
          </string-name>
          , ”
          <article-title>The JPEG still picture compression standard,”</article-title>
          <source>Communication of ACM</source>
          ,
          <volume>34</volume>
          (
          <year>1991</year>
          )
          <fpage>30</fpage>
          -
          <lpage>44</lpage>
          . doi:
          <volume>10</volume>
          .1145/103085.103089.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>O.</given-names>
            <surname>Shehata</surname>
          </string-name>
          , “Unraveling the
          <string-name>
            <surname>JPEG</surname>
          </string-name>
          ,” Parametric Press,
          <volume>1</volume>
          (
          <year>2019</year>
          ). URL: https://parametric.press/issue-01/unraveling-the-jpeg.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>T.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , E. Lim,
          <string-name>
            <given-names>M.</given-names>
            <surname>López-Benítez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ma</surname>
          </string-name>
          , L. Yu, “
          <article-title>A review of wavelet analysis and its applications: Challenges and opportunities</article-title>
          ,” IEEE Access,
          <volume>10</volume>
          (
          <year>2022</year>
          ),
          <fpage>58869</fpage>
          -
          <lpage>58903</lpage>
          . doi:
          <volume>10</volume>
          .1109/ACCESS.
          <year>2022</year>
          .
          <volume>3179517</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>H. D.</given-names>
            <surname>Kotha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Tummanapally</surname>
          </string-name>
          and
          <string-name>
            <given-names>V. K.</given-names>
            <surname>Upadhyay</surname>
          </string-name>
          , “Review on Lossless Compression Techniques,
          <source>” Journal of Physics</source>
          <volume>1228</volume>
          (
          <year>2019</year>
          ). doi:
          <volume>10</volume>
          .1088/
          <fpage>1742</fpage>
          -6596/1228/1/012007.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Selomon</surname>
          </string-name>
          , A Guide to Data Compression Methods, Springer, New York,
          <year>2002</year>
          , 295 p. doi:
          <volume>10</volume>
          .1007/978-0-
          <fpage>387</fpage>
          -21708-6.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>R.</given-names>
            <surname>Gonzalez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Woods</surname>
          </string-name>
          , Digital Image Processing,
          <string-name>
            <given-names>Fourth</given-names>
            <surname>Edition</surname>
          </string-name>
          .,
          <string-name>
            <surname>Pearson</surname>
          </string-name>
          , London,
          <year>2017</year>
          , 1192 p.
          <source>ISBN</source>
          <volume>9353062985</volume>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>