<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>W. Yang, Method of image quality assessment based on region of interest, Journal of
Computer Applications</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1186/1687-5281-2012-8</article-id>
      <title-group>
        <article-title>Research of Methods for Image Sharpness Evaluation in Photos of People</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Victoria Vysotska</string-name>
          <email>victoria.a.vysotska@lpnu.ua</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nataliia Sharonova</string-name>
          <email>nvsharonova@ukr.net</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mariya Shirokopetleva</string-name>
          <email>marija.shirokopetleva@nure.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Oleksandr Dolhanenko</string-name>
          <email>oleksandr.dolhanenko@nure.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anastasiya Chupryna</string-name>
          <email>anastasiya.chupryna@nure.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Serhii Smelyakov</string-name>
          <email>serhii.smeliakov@nure.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Kharkiv National University of Radio Electronics</institution>
          ,
          <addr-line>Nauky Ave. 14, Kharkiv, 61166</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Lviv Polytechnic National University</institution>
          ,
          <addr-line>Stepan Bandera, Lviv, 79013</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>National Technical University "KhPI"</institution>
          ,
          <addr-line>Kyrpychova str. 2, Kharkiv, 61002</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <volume>5</volume>
      <issue>2008</issue>
      <fpage>0000</fpage>
      <lpage>0001</lpage>
      <abstract>
        <p>The subject matter of the article is image sharpness evaluation in photos of people. The goal of the work is to analyse the existing methods of image sharpness evaluation, compare their performance and quality of results, suggest improvements for the use case of sharpness classification of photos of people, where large quantities of background blur is present due to the aperture effect. In this article the methods for image sharpness evaluation were described and tested on a set of selected images. The images contained different subject sizes and types of blur. The following methods were used: Fast Fourier Transform (FFT), Variance of the Laplacian, Appearance-based face detection algorithms, metadata analysis, linear trendline analysis. As a result, the problem of naturally blurred background was demonstrated and conclusions were made. An alternative method of sharpness evaluation was de-scribed, which solves the mentioned problem. The suggested improved algorithm was tested to determine if it satisfies expectations and solves the identified problems. To implement the Fast Fourier Transform and Variance of the Laplacian methods, the OpenCV library was used. The following results were ob-tained - when using the default implementation, the FFT and Variance of the Laplacian methods are not reliable for evaluating sharpness for images containing large and unstable quantities of naturally blurred background (due to the open aperture) and when using different settings of the lens and camera. The following conclusions were made: steps need to be taken to eliminate the factors of unstable quantities of naturally blurred background and camera preferences and in this way improve accuracy, reliability of sharpness evaluation. This means evaluating the sharpness of parts of the images and the images as a whole. These steps include but are not limited to face position detection, identifying faces that were supposed to be in focus when the photo was taken, sharpness evaluation of only areas that were in-tended to be in focus. Plans were set for further research and improvements of the suggested algorithm.</p>
      </abstract>
      <kwd-group>
        <kwd>Sharpness</kwd>
        <kwd>image</kwd>
        <kwd>aperture</kwd>
        <kwd>depth of field</kwd>
        <kwd>focus</kwd>
        <kwd>FFT</kwd>
        <kwd>variance of the Laplacian 1</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Photography is very subjective. There are many ways to take a photo, since all of us see the world
in very different ways. There are, however, some basic principles for achieving a generally well
composed photograph. You may or may not follow these principles and still get a great photo that
tells a story, but statistically speaking the best looking and most objectively recognised as
professional and pleasing photos are the ones which have the subject in critical sharp focus.</p>
      <p>Achieving focus on a photo is a very complicated process, which has multiple factors
influencing the final result. A general smartphone user takes this process for granted, as the phone
does all decision making and photo processing for us in real time, most of the time achieving
pretty good results for general social media use. The algorithms used there are fine-tuned, subject
oriented. Most of the time our phones use machine learning to detect the scene and tune the
settings to achieve better results.</p>
      <p>Our smartphones are getting better and better every year in terms of photography, there is
even the ability to imitate expensive lenses by adding a fake depth-of-field to the photos [1].
However, any well-established professional photographer will say how important it is to be in
control of the manual settings of the camera, being the decision maker and scene establisher,
getting the exact result the creator wants, and not the tool on its own.</p>
      <p>Moreover, they use professional hardware, which may contain some "smart" features, but
generally is very exposed to manual overriding, giving the operator more flexibility. However,
where humans are involved mistakes are present.</p>
      <p>Having photographed an event, say, a wedding, the photographer usually spends hours and
sometimes days looking at more than 2000 photos, filtering out the ones that are to be deleted
and highlight the best ones for further editing. This is a very long and complicated process of
comparing, what seem to be, identical photographs at first glance (but actually different in slight
ways), which can be a big factor for the final result. The most important thing the photographer
looks at is the sharpness of the subjects. It is a general rule, that, when photographing humans or
animals, the eyes are the ones that need to be in critical sharp focus. Everything else in the photo
can be changed - the lighting can be increased or decreased, some elements can be added or
removed, but sharpening a blurry subject is a very destructive process which is highly
discouraged.</p>
      <p>The goal of this work is to analyse and compare different algorithms and program solutions
for identifying the sharpness of a photo (containing humans as subjects) automatically and
proposing improvements of the existing algorithms to solve the problems which will be described
further on.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Works</title>
      <p>When it comes down to solving a scientific problem, it is required to operate with objective
terms and calculations. That is why it is important to define what a sharp photo is. The common
way of measuring the sharpness is by the "rise distance" of edges within the image. That way, the
sharpness is determined by the distance of a pixel level between 10% and 90%. Therefore, it can
be stated that sharpness is measured by analysing the intensity of edge gradients [2]. However,
the threshold level of intensity to classify a photo being sharp or not is very circumstantial and
cannot be standardised. During the following experiments, a set of reference photos will be used
to determine the minimum and maximum values of sharpness and thus, providing the threshold.
Apart from that, the problem of image quality assessment [3] is well-known and was attempted
do be generally solved by many.</p>
      <p>In order to start detecting and classifying the sharpness of photos it is first needed to clarify
which camera and lens settings influence it. The camera in combination with the lens has 3 main
settings: aperture, shutter speed and ISO.</p>
      <p>Aperture is the main setting of the lens. It is generally an opening that can be bigger or smaller
and thus let in more or less light. It is usually preferred to let in as much light as possible, therefore
"opening up", or "making the aperture wide open", but this has a side effect that can directly
influence the sharpness of the subject. The more open the aperture - the more shallow the depth
of field is. For cameras that can only focus on one object distance at a time, depth of field is the
distance between the nearest and the furthest objects that are in acceptably sharp focus.</p>
      <p>By knowing the DOF (Depth Of Field) we can understand what depth of the image had to be in
focus. To give a more clear understanding of why this is important an example image is provided,
shot with a very shallow DOF (see Figure 1).</p>
      <p>Figure 1 (photo by Dolhanenko O.) demonstrates a shot with a very open aperture of 2.8 (the
lower the number the more open the aperture is). The subject that lays within the shallow depth
of field is in critical sharp focus, however the secondary subjects which are before and after the
field boundaries are not in focus at all (in this case only before the subject on the right).</p>
      <p>By means of traditional photo sharpness detection algorithms, this is not a sharp photo, as the
area in critical sharp focus is very small compared to the blurry part (this will be experimentally
tested further on). However, if the subject is correctly identified among the two and is exclusively
checked, then the photo is in fact in critical sharp focus and is acceptable for further editing.</p>
      <p>Another setting that influences the sharpness of a photo is the shutter speed. The rule is "the
faster - the better" for general photography. The quicker the curtain collapses before the sensor
the less blur there will be on the photo, as each movement of subjects in the photo during the shot
will cause their sharpness to decrease. Especially when shooting in low light conditions, when fast
shutter speeds are not available (otherwise the photo will be too dark) this motion blur is quite
noticeable due to minor movements.</p>
      <p>The pursuit of accurate image sharpness assessment has led to the exploration of various
approaches, encompassing spatial domain-based methods, spectral domain-based methods,
learning-based methods [4], and a combination of these techniques, each presenting unique
advantages and challenges, including utilization of methods such as Local Phase Coherence [5, 6],
Edge Information analysis approach [7], Normal-Gradient-Based approach [8], Gradient
Neighbourhood-Weighted approach [9] and others.</p>
      <p>Zhu et al. (2023) [10] conducted a comprehensive review, offering insight into the current
trends and performance comparisons of notable algorithms, revealing a landscape of ongoing
innovation aimed at overcoming the shortcomings of existing methods.</p>
      <p>Bielievtsov et al. (2018) [11] investigated network technology for the transmission of visual
information, highlighting the importance of maintaining image quality in the context of digital
communication and storage. This work is foundational, setting the stage for further research into
image quality assessment methods that are critical for various applications, including but not
limited to, facial recognition, social media, and professional photography.</p>
      <p>In the domain of image search and retrieval, Smelyakov et al. (2020) [12] introduced an
innovative approach to image engine search for big data warehouses, emphasizing the necessity
of high-quality image processing for efficient and accurate image retrieval. This development is
particularly relevant to our research as it underscores the significance of image sharpness in
enhancing the performance of search engines, which often rely on visual content analysis to
function effectively.</p>
      <p>Furthermore, the effectiveness of preprocessing algorithms for natural language processing
applications was explored by Smelyakov et al. (2020) [13], illustrating the broad applications of
image and signal processing techniques across various fields of computer science. Although
focused on natural language processing, the principles of preprocessing and quality enhancement
are applicable to the domain of image processing, providing insights into methods that could
potentially improve image sharpness evaluation algorithms.</p>
      <p>The development of no-reference (NR) [14] sharpness metrics has been particularly
noteworthy, with Duan et al. (2021) [15] introducing an efficient NR objective sharpness
assessment metric designed for images with shallow depth of field, a common characteristic in
portraits and photos emphasizing human subjects. This metric, which calculates sharpness based
on bidirectional pixel intensity differences, addresses the limitations of traditional sharpness
assessment tools when applied to such images.</p>
      <p>Research by Her and Yang (2019) [16] on image sharpness assessment algorithms for
autofocus systems further exemplifies the field's evolution. Their work evaluates the performance
of several spatial domain functions, highlighting the scene adaptability and anti-jamming
capabilities of the Benner algorithm and the sensitivity of the Laplace algorithm, among others.
This research underscores the critical role of sharpness evaluation functions in enhancing the
quality of images captured by various imaging systems.</p>
      <p>The development of advanced artificial intelligence systems, as explored by Kyrychenko,
Tereshchenko, Proniuk, and Geseleva (2023) [17], through the use of predicate clustering
methods, presents potential avenues for refining image sharpness evaluation techniques</p>
      <p>Moreover, advancements in image quality assessment for zoom photos, as investigated by Han
et al. (2023) [18], reveal the challenges posed by small sensor sizes and fixed focal lengths in
smartphones. Their novel no-reference zoom quality metric incorporates traditional sharpness
estimation with image naturalness concepts, demonstrating significant improvements in
assessing image quality over traditional metrics.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Methods</title>
      <sec id="sec-3-1">
        <title>3.1. Existing methods</title>
        <p>The Fast Fourier Transform is a convenient mathematical algorithm for computing the
Discrete Fourier Transform (DFT). It is used for converting a signal from one domain into another
[19].</p>
        <p>The FFT is used in different areas, such as mathematics, music, engineering, etc. This method
is widely used, as sometimes calculations are much easier performed when the time-series
signals are converted into the frequency domain. This method can also be used to convert the
frequency domain back to the original format. When talking about FFT in image processing and
computer vision, it is important to note that the image is represented in the Fourier and Spatial
domains. So, the image is represented in both imaginary and real components.</p>
        <p>The obtained values can be analysed to perform blurring or blur detection, edge detection,
analysis of textures, etc.</p>
        <p>There is a sampled Fourier Transform, which is called DFT. It contains only the set of image
samples which are enough to fully represent the spatial domain image [20] (which is often used
for further quality metrics extraction).</p>
        <p>
          Given an image with size N*N, the resulting DFT matrix can be defined as follows:
&amp; ' (
          <xref ref-type="bibr" rid="ref1">1</xref>
          )
(, ) = ∑-(+!,) ∑*(+!,) (, )!"#$%!#" #$"
where f(a,b) is the image in the spatial domain and the exponential term is the basis function
corresponding to each; point F(k,l) in the Fourier space.
        </p>
        <p>The equation can be interpreted as: the value of each point F(k,l) is obtained by multiplying
the spatial image with the corresponding base function and summing the result. At the base level
functions, operations are represented as sine and cosine waves which have increasing
frequencies. For example, F(0,0) represents the DC-component of the image (average image
brightness) and F(N-1,N-1) is the highest frequency of the image.</p>
        <p>The ordinary one-dimensional DFT has N2 complexity. If the Fast Fourier Transform (FFT) is
used, the complexity can be reduced to Nlog2N. For computing large images this improvement is
crucial. However, some forms of the FFT may restrict the maximum size of the input to N=2n.</p>
        <p>The result is an output image represented with complex numbers . This image can be displayed
in two states: either with the real and imaginary part or with magnitude and phase (see Fig. 2).</p>
        <p>When solving problems in the area of image processing, commonly only the magnitude of the
Fourier Transform is displayed. The example image of the result of such transformation is
illustrated above.</p>
        <p>For clear and reliable results, a contrast detection threshold must be calculated beforehand
[21] in order to understand if the image has enough contrast for further analysis. In case of
contrast availability and FFT algorithm completion, a floating point value of the mean of the
magnitude indicates the relative sharpness of the whole image. Of course, since this value is
relative, conclusions cannot be made without a reference sharpness value.</p>
        <p>Another option is to use the variance of the Laplacian.</p>
        <p>The Laplacian of a function f at a point p is (up to a factor) the rate at which the average value
of f over spheres cantered at p deviates from f(p) as the radius of the sphere shrinks towards 0.</p>
        <p>
          The Laplacian operator is defined as the divergence of the gradient of function f, as shown in
formula (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ).
        </p>
        <p>
          ∆(, ) = 3()7 (
          <xref ref-type="bibr" rid="ref2">2</xref>
          )
        </p>
        <p>In this definition, the gradient is the slope of steepest accent and it gives information about the
point and direction of the highest accent in local maxima and, likewise, the local minima.</p>
        <p>In case of image sharpness detection, divergence is the vector field associated with blurriness
from subject motion or natural background blurriness. The calculated matrix of Laplacian is
demonstrated on Figure 3.</p>
        <p>Using the process of convulsion with the source image, it is being transformed based on the
Laplacian kernel. This is used to find areas of rapid source changes (but this works if there is no
noise in the image [22]).</p>
        <p>Using the OpenCV library this method will be tested on sample images which contain both
blurry areas and sharp subjects. This partial blurriness was caused by a very open aperture, which
made a very distinct background separation.</p>
        <p>var imageMat = Highgui.imread(image.absolutePath, Highgui.CV_LOAD_IMAGE_GRAYSCALE)
val destination = Mat()
Imgproc.Laplacian(imageMat, destination)
val median = MatOfDouble()
val std = MatOfDouble()
Core.meanStdDev(destination, median, std)
val variance = Math.pow(std.get(0, 0).get(0), 2.0)</p>
        <p>To get a single floating point number representing the overall sharpness of a photo it is first
needed to load the source image in grayscale, after which the basic Laplacian function should be
called.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. The alternative algorithm</title>
        <p>Looking at the results of the two popular algorithms it can be stated that neither one is
optimised enough for the problem stated at the beginning. During photo sessions most portrait
photos will be taken with an open aperture which will result in the background having many areas
with low frequencies [23]. If the area with naturally low frequencies is greater than the area with
high frequencies it will result in the image being labelled as "not sharp", which is not necessarily
true. The question arises – how to make judgements on the photo subject sharpness if analysing
the whole image is not optimal?</p>
        <p>As a top level solution, the sharpness detection algorithm should be modified in such a way, so
that only areas that are supposed to be in focus are evaluated and the background/foreground is
ignored. This is the general principle of evaluating visible errors in specific areas of the photo [24].
An example of such approach can be found in the framework for measuring sharpness in natural
images [25]. Also, there is research about image quality assessment based on regions of interest,
which are identified as features which are highly spatially nonstationary [26].</p>
        <p>The solution will be developed based on the limitation that the subjects that need to be in focus
are human faces. This limitation, however, can be eliminated by modifying the algorithm and
providing support for more subjects.</p>
        <p>The following list describes the steps of the algorithm:
1. Find the coordinates and boxing boundaries of every face in the frame
2. Extract the focus distance from the photo
3. Calculate the distance to every face in the frame
4. Calculate the ideal depth of field
5. Select the faces that are within the intended focus plane (focus distance +- ideal depth
of field)
6. Apply the sharpness detector only for the boxes containing the selected faces
7. Calculate the average sharpness score based on the individual sharpness scores
Next, the steps of the algorithm will be described in more detail with the propositions and
variants for implementation.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.2.1. Finding the boundaries and coordinates of faces in the frame</title>
        <p>There are quite a few methods and options when it comes to face detection in images. Given
an arbitrary image, the goal of face detection is to determine whether or not there are any faces
in the image and, if present, return the image location and extent of each face.</p>
        <p>Some methods for detecting faces include:
• Knowledge-Based Top-Down Methods;
• Feature invariant approaches;
• Template matching methods;
• Appearance-based methods.</p>
        <p>Knowledge-based methods are developed with the scientific knowledge about human faces as
the primary source of information. The problem with this approach is the difficulty in translating
human knowledge into well-defined rules. It is very hard to find a perfect balance of strictness in
these rules. If the rules are defined too strictly – the method may fail to detect faces that do not
pass all rules at once. On the other hand, if the rules are too general – the method may result in
false positives.</p>
        <p>The feature-based methods are based on the fact that humans have the natural ability to
recognize a face in different lighting conditions, under different angles and circumstances,
meaning we are trained to detect certain features. This method builds on this fact and many
variants were proposed, when first the features are extracted and analysed. One problem with
these feature-based algorithms is that the image features can be severely corrupted due to
illumination, noise, and occlusion.</p>
        <p>In template matching methods a face pattern is manually predefined. When analyzing an input
image, the correlation is computed for separate parts of the presumable face: the contour, the
nose, eyes, mouth. The result is positive, if the mean correlation value of these components is
above certain threshold. This method, however, is not ideal due to the lack of flexibility when it
comes to different face shapes, poses and scale.</p>
        <p>Lastly, the appearance-based methods are more close-to-life, than the previous ones. The
templates are not generated by experts (like in template matching), rather taken from samples of
actual image databases. This method relies heavily on statistical analysis and machine learning.
Appearance-based methods have many different implementations, starting like
distributionbased, support vector machines, hidden Markov model, cascade classifiers, cascaded
convolutional networks [27], etc.</p>
        <p>Taking into account the advantages and disadvantages of the abovementioned face detection
methods, it was decided that appearance-based methods are well suited for the task. When
implementing the improved sharpness detection algorithm, cascade classifiers can be used.
OpenCV contains pre-trained open classifiers that can be freely downloaded. To retrieve face
coordinates using the OpenCV library one can use the CascadeClassifier.detectMultiScale function.
The arguments for this function are: the input image (in gray scale), scaleFactor and
minNeighbours. The scaleFactor specifies how much the image size is reduced with each scale.
minNeighbours specifies how many neighbors each candidate rectangle should have to retain it.
These parameters will be fine-tuned during the development process.</p>
        <p>Of course, everything that involves image processing and subject recognition will always take
up some valuable processing time, so there is another way of extracting the faces in the frame.
The other method relies heavily on the camera’s integrated ability to detect faces in real time and
write this information to the metadata. Of course, the algorithm should not be strictly reliant on
using this optional metadata, but definitely should utilize it if available – that would dramatically
improve the performance of the method.</p>
        <p>As a result of image scanning with classifiers or face extraction using metadata, the coordinates
and boundaries of all faces in the frame will be retrieved for further analysis.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.2.2. Extracting the focus distance and other useful parameters from the photo</title>
        <p>Focusing plane is the image sensor of the camera. Focusing distance is the distance from the
focusing plane to the subject.</p>
        <p>When using automatic lenses (lenses with autofocus, that do not require manual input to put
the subject in critical sharp focus) there are complex algorithms running in order to produce
optimal results in terms of focus. The camera constantly monitors the image, understands the
distance to the subject in real time and adjusts the focus motors accordingly. To achieve the
autofocus, the camera lens moves to a position where the clearest image is obtained. The
maximum clarity is measured from the histograms of the images on which a filter with the role of
highlighting the edges was initially applied. Knowing the focal length of the lens, the distances can
be found from the lens equation [28]. So, we can see that this is clearly possible and cameras
perform these calculations all the time. Moreover, cameras can utilize different focusing
algorithms [29] for accuracy. The distance that the focus motor travelled in the lens before the
photo was taken corresponds to the estimated focusing distance which is needed for the modified
sharpness detection algorithm.</p>
        <p>It is possible to retrieve the calculated focusing distance by looking at the photo metadata. EXIF
(Exchangeable Image File Format) is a standard that allows adding information (metadata) to
photos and videos. This format is quite flexible, meaning that all users can modify it by adding
new data entries with original names. This means that not all camera bodies from different
manufacturers produce the same metadata. Having that said, many cameras contain valuable
information that can be used for the purpose of this research. By viewing the metadata using an
EXIF reader we can find the following relevant information (Table 1).</p>
        <p>Parameter Name
Lens Spec
Min Focal Length
Max Focal Length
Focal Length</p>
        <p>The parameter "Focus Distance 2" implies the calculated distance (in meters) to the focused
subject. This calculation was performed by the camera by analysing the rotation angle of the focus
motor when the edges of the image were in focus. This parameter can be used as a ready solution
and will be very useful for further calculations. There is no other reliable way of
retrieving/calculating the focus distance of a photo, especially when the photo is not guaranteed
to be in critical focus initially. This is why the method heavily relies on this meta parameter.</p>
        <p>Some other useful parameters include the number of faces detected and the detected faces
positions. This is a "luxury" to have these parameters in the metadata and only modern cameras
provide such information. Heavily relying on the presence of these parameters would limit the
method to work only for modern high-tech camera bodies. So, the method will utilise these
shortcuts in metadata if they are present, but still have the ability to detect faces’ boundaries
described in the section above.</p>
      </sec>
      <sec id="sec-3-5">
        <title>3.2.3. Calculating the distance to every face in the frame</title>
        <p>It is required to find distances to all faces in the frame because the method needs to determine
which faces in particular need to be in focus. The "need to be in focus" criteria is described in the
next point.</p>
        <p>At this step the focus distance, as well as all face boundaries (and thus, sizes) are already
known. To find the distance to the human face on the photo it is necessary to first estimate the
height of the faces in the real life.</p>
        <p>To find the real-world height of each face in the frame, two methods can be used:
• the average statistical face height value (21.8cm -23.9 cm);
• calculated from a proportionality formula based on the distance between the eyes and
the mouth.</p>
        <p>The second option is very complicated, involves extra operations like edge detection to find
the location of eyes and mouth, depends on the pose (will not work for side portraits) and does
not guarantee accuracy greatly more than the first option. This is why the first option of assuming
that the height of the face is somewhere between 21.8cm and 23.9cm will be used during the
implementation.</p>
        <p>When the real world height of all faces in the frame is known, the real world distance to the
faces can be determined in two ways:
• predefined proportions at 1meter distance method (see Figure 4, photo by Dolhanenko</p>
        <p>O.);
• using an alleged "focused face" from the focus point coordinates as a reference.</p>
        <p>The first method can be described as follows: practically or mathematically find the percentage
height (from the frame height) of a 23.9 cm high object at a distance of 1 meter at a focal length of
50mm (which gives 1.0 times magnification) on a 35mm full frame sensor.</p>
        <p>ℎ =  ∗</p>
        <p>,</p>
        <p>As seen from the photo, the height of the face on the image is 1857px (46.4%) on a horizontal
photo with 4000*6000 resolution. On a vertical photo of the same resolution, the facial height is
30.95% of the full height. This gives a point of reference which can be proportionally scaled based
on the given (non-full frame) sensor size or/and focal length.</p>
        <p>
          To convert the reference measurement to a different focal length on a different format camera
sensor, the following formula can be used (see formula 3):
 (
          <xref ref-type="bibr" rid="ref3">3</xref>
          )
50
where h is the new reference height of the face, f is the 35mm equivalent focal length of the
selected lens, H is the initial reference height of the face (taken via 50mm full frame lens from 1
meter).
        </p>
        <p>When the reference height of the 23.9 cm face is calculated taking into account the sensor
format and the focal length, the distance D to a particular face in the frame can be calculated by
comparing the height of the face to the reference height and determining proportionally the
distance to the sensor (see formula 4).</p>
      </sec>
      <sec id="sec-3-6">
        <title>3.2.4. Calculating the ideal depth of field</title>
        <p>Depth of field (DOF) is the distance between the nearest and the farthest objects that are in
acceptably sharp focus in an image.</p>
        <p>To understand whether or not some subject at a specific distance is supposed to be sharp in
the photo (based on the camera and lens settings) it is required to first calculate the depth of field
 =
ℎ</p>
        <p>
          (
          <xref ref-type="bibr" rid="ref4">4</xref>
          )

where e D is the distance (in meters) to the face in the frame, g is the height of the face of
interest in pixels, h is the reference height of the face.
        </p>
        <p>The method described above is simplified. In future work the accuracy can be improved by
adding support for lens distortions and other factors that may influence the measurements.</p>
        <p>
          The second method heavily relies on metadata. The coordinates of the focus point in the meta
data indicates the precise area of focus interest on the photo. Since the coordinates and boxes of
all faces are available at this point, the face within the area of interest can be selected as the
reference. Since the distance to this face is already available, once again, from the metadata, we
can calculate the proportional distance to other faces in the frame by using the (
          <xref ref-type="bibr" rid="ref4">4</xref>
          ) formula.
        </p>
        <p>
          The second method can be reliable if the image does not have stacked, slightly shifted faces,
but since this is not a guarantee, the first method can be preferred. However, if the conditions are
ideal, the second method will produce more precise results, since it uses more non-estimated
measurements. There are other methods of object distance extraction by using reference targets
[30], however, such methods are not suitable for general photography.
using the captured camera and lens settings. The formulas for calculating the distance to the front
plane of the focus area and the distance to the back plane of the focus area are as shown in formula
(
          <xref ref-type="bibr" rid="ref5">5</xref>
          ):
# =
        </p>
        <p>,,
# +  ∗  ∗  −  ∗  ∗ 
where R1 is distance to the front edge of the critical focus plane, R is distance of focus (can be
retrieved from metadata), R2 is distance to the back edge of the critical focus plane, f is the focal
length of the lens in meters (can be retrieved from metadata), K is the f-stop of the lens (can be
retrieved from metadata), Z is the Circle Of Confusion (can be retrieved from metadata).</p>
        <p>By subtracting R1 from R2 the depth of field can be found. However, it will be easier to use the
raw R1 and R2 values for further calculations.</p>
        <p>
          ∗ #
# −  ∗  ∗  +  ∗  ∗ 
 ∗ #
;
(
          <xref ref-type="bibr" rid="ref5">5</xref>
          )
        </p>
      </sec>
      <sec id="sec-3-7">
        <title>3.2.5. Applying sharpness detection only for faces required to be in focus</title>
        <p>To understand which faces to check for sharpness it is first required to understand the criteria
of expected portrait framing depth (EPFD).</p>
        <p>The EPFD is a generalized assumption of the maximum intended distance between the subjects
when taking photos of multiple rows of people at once. For example, a photographer is taking a
group portrait photo with 10 people, which are placed in two rows. In this case, the expected
portrait framing depth is within 0.8 meters, since there are two rows and each one is about 0.3
meters wide +- a margin in between. For such portrait framing depth, the photographer needs to
set a particular aperture (f-stop), so that the depth of field is equal to or more than the framing
depth.</p>
        <p>The EPFD is a very subjective parameter and cannot be calculated from analyzing the photo.
That is because it is infinitely hard for a computer to determine if the photo of a group of people
is intended to be shot with an open aperture (to have only front subjects in focus), or it was
intended to have both rows of people in focus and the low aperture value was chosen by mistake.</p>
        <p>The subjective nature of this parameter leaves this part of the method to be fine-tuned by the
end user, selecting one of two options on a collection of photos before the sharpness detection:
single/couple styled photos or intended multi-row group photos.</p>
        <p>For the second option, where a group photo (composed in two or more rows) is selected, the
faces can be classified as "intended to be sharp" by the following sequence of formulas:</p>
        <p>Given the measured light value (EV) from the metadata, it is required to calculate the maximum
aperture, using which the EV value would be the same. The bigger the aperture value, the greater
the depth of field is, meaning more faces need to be in critical sharp focus. The photographer could
make the mistake of setting a low aperture, which results in only one row of people being in focus.
This is why the maximum aperture value needs to be determined and the photo needs to be
analysed as if these "ideal" settings were set in camera.</p>
        <p>
          100 ∗ # (
          <xref ref-type="bibr" rid="ref6">6</xref>
          )
        </p>
        <p>∗ 
where EV is the exposure value; K is the f-stop value, I is the ISO.</p>
        <p>Given the maximum acceptable ISO value by the photographer (the greater the ISO the more
noise there is in the photo, thus it is down to personal photographer preference), the maximum
value of the aperture to achieve the same light value can be determined as follows:
 = log# I</p>
        <p>M ,,
 = O
2./ ∗  ∗ 
100
(7)
where Kmax is the maximum f-stop number for the pre-defined EV and ISO, Imax is the
maximum ISO value, acceptable by the photographer, Smax is the maximum duration of the
shutter speed.

where Fl is the focal length used for the photo, taken from the metadata.</p>
        <p>However, there is an unformal rule, stating, that when shooting portraits, one should not select
shutter speeds below 0.008 of a second. This may also be an external fine-tuning setting available
for the end user.</p>
        <p>
          Next, the maximum depth of field needs to be calculated. To achieve this, the formulas (
          <xref ref-type="bibr" rid="ref5">5</xref>
          ) can
be used, inserting the maximum aperture f-stop value (Kmax). As a result, R1 (distance to the front
edge of the critical focus plane) and R2 (distance to the back edge of the critical focus plane) with
the "ideal" camera settings can be found.
        </p>
        <p>Given the calculated distance to the subjects face, the criteria of in focus intention can be
formulated as follows: the face of a subject was intended to be in critical sharp focus, if the
distance D satisfies the condition:
 =
 ≤ 2
(9)</p>
      </sec>
      <sec id="sec-3-8">
        <title>3.2.6. Sharpness detection only for regions containing the selected faces</title>
        <p>At this stage of the algorithm, the most valuable information is already achieved – the
understanding of which faces in the frame were most likely intended to be in focus. Having this
information, the sharpness detection algorithm of choice can be applied these regions exclusively.
This will result in very accurate results, as the background and other subjects will not be
evaluated.</p>
        <p>It is worth mentioning, that having this information, a whole window of possibilities for photo
categorizing opens up. Not only the sharpness can be evaluated, but also the open state of the eyes
(open/half-open/closed), preferred facial expressions of focused subjects, even the poses can be
classified as appealing and not.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experiment</title>
      <p>The purpose of the following experiment is to demonstrate relations of the Laplacian and FFT
sharpness evaluation algorithms to different factors of the photo. Revealing these dependencies
on factors will lead to conclusions about the steps needed to be taken in order to improve the
effectiveness of the algorithms.</p>
      <p>In this experiment the following set of images will be used:
1. A reference representation of a completely blurry photo
2. A reference representation of a very sharp photo in all parts (no background blurriness)
3. A photos which contain sharp and blurred parts due to the aperture effect
The first and second images are used to determine the dramatic maximum and minimum
sharpness results both algorithms provide.</p>
      <p>All other 7 test photos contain human subjects and should be the same resolution and have
identical lighting conditions for optimal evaluation results. There should not be any colour or
sharpness corrections done before the experiment. All photos from the experiment are similar to
figure 1 and contain same subjects and same scenes but in different configurations.</p>
      <p>The following Table 2 verbally describes the photos that were used for the experiment.</p>
      <p>The first photo was made with a very closed aperture (f22) and has extremely "busy"
foreground, which leads to extreme sharpness results by the Laplacian method. The second photo
is fully out-of-focus and results in very low value results by both algorithms. No conclusions can
be made from these results yet.</p>
      <p>4.JPG
5.JPG
6.JPG
7.JPG
8.JPG
9.JPG</p>
      <p>A sharp photo containing one in-focus subject which takes up 24.7% of the frame
A sharp photo containing one in-focus subject which takes up 37.5% of the frame
A sharp photo containing two in-focus subjects, high aperture (background blurriness is
high), collectively taking up 38.38% of the frame.</p>
      <p>A sharp photo containing one in-focus and one out-of-focus subject, collectively taking
up 36.09% of the frame
A sharp photo containing two in-focus subjects, low aperture (background blurriness is
low), taking up 33.13% of the frame
A sharp photo containing one in-focus and one in-motion subject, collectively taking up
37.87% of the frame
A sharp photo containing one in-focus and one out-of-focus subject, collectively taking
up 57.24% of the frame
A very "busy" and sharp photo, shot with a closed aperture (f22)</p>
      <p>A completely blurry, out-of-focus photo of the same scene as in 8.JPG</p>
      <p>The following Table 3 represents the results of the main 7 photos analysis (provided in the
Laplacian, FFT columns), alongside additional calculated parameters that will be needed for
dependencies analysis. It should be stated, that the Laplacian and FFT methods produce values
that do not relate to each other and thus should not be compared directly.</p>
      <p>The "background quantity" is the percentage value of how much of the photo is taken up by
the background. The "total subject size" from photo is the opposite value, describing how much
area of a photo is taken up by the subjects.</p>
      <p>The next phase of the experiment includes utilizing the parts of the improved algorithm that
was described in the previous section. Within this experiment the dependencies which lead to
unstable results were determined. One of the main parts of the described algorithm is face
detection. The following test was conducted on a set of images that were previously cut to contain
only faces. Images that contained two subjects are represented by two individual cut images.</p>
      <p>The experiment conditions and results are displayed in table 3. The test images are extracted
faces from the frame (see Figure 5, photos by O. Dolhanenko) and the test results are stated in
Table 4.</p>
      <p>The methods were implemented using Kotlin and launched on the JVM, running on MacOS with
Core-I9 CPU</p>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>By plotting the sharpness result values obtained during the experiment against the
background quantity, the following chart can be achieved (see Figure 6)</p>
      <p>As seen from the chart, a clear linear trendline describes the relation of the quantity of the
background to the calculated sharpness (both in Laplacian and FFT methods). This means that
the more background there is visible on a photo (even if the subject is in critical sharp focus), the
less likely the photo will be classified as sharp.</p>
      <p>By plotting the sharpness result values obtained during the experiment against the f-stop
values (see Figure 7), the following chart can be achieved.</p>
      <p>The relation on this chart is not as obvious as the previous one, since the data set is rather
small, but even here a visible linear trendline for both Laplacian and DFT methods is noticeable.
This trend implies that the higher the f-stop value, the more likely the photo will be classified as
sharp.</p>
      <p>Having analyzed both relations it can be stated, that the result of DFT and Laplacian methods
are dependent on the scene and camera settings, which means that they are not reliable for
generic automatic photo filtering. Furthermore, comparing the average Laplacian sharpness value
for subject photos (45.12) to the average sharpness value for the 2 reference photos (325.54)
reveals a major difference. This difference demonstrates the source of errors that can appear by
selecting an incorrect reference value.</p>
      <p>Image(s)
name
10-2.JPG
10.JPG
4-2.JPG
4.JPG
3-2.JPG
3.JPG
7-2.JPG
7.JPG
6-2.JPG
6.JPG
5-2.JPG
5.JPG
1.JPG
2.JPG</p>
      <p>The "Image(s) name" is the name of the analyzed photo. If the photo contained multiple
subjects, it was split into separate images and visually merged in the table.</p>
      <p>The "Expectation" is a non-bias subjective rating of image quality and usability given by the
photographer, where 0 is unusable (blurry) and 1 – usable (sharp). The "Laplacian", "FFT" are
actual results of sharpness evaluation. The "Laplacian normalized" and "FFT normalized" are
conversions of actual results to a 0-1 scale, where 1 is the maximum value taken from the
algorithm output. This way the normalized values of different algorithms are relatable to the
"Expectation" values. The "time" columns represent the time taken to process the image in
milliseconds.</p>
      <p>After the data was collected, the normalized columns were analyzed for visual trends. To fulfil
the first objective of the experiment, the sorted input values trend (the expectation) must meet
the trend of the sorted normalized output values. By viewing the plotted results (see Figure 8) it
can be stated, that both algorithms generally fulfil this requirement.</p>
      <p>Only one data point is either unnaturally classified as sharp (by both algorithms), or the
subjective usability parameter was defined not accurately.</p>
      <p>The second objective of the experiment was to determine which algorithm is more stable and
should be used in the actual implementation. Comparing two charts (see Figure 8) a more linear
trend line is received from the FFT algorithm, rather than the Laplacian. Moreover, the first
method produces very unbalanced results.</p>
      <p>The third and final objective was to determine which algorithm works faster. By comparing
the average speed of both algorithms – 17.7ms for Laplacian and 241.25ms for the FFT, it can be
concluded that the FFT method is 14 times slower.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Discussions</title>
      <p>Two methods were analyzed and compared – FFT (Fast Fourier Transform) and the Variance
of Laplacian for image sharpness evaluation. During the first experiment it was determined that
both algorithms produce results that are highly dependent on the photo scene, composition and
camera settings – photos with large quantities of background (produced by a low f-stop value)
were classified as blurry.</p>
      <p>An improved algorithm was proposed which is based on subject detection [31, 32] and further
sharpness evaluation exclusively of the subject box. This way all dependencies were eliminated
and more natural evaluation results were obtained.</p>
      <p>The improved algorithm uses metadata of the photo to simplify calculations, if the relevant
data "shortcuts" are available. In some cases, the metadata may contain the detected faces
coordinates, which greatly optimize the performance of the algorithm. The method is built on the
principle of data analysis and does not require much user input to function properly. The only
parameter that cannot be achieved automatically is the intended photography style – whether the
user was shooting groups of people to get all in focus, or it was intended to focus on only one-two
subjects in the frame, while isolating the rest of the background. Another parameter that the user
may input is the maximum acceptable ISO value. This parameter has a default value but is very
subjective and so depends on the user preference.</p>
      <p>The goals of the experiment were to identify whether the improved algorithm results meet the
expectation and to determine which of the sharpness evaluation methods work best for the task.
As a result it was found that the improved algorithm meets the expectations and works best with
the FFT sharpness evaluation method as it produces more linear trending results. The Laplacian
method, however, is 14 times faster and may be preferred for very large datasets, where accuracy
is not as important as time efficiency.</p>
      <p>The results of the research were used to implement a software solution prototype for
automated image files sorting by subject sharpness (see Figure 9).</p>
      <p>The resulting software was configured to operate with JPG files, built using primarily Kotlin
and OpenCV, deployed and run on a MacOS environment.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusions</title>
      <p>The purpose of this research was to analyze the existing methods of image sharpness
evaluation and suggest improvements for the use case of sharpness evaluation of photos of
people. The goal end result is an automated photo sorting solution, which classifies images from
sharp and usable to blurry and unusable.</p>
      <p>The comparison of FFT and Variance of Laplacian methods for image sharpness evaluation
showed that their accuracy is affected by the photo's scene, composition, and camera settings,
leading to misclassification of detailed background photos as blurry.</p>
      <p>An enhanced algorithm leveraging subject detection for focused sharpness evaluation showed
significant improvements by removing biases and yielding more accurate results. It utilizes photo
metadata to streamline processes, especially when such metadata includes coordinates of
detected faces, enhancing performance efficiency. The method minimizes user input, requiring
only the photography style and an optional maximum acceptable ISO value to adjust for personal
preference. Final testing confirmed the algorithm's effectiveness, particularly in conjunction with
the FFT method for its linear results, though the faster Laplacian method may be chosen for large
datasets where speed trumps precision.</p>
      <p>The results of the research and experiments have been used to implement a prototype of an
automation system for sorting photos by subject sharpness, detecting unwanted motion within
the image, subject blurriness.
[7] Z. Liu, H. Hong, Z. Gan, J. Wang, Y. Chen, An Improved Method for Evaluating Image Sharpness</p>
      <p>Based on Edge Information, Applied Sciences 12.13 (2022):6712. doi:10.3390/app12136712
[8] Y. Zhang et al., Image sharpness evaluation method based on normal gradient feature, in
Proceeding of the 3rd International Symposium on Robotics &amp; Intelligent Manufacturing
Technology (ISRIMT), Changzhou, China, 2021, pp. 308-314,
doi:10.1109/ISRIMT53730.2021.9596808.
[9] Yan, X.Y., Lei, J., Zhao, Z.: Multidirectional gradient neighborhood-weighted image sharpness
evaluation algorithm. Math. Problems Engin. 7864024 (2020). doi: 10.1155/2020/7864024
[10] M. Zhu, L. Yu, Z.Wang, Z. Ke, C. Zhi, Review: A Survey on Objective Evaluation of Image</p>
      <p>Sharpness, Applied Sciences 13.4 (2023): 2652. doi:10.3390/app13042652.
[11] S. Bielievtsov, I. Ruban, K. Smelyakov and D. Sumtsov, Network technology for
transmission of visual information, Selected Papers of the XVIII International Scientific and
Practical Conference “Information Technologies and Security” (ITS 2018), Kyiv, Ukraine,
November 27, 2018. In CEUR Workshop Proceedings, Vol-2318, 2018, pp. 160-175.
[12] K. Smelyakov, A. Chupryna, D. Sandrkin and M. Kolisnyk, Search by Image Engine for Big
Data Warehouse, in: Proceedings of the 2020 IEEE Open Conference of Electrical, Electronic
and Information Sciences (eStream), Vilnius, Lithuania, 2020, pp. 1-4,
doi:10.1109/eStream50540.2020.9108782.
[13] K. Smelyakov, D. Karachevtsev, D. Kulemza, Y. Samoilenko, O. Patlan and A. Chupryna,
Effectiveness of Preprocessing Algorithms for Natural Language Processing Applications, in:
Proceedings of the 2020 IEEE International Conference on Problems of Infocommunications.
Science and Technology (PIC S&amp;T), Kharkiv, Ukraine, 2020, pp. 187-191,
doi:10.1109/PICST51311.2020.9467919.
[14] Jiye Qian, Hengjun Zhao, Jin Fu, Wei Song, Jide Qian, Q. Xiao. "No-reference image
sharpness assessment via difference quotients." Journal of Electronic Imaging, 28.1 (2019).
doi: 10.1117/1.JEI.28.1.013032.
[15] Z. Duan, G. Li and G. Fan, An Effective Sharpness Assessment Method For Shallow
DepthOf-Field Images, in Proceeding of the IEEE International Conference on Image Processing
(ICIP)’21, Anchorage, AK, USA, 2021, pp. 1449-1453, doi:10.1109/ICIP42928.2021.9506498
[16] L. Her and X. Yang, Research of Image Sharpness Assessment Algorithm for Autofocus, in
Proceeding of the 2019 IEEE 4th International Conference on Image, Vision and Computing
(ICIVC), Xiamen, China, 2019, pp. 93-98, doi: 10.1109/ICIVC47709.2019.8980980.
[17] I. Kyrychenko, G. Tereshchenko, G. Proniuk, N. Geseleva, Predicate Clustering Method and
its Application in the System of Artificial Intelligence, in: Proceedings of the 7th International
Conference on Computational Linguistics and Intelligent Systems (COLINS-2023), 3396
(2023), pp. 395 - 406.
[18] Z. Han, Y. Liu, R. Xie, G. Zhai, Image Quality Assessment for Realistic Zoom Photos.</p>
      <p>Sensors 2023 (2023) 4724. doi:10.3390/s23104724.
[19] T. Sieberth. ”Automatic detection of blurred images in UAV image sets.” ISPRS Journal of</p>
      <p>Photogrammetry and Remote Sensing. 122 (2016). doi: 10.1016/j.isprsjprs.2016.09.010.
[20] E. Fry. ”Bridging the Gap Between Imaging Performance and Image Quality Measures.”</p>
      <p>Electronic Imaging. 12 (2018). doi:10.2352/issn.2470-1173.2018.12.iqsp-231.
[21] S. Triantaphillidou. ”Contrast sensitivity in images of natural.” Signal Processing: Image</p>
      <p>Communication. 75 (2019): 67. doi:10.1016/j.image.2019.03.002.
[22] O. van Zwanenberg, S. Triantaphillidou, R. Jenkin and A. Psarrou, Edge Detection
Techniques for Quantifying Spatial Imaging System Performance and Image Quality, in:
Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition
Workshops (CVPRW), Long Beach, CA, USA, 2019 pp. 1871-1879. doi:
10.1109/cvprw.2019.00238.
[23] P. Marziliano, Perceptual blur and ringing metrics: application to JPEG2000, Signal</p>
      <p>Processing: Image Communication. 19.2 (2004): 167. doi:10.1016/j.image.2003.08.003.
[24] Z. Wang, Image Quality Assessment: From Error Visibility to Structural, IEEE Transactions
on Image Processing 13.4 (2004): 601. doi:10.1109/tip.2003.819861.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>N.</given-names>
            <surname>Wadhwa</surname>
          </string-name>
          ,
          <article-title>Synthetic depth-of-field with a single-camera mobile phone</article-title>
          ,
          <source>ACM Transactions on Graphics 37.4</source>
          (
          <year>2018</year>
          )
          <article-title>64</article-title>
          . doi:
          <volume>10</volume>
          .1145/3197517.3201329.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>M.-J. Chen</surname>
            ,
            <given-names>A. C.</given-names>
          </string-name>
          <string-name>
            <surname>Bovik</surname>
          </string-name>
          ,
          <article-title>No-reference image blur assessment using multiscale gradient</article-title>
          ,
          <source>J Image Video Proc</source>
          .
          <year>2011</year>
          ,
          <volume>3</volume>
          (
          <year>2011</year>
          ). doi:
          <volume>10</volume>
          .1186/
          <fpage>1687</fpage>
          -5281-2011-3.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>H. R.</given-names>
            <surname>Sheikh</surname>
          </string-name>
          ,
          <article-title>Image information and visual quality</article-title>
          ,
          <source>IEEE Transactions on Image Processing. 15.2</source>
          (
          <year>2006</year>
          ):
          <fpage>430</fpage>
          -
          <lpage>444</lpage>
          . doi:
          <volume>10</volume>
          .1109/tip.
          <year>2005</year>
          .
          <volume>859378</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>W.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , K. Ma,
          <string-name>
            <given-names>J.</given-names>
            <surname>Yan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Deng</surname>
          </string-name>
          and
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Blind Image Quality Assessment Using a Deep Bilinear Convolutional Neural Network</article-title>
          ,
          <source>IEEE Transactions on Circuits and Systems for Video Technology 30.1</source>
          (
          <year>2020</year>
          )
          <fpage>36</fpage>
          -
          <lpage>47</lpage>
          ,
          <year>2020</year>
          . doi:
          <volume>10</volume>
          .1109/TCSVT.
          <year>2018</year>
          .2886771
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Chattopadhyay</surname>
          </string-name>
          ,
          <source>Image Sharpness Assessment Based On local Phase Coherence. NeuroQuantology 5</source>
          .20 (
          <year>2022</year>
          ). doi: 0.48047/nq.
          <year>2022</year>
          .
          <volume>20</volume>
          .5.nq22800.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>R.</given-names>
            <surname>Hassen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wang and M. M.</surname>
          </string-name>
          <article-title>A. Salama, Image Sharpness Assessment Based on Local Phase Coherence</article-title>
          ,
          <source>in IEEE Transactions on Image Processing</source>
          ,
          <volume>22</volume>
          .7 (
          <issue>2013</issue>
          ), pp.
          <fpage>2798</fpage>
          -
          <lpage>2810</lpage>
          , doi:10.1109/TIP.
          <year>2013</year>
          .
          <volume>2251643</volume>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>