Estimation of Optimal Number of Rays in the Bidirectional
Photon Mapping Method
Sergey Ershov1 and Alexey Voloboy1
1
    Keldysh Institute of Applied Mathematics RAS, Miusskaya sq. 4. Moscow 125047, Russia

                Abstract
                The classic Monte-Carlo ray tracing is a powerful technique for simulating almost all effects
                in ray optics, but it may be prohibitively slow for, for example, calculation of images seen by
                a lens camera. Therefore, in practice there are often used its various modifications, in
                particular, bi-directional stochastic ray tracing with photon maps. The well-known flaw of all
                stochastic methods is their noise. The noise level, that is, the root mean square of pixel
                brightness calculated during given time, depends, besides all, on the number of rays traced
                from the light source and from the camera. The choice of the optimal parameters must provide
                the lowest noise level in a fixed time. This article is devoted to the choice of the optimal number
                of rays that minimizes the noise. It is proved that this minimal noise is in the same time
                homogeneous over the image. We produce the formulae to calculate the optimal number of
                rays from several coefficients which can be obtained from a bi-directional ray tracing of several
                auxiliary variants. It happens that this optimum is rather wide i.e. the noise level changes with
                the number of rays slowly, which allows to choose it including other factors e.g. limit this
                number to save memory.

                Keywords 1
                realistic rendering, bi-directional Monte-Carlo ray tracing, photon maps, denoising

1. Introduction
    Nowadays light simulation is widely used in realistic computer graphics and for design of new
materials and optical systems [1]. If wave effects may be neglected in simulation, then stochastic ray
tracing methods are preferable. This group of methods includes the simulation of light transport using
the Metropolis method [2] and stochastic ray tracing [3]. The classical forward Monte Carlo ray tracing
(which starts tracing from the light source) is inefficient for image generation, and therefore it is
replaced by bidirectional modifications of it [4–6]. One of the most popular methods among them is the
bidirectional stochastic ray tracing with photon maps (BDPM – Bidirectional Photon Mapping) [5, 7].
A well-known flaw of all stochastic methods is that they produce noisy results. Therefore, the noise
reduction problem always has primary importance. It is considered in many works, e.g., see [8–10].
    The level of noise in BDPM mainly depends on the random scattering of the forward and backward
rays, on the choice of the vertex for their merging (or, in other words, on the vertex of the camera ray
trajectory at which luminance is estimated from photon maps), and also on the number of forward and
backward rays traced in one iteration step. The majority of studies is devoted to the first two issues
(e.g., [9–12]), and the number of rays got less attention. However, this is an important factor, and it
often happens that the number of forward rays is already redundant and its further increase only
increases the computation time but does not reduce the noise. In other cases, it may happen that the
number of forward rays is indeed critical, while the number of traced backward rays is redundant. It is
usually difficult to predict which fraction of the forward and backward rays is optimal: however, a good
choice can speed up the computations by several times.


GraphiCon 2021: 31st International Conference on Computer Graphics and Vision, September 27-30, 2021, Nizhny Novgorod, Russia
EMAIL: ersh@gin.keldysh.ru (S. Ershov); voloboy@gin.keldysh.ru (A. Voloboy)
ORCID: 0000-0002-5493-1076 (S. Ershov); 0000-0003-1252-8294 (A. Voloboy)
             ©️ 2021 Copyright for this paper by its authors.
             Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
             CEUR Workshop Proceedings (CEUR-WS.org)
   In this paper, we propose a method which allows to estimate the optimal number of rays. A general
rule determining noise in the BDPM was derived in [8].
   One must realize that the rate of convergence of stochastic methods like BDPM is not actually
determined the variance of the image calculated in one iteration (or another number of them), but the
variance after the fixed time of calculations. Indeed, if the variance of one iteration is decreases twice
while the time to calculate it increases tenfold, the situation obviously worsens.
   Therefore, calculation of the optimum i.e. those number of camera and light paths that provide “the
best image” (or the fastest convergence) we must account for both the variance from one iteration and
the time spent on one iteration.
   In this paper we produce the simple approximating law for that time and, combining it with the law
of dependence of the variance from one iteration on the number of rays, we derive the formulae for the
optimal number of rays. It happens that those distribution of camera rays per pixel that makes the noise
homogeneous is just the one which makes it minimal. This is especially convenient because eliminates
the problem of which spatial aggregate to treat as the “noise amplitude”.

2. Expression for the noise level
   Generally the number of camera rays traced through a pixel may depend on this pixel, i.e. vary across
the image. The variance of the contribution of one iteration of BDPM can be calculated individually for
each pixel and is [11, 13]

            𝐶(𝑝)       1 − 𝑁𝐹−1           (1 − 𝑁𝐵−1 (𝑝))
    𝑉(𝑝) =           +            𝐵(𝑝) +                 𝐹(𝑝)𝐶(𝑝) ≡ ⟨⟨𝐶 2 ⟩⟩(𝑝) − ⟨⟨𝐶⟩⟩2 (𝑝)𝐵(𝑝)
           𝑁𝐹 𝑁𝐵 (𝑝)    𝑁𝐵 (𝑝)                  𝑁𝐹
                   ≡ ⟨⟨𝐶⟩2𝐹 ⟩𝐵 (𝑝) − ⟨⟨𝐶⟩⟩2 (𝑝)𝐹(𝑝) ≡ ⟨⟨𝐶⟩2𝐵 ⟩𝐹 (𝑝) − ⟨⟨𝐶⟩⟩2 (𝑝)

where 𝑁𝐹 is the number of light paths per iteration, 𝑁𝐵 (𝑝) is the number of camera paths through the
pixel p per iteration, 𝐶(𝑋 (𝐹) , 𝑋 (𝐵) ) is the contribution to the pixel’s luminance from merging of the light
path 𝑋 (𝐹) with camera path 𝑋 (𝐵) , its average ⟨⟨𝐶⟩⟩ obviously equals the limiting (exact) pixel’s
luminance 𝐿(𝑝), and ⟨⋅⟩𝐹 and ⟨⋅⟩𝐵 denote the averaging over (the ensemble of) light and camera paths,
respectively.
   The variance of the image obtained during time Т is obviously
                                                              𝜏
                                              𝑉𝑇 (𝑝) ≡ 𝑉(𝑝)
                                                              𝑇

where 𝜏 is the (average) time spent on one iteration. Therefore the value that determines convergence
rate and which, therefore, must be minimized to obtain the best image, is

                                            𝑉(𝑝) ≡ 𝑉(𝑝)𝜏                                                     (1)

    Let’s estimate the average time per iteration. The iteration in BDPM consists of three successive
parts:
    ● BMCRT (from camera);
    ● FMCRT (from light source);
    ● “Merging” of the above sub-paths, when each light path is (attempted to) join with each camera
    path.
    The total iteration time is the sum of these partial timings denoted as 𝜏𝐵 , 𝜏𝐹 , 𝜏𝑚𝑒𝑟𝑔𝑒 respectively.
The time spent on the first parts is obviously linear in the number of rays. For the third part, it is linear
in the number of their pairs. The general functional form is then


           𝜏𝐵 = ∑       𝛽(𝑝)𝑁𝐵 (𝑝),        𝜏𝐹 = 𝛼𝑁𝐹 ,       𝜏𝑚𝑒𝑟𝑔𝑒 = 𝑁𝐹 ∑        𝛾(𝑝)𝑁𝐵 (𝑝)                  (2)
                  𝑝                                                         𝑝
where 𝛽(𝑝) has the sense of the average time spent on tracing of one camera ray through the pixel p, 𝛼
has the sense of the average time spent on tracing of one light ray and 𝛾(𝑝) is the average time spent
on “merging” one pair of light ray and one camera ray (through pixel p). This 𝛾(𝑝) can depend on the
pixel because for different pixels the camera rays go different parts of the scene where the density of
light rays (and thus the probability of the merging) is different.
   Summing, we have


           𝜏 = 𝜏𝐵 + 𝜏𝐹 + 𝜏𝑚𝑒𝑟𝑔𝑒 = 𝛼𝑁𝐹 + ∑             𝛽(𝑝)𝑁𝐵 (𝑝) + 𝑁𝐹 ∑       𝛾(𝑝)𝑁𝐵 (𝑝)                 (3)
                                                  𝑝                     𝑝

   Substituting (3) in (1) we see that the value to be minimized to obtain the best image is

                 𝐶(𝑝)       1 − 𝑁𝐹−1        1 − 𝑁𝐵−1 (𝑝)
        𝑉(𝑝) = (          +          𝐵(𝑝) +              𝐹(𝑝)) (𝛼𝑁𝐹 + ∑                  𝛽(𝑝)𝑁𝐵 (𝑝)
                𝑁𝐹 𝑁𝐵 (𝑝)    𝑁𝐵 (𝑝)             𝑁𝐹
                                                                                 𝑝


                         + 𝑁𝐹 ∑        𝛾(𝑝)𝑁𝐵 (𝑝))
                                𝑝


   Usually the number of rays is much greater than 1, so we can neglect the terms 𝑁𝐵−1 (𝑝) и 𝑁𝐹−1 :

                1    𝐶(𝑝)           𝐹(𝑝)
   𝑉(𝑝) = (         (     + 𝐵(𝑝)) +     ) (𝛼𝑁𝐹 + ∑                𝛽(𝑝)𝑁𝐵 (𝑝) + 𝑁𝐹 ∑        𝛾(𝑝)𝑁𝐵 (𝑝))
              𝑁𝐵 (𝑝) 𝑁𝐹              𝑁𝐹
                                                           𝑝                         𝑝


3. Minimization of noise
    Usually the noise is not homogeneous over the image, i.e. it depends on the pixel. It is not very
convenient because it is difficult to decide, which is better to have less noise on the floor yet stronger
in the ceiling or vice versa? Therefore usually we use a single measure of noise, e.g. the maximum over
the image, which better corresponds to the visual perception of “noisy” or “clear” images.
    Therefore, what we need to minimize is 𝑉(𝑝) .
    It happens that (it will be so should 𝑁𝐵 (𝑝) be continuous instead of integers) in this case the noise
is homogeneous i.e.

                                             𝑉(𝑝) = 𝑉(𝑝) ≡ 𝑉

   Indeed, let us suppose the opposite: that in some pixel 𝑝∗ the noise is below the maximum i.e.
𝑉(𝑝∗ ) < 𝑉(𝑝) . Now let us decrease (infinitesimally as now we treated it as a continuous parameter!)
𝑁𝐵 (𝑝∗ ). Obviously, in all the other pixels but 𝑝∗ the noise will decrease after it because of decrease of
the time spent on one iteration 𝜏:

                                                             𝛿𝜏
                                              𝛿𝑉(𝑝) = 𝑉(𝑝)
                                                             𝜏

while 𝑉(𝑝) for these pixels won’t change. In the same time in pixel 𝑝∗ the change of noise level will be

                                       1      𝐶(𝑝∗ )                               𝛿𝜏
                     𝛿𝑉(𝑝∗ ) = −              (      + 𝐵(𝑝∗ )) 𝜏𝛿𝑁𝐵 (𝑝∗ ) + 𝑉(𝑝∗ )
                                    𝑁𝐵2 (𝑝∗ )  𝑁𝐹                                  𝜏
  In view of our supposition 𝑉(𝑝∗ ) < 𝑉(𝑝) the maximum is achieved outside 𝑝∗ where for each pixel
              𝛿𝜏
𝛿𝑉(𝑝) = 𝑉(𝑝) . Therefore, 𝛿𝑉(𝑝) = 𝑉(𝑝) and
                𝜏


                                                    𝛿𝜏     1   𝐶(𝑝∗ )
           𝛿𝑉(𝑝) − 𝛿𝑉(𝑝∗ ) = (𝑉(𝑝) − 𝑉(𝑝∗ ))           + 2    (       + 𝐵(𝑝∗ )) 𝜏𝛿𝑁𝐵 (𝑝∗ )
                                                    𝜏 𝑁𝐵 (𝑝∗ ) 𝑁𝐹

   By supposition, 𝑉(𝑝) − 𝑉(𝑝∗ ) > 0 and since 𝛿𝑁𝐵 (𝑝∗ ) < 0, 𝛿𝜏 < 0, then

                                            𝛿𝑉(𝑝) − 𝛿𝑉(𝑝∗ ) < 0

    In other words, after our change 𝑉(𝑝) − 𝑉(𝑝) decreases, i.e. the distribution of noise becomes
more homogeneous. Simultaneously the maximum of noise decreases i.e. we approach the optimum.
    Therefore, if the distribution of noise is not homogeneous, then varying the number of camera rays
(through those pixels where the noise is below the maximum) we can make it more optimal (smaller).
In other words, an inhomogeneous distribution cannot be the optimal (since it admits improvement).
Therefore, the optimal noise distribution is homogeneous.
    In fact the limitation that 𝑁𝐵 (𝑝) must be integer distort this simple solution, but in case 𝑁𝐵 (𝑝) is
large enough, the effect must be weak. One may not though hope to reach the homogeneous distribution
with small number of camera rays, so it is better to have it large enough so that it be at least 3 in the
“worse” pixels (recall that the roundoff error is below half unit!).
    So we should estimate the number of camera rays that yields a homogeneous distribution of noise.
                                                          1
For the sake of simplicity let us consider the case when 𝐹(𝑝) is negligible (as compared to the other
                                                         𝑁𝐹
noise components) which for the case of BDD>0 is if not always then very frequently. Then

                 1    𝐶(𝑝)
      𝑉(𝑝) =         (     + 𝐵(𝑝)) (𝛼𝑁𝐹 + ∑             𝛽(𝑝)𝑁𝐵 (𝑝) + 𝑁𝐹 ∑           𝛾(𝑝)𝑁𝐵 (𝑝))         (4)
               𝑁𝐵 (𝑝) 𝑁𝐹
                                                   𝑝                       𝑝


and one can see that the homogeneous distribution of noise is achieved when

                                      𝐶(𝑝)            𝐶(𝑝)                     𝑁𝐵
                    𝑁𝐵 (𝑝) = 𝑐𝑜𝑛𝑠𝑡 × (     + 𝐵(𝑝)) = (     + 𝐵(𝑝))                                      (5)
                                       𝑁𝐹              𝑁𝐹                   𝐶
                                                                          𝐵+𝑁
                                                                                    𝐹

where 𝑛𝑥 , 𝑛𝑦 denote the image size in pixels and

                                   1                            1
                            𝐵≡         ∑       𝐵(𝑝),   𝐶≡           ∑   𝐶(𝑝)
                                 𝑛𝑥 𝑛𝑦                        𝑛𝑥 𝑛𝑦
                                        𝑝                          𝑝


   Now we have just two free parameters: 𝑁𝐹 and the scale factor 𝑁𝐵 for the number of camera rays
(notice the actual average number of camera rays per pixel will deviate from it because of rounding)
which must be varies to minimize the (now homogeneous!) noise.


4. The optimal 𝑵𝑭 and 𝑵𝑩

   Substituting (5) in (4) and writing 𝑉 instead of 𝑉(𝑝) because now the noise is homogeneous, one
has
                                     1        𝑁𝐹   1
                           𝑉 = 𝛼𝐶̂      + 𝛼𝐵̂    +   ̂ + 𝑁𝐹 𝛾𝐵
                                                     𝛽𝐶         ̂ + 𝛾𝐶
                                                            ̂ + 𝛽𝐵  ̂
                                     𝑁𝐵       𝑁𝐵 𝑁𝐹

where the overhat denotes the sum:


                                            𝑓̂ ≡ ∑      𝑓(𝑝)
                                                  𝑝
     One easily finds that for the given 𝑁𝐵 , minimum in 𝑁𝐹 is achieved for

                                                    ̂
                                                   𝛽𝐶
                                         𝑁𝐹 =                                                      (6)
                                                √𝛼𝐵 ̂
                                                 𝑁𝐵 + 𝛾𝐵

     Denoting

                                                𝛼𝐵
                                             𝜂≡√      ̂
                                                    + 𝛾𝐵
                                                 𝑁𝐵

we can write this minimum as

                   ̂          ̂ √𝛽𝐶̂
               𝐶 √𝛽𝐶          𝛾𝐵              𝐶             ̂
                                                           𝐶𝛾𝐵
          ̂)
          2
 𝑉 = (𝜂 − 𝛾𝐵     +         ̂+
                       + 𝜂√𝛽𝐶          ̂ + 𝛾𝐶
                                     + 𝛽𝐵               ̂−
                                           ̂ = 𝜂 2 + 2𝜂√𝛽𝐶       ̂ + 𝛾𝐶
                                                               + 𝛽𝐵  ̂
               𝐵   𝜂             𝜂            𝐵             𝐵
             (       )

   Obviously, this is a monotone increasing function of 𝜂 and thus a monotone decreasing function of
𝑁𝐵 . Therefor there is no extremum in 𝑁𝐵 and formally the greater this value, the better. However too
many rays per pixels is bad as concerning memory etc. Therefore it is reasonable to take such 𝑁𝐵 that
the noise is only (1 + 𝜖), where 𝜖 is a small number, times the limiting noise value for 𝑁𝐵 → ∞ i.e.

                𝐶 2          𝐶
                          ̂ − 𝛾𝐵
                  𝜂 + 2𝜂 √𝛽𝐶       ̂ + 𝛾𝐶
                               ̂ + 𝛽𝐵  ̂ = (1 + 𝜖) (2√𝛾𝐵 ̂ + 𝛽𝐵
                                                      ̂ √𝛽𝐶  ̂ + 𝛾𝐶
                                                                 ̂)
                𝐵            𝐵
or
                  𝐶 2           𝐶
                           ̂ − ( 𝛾𝐵
                    𝜂 + 2𝜂√𝛽𝐶           ̂ + 𝛾𝐶
                                  ̂ + 𝜖(𝛽𝐵  ̂ ) + 2(1 + 𝜖)√𝛾𝐵 ̂) = 0
                                                           ̂ √𝛽𝐶
                  𝐵             𝐵

which is achieved for
               −√𝛽𝐶   ̂ + 𝐶 (𝐶 𝛾𝐵
                 ̂ + √𝛽𝐶             ̂ + 𝛾𝐶
                               ̂ + 𝜖(𝛽𝐵  ̂ ) + 2(1 + 𝜖)√𝛾𝐵 ̂)
                                                        ̂ √𝛽𝐶
                          𝐵 𝐵
          𝜂=
                                                 𝐶
                                                 𝐵

                                                             ̂ ̂         ̂ ̂
                                   ̂      ̂   𝐶 ̂          𝐶 𝛽𝐵 + 𝛾𝐶 + 2√𝛾𝐵√𝛽𝐶
                                 √      √      √
                                − 𝛽𝐶 + ( 𝛽𝐶 + 𝐵 𝛾𝐵) √1 + 𝜖 𝐵                2
                                                                 ̂    𝐶 ̂
                                                              (√𝛽𝐶 + 𝐵 √𝛾𝐵)
                            =
                                                                𝐶
                                                                𝐵
                                        ̂ ̂         ̂ ̂
                                      𝜖 𝛽𝐵 + 𝛾𝐶 + 2√𝛾𝐵√𝛽𝐶
                               ̂+
                            ≈ √𝛾𝐵
                                      2
                                             ̂ + 𝐶 √𝛾𝐵
                                           √𝛽𝐶       ̂
                                                  𝐵

(We naturally discarded the negative root) or

           𝛼𝐵̂                                             𝛼𝐵
 𝑁𝐵 =            =
              ̂
        𝜂 2 − 𝛾𝐵                                                                           2

                           ̂ + √𝛽𝐶
                                ̂+      𝐶 𝐶̂        ̂ + 𝛾𝐶
                                                        ̂ ) + 2(1 + 𝜖)√𝛾𝐵 ̂)
                                                                       ̂ √𝛽𝐶
                         −√𝛽𝐶                𝛾𝐵 + 𝜖(𝛽𝐵
                                        𝐵 (𝐵
                                                                                                 ̂
                                                                                               − 𝛾𝐵
                                                       𝐶
                                                       𝐵                                              (7)
                     (                                                                 )
                                ̂ + 𝐶 √𝛾𝐵
                           𝛼𝐵 (√𝛽𝐶     ̂)
                     1              𝐵
                 ≈
                     𝜖
                             ̂ + 𝛾𝐶
                          ̂ (𝛽𝐵
                         √𝛾𝐵     ̂ + 2√𝛾𝐵 ̂)
                                       ̂ √𝛽𝐶


   Substituting it in (6) we obtain the optimal number of the light (forward) rays:

                                                 ̂
                                                √𝛽𝐶      ̂
                                                        𝛽𝐶                                            (8)
                                       𝑁𝐹 =           ≈√
                                                 𝜂       ̂
                                                         𝛾𝐵

5. How to estimate the timings
    The value of 𝛼, i.e. time spent on average on one light (forward) ray is rather easy to measure. We
trace some large number of FMCRT rays 𝑁𝐹 , measure the time spent and divide it by 𝑁𝐹 .
    But 𝛽(𝑝) and 𝛾(𝑝) are more difficult to obtain. In principle, we can perform a series of BMCRT
simulations for single pixel each, tracing a large number of rays through this pixel and then dividing
the time spent by their number. This is expensive, though, because for an accurate measurement of time
requires that it be at least milliseconds, while there usually is millions of pixels.
    The value of 𝛾(𝑝) also can be measured straightforwardly, but this is even more difficult.
    Happily, we do not need all this because the expressions (7) and (8) include 𝛽(𝑝) and 𝛾(𝑝) only in
the following combinations
          ∑     𝛽(𝑝)𝐵(𝑝),      ∑        𝛾(𝑝)𝐵(𝑝),     ∑     𝛽(𝑝)𝐶(𝑝),       ∑      𝛾(𝑝)𝐶(𝑝)
           𝑝                       𝑝                   𝑝                     𝑝
   By definition, the two former are BMCRT time and merging time when there are 𝐵(𝑝) camera rays
per pixel and one FMCRT ray. The two latter are for 𝐶(𝑝) camera rays per pixel.
   These values can be measured easily just by doing a few iterations of bi-directional ray tracing for
some specially chosen 𝑁𝐵 (𝑝)and some (more or less arbitrary) medium 𝑁𝐹 about, say, 10000.
   Namely, let us take 𝑁𝐵 (𝑝) = 𝑛𝐵 × 𝐵(𝑝) where the constant 𝑛𝐵 is chosen so that the average number
of rays per pixels be about, say, 10. Then we measure the time spent on the backward ray tracing 𝜏𝐵 ,
on the forward ray tracing 𝜏𝐹 and time spent on merging the paths 𝜏𝐶 . In view of (2),

                         𝜏𝐹                         𝜏𝐵                    𝜏𝑚𝑒𝑟𝑔𝑒
                    𝛼=      ,∑         𝛽(𝑝)𝐵(𝑝) =      ,∑   𝛾(𝑝)𝐵(𝑝) =
                         𝑁𝐹                         𝑛𝐵                     𝑛𝐵 𝑁𝐹
                               𝑝                       𝑝


   Similarly, choosing 𝑁𝐵 (𝑝) = 𝑛𝐶 × 𝐶(𝑝) we can measure the two remaining sums.

6. Results
   We use the famous Cornell Box with an isotropic point light source slightly below the center of the
ceiling. All scene surfaces are gray Lambert with albedo kd=50%. Both direct and indirect illumination
was taken from photon maps. “Diffuse depth” i.e. the maximal allowed number of diffuse events for
camera ray [11] was BDD=1. Scene image is shown in Figure 1.


Figure 1: Camera image of the test scene.


   There are two variants which differ in the radius of integration sphere R: “large” one equals 0.0083
of the scene size, and the “small” is 0.0015 of the scene size. Naturally, one can expect that smaller
radius requires more FMCRT rays.
   Calculation of 𝐵̂ and 𝐶̂ was done by the usual bi-MCRT, using radius 0.0083 of the scene size,
10000 iteration had been done with 𝑁𝐹 = 10000 light paths and 𝑁𝐵 (𝑝) = 25 camera rays per pixel.
The matrices 𝐵(𝑝) and 𝐶(𝑝) were calculated as described in [13]. The sums are presented in Table 1.

Table 1
Averages of the BMCRT term and cross-term for the two test variants
                                   R = 0.0083 of scene size              R = 0.0015 of scene size
                𝐵                            16.47                                16.47
                𝐶                            2599                                 80556
    Calculation of time-related terms was done like described in Section 5 for 10000 light paths and
average of 50 camera paths per pixel. Obviously the pure BMCRT and FMCRT terms are independent
of radius. Since measurement of time is not very accurate (because of loading by background processes,
adaptive changing of CPU clock etc), the figures are rounded (Table 2).

Table 2
Time-related sums for the two test variants
                                      R = 0.0083 of scene size             R = 0.0015 of scene size
                𝛼                            6.35∙10–7                            6.35∙10–7
               ̂
               𝛽𝐵                               1.75                                 1.75
               ̂
               𝛽𝐶                               247                                 9000
               ̂
               𝛾𝐵                             3.3∙10–5                             1.3∙10–6
               ̂
               𝛾𝐶                             5.7∙10–3                             6.5∙10–3

   Approximation of the iteration time (3) for the number of camera rays given by (5) becomes

                                       ̂ + 𝑁𝐹 (𝛽𝐵
                                       𝛽𝐶      ̂ + 𝛾𝐶̂ ) + 𝑁𝐹2 𝛾𝐵
                                                               ̂
                          𝜏 = 𝛼𝑁𝐹 + 𝑁𝐵                                                        (9)
                                               𝐶 + 𝑁𝐹 𝐵
   Tables 3 and 4 presents comparison of this approximation of the real iteration time with the
coefficients taken from Table 2.

Table 3
Iteration time for radius = 0.0083 of scene size. The first number is real value, the second one is
calculated according to (9).
                                              𝑁𝐵 = 5                           𝑁𝐵 = 50
          𝑁𝐹 = 1000                           0.71 vs 0.53                         5.3 vs 5.3
          𝑁𝐹 = 3000                           0.79 vs 0.56                         5.5 vs 5.6
          𝑁𝐹 = 10000                          0.86 vs 0.63                         6.3 vs 6.3

Table 4
Iteration time for radius – 0.0015 of scene size. The first number is real value, the second is from (9).
                                                𝑁𝐵 = 5                              𝑁𝐵 = 50
          𝑁𝐹 = 10000                          0.78 vs 0.54                        5.3 vs 5.4
          𝑁𝐹 = 60000                          0.82 vs 0.56                        5.85 vs 5.6
          𝑁𝐹 = 200000                         0.98 vs 0.61                        6.3 vs 6.1

   One can see that approximation is quite good for large 𝑁𝐵 but not for small 𝑁𝐵 . Mainly this is
because (9) did not include rounding of the number of camera rays which increase 𝑁𝐵 (𝑝) and thus the
real run time is larger. Finally the optimal numbers of rays are in Table 5.
   REMARK. Rather large times are because we did not optimized the calculations and did not use
SIMD and even multithreading.

Table 5
The number of rays for the two test variants
                                      R = 0.0083 of scene size             R = 0.0015 of scene size
        𝑁𝐵 for 𝜖 = 0.1                         0.115                                 3.26
        𝑁𝐵 for 𝜖 = 0.01                         1.6                                  43.7
        𝑁𝐹 for 𝑁𝐵 = 5                          2653                                 51153
        𝑁𝐹 for 𝑁𝐵 = 50                         2727                                 77224
    Obviously, the small estimates for 𝑁𝐵 is rather absurd since it yields less than one ray per pixel
everywhere. Small 𝑁𝐵 (𝑝) results in strong rounding and the error of rounding which can reach ±0.5 is
a serious distortion when 𝑁𝐵 (𝑝) is about 3. So we used 𝑁𝐵 = 5 as the low bound. As explained above,
there is no formal optimum i.e. the greater the better, but too large values result in unjustified memory
consumption. So for we adopted 𝑁𝐵 = 50 for the high bound. The distribution of 𝑁𝐵 (𝑝) for low and
high bounds is presented in Figure 2. The effect of this distortion is maximal where 𝑁𝐵 (𝑝) is small, i.e.
in the blue image areas, mainly the ceiling and the small box. Later we shall see that it is these areas
where the noise deviates from the mean value.


Figure 2: The distribution of 𝑁𝐵 (𝑝) for larger radius, 𝑁𝐹 = 3000 and 𝑁𝐵 = 5 (left) and for 𝑁𝐵 = 50
(right). The actual average is 5.6 and 54, respectively, because rounding to integers increases the
values as compared to (5).


Table 6
Noise invariant √𝑉 for the test variant with larger radius of integration sphere = 0.0083 of scene size
                                                𝑁𝐵 = 5                            𝑁𝐵 = 50


           𝑁𝐹 = 1000


                                    walls: 1.75; ceiling: 1.8; small box: 0.6   walls: 1.4; ceiling: 1.4; small box: 1.4


           𝑁𝐹 = 3000


                                    walls: 1.7; ceiling: 1.1; small box: 0.37   walls: 1.4; ceiling: 1.4; small box: 1.0


          𝑁𝐹 = 10000


                                    walls: 1.6; ceiling: 0.7; small box: 0.22   walls: 1.45; ceiling: 1.5; small box: 0.6
    As explained in Section 2, what determines the speed of convergence is the product (1) of the
variance in one iteration and 𝜏 is time per iteration. We produce its root √𝑉 (notice it relates to the
absolute (not relative!) stochastic error) and visualize it in Table 6 (for larger radius of integration
sphere) and Table 7 (for smaller radius). According to toolbar the red color corresponds to value 2,
yellow color to 1 and blue color to 0. The middle row is for 𝑁𝐹 close to optimal.
    One can see that the noise image roughly has three distinct areas: walls of the main box, ceiling and
the small box, the noise being nearly homogeneous throughout each area. The corresponding values are
printed below each image for quantitative comparison.
    The areas where the noise considerably deviates from the mean level coincide with areas where
𝑁𝐵 (𝑝) is below 3 or 5, see Figure 2. Meanwhile prediction of noise level (4) itself is quite good as it
can be seen from comparison of images in Figure 3, so the inhomogeneity is due to the deviation of the
integer 𝑁𝐵 (𝑝) from the continuous values (5):


Figure 3: The distribution of √𝑉(𝑝) for 𝑁𝐵 = 50 and 𝑁𝐹 = 3000: actual (left) and prediction (4) (right).

Table 7
Noise invariant √𝑉 for the test variant with larger radius of integration sphere = 0.0015 of scene size
                                                 𝑁𝐵 = 5                            𝑁𝐵 = 50


         𝑁𝐹 = 10000


                                  walls: 1.85; ceiling: 3.0; small box: 1.0     walls: 1.7; ceiling: 1.7; small box: 1.6


         𝑁𝐹 = 60000


                                  walls: 1.7; ceiling: 1.3; small box: 0.45   walls: 1.44; ceiling: 1.44; small box: 1.2


         𝑁𝐹 = 200000


                                  walls: 1.8; ceiling: 0.9; small box: 0.3    walls: 1.47; ceiling: 1.46; small box: 0.75
   The situation is qualitatively similar to the case of larger radius (Table 6). Again one sees the same
three distinct areas: walls of the main box, ceiling and the small box, the noise being nearly
homogeneous throughout each area. The corresponding values are printed below each image for
quantitative comparison. Again the areas where the noise level deviates from mean are those where
𝑁𝐵 (𝑝) is below 2 or 3 (cf. Figure 2) and so rounding to integers deviated them from (5).

7. Conclusion
   From comparison of the calculated variants one can see that
   1. Noise is the more homogeneous the larger the 𝑁𝐵 (natural: because of the rounding 𝑁𝐵 (𝑝)
   becomes less essential) and the larger the ratio 𝑁𝐵 /𝑁𝐹 (which looks strange).
   2. The noise for the most of the area is not sensitive to the 𝑁𝐵 and 𝑁𝐹 . In other words, the optimum
   is “very wide” and deviating from it even threefold does not change the mean noise level much. This
   is good in many aspects because in real simulation of complex scenes it may be difficult to find the
   exact values of the number of rays (as they use coefficients that must be ray-traced themselves).
   Also time per iteration is difficult to measure as it is a random value and additionally it may have
   regular trends because of repeating caching etc.
   3. The areas where the noise considerably deviates from the mean level coincide with areas where
   𝑁𝐵 (𝑝) is below 2 or 3, see Figure 2. Therefore it is reasonable to choose 𝑁𝐵 (which all the same
   does not have a precise optimum value) large enough so that 𝑁𝐵 (𝑝) exceeds, say, 3 in the most of
   the image.
   This paper operates the absolute noise, while sometimes it is better to work with the relative noise
(divided by the luminance in this pixel). Our approach can be easily adapted to this case as well.

8. Acknowledgements
   We should like to thank Elissey Birukov for his help with computations.

9. References
[1] D. Zhdanov, V. Galaktionov, A. Voloboy, A. Zhdanov, A. Garbul, I. Potemin, V. Sokolov,
    Photorealistic rendering of images formed by augmented reality optical systems, Programming and
    Computer Software 44 (2018) 213–224. doi:10.1134/S0361768818040126.
[2] M. Sik, J. Krivanek, Survey of Markov Chain Monte Carlo methods in light transport simulation,
    IEEE Transactions on Visualization and Computer Graphics 26(4) (2018) 1821–1840.
[3] M. Pharr, G. Humphreys, Physically Based Rendering. Second Edition: From Theory To
    Implementation, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2010.
[4] N. Dodik, Implementing probabilistic connections for bidirectional path tracing in the Mitsuba
    Renderer 2017 URL: https://www.cg.tuwien.ac.at/research/publications/2017/dodik-2017-pcbpt
[5] H.W. Jensen, P. Christensen, High quality rendering using ray tracing and photon mapping, in:
    ACM SIGGRAPH 2007 Courses, ser. SIGGRAPH ’07. ACM, New York, NY, USA, 2007.
    doi:10.1145/1281500.1281593
[6] E. Veach, A dissertation: Robust Monte-Carlo methods for light transport simulation, 1997. URL:
    http://graphics.stanford.edu/papers/veach_thesis/thesis.pdf
[7] J. Vorba, Bidirectional photon mapping, in:Proceedings of CESCG 2011: The 15th Central
    European Seminar on Computer Graphics. Prague: Charles University, 2011, pp. 25–32. URL:
    https://cgg.mff.cuni.cz/~jaroslav/papers/2011-bdpm/vorba2011-bdpm.pdf
[8] S.V. Ershov, D.D. Zhdanov, A.G. Voloboy, Estimation of noise in calculation of scattering
    medium luminance by MCRT, Mathematica Montisnigri 45 (2019) 60–73.
    doi:10.20948/mathmontis-2019-45-5
[9] I. Georgiev, J. Krivánek, T. Davidovic, P. Slusallek, Light transport simulation with vertex
     connection and merging, ACM Trans. Graph., 31(6) (2012) 192:1–192:10.
     doi:10.1145/2366145.2366211
[10] T. Hachisuka, J. Pantaleoni, H.W. Jensen, A path space extension for robust light transport
     simulation, ACM Trans. Graph., 31(6) (2012) 191:1–191:10.
[11] S.V. Ershov, A.G. Voloboy, Calculation of MIS weights for bidirectional path tracing with photon
     maps in presence of direct illumination, Mathematica Montisnigri 48 (2020) 86–102.
     doi:10.20948/mathmontis-2020-48-8
[12] M. Sbert, V. Havran, and L. Szirmay-Kalos, Multiple importance sampling revisited: breaking the
     bounds, EURASIP Journal on Advances in Signal Processing, 15 (2018) 1–15.
[13] S.V. Ershov, E.D. Birukov, A.G. Voloboy, V.A. Galaktionov, Noise Dependence on the Number
     of Rays in Bidirectional Stochastic Ray Tracing with Photon Maps, Programming and Computer
     Software 47(3) (2021) 194-200. doi:10.1134/S036176882103004X