1. Introduction

Direct calculation of the optimal weight for MIS

0 Keldysh Institute for Applied Mathematics , Moscow , Russia 1 National Research University of Information Technologies , Mechanics, and Optics, St. Petersburg , Russia 2 Sergey V. Ershov, PhD, senior researcher in the Keldysh Institute of Applied Math of RAS

3 6

A Monte-Carlo ray tracing is nowadays standard approach for lighting simulation and generation of realistic images. A widely used method for noise reduction in Monte-Carlo ray tracing is combing different means of sampling, known as Multiple Importance Sampling (MIS). For bi-directional Monte-Carlo ray tracing with photon maps (BDPM) the join paths are obtained by merging camera and light sub-paths. Since several light paths are checked against the same camera path and vice versa, the join paths obtained are not statistically independent. Thus the noise in this method does not obey the laws which are correct in simple classic Monte-Carlo with independent samples. And, correspondingly, the MIS weights that minimize that noise must also be calculated differently. In this paper we calculate these weights for a simple model scene directly minimizing the noise of calculation. This is a pure direct numerical minimization that does not involve any doubtful hypothesis or approximations. We show that the weights obtained are qualitatively different from those calculated from classic “balance heuristic” for Monte-Carlo with independent samples. They depend on the scene distance, but not only on scattering properties of the surfaces and the distribution of light source emission.

Monte-Carlo ray tracing bi-directional ray tracing photon maps reduction of noise multiple importance sampling

1. Introduction

A powerful method of solution of the rendering equations is Monte Carlo ray tracing (MCRT). It is widely used in calculation of the global illumination [ 1, 2 ]. Its main problem is noise, and it strongly depends on the method of generation of random points. Therefore there were and are a lot of papers devoted to the optimal choice of the probability distribution of ray scattering [ 3–9 ]. One of the powerful approaches here is the socalled Multiple Importance Sampling (MIS). Its idea is that generate several random samples (rays) according to distributions different “strategies” i.e. probability and then sum with weights their contributions to image luminance.

The mathematics behind that was produced in the famous thesis by E Veach [ 3 ] where the theorem was proved about several simple schemes of weight calculation. It was proved there that the resulting noise is close to its minimal value. This theorem applies to the classic MCRT method when successive random points are absolutely independent.

Lighting simulation meanwhile frequently uses not that simple MCRT but more advanced methods like bidirectional

Monte-Carlo path tracing (BDPT), bidirectional Monte-Carlo ray tracing with photon maps (BDPM) [ 2 ], their combination termed sometimes BDCM [ 8, 9 ] etc. Here the successive trajectories are not quite independent, for example, in the BDPM the same forward path is “merged” with all the backward paths. Therefore the resulting joined full trajectories have common ”tail” and thus are not independent.

As a result, the noise in these methods follows other rules than in the simple or classic MCRT [ 6 ]. Therefore the weights that minimize this noise are likely different from those which minimize the noise functional in the classic MCRT. We shall prove it for the example of a very simple model scene. In this scene the noise level is dictated by geometric factors, but not by object optical properties in form of bi-directional distribution function (BDF), while the Veach formulae [ 3, 4 ] relate weights to the BDFs along the ray path.

In this paper we calculate the optimal weights directly, i.e. find the minimum of the sample variance of the pixel value. This is performed for a simple model scene. We demonstrate that these weights depend on the geometry of the scene and on the number of light and camera rays per iteration, while the known

MIS formulae from [ 3, 4 ] include only the BDFs and distribution of light source emission. 2.

BDPM and weights in it

The basic idea of BDPM is that we trace several camera and several light rays. Then for each pair of light + camera paths, we try to merge them in a join trajectory that connects light and camera. If they do join we increment the accumulated luminance. Then the next pair is processed. After all light rays had been checked against all camera rays they all are discarded and new sets of rays are generated etc. Generation of the sets of rays and then cycling over all pairs constitute one iteration of the process. The luminance calculated in different iterations is statistically independent.

So, an iteration which uses camera rays (through given

pixel!!) and increases accumulated luminance of a pixel by

light rays (for all pixels!) = 1 1

=1 =1

, where , is the contribution from the pair of -th light and -th camera rays. Similarly to it can be written as + , ( )( ⃗ ( )) ( )( ⃗ ( ) − ⃗ ) (⋯ ; ⃗

( )) where

cycles over all camera path vertices and cycles over all light path vertices, is the integration kernel and is BDF in luminance units at the point ⃗

( ).

Like in [ 3, 4 ], ,

is the weight for junction at the th camera vertex when the join path of vertices (i.e. the light half of the join path has − must be a function of that full path such that

vertices). It = (1) (2) Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY −1 =0 , weights for joint paths of different total length (this is obvious because they are functions of arguments).

Direct calculation of optimal weights

It follows from (1) and (2) that the increment of the pixel luminance from one algorithm iteration is also linear in weights: it is a sum of weights times some random functions: = 1 ∞

=1 =0 , , ( ⃗ , ) , ( ⃗ , ) conditions where

enumerates light rays in one iteration, enumerates camera rays (through this pixel) in one iteration and ⃗ is the join path from them. , ( ⃗ , ) is the contribution from the this pair ( , ) constrained to 1. The paths merge into join path 3. The join path has vertices 2. They merge at the -th camera vertex

If these conditions are not satisfied , resolves ambiguity how to define the join path if the camera and light halves did not merge.

The sets of join paths from different iterations are independent. So for this linear form the average over iterations (= the limiting luminance) is also linear in weights whose “coefficients” are averages that can be calculated in ray tracing.

Therefore we can find those weights that minimize the variance of the pixel luminance, i.e. the noise. These are the optimal weights. Its direct calculation that does not include an approximations and hypothesis is regrettably very expensive numerically. So we shall perform it for a simple model scene, but even this example will give us some important conclusions.

Simple model scene and calculations for it vanish. This

Scene layout

To simplify our calculations we use a model scene where all join paths have the same (and small) length. It consists of 3 parallel planes with diffuse transparency; planes 1 and 3 have the Lambert BDF. BDF of the middle plane 2 is arbitrary and can be made very sharp (when direction of a transmitted ray is close to that of the incident ray). The planes are orthogonal to Oz and where ⃗1 = 0⃗ is fixed and the segment between lights source and plane 3 is ignored because does not affect the path contribution.

Therefore the full path is completely described by its two variable vertices, ⃗2 and ⃗3. and 1

. hit then.

The camera and light rays can meet at planes 1, 2 and 3 whose contributions are taken with

If camera and light ray meet at plane , it is ambiguous whether ⃗

is camera or light hit (they can differ by integration kernel radius). We choose camera Calculation of contribution

Camera path is ( ⃗1 ( ⃗3( ) , ⃗2 ( )

( ) , ⃗1 ) where ⃗ at the -th plane, ⃗ to plane 1), and ⃗1 ( ) ( ) , ⃗2 ( )

, ⃗3( )) and light path is ( ) is the hit point of camera ray ( ) is the hit point of light ray at the = 0⃗ is fixed. As said above we do -th plane (light ray goes from plane 3 to plane 2 then not consider the light ray before it hits the plane 3; just we start the ray by choosing the point ⃗3

The contribution of these two sub-paths is ( )at random. = 0( ⃗2 ) ( )( ⃗1 ( )) ( )( ⃗1 ( )) ( ⃗1 ( ) ( )

( ) , ⃗3 − ⃗1 ( )

) 1(⋯ ) + 1( ⃗2 + 2( ⃗2 ( ) ( ) ( ) − ⃗3

) 3(⋯ ) where 1 = 2 = −1 while , ⃗3 ( ) ) ( )( ⃗2 ( )) ( )( ⃗2

( ) , ⃗3 ( )) ( )( ⃗3 ( )) ( )( ⃗3 ( ) ) ( ⃗2 ) ( ⃗3 ( ) ( ) − ⃗2 ( ) ) 2(⋯ ) 2 =

1 2 2 − 2 2 2

1 cos we use the simplest one: where is the angle between the incident and scattered rays, is the angle between the scattered ray and the normal and is the width. As to the integration kernel, ( ⃗) = 1

1, | ⃗| ≤ 2 0, | ⃗| > 2 =0 ( ) ray where is integration radius. It is small. and light path as ⃗ ≡ ( ⃗3 ( ) ( )

, ⃗1( )) we can write Denoting the camera path as ⃗ ≡ ( ⃗1 ( ) ( ) , ⃗3( )) ( ⃗, ⃗) =

( ⃗2( ⃗, ⃗), ⃗3( ⃗, ⃗)) ( ⃗, ⃗) where of which uses light rays and camera rays, is Pixel luminance calculated during iterations, each energies can

calculated ( ) be − ,

(2) ≡ ( , ) ≡ 1

(2) , 1

=1 =1 and 1 into single array: ( ⃗ , , ⃗ , ) (3)

The contribution from each iteration (3) is random variable and contributions from different iteration.

Therefore the variance of calculated luminance is 1

( ) =

( ) ( ) = ⟨ 2 ⟩ − ⟨ ⟩ 2

The averages are over the ray ensembles. They can be approximately estimated from the sum over iterations (the usual practice called sample mean and sample variance).

Tabulated weights

For numerical calculations let us subdivide the whole admissible area in ( ⃗2, ⃗3) space in cells. The weight is constant ,

within cell. Since formally ⃗2 and ⃗3 can be infinite we take some finite area and subdivide it in a usual way, unbounded space outside it constituting “the last cell”. Let ( ⃗2, ⃗3) be 1 inside the -th cell and 0 outside it. Then the contribution from the -th iteration (3) becomes

( ⃗ , , ⃗ , ) ( ⃗ , , ⃗ , ) Combining the tables that relate to the weights 0

≡ ≡ ≡ 0 1 (0) (1) (0) (1) we can write

= ⟨ || ⟩ + Then, since independent the average value and the mean square are = | ⟩ + 2 = ⟨

| | ⟩ + 2 | ⟩ + 2 where the overbar denotes the average, and

≡ | ⟩⟨ | ⟨ | ≡ ⟨ | 1 − thus

The mathematical expectation of pixel luminance is the same for all weights, thus the limiting average = 0. But for a finite number of iterations, when convergence is incomplete, the sample average can be slightly depending on weights. Thus the sample average ≠ 0 and while calculating the sample variance over a finite number of iterations we must account for this dependence. This sample variance over iterations is ( ) = ⟨ | | ⟩ + 2 | ⟩ + 2 | ⟩ + 2 so the weights which minimize it satisfy −

| ⟩ = −| ⟩ + ⟨ ⟩ which is just a system of simultaneous linear equations.

However numerical experiments shown that the solution can be rather ragged. To improve the situation a regularization term can be added to the minimization equation which is a penalty for high gradients.

Results

We performed the calculations for the case when plane positions 2 = 1, 3 = 3, BDF of plane 2 has width = 3 ∘ illumination density is ( 3) = 1 300 2 1 0

= 20 is the radius of illuminated area and = 0.02 is the “aperture” (radius) of its bright central part • • ⃗3 BTW integration radius the number of rays = 0.003 = 100,

The scene is axisymmetrical. Therefore all functions infinity. So we chose a finite area for each, now 0 ≤ 2 ≤ 2, 0 ≤ 3 ≤ 0.05, subdivided it into equal cells and then added the last cell which completes to the whole infinite domain, e.g. 1 2

< 2 < ∞.

Trial calculations were done for several numbers of cells. It happened that although the calculated weights differ, the noise level is nearly the same (as it is common for optimization). Since the weights are not needed per se, but only the noise reduction by them, we can use as small cells as enough to saturate the noise level.

It happened that it was enough 1 cell in (i.e. weights actually do not depend on it!), 2 cells in 2 (0 ≤ 2 ≤ 21 and 21 < 2 < ∞) and 26 cells in 3 (the first 25 of size 0.002 and the last 0.05 < 3 < ∞).

The noise was calculated for the following cases. The calculation results are shown in Table 1: 1. rays meet at plane 1 only 2. rays meet at plane 2 only 3. rays meet at plane 3 only 4. rays meet at plane 3 only 5. optimal weights are used

Table 1. The calculation results case ×105 RMS,% 1 1 26.484 208% 2 0 26.352 254% 3 0 25.997 146% 0 0 1 0 0 1 1, 3 ≤ 0, 3 ≤ 0, 3 > 1, 3 > optimal

Optimal weights were calculated from statistic accumulated in 10000 iterations (Fig. 2). Optimization was constrained to 0 = 0.

Fig. 2: The optimal weight 1 as a function of 2;

2 = 1 − 1 ; 0 was constrained to 0

Conclusions

We see that even in case of a direct optimization (which gives the best result without false minima, approximations etc.) the gain is moderate; it is about 3fold as compared to the best “fixed BDD” strategy. This is not bad because 3fold in noise is equivalent to a 9fold increase of speed.

At qualitative level we see that the optimal weights are not local i.e. we cannot calculate the weight (which is as we remember a function of the vertices of join path) from that path only. Indeed, in the above calculation illumination of the rightmost plane was 3333 times lower for 3 > . Let us compare it with the case of uniform illumination. In this case the optimal weight is very close to 2 = 1 (for all paths), so for a join path with | 3| ≤ the optimal weight is different for the uniform and not uniform illumination. In other words, the weight for this path depends on illumination outside it.

Surely the optimal weight is still a function of the join path but this function depends on the global scene characteristics.

Meanwhile in the “balance heuristic” or “power heuristic” [ 3, 4 ] this function is known in advance. Very roughly, it calculates the weight from the ratio of BDF at junction point to the sum of BDFs at all the vertices of the join path. We therefore conclude that the balance/power heuristic, derived for the usual MCRT, is not truly optimal for BDPM because there the “samples” (join paths) are correlated because use the same light and/or camera path several times.

Acknowledgments

The study was carried out within the framework of the RFBR grants 18-01-00569, 18-31-20032 and 20-0100547.

[1]

Matt

Pharr and

Greg

Humphreys . 2010 . Physically Based Rendering, Second Edition: From Theory to Implementation (2nd ed .). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.

[2]

H. W.

Jensen , Global illumination using photon maps , in Proceedings of the Eurographics Workshop on Rendering Techniques '96 , (London, UK, UK), pp. 21 - 30 , SpringerVerlag, 1996 .

[3]

Eric

Veach . A dissertation: Robust Monte-Carlo methods for light transport simulation , 1997 .

[4]

Jiri

Vorba . Bidirectional photon mapping . In Proceedings of CESCG 2011: The 15th Central European Seminar on Computer Graphics , Prague, 2011 .

[5]

Georgiev ,

Křivánek ,

Davidovič , and Ph . Slusallek . 2012 . Light transport simulation with vertex connection and merging . ACM Trans. Graph . 31 , 6 , Article 192 ( November 2012 )

[6]

Ershov ,

Zhdanov , and

Voloboy . Estimation of noise in calculation of scattering medium luminance by MCRT . Mathematica Montisnigri , XLV: 60 - 73 , 2019 .

[7]

Ershov ,

Zhdanov ,

Voloboy ,

Sorokin . Treating diffuse elements as quasi-specular to reduce noise in bidirectional ray tracing // Keldysh Institute Preprints. 2018 . No. 122 . 30 p. doi: 10 .20948/prepr-2018-122-e

[8]

Popov ,

Ramamoorthi ,

Durand , and

Drettakis , Probabilistic Connections for Bidirectional Path Tracing , Computer Graphics Forum, 2015 .

[9]

Dodik , Implementing probabilistic connections for bidirectional path tracing in the Mitsuba Renderer , Sept. 2017 .