Implementation of the Radiosity Algorithm for Large Scale Scenes A.S. Shcherbakov1,2, V.A. Frolov1,3 alex.shcherbakov@graphics.cs.msu.ru|vladimir.frolov@graphics.cs.msu.ru 1Lomonosov Moscow State University, Moscow, Russia; 2Gaijin Entertainment, Moscow, Russia; 3Keldysh Institute of Applied Mathematics, Moscow, Russia We propose an upgrade for the Radiosity algorithm that allows to efficiently apply radiosity for large scale scenes. This is achieved by considering only the patches located close to the observer. For each frame we update local form-factor matrix with a little set of patches, effectively reusing information from the previous frame in this way. Our method is completely expressed via matrix-vector operations, thus, it’s GPU implementation is natural and straightforward. We achieve high occupancy for both CPU and GPU versions of the algorithm by thanks to we use special matrix of several reflections for which update operation effectively combine computations with memory operations. Keywords: Global Illumination, Radiosity, Large Scenes, GPU 1. Introduction 2.2 PRT Methods Precomputed Light Transport (PRT) methods al-low The methods of global real-time global illumina- to move the most complex calculations from the tion in general can be divided into two classes: the rendering stage to the precomputation stage. methods that uses preprocessing/precomputation and For example, Radiance regression function [17] is those who does not uses it (i. e. calculate everything in based on neural networks. It is restricted by static dynamic). The main advantage of methods that do not geometry and materials. The second disadvantage of use pre-processing is the fully dynamic content. method is that it works only for point light sources. However, the methods of the first group due to the Spherical harmonics based methods [8, 20] uses precomputation and pre-integration usually are sig- idea of ”relighting”, treating the dynamic light source as nificantly better in terms of speed/quality ratio. a linear combination of pre-processed sources. Our work is devoted to the problem of building These methods suffers from light leaking which was methods of the ”intermediate class” that combines the noted in [20]. In general these methods are well-suited advantages of both classes. for outdoor but not indoor scenes. Despite huge number of papers in real-time global illumination, radiosity [3, 19] to this day remains one of 2. Related works the best method by means of speed/quality ra-tion. It is used by Enlighten and Unity for exam-ple [1, 5, 2.1 Dynamic methods 12, 14] and lighting engineering programs [22, 23]. Radiosity, like the previously considered methods, uses scene discretization (but use patches instead of LPV [9], VCT [4] and Volumetric Irradiance Map voxels). The pre-calculation is done by calculating the [20] discretizes the space around the observer and matrix of form-factors. The form-factor is a value in- model the movement of light along the voxel grid. dicating how much of the lighting is transferred from Main disadvantage of LPV and VCT is the high com- one patch to another. The calculation of the global putational cost and high memory requirements. Volu- illumination is reduced to solving the linear equation metric Irradiance Map [20] is uses for half-static scenes system or several matrix-vector multiplications. By and the secondary lighting is applied to dynamic ob- expanding the calculations at the preprocessing stage, it jects, but these objects do not contribute to secondary is possible to reduce the calculation of the global lighting of static objects. Thus, Volumetric Irradiance illumination method to a single matrix-vector multi- Map [20] is one of those ”intermediate class” exam- plication [18]. ples. However Radiosity still have O(N 2) algorithmic complexity where N is the number of scene patches. Instant Radiosity [11], Reflective Shadow Map [6] This is the main disadvantage of the algorithm, since and analogues creates secondary light sources, calcu- such computational complexity is unacceptable for lating the secondary illumination in the same way as large scenes (with an increase in the number of sites the primary. The most significant disadvantage of N ). Our work is aimed specifically at solving this RSM-based methods is their low precision. As the problem. The existing approaches of hierarchical ra- number of secondary sources increases, the accuracy diosity and analogues [7, 13] are light-dependent (due to increases, but the calculation time also increases sig- they split scene to patches according to the cur- nificantly [2, 15, 21]. Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). rent lighting setup) and does not exploit its the main Therefore, third idea is to combine updating the advantage — precomputation of form-factors. local matrix with the multiple reflection matrix from [18]. This will allow organizing a pipeline in which 3. Proposed approach calculations are immediately performed for each form- For a better explanation of our methods, we will factor obtained from the global matrix with form- begin with a few simple ideas that will be refined fur- factors from the local matrix. That is, we will store ther. and update the local matrix of several reflections, Our first idea is that we can consider only M clos-est rather than the usual local matrix of form-factors. In patches to the observer (where M << N ). We as-sume this case, the subsequent calculation of the global il- lighting from other patches equals to some con-stant or lumination is reduced to a single multiplication of the evaluate it on a rougher approximation (that is, we may matrix by the vector, which, given the small size of the build cascades of discrete representations of the scene in local matrix, is very cheap. the same way as the Cascaded LPV [10] does). This idea Finally, the fourth idea is that with a small leads us to the fact that we can make a submatrix of movement of the observer, it is possible to update the form-factors, on which we will perform basic local matrix less often than to calculate the lighting on calculations. We will call it local matrix (Fig. 1, left). it (for example, once in 2-5 frames). Therefore, it However, the direct implementation of this method makes sense to reduce the cost of the stage of cal- will not be efficient because the matrix of the form- culating the lighting using a local matrix of several factor for the entire scene (even if special methods reflections. are used to effectively store sparse matrices) does not fit into the computer’s RAM. Therefore, reading even M 2 3.1 Baseline of random elements (shown in green in Fig. 1, left) from DRAM or external memory cannot be im- The matrix of form-factors of several reflections plemented efficiently. [18] is based on the idea of the composition of form- Therefore, second idea is that it is necessary to factors. We further describe the principle of composi- reuse information from the previous frames as much as tion on simple examples. possible, updating only patches that were not loaded on Consider three patches with indices i, j, k. Then Fij previous frames (Fig. 1, right). This allows us to and Fjk — form-factors of energy transfer from the j-th reduce the number of form-factors that are loaded into path to the i-th patch and from the k-th patch to the j- the local matrix each frame to only a few readings (or th patch respectively. Colorsj — color of j patch. approximately M if the observer moves quickly, Then value which, nevertheless, is significantly better than M 2). Fij · Colorsj · Fjk However, reading even a few elements from the global matrix at random addresses is still inefficient shows how much of the world will go from the k-th for modern computing systems. During the update of patch to the i-th patch when the multi-reflection from the the local matrix of form-factors, computing system (both j-th patch. CPU and GPU) will be practically idle. At the same After going through all patches as intermediate in time, after updating, when calculating the radi-ance on the reflection, we get the complete second reflection of the local matrix of form-factors, they will be loaded and light from the patch number k to the patch number i: these calculations may become a bottle-neck since you ∑N F · Colors · F , need to perform at least 3 multiplica-tions of the local F=ik 2 ij j jk matrix by a vector (to account for three light bounces). j=0 where N — number of patches in local matrix. Thus, at the preprocessing stage, it is possible to calculate reflections of an arbitrary order. The mul- tiple reflection matrix contains a specified number of reflections. 3.2 Local matrix of multiple-reflection In order to extend this approach to a local ma-trix in the surrounding of an observer, two ways are possible: local matrix matrix update 1. Calculation of the matrix of several reflections for Fig. 1. Local matrix is a submatrix of global form- the entire scene and the use of its parts. factor matrix. It is composed of green-labelled cells 2. ”Online” matrix calculation for the observer’s area (left). Changing sites in the surroundings of the using the usual matrix of form-factors (Fig. 1, observer when moving (right). right). The first option is not applicable for large scenes 3.3 GPU implementation since it requires to store 3N 2 of real numbers. The To implement the proposed method we create 6 proposed method implements the second option, since the matrix of form-factors is sparse and can be effi-ciently compute shaders to update the matrix, one to delete the stored/accessed. patch data and one shader to update the lighting. The To update the matrix of several reflections, two op- following shaders are used for updating: erations are required: (1) adding a site to the matrix and 1. Calculating double_ref lection (1) using parallel (2) removing the site from it. Delete operation is reduction on GPU. implemented by zeroing the row and column of the 2. Calculates triple reflections gcol, grow (2) by mul- matrix of form-factors corresponding to the deleted tiplying double_ref lection with form-factors. patch. 3. Multiplying gcol by a matrix of form-factors to get ′ col (3). 4. Multiplying the form-factor matrix by grow to get g′ row (3). 5. Adding to the matrix of information about reflec- tions in view of the new patch (5). 6. Adding new site form-factors to the matrix. Fig. 2. Scheme of updating form-factors for local matrix of several reflections. To add a new patch, we must consider several types of reflections. First, we need to include the usual form- factors fcolumn and frow (Fig. 2, ”single”). Next, triple reflections are taken into account, in which light is Suggested approach; (2 ms, 128 MB) reflected twice from a new patch. To do this, double reflection is first calculated (Fig. 2, ”double”): double_ref lection = fcolumn ◦ Colors · frow (1) By using double reflection double_ref lection triple reflections can be calculated further (Fig. 2, ”triple”): gcol = fcolumn · Color · double_ref lection (2) grow = frow · Color · double_ref lection Finally, we add information on form-factors that takes into account the light reflections between patches already participating in the matrix (Fig. 2, Multiple reflection matrix from [18]; (300 ms, 30GB) ”interreflections”): ′ gcol = Flocal · gcol ′ (3) grow = grow · Flocal Thus, the form-factors for the new patch are equal to ′ gcol + gcol + fcolumn , ′ (4) grow + grow + frow . Now, in addition to adding the form-factors of the new patch, it is required to update the form-factors of the rest of the matrix. For this we use previously calculated form-factors for new patch (4): Difference image. In average is is less than 0.05. ′ Flocal ← Flocal + (gcol + gcol + fcolumn )· (5) ′ ·(grow + grow + frow ) Fig. 3. Comparison of Radiosity for the full matrix Thus, we can add and remove patches from the local and the proposed method. Since the full matrix 63125 matrix. The size of the matrix is limited by a constant × 63125 does not fit in the memory of the GPU, the specified by the user. time was estimated. The most computational-cost are steps 3-5. How-ever, [4] Crassin, C., Neyret, F., Sainz, M., Green, S., and steps 3 and 4 are just matrix-vector multiplica-tions, and Eisemann, E. (2011, September). Interactive indirect the step 5 implements the addition of two matrices. Both illumination using voxel cone tracing. In Computer of these these operations are to be easily implemented on Graphics Forum (Vol. 30, No. 7, pp. 1921–1930). Ox- GPU. ford, UK: Blackwell Publishing. [5] Cupisz, Kuba, and Kasper Engelstoft, Lighting in Unity, Game Developers Conference, Mar. 2015. 3.4 Details http://www.gdcvault.com/play/1021765/Advanced-Visual- Effects-With-DirectX The matrix of form-factors can be stored as a tex-ture [6] Dachsbacher, C., Stamminger, M. Reflective shadow maps // in the format RGBA16F. At the same time, the matrix Proceedings of the 2005 symposium on Inter-active 3D size is limited to 4096 pixels on some GPU due to the graphics and games. — ACM, 2005. — �. specific hardware limits (mobile GPUs). 203-231. [7] Durand, F., Drettakis, G., and Puech, C. Fast and 4. Results accurate hierarchical radiosity using global visibility. ACM Trans. Graph. 18, 2 (April 1999), 128-170. The proposed method was compared with the com- DOI=http://dx.doi.org/10.1145/318009.318012 mon Radiosity on local form-factor matrix and signif- [8] Green, R. Spherical harmonic lighting: The gritty de- icantly outperforms it (Fig. 4). tails //Archives of the Game Developers Conference. – 2003. – Vol. 56. – p. 4. [9] Kaplanyan, A. Light propagation volumes in cryengine 3 // ACM SIGGRAPH Courses. 2009. Vol. 7. p. 2. [10] Kaplanyan, A., Dachsbacher, C. 2010. Cascaded light propagation volumes for real-time indirect illumina- tion. In Proceedings of the 2010 ACM SIGGRAPH symposium on Interactive 3D Graphics and Games (I3D ’10). ACM, New York, NY, USA, 99-107. [11] Keller A. Instant radiosity. — 1997. [12] Magnusson, Kenny, Lighting You Up with Battle- field 3, Game Developers Conference, Mar. 2011. http://www.frostbite.com/2011/03/lighting-you-up-in- battlefield-3/ [13] Marries van de Hoef. Real-Time Dynamic Radios- ity for High Quality Global Illumination. 2013. Master Fig. 4. Comparison of the proposed method for a local Thesis. ICA-3220516. Utrecht university. matrix of several reflections with an a common [14] Martin, Sam, and Per Einarsson, A Real-Time Ra- Radiosity for different local matrix sizes. diosity Architecture for Video Game, SIGGRAPH Ad- vances in Real-Time Rendering in 3D Graphics and We test our method on the scene of 63125 patches Games course, July 2010. [15] Ou, J. and Pellacini, F. 2011. LightSlice: matrix slice (Fig. 3). We took a Minecraft scene in order to omit sampling for the many-lights problem. ACM Trans. scene approximation algorithms in our work. At high Graph. 30, 6, Article 179 (December 2011), 8 pages. speed, the proposed method gives an image close to DOI: https://doi.org/10.1145/2070781.2024213 the image obtained by the method of full-matrix Ra- [16] Pranckevicius, Aras, Jens Fursund, and Sam diosity on the CPU. Martin, Advanced Lighting Techniques in Unity, Unity DevDay, Game Developers Conference, Mar. 5. Acknowledgments 2014. http://aras-p.info/blog/2014/05/05/shader- This work was sponsored by RFBR 18-31-20032 compilation-in-unity-4-dot-5/ grant. [17] Ren, P., Wang, J., Gong, M., Lin, S., Tong, X., and Guo, B. 2013. Global illumination with 6. References radiance regression functions. ACM Trans. Graph. 32, 4, Article 130 (July 2013), 12 pages. DOI: https:// [1] Akenine-Moller, T., Haines, E., Hoffman, N. Real-time doi.org/10.1145/2461912.246200 rendering. Fourth Edition. – AK Peters/CRC Press, [18] Shcherbakov, A., and Frolov, V.Accelerating radios- 2018. page 482. ity on gpus. In WSCG’2017 Full papers proceedings [2] Budak, V.P., Zheltov, V.S., Kalakutsky, T.K. Local es- (2017), 2701, Computer Science Research Notes 2701 timations of Monte Carlo method with the object spec-tral Pilsen, Czech Republic, pp. 99–105. representation in the solution of global illumina-tion. // [19] Sillion F. X. et al. Radiosity and global illumination. Computer Research and Modeling, 2012, vol. 4, no. 1, pp. San Francisco : Morgan Kaufmann, 1994. Vol. 1. 75-84. [20] Yudintsev, A. Scalable Real-Time Global Illumi- [3] Cohen, M., GreenBurg, D. The Hemi-cube: A Radios-ity nation for Large Scenes. GDC Talk. 2019. URL = solution for complex it is hard to achieve environ- https://www.gdcvault.com/play/1026469/Scalable-Real- ments // Proceedings of SIGGRAPH 85 in Computer Time-Global-Illumination Graphics, 1985. 19. number 3. pp. 31-40. [21] Walter, B., Fernandez, S., Arbree, A., Bala, K., Donikian, M., and Donald P. Greenberg. 2005. Light- cuts: a scalable approach to illumination. ACM Trans. Graph. 24, 3 (July 2005), 1098-1107. DOI: https://doi.org/10.1145/1073204.1073318 [22]DIAL Dialux. 2019. URL = https://www.dial.de/en/dialux/ [23] Relux Informatik AG Relux. 2019. URL = https://reluxnet.relux.com/en/