<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Examination of the Nvidia RTX</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Gubkin Russian State University of Oil and Gas</institution>
          ,
          <addr-line>Moscow</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Keldysh Institute of Applied Mathematics RAS</institution>
          ,
          <addr-line>Moscow</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Moscow State University</institution>
          ,
          <addr-line>Moscow</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>V.V. Sanzharov</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Hardware acceleration of ray tracing is an active research field, but only with therelease of Nvidia Turing architecture GPUs it became widely available. Nvidia RTX is a proprietary hardware ray tracing acceleration technology available in Vulkan and DirectX APIs as well asthrough Nvidia OptiX. Since the implementation details are unknown to the public, there are a lot of questionsabout what it actually does under the hood. To find answers to these questions, we implemented classic path tracing algorithm using RTX via both DirectX and Vulkan and conducted several experiments with it to investigate the inner workings of this technology. We tested actual hardware implementation of RTX technology on RTX2070 GPU and the software fallback in the driver on GTX1070 GPU. In this paper we present results of these experiments and speculate on the internal architecture of RTX.</p>
      </abstract>
      <kwd-group>
        <kwd>photo-realistic rendering</kwd>
        <kwd>ray tracing</kwd>
        <kwd>hardware acceleration</kwd>
        <kwd>GPU</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Ray tracing is a cornerstone of photo-realistic
image synthesis. Since first papers on ray tracing
[19], [5], computer graphics researchers developed a
plethora of different techniques to somehow
accelerate the computations associated withray tracing.</p>
      <p>The hardware acceleration ray tracing had limited
success out of research papers. Until the RTX
technology by Nvidia was released in their Turing
architecture GPUs. It was stated that Turing hardware
contains special so-called «RT cores» which
accelerate ray tracing. In the official Turing architecture
whitepaper [22] it is stated that RT core contains two
units which perform bounding box and ray-triangle
intersection tests. But since RTX is closed source,
we don’t know for sure how exactly it is implemented
and if this is all that isto ray tracing acceleration in
Turing GPUs. In this paper, we present information on
several experiments we did with an RTX GPU. We
analyze the experiments’ results and speculate on
possible techniques used in RTX hardware to accelerate
ray tracing. But first of all, let’s review the research in
ray tracing acceleration hardware to understand what
techniques were already tried out in hardware
implementations and how well did they perform.</p>
      <p>Related work in ray tracing acceleration
hardware</p>
      <p>
        First dedicated hardware solutions closely related to
ray tracing were PCI cards for volume data visu-alization
which implemented ray casting and Phong shading (such
as [9, 12]). Even though these hard-ware traced only
primary rays, it already implemented techniques to
increase the efficiency of parallel tracing such as grouping
rays to make use of memory access coherence [9].
Another notable product was SaarCOR architecture [13]
and its updated version in an FPGA chip [
        <xref ref-type="bibr" rid="ref24">14</xref>
        ]. The
SaarCOR chip implemented the whole
ray tracing algorithm - scene and camera data were
uploaded from the host and the chip produced the
rendered image. Like the ray casting solutions, SaarCOR
used packet tracing (in groups of 64 rays). The
architecture was fully pipelined to further mitigate memory
access latency - simultaneously traversing one group of
rays, loading data for the next group andintersec-tion
operation performed on another group of rays. An
example of ray tracing hardware which was
commercially available is ART AR250/350 rendering
processor with a custom RISC processor core [4]. The
solution was used to accelerate offline rendering and
was packaged as x86 PC with 16, 36 or 48
rendering processors as PCI-X cards and gigabit networking
system. Software side included RenderMan
compliant renderer and network communication interfaces
and plugins for 3D applications (CATIA, 3dMs ax,
Maya). Details about the custom rendering processor to
our knowledge were never published.
      </p>
      <p>All works mentioned to this point concern fixed
function hardware. One of the first solutions with
programmable stages is RPU (ray processing unit)
[20]. The traversal and primitive intersection tasks
are implemented in fixed function units. RPU
supported custom shaders with featuressuch as
recursive function calls, trace instruction to initiate tracing of
an arbitrary ray, asynchronous load instruction to hide
memory latency. RPU also featured geometry shaders,
instancing support and shader tables to look up specific
shader to execute for a particular geome-try object.
As SaarCOR andray casting solutions, RPU also uses
packet ray-tracing which can result in performance
drops in the case of incoherent rays. The TRaX
architecture [16] implements a different solution
many identical cores consisting of simple thread
processors. It can be viewed as general pur-pose
architecture and is used in other papers to simu-late their
hardware [7]. In the ray-tracing application TRaX
accelerates single ray performance and features MIMD
execution model as opposed togroups of 4 or
more rays and SIMD model in previously mentioned
architectures. The authors in [10] aimed to address
problems with incoherent rays by using N-wide SIMD
processing architecture with filtering of rays to find
coherent groups. The filtering is applied at traver-sal,
intersection and shading stages of the ray tracing
algorithm.</p>
      <p>In [1] authors simulate architecture close to that of
Nvidia Fermi GPU. One of thekey aspects of it
(related to ray tracing) is work compaction. When a
warp (group of 32 threads) has more than a half of rays
terminated, it terminates and the non-terminated rays are
copied to the next warp. This mechanism allows to
mitigate the effect of incoherent rays and preserve the
parallelism. Another suggestion in this work is related
to stack memory layout for threads. Also [1]
implements the idea of partitioning BVH into treelets
(which approximately matches cache sizes) and group-ing
rays according to treelets they intersect. Another
architecture - STRaTA [7] is built on top of the TRaX
[16] and implements modified treelet technique of [1]
and streaming approach to processing rays associated
with each treelet. STRaTA adds special small buffers to
memory hierarchy to store rays.</p>
      <p>In [15] authors focus on improvements related to
memory access, in particular, completely avoiding
random memory access duringray traversal. Their
approach is based around presenting data needed for ray
tracing in two streams - stream of geometry data split
in segments and stream of rays collected as a queue
per geometry segment they intersect. This al-lows for
fetching geometry and rays from main mem-ory into
caches before they are needed fortraversal.</p>
      <p>Work [6] in addition to MIMD execution model
and treelets proposes using reduced precision BVH
traversal which also allows for chip area and power
savings. Another specific point of [6] is that
authors propose small solution which can be integrated
into existing GPU architecture. There are also works
focused on developing mobile ray tracing hardware
(such as [8, 11]). These solutions usually have such
common properties as MIMD execution model,
hardware traversal and intersection units. Raycore [11] has
distinctive properties that separate it from other
architectures - it’s fully fixed function Whitted-style ray
tracing [19], it uses kD-tree as accelerationstructure
and includes hardware unit for kD-tree construction.</p>
      <p>Summary. Overall, quite a few different
architectures and hardware acceleration techniques for ray
tracing were proposed overthe years. Detailed
review and comparison can be found in [2]. Some of the
mentioned architectures had been implemented in
FPGAs. Production level hardware applications besides
Nvidia RTX are represented by [4] and mobile GPUs by
Imagination technologies [21]. However, both of
those have no published details, [4] is discontinued
and [21] is not yet available. Therefore, RTX is the
first hardware ray tracing acceleration technology to
reach wide public. But since the implementation
details are closed (like [4, 21]), it is unclear how exactly
does it work and what acceleration techniques it uses. In
this paper, we aim to understand the principles be-hind
ray tracing acceleration in Nvidia RTX hardware by
measuring the performance in several scenarios us-ing
Vulkan and DirectX12 API.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Experimental analysis of Nvidia RTX</title>
      <p>First let’s briefly reviewavailable information
about inner workings of RTX. Access to RTX ray
tracing functionality is available through Vulkan API,
Microsoft DirectX 12 (DXR) and Nvidia OptiX API
libraries [23]. We used both Vulkan and DirectX 12
for our experiments.
2.1</p>
      <sec id="sec-2-1">
        <title>Known details</title>
        <p>In summary, for both graphics APIs the
corresponding extensions add functionality to create ray
tracing pipeline with thecorresponding new shader
types, commands and objects for acceleration
structures, and tools to associate shader groups with
acceleration structures (i.e. shader binding table).</p>
        <p>Acceleration structure is represented as two-level
tree. Bottom level acceleration structure (BLAS)
objects contain actual vertices and top level acceleration
structure (TLAS) contains BLAS object instances i.e.
transformation matrices. The building process is done on
the GPU, acceleration structure is some form of BVH
[17].</p>
        <p>Ray tracing pipeline has five shader types - ray
generation, miss, closest hit, any hit and intersection.
Shader programs of first three types are mandatory
and the last two are optional. All stages of ray tracing
algorithm are programmable. There is built-in
raytriangle intersection shader which is used by default.
Official whitepaper [22] states that RT core has
raytriangle intersection unit inside. In [18] authors show
2-3.5 times improvement in performance of their
algorithm of point location in tetrahedral meshes when us-ing
built-in triangle intersection unit on Turing hard-ware
while Volta hardware (which has no RTX cores, so
software fallback is used for RTX functionality)
shows performance loss in the samescenario.
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Experiments</title>
        <p>
          To understand how RTX works under the hood we
conducted several experiments. As a base for our
investigations we implemented a basic path tracing
algorithm [5] and compare it to Open Source
implementation of path tracing in Hydra Renderer[
          <xref ref-type="bibr" rid="ref29">24</xref>
          ].
        </p>
        <p>Implementation of a minimal path tracer using
RTX in Vulkan or DirectX 12 would require
developer to:
1. build acceleration structures using ray tracing extension
API;
2. create ray tracing pipeline containing at least ray
generation, closest hit and miss shaderprograms;
3. create shader table to bind shader programs
toacceleration structures;
4. create and execute command buffers ocnreated
pipeline.</p>
        <p>There are several design options even in the minimal
implementation using RTX which can potentially affect
performance. For example, the shading and lighting
code can be executedin a ray generation shader, in
a single (closest) hit shader or in several hit shaders.
We tested two different implementations according to
best practices ofRTX for Vulkan and DX12:
1. impl_1 (Vulkan): ray generation shader creating
ray(s) for each pixel in a cycle until the specified
tracing depth is reached;
2. impl_2 (DirectX): ray generation shader spawning
primary ray and closest hit shader taking care of
generating rays until specified depth is reached. To
measure performance in all our experiments we
used Nvidia Nsight Graphics software an2d GPUs—
GTX1070 and RTX2070. It is known that while
RTX2070 has hardware acceleration for ray tracing,
GTX1070 has software implementation of RTX. Using
this setup we captured frames from our pathtracing
application and logged time spent by
vkCmdTraceRaysNV (Vulkan) or DispatchRays (DirectX 12)
function and «BVH4TraversalInstKernel»kernel in Hy-dra
Renderer. In our first set of experiments we ran
implemented path tracer on three scene(sSponza,
CrySponza, Hairballs) with different tracing depth.
From measured time we calculated frames per second
and approximate amount of rays traced per second as:
rays = width
height spp f ps
(1)
width, height – rendering resolution, spp – samples per
pixel, f ps – frames per second.</p>
        <p>scene
Sponza, impl_1</p>
        <p>Sponza, impl_2
Sponza, Hydra_SW
Crysponza, impl_1</p>
        <p>Crysponza, impl_2
Crysponza, Hydra_SW</p>
        <p>Hairballs, impl_1</p>
        <p>Hairballs, impl_2
Hairballs, Hydra_SW
Fig. 1. Time spent by ray tracing "draw call" per frame (1
sample per pixel, 1024 x 1024 resolution) depending on
rays traced per depth level. Depth = 3</p>
        <p>Next, we modified impl_2 with tracing several rays at
each depth level essentially transforming it into an
implementation of branched (recursive) path tracing. As
can be seen in fig. 1, the time increases consis-tently
with the number of rays, even slower in some cases.
For example, with 4 rays per depth level the total
number of rays is 7 times higher than for 1 ray per
depth level (21 against 3). And the performance drop is
6 times for Sponza and 3.6 forHairballs.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Results and discussion</title>
      <p>Conclusion #1: Nvidia RTX is primarily aimed at
accelerating random access to memory during ray
tracing. More specifically, traversing BVH tree with a
sets of random rays. This conclusion stems from (fig 2,
right), where we can see that hardware implemen-tation
on the small scene (Sponza) wins only 2 times (477 vs
1140) with «coherent» and «sorted» sets of primary
rays. But breaks away 4-5 times for the same Sponza
and incoherent rays (122 vs 561). Moreover, large
scene (Hair Balls) shows same 4-5 times for both
primary (58 vs 283) and secondary (50 vs 210) rays.
The fact that acceleration is preserved on the scene
where the bottleneck is the memory confirms our
conclusion.</p>
      <p>Conclusion #2: Nvidia RTX implements some
raygrouping/ray-sorting. It’s done probably in
combination with GPU work creation (see conclusion
#4). This assumption is confirmed by the fact that on
simple scenes (like Sponza) hardware implementa-tion
doesn’t have significant performance drop when we
move from primary tosecondary rays (table 1, fig1).
At the same time software implementation sees its
performance degrade much faster. However, on the
scene where ray grouping could not help (Hair balls),
both hardware and softwareimplementation don’t have
significant performance difference between primary and
secondary rays.</p>
      <p>Conclusion #3: Despite the Nvidia attempt,
placing the whole code in a single kernel («CPU
style» or «uber kernel») is stil l inefficient for GPUs.
We make such conclusion because of 2 main reasons.
First, open source implementation with separate ker-nel
in Hydra Renderer benefits almost 2 times over
Nvidia RTX for pure software case (fig. 2, left).
Second, when comparing 2 slightly different
implementations of RTX in Vulkan and DX12 we have found
dramatic changes in performance depending on a slight
change in the complexity of shaders in
«impl_1» (more complex) vs «impl_2» (simpler), table
1. This can be explained by occupancy drop
depending on code complexity and register pressure.</p>
      <p>Conclusion #4: Nvidia RTX uses GPU work cre-ation
for rays. This conclusion is confirmed by simple
observation. When we generated random amount of
rays (10 to 40), we got 2 times slower in comparison
with 10 rays. In contrast to ray tracing, when we
calculated Perlin Noise with random noise function calls
(10 to 40), we got exactly 4 times of what we should
have without GPU work creation. Our experiment
with recursive ray tracing (fig.1) also confirms GPU
work creation presence since the time is proportional to
the number of rays.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Final conclusion</title>
      <p>Our main conclusion is that NvidiaRTX is some
sort of «general» technology, oriented to speeding up
random memory access and irregular work
distribution on GPUs. In this way we can expect in near
future different sets of algorithms (at least some spatial
search algorithms) to be hardware accelerated.</p>
      <p>We believe Nvidia puts a lot of efforts in their
compiler and software support of GPU work creation. On
the example of this technology we can see, that «the
golden age of software» has ended and the «the golden
age of compilers and HW/SW projects» hasstarted.</p>
      <p>Despite the overallcomplexity of Vulkan and
DX12, such improvements make GPU implementation of
complex rendering engine much simpler for devel-oper.
On the other hand, this simplicity is achieved at the
cost of tying the project to a fairly heavy tech-nology.
We believe that efficient software implemen-tation of
RTX will be complex and expensive due to GPU work
creation and specific compiler that Nvidia puts inside
RTX — even Nvidia’s software implemen-tation on
GTX1070 essentially loses to simple and
straightforward open source ray tracing implementa-tion
in Hydra Renderer.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Acknowledgments</title>
      <p>This work was sponsored by RFBR 18-31-20032
grant.</p>
    </sec>
    <sec id="sec-6">
      <title>6. References</title>
      <p>[1] Aila T., Karras T. Architecture considerations for tracing
incoherent rays //High-performance Graphics.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>- Eurographics Association</surname>
          </string-name>
          ,
          <year>2010</year>
          . - p.
          <fpage>113</fpage>
          -
          <lpage>122</lpage>
          . [2]
          <string-name>
            <surname>Deng</surname>
            <given-names>Y.</given-names>
          </string-name>
          et al.
          <article-title>Toward real-time ray tracing: A survey on</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>ACM Computing Surveys (CSUR)</surname>
          </string-name>
          .
          <source>- 2017</source>
          . - .
          <volume>50</volume>
          . -
          <fpage>№</fpage>
          . 4.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          - p.
          <fpage>58</fpage>
          . [3]
          <string-name>
            <surname>Gribble</surname>
            <given-names>C. P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ramani</surname>
            <given-names>K</given-names>
          </string-name>
          . Coherent ray tracing via
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>stream filtering //2008 IEEE Symposium on Interac-tive</mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>Ray</given-names>
            <surname>Tracing</surname>
          </string-name>
          . - IEEE,
          <year>2008</year>
          . - p.
          <fpage>59</fpage>
          -
          <lpage>66</lpage>
          . [4]
          <string-name>
            <surname>Hall</surname>
            .
            <given-names>D.</given-names>
          </string-name>
          <article-title>The AR350: Today's ray trace rendering</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Graphics</surname>
          </string-name>
          hardware -
          <source>Hot 3D Session 1</source>
          ,
          <year>2001</year>
          [5]
          <string-name>
            <surname>Kajiya</surname>
            <given-names>J. T.</given-names>
          </string-name>
          <article-title>The rendering equation //ACM SIG-GRAPH</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <source>computer graphics. - ACM</source>
          ,
          <year>1986</year>
          . - .
          <volume>20</volume>
          . -
          <fpage>№</fpage>
          . 4. - p.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          143-
          <fpage>150</fpage>
          . [6]
          <string-name>
            <surname>Keely</surname>
            <given-names>S.</given-names>
          </string-name>
          <article-title>Reduced precision hardware for ray tracing</article-title>
          .//Proc.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>HPG</surname>
          </string-name>
          . -
          <year>2014</year>
          . - p.
          <fpage>29</fpage>
          -
          <lpage>40</lpage>
          . [7]
          <string-name>
            <surname>Kopta</surname>
            <given-names>D.</given-names>
          </string-name>
          et al.
          <article-title>An energy and bandwidth efficient ray</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <year>2013</year>
          . - p.
          <fpage>121</fpage>
          -
          <lpage>128</lpage>
          . [8]
          <string-name>
            <surname>Lee</surname>
            <given-names>W. J.</given-names>
          </string-name>
          et al.
          <article-title>SGRT: A mobile GPU architecture for</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>conference</surname>
          </string-name>
          .
          <source>- ACM</source>
          ,
          <year>2013</year>
          . - p.
          <fpage>109</fpage>
          -
          <lpage>119</lpage>
          . [9]
          <string-name>
            <surname>Meißner</surname>
            <given-names>M.</given-names>
          </string-name>
          et al.
          <article-title>VIZARD II: a reconfigurable inter-</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <year>2002</year>
          . - p.
          <fpage>137</fpage>
          -
          <lpage>146</lpage>
          . [10]
          <string-name>
            <surname>Nah J. H</surname>
          </string-name>
          . et al.
          <source>T&amp;I engine: traversal and intersection</source>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <source>Transactions on Graphics (TOG)</source>
          .
          <source>- ACM</source>
          ,
          <year>2011</year>
          . - .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          30. -
          <fpage>№</fpage>
          . 6. - p.
          <fpage>160</fpage>
          . [11]
          <string-name>
            <surname>Nah J. H</surname>
          </string-name>
          . et al.
          <article-title>RayCore: A ray-tracing hardware ar-</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>Graphics (TOG)</surname>
          </string-name>
          .
          <source>- ACM</source>
          ,
          <year>2014</year>
          . - .
          <volume>33</volume>
          . -
          <fpage>№</fpage>
          . 5. - p.
          <fpage>162</fpage>
          . [12]
          <string-name>
            <surname>Pfister</surname>
            <given-names>H.</given-names>
          </string-name>
          et al.
          <article-title>The VolumePro real-time ray-casting</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <surname>-</surname>
          </string-name>
          <year>1999</year>
          . - p.
          <fpage>251</fpage>
          -
          <lpage>260</lpage>
          . [13]
          <string-name>
            <surname>Schmittler</surname>
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wald</surname>
            <given-names>I.</given-names>
          </string-name>
          , Slusallek P. SaarCOR: a hard-
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <year>2002</year>
          . - p.
          <fpage>27</fpage>
          -
          <lpage>36</lpage>
          . [14]
          <string-name>
            <surname>Schmittler</surname>
            <given-names>J.</given-names>
          </string-name>
          et al.
          <article-title>Realtime ray tracing of dy-</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <source>- ACM</source>
          ,
          <year>2004</year>
          . - p.
          <fpage>95</fpage>
          -
          <lpage>106</lpage>
          . [15]
          <string-name>
            <surname>Shkurko</surname>
            <given-names>K.</given-names>
          </string-name>
          et al.
          <article-title>Dual streaming for hardware-</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <source>- ACM</source>
          ,
          <year>2017</year>
          . - p.
          <fpage>12</fpage>
          . [16]
          <string-name>
            <surname>Spjut</surname>
            <given-names>J.</given-names>
          </string-name>
          et al.
          <article-title>TRaX: A multi-threaded architecture</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <given-names>Specific</given-names>
            <surname>Processors</surname>
          </string-name>
          . - IEEE,
          <year>2008</year>
          . - p.
          <fpage>108</fpage>
          -
          <lpage>114</lpage>
          . [17]
          <string-name>
            <surname>Stich</surname>
            <given-names>M.</given-names>
          </string-name>
          <article-title>Real-time raytracing with Nvidia RTX,</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <source>GTC EU</source>
          <year>2018</year>
          [18]
          <string-name>
            <surname>Wald</surname>
            <given-names>I.</given-names>
          </string-name>
          et al.
          <source>RTX Beyond Ray Tracing: Exploring</source>
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <source>High-Performance Graphics</source>
          <year>2019</year>
          [19]
          <string-name>
            <surname>Whitted</surname>
            <given-names>T.</given-names>
          </string-name>
          <article-title>An improved illumination model for shaded</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          display //ACM SIGGRAPH - ACM,
          <year>1979</year>
          . - .
          <volume>13</volume>
          . -
          <fpage>№</fpage>
          . 2.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          - .
          <year>14</year>
          . [20]
          <string-name>
            <surname>Woop</surname>
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schmittler</surname>
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Slusallek</surname>
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>RPU</surname>
          </string-name>
          <article-title>: a pro-</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <year>2005</year>
          .
          <article-title>-</article-title>
          .
          <volume>24</volume>
          . -
          <fpage>№</fpage>
          . 3. - p.
          <fpage>434</fpage>
          -
          <lpage>444</lpage>
          . [21]
          <article-title>Imagination technologies</article-title>
          .
          <source>PowerVR Ray Tracing.</source>
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>2019. URL = https://www.imgtec.com/graphics-</mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          processors/architecture/powervr-ray-tracing/ [22]
          <article-title>Nvidia Turing architecture whitepaper</article-title>
          .
          <source>2019</source>
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          <article-title>architecture/NVIDIA-Turing-Architecture-Whitepaper</article-title>
          .pdf [23]
          <string-name>
            <surname>Nvidia</surname>
            <given-names>RTX</given-names>
          </string-name>
          <article-title>Ray tracing developer resources</article-title>
          .
          <source>2019</source>
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Tracing</surname>
            <given-names>Systems</given-names>
          </string-name>
          , Keldysh Institute of Applyed
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          derer.
          <source>Open source rendering system</source>
          .
          <source>2019 URL =</source>
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [25]
          <fpage>specification</fpage>
          . 2019 URL = https://
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>www.khronos.org/registry/vulkan/specs/1.1-extensions/</mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>