Launched | September 20, 2018 |
---|---|
Designed by | Nvidia |
Manufactured by | |
Fabrication process | TSMC 12FFC |
Codename(s) | TU10x TU11x |
Product Series | |
Desktop | |
Professional/workstation | |
Server/datacenter | |
Specifications | |
Compute |
|
L1 cache | 96 KB (per SM) |
L2 cache | 2 MB to 6 MB |
Memory support | GDDR6 HBM2 |
PCIe support | PCIe 3.0 |
Supported Graphics APIs | |
DirectX | DirectX 12 Ultimate (Feature Level 12_2) |
Direct3D | Direct3D 12.0 |
Shader Model | Shader Model 6.7 |
OpenCL | OpenCL 3.0 |
OpenGL | OpenGL 4.6 |
CUDA | Compute Capability 7.5 |
Vulkan | Vulkan 1.3 |
Media Engine | |
Encode codecs | |
Decode codecs | |
Color bit-depth |
|
Encoder(s) supported | NVENC |
Display outputs | |
History | |
Predecessor | Pascal |
Variant | Volta (datacenter/HPC) |
Successor | Ampere |
Turing is the codename for a graphics processing unit (GPU) microarchitecture developed by Nvidia. It is named after the prominent mathematician and computer scientist Alan Turing. The architecture was first introduced in August 2018 at SIGGRAPH 2018 in the workstation-oriented Quadro RTX cards,[2] and one week later at Gamescom in consumer GeForce 20 series graphics cards.[3] Building on the preliminary work of Volta, its HPC-exclusive predecessor, the Turing architecture introduces the first consumer products capable of real-time ray tracing, a longstanding goal of the computer graphics industry. Key elements include dedicated artificial intelligence processors ("Tensor cores") and dedicated ray tracing processors ("RT cores"). Turing leverages DXR, OptiX, and Vulkan for access to ray tracing. In February 2019, Nvidia released the GeForce 16 series GPUs, which utilizes the new Turing design but lacks the RT and Tensor cores.
Turing is manufactured using TSMC's 12 nm FinFET semiconductor fabrication process. The high-end TU102 GPU includes 18.6 billion transistors fabricated using this process.[1] Turing also uses GDDR6 memory from Samsung Electronics, and previously Micron Technology.
The Turing microarchitecture combines multiple types of specialized processor core, and enables an implementation of limited real-time ray tracing.[4] This is accelerated by the use of new RT (ray-tracing) cores, which are designed to process quadtrees and spherical hierarchies, and speed up collision tests with individual triangles.
Features in Turing:
The GDDR6 memory is produced by Samsung Electronics for the Quadro RTX series.[6] The RTX 20 series initially launched with Micron memory chips, before switching to Samsung chips by November 2018.[7]
Nvidia reported rasterization (CUDA) performance gains for existing titles of approximately 30–50% over the previous generation.[8][9]
The ray-tracing performed by the RT cores can be used to produce reflections, refractions and shadows, replacing traditional raster techniques such as cube maps and depth maps. Instead of replacing rasterization entirely, however, the information gathered from ray-tracing can be used to augment the shading with information that is much more photo-realistic, especially in regards to off-camera action. Nvidia said the ray-tracing performance increased about 8 times over the previous consumer architecture, Pascal.
Generation of the final image is further accelerated by the Tensor cores, which are used to fill in the blanks in a partially rendered image, a technique known as de-noising. The Tensor cores perform the result of deep learning to codify how to, for example, increase the resolution of images generated by a specific application or game. In the Tensor cores' primary usage, a problem to be solved is analyzed on a supercomputer, which is taught by example what results are desired, and the supercomputer determines a method to use to achieve those results, which is then done with the consumer's Tensor cores. These methods are delivered via driver updates to consumers.[8] The supercomputer uses a large number of Tensor cores itself.
Die | TU102[10] | TU104[11] | TU106[12] | TU116[13] | TU117[14] |
---|---|---|---|---|---|
Die size | 754 mm2 | 545 mm2 | 445 mm2 | 284 mm2 | 200 mm2 |
Transistors | 18.6B | 13.6B | 10.8B | 6.6B | 4.7B |
Transistor density | 24.7 MTr/mm2 | 25.0 MTr/mm2 | 24.3 MTr/mm2 | 23.2 MTr/mm2 | 23.5 MTr/mm2 |
Graphics processing clusters (GPC) |
6 | 6 | 3 | 3 | 2 |
Streaming multiprocessors (SM) |
72 | 48 | 36 | 24 | 16 |
CUDA cores | 4608 | 3072 | 2304 | 1536 | 1024 |
Texture mapping units | 288 | 192 | 144 | 96 | 64 |
Render output units | 96 | 64 | 64 | 48 | 32 |
Tensor cores | 576 | 384 | 288 | — | |
RT cores | 72 | 48 | 36 | ||
L1 cache | 6.75 MB | 4.5 MB | 3.375 MB | 2.25 MB | 1.5 MB |
96 KB per SM | |||||
L2 cache | 6 MB | 4 MB | 4 MB | 1.5 MB | 1 MB |
Turing's development platform is called RTX. RTX ray-tracing features can be accessed using Microsoft's DXR, OptiX, as well using Vulkan extensions (the last one being also available on Linux drivers).[15] It includes access to AI-accelerated features through NGX. The Mesh Shader, Shading Rate Image functionalities are accessible using DirectX 12, Vulkan and OpenGL extensions on Windows and Linux platforms.[16]
Windows 10 October 2018 update includes the public release of DirectX Raytracing.[17][18]