Nvidia Ampere vs Turing GPU Architecture Comparison

(*This post may contain affiliate links, which means I may receive a small commission if you choose to purchase through links I provide (at no extra cost to you). Thank you for supporting the work I put into this site!)

Turing and Ampere are two advanced GPU architectures from Nvidia used in their RTX series graphics cards. Both these architectures offer a significant improvement over the older Nvidia architectures that include Volta and Pascal. These two latest GPU architectures (Turing & Ampere) do share some similarities with each other. Ampere is the newer of the two architectures and is employed in the latest generation of Nvidia graphics cards that include RTX 30 Series, while Turing serves the RTX 20 Series graphics cards. Ampere architecture comes with some newer features and improvements over the Turing GPU architecture. So, to help you know the significant differences between these GPU architectures, here I am making a general comparison between these two architectures on important parameters.

Turing GPU Architecture

Turing is the immediate successor of Volta GPU architecture. The architecture is built on the 12nm fabrication process and supports GDDR5, HBM2, and GDDR6 memory. The Tensor GPU architecture comes with CUDA Cores, RT Cores, and Tensor Cores in a single GPU chip (except GTX 16 series cards). It is the first architecture to support Real-Time Ray Tracing that is used for creating lifelike images, shadows, reflections, and other advanced lighting effects. Moreover, the Turing architecture also supports DLSS (Deep Learning Super Sampling), which is AI-based technology utilizing Tensor Cores to increase frame rates in games without compromising on image or graphics quality. However, it should be noted that to take advantage of these two technologies, the game must also support them (Ray Tracing and DLSS). The Turing GPU architecture provides up to a 6x performance increase over the older Pascal GPU architecture which is a great leap forward.

turing-gpu-architecture

Graphics Cards based on the Turing GPU architecture include GeForce RTX 20 Series and GTX 16 Series. However, the GeForce GTX 16 Series Turing graphics cards do not come with RT Cores and Tensor Cores. GeForce RTX 20 Series graphics cards also support VirtualLink via USB Type-C connector for connecting next-generation VR Headsets on USB Type-C port for an amazing VR experience. The Turing GPU architecture is also used in Workstation graphics cards that include Quadro RTX 4000, Quadro RTX 5000, Quadro RTX 6000, and Quadro RTX 8000.

Ampere GPU Architecture

Ampere is the successor of the Turing GPU architecture. It is built on an 8nm fabrication process and supports high-speed GDDR6, HBM2, and GDDR6X memory. The GDDR6X memory is currently the fastest graphics memory that can reach speeds to up to 21Gbps and can offer bandwidth up to 1TB/s. The Ampere architecture provides a significant improvement over Turing and comes with 2nd generation RT Cores and 3rd generation Tensor Cores. These new RT and Tensor Cores deliver about 2X Throughput or Performance over the previous generation RT & Tensor cores used in Turing architecture. It means you get a significant performance boost in games and other applications when the game or application supports Ray Tracing and AI technologies.

ampere-gpu-architecture

ampere-architecture

The Ampere architecture now supports PCIe Gen 4 standard, which doubles the bandwidth over the PCIe Gen3 interface. The architecture supports CUDA version 8.0 and includes 2x FP32 Streaming Multiprocessors, which means double FP32 performance compared to Turing. The Ampere GPU architecture supports NVLink 3.0 to increase the computation capability of a system using more than one GPUs. The Ampere architecture offers up to 1.9X Performance per Watt improvement over the Turing architecture.

ampere-perf-per-watt-improvement

Another great addition to the Ampere is the support for HDMI 2.1 that supports the ultra-high resolution and refresh rates which are 8K@60Hz and 4K@120Hz. It also supports Dynamic HDR and the total bandwidth supported by HDMI 2.1 is 48Gbps. RTX IO is another new feature introduced with Ampere architecture that can reduce CPU I/O overhead and decreases the game loading time quite dramatically by decompressing the game textures/data inside the GPU Memory using the GPU. This feature works in conjunction with the Microsoft Windows DirectStorage API. The graphics cards that use Ampere GPU architecture are the RTX 30 Series graphics cards, which include GeForce RTX 3090, RTX 3080, RTX 3070.

Nvidia Ampere vs Turing GPU Architecture

A quick and brief comparison of Ampere and Turing GPU architectures from Nvidia.

GPU Architecture-> Ampere Turing
Manufacturer Nvidia Nvidia
Fabrication Process 8nm (Samsung) 12nm (TSMC)
CUDA Version 8 7.5
RT Cores 2nd Generation 1st Generation
Tensor Cores 3rd Generation 2nd Generation
Streaming Multiprocessors 2x FP32 1x FP32
DLSS DLSS 2.0 DLSS 1.0
Memory Support HBM2, GDDR6X GDDR6, GDDR5, HBM2
PCIe Support PCIe Gen 4 PCIe Gen 3
NVIDIA Encoder (NVENC) Gen 7 Gen 7
NVIDIA Decoder (NVDEC) Gen 5 Gen 4
DirectX 12 Ultimate Yes Yes
VR Ready Yes Yes
Multi-GPU Support NVLink 3.0 NVLink 2.0
Power Efficiency Better than Turing Better than Volta
Video Outputs HDMI 2.1, DisplayPort 1.4a HDMI 2.0b, DisplayPort 1.4a
Graphics Cards RTX 30 Series RTX 20 Series, GTX 16 series
Applications Gaming, Workstation, Artificial Intelligence (AI) Gaming, Workstation, Artificial Intelligence (AI)

Final Words

Well, Ampere GPU architecture does offer significant improvements when it comes to Ray Tracing and DLSS, but even when these features are not utilized, then also the performance increment in Ampere is greater than Turing. The other significant addition to Ampere is the PCIe Gen 4 support that offers much higher bandwidth and it can come quite useful in the future. If you have anything to add or say then you can do so by leaving a comment below.

One Response

  1. James Jarvis September 24, 2020

Leave a Reply