NVIDIA GeForce RTX 4090

NVIDIA GeForce RTX 4090 — 24 Гб GDDR6X, 16,384 ядер, GPI 32.1

Loading...

Performance Rating

The GeForce RTX 4090 is an enthusiast-class graphics card by NVIDIA, launched on September 20th, 2022. Built on the 5 nm process, and based on the AD102 graphics processor, in its AD102-300-A1 variant, the card supports DirectX 12 Ultimate. This ensures that all modern games will run on GeForce RTX 4090. Additionally, the DirectX 12 Ultimate capability guarantees support for hardware-raytracing, variable-rate shading and more, in upcoming video games. The AD102 graphics processor is a large chip with a die area of 609 mm² and 76,300 million transistors. Unlike the fully unlocked TITAN Ada, which uses the same GPU but has all 18432 shaders enabled, NVIDIA has disabled some shading units on the GeForce RTX 4090 to reach the product's target shader count. It features 16384 shading units, 512 texture mapping units, and 176 ROPs. Also included are 512 tensor cores which help improve the speed of machine learning applications. The card also has 128 raytracing acceleration cores. NVIDIA has paired 24 GB GDDR6X memory with the GeForce RTX 4090, which are connected using a 384-bit memory interface. The GPU is operating at a frequency of 2235 MHz, which can be boosted up to 2520 MHz, memory is running at 1313 MHz (21 Gbps effective).

A100 A100
H200 H200
MI325X MI325X

NVIDIA GeForce RTX 4090

32.1

NVIDIA GeForce RTX 4090

32.1

Memory

Memory Size

24 ГБ

Memory Type

Memory Bandwidth

1.01 TB/s

Memory Bus Width

384 бит

ML Performance

FP16 (Half Precision)

82.58 TFLOPS
998.4 TFLOPS
(Intel UHD Graphics 730)

BF16 (Brain Float)

No TFLOPS
311.84 TFLOPS
(NVIDIA A800 SXM4 80 GB)

TF32 (TensorFloat)

Compute Power

FP32 (Single Precision)

82.58 TFLOPS

FP64 (Double Precision)

No TFLOPS
1,204,000.0 TFLOPS
(AMD Radeon RX 7900M)

CUDA Cores

Architecture & Compatibility

GPU Architecture

Ada Lovelace

SM (Streaming Multiprocessor)

PCIe Version

PCIe 4.0 x16

ML Software Support

CUDA Version

Clocks & Performance

Base Clock

Boost Clock

Memory Clock

Power Consumption

TDP/TGP

450 W
unknown
(NVIDIA CMP 70HX)

Recommended PSU

Power Connector

1x 16-pin

Rendering

Texture Units (TMU)

NVENC / NVDEC

NVENC Generation

8th Gen

NVENC Chips

1

Max Encoding Sessions

12

Encoding (NVENC)

H.264: H.265: AV1:

NVDEC Generation

5th Gen

NVDEC Chips

1

Decoding (NVDEC)

H.264: H.265: AV1: VP9:

Benchmarks

MLPerf, llama3.1-8b-edge (fp32)

44.7 tokens/s

llama.cpp, llama 7B Q4_0

154.7 tokens/s
315.3 tokens/s
(NVIDIA GeForce RTX 5090)

llama.cpp, llama-2-7b-Q4_0

189.0 tokens/s
280.7 tokens/s
(NVIDIA H100 SXM5 96 GB)

Geekbench AI, FP16

53 496 points
69 765 points
(NVIDIA GeForce RTX 5090)

Geekbench AI, INT8

29 155 points
34 585 points
(NVIDIA GeForce RTX 5090)

Geekbench AI, FP32

39 033 points
46 834 points
(NVIDIA GeForce RTX 5090)

Additional

Slots

Triple-slot
SXM Module
(NVIDIA H200 SXM 141 GB)

Release Date

Sept. 20, 2022

Display Outputs

1x HDMI 2.1
3x DisplayPort 1.4a
4x mini-DisplayPort 2.0
4x HDMI 2.1
(SPARKLE Arc A310 OmniView)

Renting is cheaper than buying