Nvidia's new Blackwell GPUs boost tiny workstations for AI

Theregister

Nvidia has introduced two new compact Blackwell-architecture GPUs, the RTX Pro 4000 Small Form Factor (SFF) and the RTX Pro 2000, designed to deliver high performance for professional visualization and local AI workloads within a stringent 70-watt power envelope. Unveiled at the Siggraph conference in Vancouver, British Columbia, both cards share a half-height, dual-slot cooler design, making them suitable for space-constrained workstations.

Despite their similar physical profiles, the two cards cater to different performance tiers. The RTX Pro 4000 SFF boasts significantly more processing power, featuring 8,960 CUDA cores, more than double the 4,352 cores found in the RTX Pro 2000. Nvidia claims the RTX Pro 4000 SFF offers a substantial performance leap over its predecessors, achieving approximately 1.7 times faster ray tracing capabilities and 2.5 times higher AI performance. Equipped with 280 tensor cores, specialized processors for AI, the chip can deliver up to 770 teraFLOPS of FP4 performance. While this represents a 2.51x improvement in floating-point math, it’s important to note that much of this gain comes from the shift to FP4 (4-bit floating point) precision rather than purely architectural enhancements; when normalized to FP8 (8-bit floating point), the chip’s speed increase is closer to 25 percent.

Where the RTX Pro 4000 SFF truly shines is in memory bandwidth, a critical factor for local AI inference, particularly with large language models (LLMs). With 24GB of GDDR7 memory providing 432GB/s of bandwidth, the card is projected to generate tokens in LLMs about 54 percent faster than Nvidia’s previous offerings.

The RTX Pro 2000, while less powerful than its sibling, still promises a notable performance uplift for professional visualization tasks, despite its modest 70W power consumption. Nvidia indicates users can expect a 1.6 times improvement in 3D modeling, 1.4 times higher performance in computer-aided design (CAD), and 1.6 times faster rendering compared to its Ada Generation predecessor. For AI workloads, the RTX Pro 2000, though not on par with its more power-hungry counterparts, is far from a slouch, offering up to 545 teraFLOPS of sparse FP4 compute and 280GB/s of memory bandwidth, fed by 16GB of GDDR7 memory.

These new compact GPUs complement Nvidia’s existing Blackwell workstation lineup, which includes the 96GB RTX Pro 6000 announced at the GTC conference in March. Furthermore, at Siggraph, Nvidia also showcased a 2U server platform capable of housing a pair of 600W RTX Pro 6000 Server edition cards, each delivering up to 4 petaFLOPS of sparse FP4 performance. The RTX Pro 4000 SFF and RTX Pro 2000 will be available later this year from distributors PNY and TD SYNNEX, and will also be integrated into OEM systems from manufacturers such as BOXX, Dell, HP, and Lenovo. The server systems, featuring the more powerful RTX Pro 6000 Server cards, are already available from Cisco, Dell, HPE, Lenovo, and Supermicro, among others.