In raw compute throughput, the RTX 5090 holds a commanding lead. Its 104.9 TFLOPS of floating-point performance is more than double the RTX Pro 4000 Blackwell's 46.9 TFLOPS, a gap that directly translates to faster frame generation, heavier shader workloads, and quicker AI inference tasks. The 5090 also fields more than twice the shading units (21,760 vs. 8,960) and TMUs (680 vs. 280), which means it can process geometry and textures at a substantially higher rate — reflected in its texture rate of 1,638.8 GTexels/s versus 732.8 GTexels/s for the Pro 4000. For workloads that are parallelism-heavy, such as real-time rendering or large generative AI models, this difference is meaningful, not marginal.
Clock speed tells a more nuanced story. The RTX Pro 4000 Blackwell actually boosts higher under load — 2,617 MHz turbo versus the 5090's 2,410 MHz — but that advantage is effectively overwhelmed by the 5090's far greater number of execution units. Higher clocks on fewer cores cannot compensate for the sheer parallelism gap. Base clocks follow the opposite pattern, with the 5090 running at 2,010 MHz compared to 1,590 MHz on the Pro 4000, suggesting the 5090 sustains a higher performance floor in sustained workloads. Memory speed is identical at 1,750 MHz on both cards, making bandwidth parity a non-factor in differentiating the two. Both cards also support Double Precision Floating Point, which is relevant for scientific computing and professional simulation, though neither has a spec-level edge there.
Overall, the RTX 5090 has a clear and decisive performance advantage in this group across every throughput metric — pixel rate, texture rate, FLOPS, and shader count. The RTX Pro 4000 Blackwell's slightly higher turbo clock is a modest bright spot but does not close the gap in any practical sense. The 5090 is the stronger performer by a wide margin for compute- and graphics-intensive tasks.