Nvidia's AI Chips Dominate Training Benchmarks, Redefine Scaling Limits
By Netvora Tech News
Nvidia is expanding its AI chip presence in data centers and "AI factories" worldwide, and the company is announcing that its Blackwell chips are leading the way in artificial intelligence (AI) benchmarks. The company, along with its partners, is accelerating the training and deployment of next-generation AI applications that leverage the latest advancements in training and inference. The Nvidia Blackwell architecture is designed to meet the heightened performance requirements of these new applications. In the latest round of MLPerf Training – the 12th since the benchmark's introduction in 2018 – the Nvidia AI platform delivered the highest performance at scale on every benchmark and powered every result submitted on the benchmark's toughest large language model (LLM)-focused test: Llama 3.1 405B pretraining. The Nvidia platform was the only one that submitted results on every MLPerf Training v5.0 benchmark, underscoring its exceptional performance and versatility across a wide array of AI workloads, spanning LLMs, recommendation systems, multimodal LLMs, object detection, and graph neural networks. The at-scale submissions used two AI supercomputers powered by the Nvidia Blackwell platform: Tyche, built using Nvidia GB200 NVL72 rack-scale systems, and Nyx, based on Nvidia DGX B200 systems. Additionally, Nvidia collaborated with CoreWeave and IBM to submit GB200 NVL72 results using a total of 2,496 Blackwell GPUs and 1,248 Nvidia Grace CPUs.
The Basics on Training Benchmarks
- MLPerf Training is a benchmark that measures the performance of AI training workloads.
- The benchmark evaluates the speed and efficiency of AI systems in training complex models.
- The results demonstrate the capabilities of AI systems in real-world scenarios, such as natural language processing, computer vision, and recommender systems.
Comments (0)
Leave a comment