CIO Influence
CIO Influence News Machine Learning

NVIDIA Announces NVIDIA Blackwell Platform to Build and Run Real-time GenAI 

NVIDIA Announces NVIDIA Blackwell Platform to Build and Run Real-time GenAI 

NVIDIA ushered in a new era of computing with the launch of the NVIDIA Blackwell platform. This new platform allows an organization to run and build real-time generative AI on trillion-parameter large language models at a low cost that is up to 25 times less than it used to take and with low energy consumption compared to the predecessor.

The Blackwell GPU Architecture has six transformative technologies for accelerated computing. These innovations will fast-track breakthroughs in a number of diverse domains: data processing, engineering simulation, electronic design automation, computer-aided drug design, quantum computing, and generative AI—in fact; they will identify emerging industry opportunities for NVIDIA.

Prominent organizations expected to adopt Blackwell include Amazon Web Services, Dell Technologies, Google, Meta, Microsoft, OpenAI, Oracle, Tesla, and xAI.

Six Prominent Features of Blackwell Innovation

  1. Second-Generation Transformer Engine: Blackwell boasts a second-generation transformer engine, powered by innovative micro-tensor scaling support and advanced dynamic range management algorithms from NVIDIA. This enhancement enables Blackwell to double compute and model sizes while introducing new 4-bit floating point AI inference capabilities.
  2. Fifth-Generation NVLink Connectivity: Blackwell integrates the latest iteration of NVIDIA NVLink, offering groundbreaking bidirectional throughput per GPU of up to 1.8TB/s. This ensures seamless high-speed communication among up to 576 GPUs, ideal for handling complex Large Language Models (LLMs) with multitrillion-parameter architectures.
  3. Reliability and Serviceability Engine (RAS): Blackwell-powered GPUs feature a dedicated RAS engine, providing enhanced reliability, availability, and serviceability. This includes AI-based preventative maintenance at the chip level, enabling proactive diagnostics and forecasting of reliability issues, maximizing system uptime, and reducing operating costs.
  4. Secure AI Computing: Blackwell prioritizes security with advanced confidential computing capabilities. It supports new native interface encryption protocols, safeguarding AI models and customer data without compromising performance. This feature is especially beneficial for privacy-sensitive industries such as healthcare and financial services.
  5. Decompression Engine for Accelerated Data Processing: Blackwell incorporates a dedicated decompression engine, accelerating database queries and delivering exceptional performance in data analytics and data science tasks. This feature is vital as companies increasingly rely on GPU-accelerated data processing to optimize operations and reduce costs.

About the Superchip

The NVIDIA GB200 Grace Blackwell Superchip connects two NVIDIA B200 Tensor Core GPUs to the NVIDIA Grace CPU over a 900GB/s ultra-low-power NVLink chip-to-chip interconnect.

For the highest AI performance, GB200-powered systems can be connected with the NVIDIA Quantum-X800 InfiniBand and Spectrum™-X800 Ethernet platforms, also announced, which deliver advanced networking at speeds up to 800Gb/s.

What are the Key Components of the NVIDIA GB200 NVL72?
  • A multi-node, liquid-cooled, rack-scale system for the most compute-intensive workloads.
  • Combines 36 Grace Blackwell Superchips, which include 72 Blackwell GPUs and 36 Grace CPUs interconnected by fifth-generation NVLink.
  • Imcludes NVIDIA BlueField-3 data processing units to enable cloud network acceleration, composable storage, zero-trust security and GPU compute elasticity in hyperscale AI clouds.
  • The GB200 NVL72 provides up to a 30x performance increase compared to the same number of NVIDIA H100 Tensor Core GPUs for LLM inference workloads, and reduces cost and energy consumption by up to 25x.

Conclusion

The platform acts as a single GPU with 1.4 exaflops of AI performance and 30TB of fast memory, and is a building block for the newest DGX SuperPOD. NVIDIA offers the HGX B200, a server board that links eight B200 GPUs through NVLink to support x86-based generative AI platforms. HGX B200 supports networking speeds up to 400Gb/s through the NVIDIA Quantum-2 InfiniBand and Spectrum-X Ethernet networking platforms.

[To share your insights with us as part of editorial or sponsored content, please write to sghosh@martechseries.com]

Related posts

Dremio Introduces First Hybrid Data Catalog for Apache Iceberg, Delivering Superior Flexibility and Governance

GlobeNewswire

Zscaler Study Confirms IoT Devices are a Major Source of Security Compromise, Reinforces

GigaOm Names CTERA the Leader in Distributed Cloud File Storage

CIO Influence News Desk