2024 is set to witness significant advancements in data center silicon, with major chip manufacturers gearing up for notable refreshes in their CPU and GPU product lines. Key players like NVIDIA, Intel, and AMD have ambitious plans for 2024, promising a slew of new accelerators, GPU architectures, and networking innovations.
NVIDIA, for instance, has a robust lineup of new offerings, including accelerators, GPU architectures, and cutting-edge networking solutions. Intel is set to launch what many consider its most compelling Xeons in recent years, alongside the introduction of new Habana Gaudi AI chips. Riding high on the success of its MI300 series, AMD is expected to bring its 5th-generation Epyc processors to the market.
The developments in data center chip technology are poised to bring about substantial advancements, setting the stage for a transformative year in data center hardware. The competitive landscape is heating up, and the race to deliver more powerful, efficient, and innovative solutions is set to drive the industry forward in 2024.
NVIDIA’s Latest Release of H200
NVIDIA’s release of the H200 accelerators is notable for its emphasis on memory performance, specifically utilizing HBM3e memory stacks. The chip offers the same floating-point performance as its predecessor, the H100, but the improved memory capacity and bandwidth, with up to 141 GB of HBM3e memory and 4.8TB/s of bandwidth, contribute to better performance for large language models (LLMs), such as Llama 70B. This aligns with the trend in AI workloads, where memory capabilities are crucial in accommodating larger models and handling multiple requests simultaneously. The focus on memory efficiency over raw FLOPS reflects the nuances of AI inference workloads.
‘Blackwell’ Architecture: The Successor to Hopper
NVIDIA’s shift to an annual release cadence for new GPU chips has sparked anticipation for the upcoming B100, which will adopt the Blackwell microarchitecture. Little is known about the B100, but it’s expected to arrive in 2024. The shift to yearly releases reflects NVIDIA’s strategy to maintain its competitive edge, especially in light of AMD’s MI300X GPUs, which boast higher FLOPS and faster memory.
The B100 is anticipated to surpass the H200 in FLOPS and memory capacity by leveraging more HBM3e stacks, aiming to set new benchmarks for memory capacity and bandwidth. NVIDIA’s roadmap includes CPU-GPU Superchips (GB200 and GB200NVL) and the B40, aimed at smaller enterprise workloads within a single GPU, consolidating Nvidia’s enterprise GPU lineup.
One notable aspect of NVIDIA’s roadmap involves networking, aiming for 800Gb/s connectivity with the Blackwell architecture. However, this ambition faces challenges due to PCIe 5.0 limitations, as PCIe 6.0 is still being developed.
The exact release date for the Blackwell-based cards remains uncertain, but NVIDIA has a track record of pre-announcing accelerators well before their availability for purchase. Hence, details about the Blackwell-based parts might surface around events like GTC.
Intel’s Latest Accelerator Gearing Up for Third-Gen Gaudi AI Chips
The cutting-edge accelerator Intel is set to introduce its third-generation Gaudi AI chips in 2024, marking a significant leap in its AI training and inference capabilities. With the cancellation of Rialto Bridge, the successor to Ponte Vecchio, Intel’s Habana Lab presents Gaudi3 as a pinnacle offering until the arrival of Falcon Shores in 2025. Intel has maintained an unusually secretive approach regarding Gaudi3. Details have been sparse, primarily sourced from a presentation slide showcased since the Innovation event in September:
Reportedly, a 5nm chip, Gaudi3, is claimed to deliver four times the Brain Float 16 (BF16) performance compared to the 7nm version 2, along with double the network bandwidth and 1.5 times the HBM bandwidth. However, Intel’s reluctance to disclose Gaudi2’s BF16 performance makes it challenging to extrapolate the relative improvements. Despite touting a 4x enhancement in Gaudi3, Intel prefers emphasizing real-world performance over benchmark comparisons, which raises questions given the lack of a benchmark reference.
Intel Advances into Cloud CPUs with Sierra Forest
Utilizing the long-awaited Intel 3 process tech, Intel’s Sierra Forest is set for a 2024 debut. This Xeon processor, sporting a pair of 144-core dies for 288 CPU cores per socket, diverges from past models by solely incorporating efficiency cores. The absence of AVX512 and AMX support aims to optimize core density for cloud workloads like Nginx, competing with offerings from Ampere, AMD, and custom Arm CPUs.
This strategy contrasts with AMD’s approach with Bergamo Epycs, prioritizing a smaller footprint with 128 cores per package. Intel addresses migration concerns caused by missing CPU features through AVX10, a technology porting AVX512 features to AVX2. Intel and AMD showcase distinct approaches, balancing core count and functionality in the cloud computing sphere.
Intel Unveils Granite Accelarates Xeons for 2024 Release
As scheduled for a later 2024 launch, Intel’s Granite Rapids Xeons contrast Sierra Forest’s core-focused approach. Unlike Sierra Forest’s emphasis on numerous efficiency cores, Granite Rapids aligns with a traditional server processor structure centered around Intel’s performance cores.
While specific core counts and clock speeds remain u**********, indications suggest performance surpassing that of Emerald Rapids. The chip boasts a modular chiplet architecture, housing up to five dies per package—three compute, and two I/O dies—offering scalability across different SKUs. This approach allows Intel to leverage modularity similarly to AMD’s long-held advantage.
Departing from previous Xeon designs, Granite Rapids disassembles I/O functionality into separate dies, aiming to narrow the gap with AMD in core count, PCIe lanes, and memory channels. With 12 memory channels and support for 8,800MT/s MCR DIMMs, the chip targets a memory bandwidth of 845GB/s, rivaling AMD’s offerings.
AMD Launches Zen 5
AMD gears up for the launch of Turin, the much-anticipated fifth generation of Epyc server processors powered by the new Zen 5 cores, slated for a 2024 release. Despite limited details, speculation points to advancements in process technology, potentially utilizing TSMC’s 4nm or 3nm for compute tiles, while uncertainty remains regarding the shrinking of the I/O dies process.
Recent leaks via Xitter hint at a possible increase in core counts across the Epyc lineup, with projections suggesting up to 128 Zen 5 cores or 192 Zen 5c cores. The fundamental architecture of the core complex dies (CCDs) appears akin to Genoa and Bergamo, maintaining eight or 16 cores per chiplet. However, reports suggest a configuration of 16 compute dies for general-purpose platforms and 12 for cloud-centric applications to achieve the touted core counts. Confirmation of these leaks awaits further validation.
AMD’s Epyc product line has expanded significantly, catering to diverse applications such as general-purpose computing, high-performance computing, cloud-based operations, and edge applications. Historically, AMD has staggered chip releases over approximately a year’s span. Notably, Epyc 4 debuted in November 2022, followed by the arrivals of Bergamo and Genoa-X in June 2023, and the edge-focused Siena parts appeared in September of the same year.
FAQs
1. What are the key advancements in NVIDIA’s H200 accelerators?
NVIDIA’s H200 accelerators mark significant progress, primarily in memory performance. They feature HBM3e memory stacks, offering enhanced memory capacity and bandwidth. While their floating-point performance mirrors the previous H100 model, the H200 boasts up to 141 GB of HBM3e memory and 4.8TB/s of bandwidth. These improvements cater specifically to handling large language models (LLMs), like Llama 70B, in AI workloads. This emphasis on memory efficiency over raw FLOPS aligns with the evolving needs of AI inference tasks.
2. What significant improvements and features are expected in Intel’s Gaudi3 AI chips?
Intel’s third-generation Gaudi AI chips, the Gaudi3, are anticipated to bring substantial advancements in AI training and inference capabilities. It’s claimed that the 5nm Gaudi3 could deliver four times the performance of the 7nm Gaudi2 in Brain Float 16 (BF16) tasks. The chip also promises double the network bandwidth and 1.5 times the HBM bandwidth. Intel, however, prioritizes showcasing real-world performance over benchmark comparisons, emphasizing the chip’s practical application in AI tasks.
3. How does Intel’s Sierra Forest Xeon processor differ from its predecessors, and what market segments does it target?
Sierra Forest represents Intel’s venture into cloud CPUs, deviating from previous models by focusing solely on efficiency cores. This Xeon processor utilizes Intel’s 3-process tech, incorporating 144-core dies for 288 CPU cores per socket. Unlike predecessors, it lacks AVX512 and AMX support, aiming to optimize core density for cloud workloads like Nginx. This approach is in direct competition with offerings from Ampere, AMD, and custom Arm CPUs.
4. What key features and improvements are anticipated in Intel’s Granite Rapids Xeons, particularly compared to previous models?
Granite Rapids Xeons, set for a late 2024 launch, follow a more traditional server processor structure centered around Intel’s performance cores. While specific core counts and clock speeds remain u**********, the chip employs a modular chiplet architecture accommodating up to five dies per package. This includes three compute, and two I/O dies, offering scalability across different SKUs. Intel aims with Granite Rapids to narrow the gap with AMD in core count, PCIe lanes, and memory channels by disassembling I/O functionality into separate dies.
5. How has AMD’s Epyc product line evolved, and what can we infer from its historical release patterns?
AMD’s Epyc product line has evolved significantly, catering to applications such as general-purpose computing, high-performance computing, cloud-based operations, and edge applications. Historically, AMD has staggered chip releases over approximately a year’s span. Notably, Epyc 4 debuted in November 2022, followed by the arrivals of Bergamo and Genoa-X in June 2023, and the edge-focused Siena parts appeared in September of the same year. This staggered release pattern suggests a calculated strategy for AMD, aiming to cover diverse market segments with incremental releases.
[To share your insights with us, please write to sghosh@martechseries.com]