Introduction
In the technological evolution, AWS re:Invent is a pivotal moment for unveiling groundbreaking innovations. Among these, NVIDIA’s announcements take center stage, announcing a new era of possibilities and advancements in AI technology within the AWS ecosystem. With many transformative revelations, NVIDIA is reshaping AWS services and capabilities. This article delves into the array of innovative announcements made by NVIDIA during AWS re: Invent 2023, illuminating the profound impact these revelations hold for the future of AI within AWS.
1. AWS-NVIDIA Collaboration: Revamping Supercomputing Infrastructure for Generative AI
At AWS re:Invent, Amazon Web Services, Inc. and NVIDIA announced an expanded collaboration to deliver cutting-edge infrastructure, software, and services, empowering customers in their generative AI pursuits. This strategic venture integrates NVIDIA’s latest multi-node systems with advanced GPUs, CPUs, and AI software alongside AWS’s Nitro System virtualization, Elastic Fabric Adapter (EFA) interconnect, and UltraCluster scalability, designed specifically for training foundational models and constructing generative AI applications.
Key points of this expanded collaboration include:
NVIDIA GH200 Grace Hopper Superchips on AWS:
AWS will be the leading cloud provider to introduce NVIDIA GH200 Grace Hopper Superchips featuring new multi-node NVLink technology to its cloud infrastructure. This platform, available on Amazon Elastic Compute Cloud (Amazon EC2) instances, leverages Amazon’s robust networking (EFA), virtualization (AWS Nitro System), and hyper-scale clustering (Amazon EC2 UltraClusters), enabling customers to scale seamlessly to thousands of GH200 Superchips.
NVIDIA DGX Cloud on AWS:
This collaboration will bring NVIDIA DGX Cloud—NVIDIA’s AI-training-as-a-service—to AWS, marking the first DGX Cloud incorporating GH200 NVL32, offering developers access to the most significant shared memory in a single instance. This integration aims to accelerate the training of advanced generative AI and large language models exceeding 1 trillion parameters.
Project Ceiba: World’s Fastest GPU-powered AI Supercomputer
NVIDIA and AWS jointly work on Project Ceiba, aiming to design the world’s fastest GPU-powered AI supercomputer. This system, equipped with GH200 NVL32 and Amazon EFA interconnect hosted on AWS, will feature 16,384 NVIDIA GH200 Superchips, capable of processing 65 AI exaflops. This powerhouse will drive NVIDIA’s next wave of generative AI innovations.
Introduction of New Amazon EC2 Instances:
AWS will roll out three new Amazon EC2 instances—P5e instances powered by NVIDIA H200 Tensor Core GPUs for large-scale generative AI and HPC workloads and G6 and G6e instances powered by NVIDIA L4 and L40S GPUs, respectively. These instances cater to applications such as AI fine-tuning, inference, graphics, and video workloads. G6e instances, in particular, are tailored for developing 3D workflows and applications using NVIDIA Omniverse.
Adam Selipsky, CEO at AWS, highlighted the longstanding collaboration between AWS and NVIDIA, emphasizing their commitment to making AWS the optimal platform for GPU operations. Jensen Huang, founder and CEO of NVIDIA, emphasized the transformative impact of generative AI and their joint efforts to deliver cost-effective, state-of-the-art solutions to customers.
Groundbreaking Features of Amazon EC2 Instances with GH200 NVL32:
These instances boast impressive features, including up to 20 TB of shared memory per GH200 NVL32 instance, AWS’s Elastic Fabric Adapter providing up to 400 Gbps per Superchip, liquid cooling for efficient operation, and enhanced security through the AWS Nitro System.
NVIDIA Software Innovations on AWS:
NVIDIA’s software contributions on AWS include NeMo Retriever microservice for chatbots and summarization tools, BioNeMo for accelerated drug discovery, and collaborations with AWS services leveraging NVIDIA’s frameworks for AI training and robotics advancements.