AI Hyperscaler System Architecture for the Rest of Us
Drut Technologies Inc., a growing technology company offering innovative high-performance solutions for the datacenter accelerator market, announced the launch of their new DynamicXcelerator architecture.
As customers deploy large quantities of GPUs and accelerators for AI/ML workloads, they find that accelerator to accelerator communication at scale becomes both a bottleneck and an unnecessary complexity and expense. Drut’s DynamicXcelerator mitigates GPU scaling challenges by introducing dynamic slicing via photonic crosÂs connects, and creating topologies based on the AI/ML workload traffic matrixes. This allows for the buildup of low-latency direct-paths, dynamically between any group of accelerators or GPUs within a datacenter.
Jitender Miglani, Founder of Drut Technologies says, “Bringing the reconfigurability at both link layer and bandwidth layer, is a game-changer as it helps open the system interconnects to adapt to the underlying workload requirements.” As part of his earlier career experiences, he helped deliver a similar technology to a cloud hyperscaler for their optical circuit switch architecture.
CIO INFLUENCE News: Lenovo, Bytec and Unicon: Joining Forces for Hybrid Workplaces and Desktop-as-a-Service (DaaS)
Targeting the expanding AI market, Drut’s DynamicXcelerator can be built at half of the cost of the equivalent connectivity options, which avoids vendor lock-in and enables multi-vendor AI cloud solutions. “The DynamicXcelerator is the culmination of years of work by the Drut team. We are proud of what we have built and believe this architecture will be the foundation of many data centers as it addresses performance, resource efficiency and climate concerns,” said William Koss, CEO of Drut Technologies.
In addition to the advantage of cost, customers can build AI solutions that allow users to ride the Accelerator and GPU curve as new modules are offered, by simply adding them to the DynamicXcelerator and connecting them to general purpose compute nodes. This provides an operational change to how users manage and live with datacenter infrastructure.
Utilizing a dynamic photonic fabric makes it possible to create direct paths between accelerators, without having to build a tiered electrical packet switch hierarchy. This reduces cost, simplifies designs, creates predictable latency, and is dynamic and deterministic in the topologies it builds. The technology used is all standards based, can be used with any compatible server, or accelerator device, and can be deployed for new and existing datacenters.
“Large-scale reconfigurable supercomputing clusters in the data center will soon no longer be the preserve of the hyperscalers,” says Roy Rubenstein, consultant at LightCounting Market Research. “In the last year, Drut has gone from showing a composable server architecture to the promise of large-scale reconfigurable supercomputing clusters in the data center.” The DynamicXcelerator is geared towards both small GPU data centers for enterprise customers but can also scale up to the height of large hyperscaler data centers. It is ideal for a variety of workloads, including machine learning, artificial intelligence, and high-performance computing.
CIO INFLUENCE News: 77 Percent Of Organizations Are Investing in AI Solutions to Bolster Their Quality Engineering
Features of the Drut DynamicXcelerator:
- GPU cloud at scale using standard datacenter devices
- Connects accelerator to accelerator, GPU to GPU, at small and large scale
- Scales up but with less cost
- Uses software to define connectivity topologies
- Builds many dynamic GPU fabrics
Benefits of the Drut DynamicXcelerator
- Access to hyperscaler architecture at an enterprise price point
- Better TCO by decoupling GPU resources from server upgrade path
- Deterministic topologies brings the right bandwidth to your workloads
- Easier to deploy and manage than traditional solutions
- Lower latencies by using a direct connect photonic fabric
- Ideal for a variety of workloads, including machine learning, artificial intelligence, and high-performance computing
- Grouping resources via direct connect resolves the stranded resource challenge
- Improves security as workload communication is restricted to defined topologies
The DynamicXcelerator solution will be available in 2024.
CIO INFLUENCE News: Micron First to Enable Ecosystem Partners With High-Capacity 128GB RDIMMs Using Monolithic 32Gb DRAM
[To share your insights with us, please write to sghosh@martechseries.com]