d-Matrix Unveils Corsair, the World’s Most Efficient AI Computing Platform for Inference in Datacenters

Enables 30,000 tokens/second at a blazing fast 2 ms/token for Llama3 70B in a single rack

d-Matrix officially launched Corsair, an entirely new computing paradigm designed from the ground-up for the next era of AI inference in modern datacenters. Corsair leverages d-Matrix’s innovative Digital In-Memory Compute (DIMC) architecture, an industry first, to accelerate AI inference workloads with industry-leading real-time performance, energy efficiency, and cost savings as compared to GPUs and other alternatives.

“Combining d-Matrix’s Corsair PCIe card with GigaIO SuperNODE’s industry-leading scale-up architecture creates a transformative solution for enterprises deploying next-generation AI inference at scale”

The emergence of reasoning agents and interactive video generation represents the next level of AI capabilities. These leverage more inference computing power to enable models to “think” more and produce higher quality outputs. Corsair is the ideal inference compute solution with which enterprises can unlock new levels of automation and intelligence without compromising on performance, cost or power.

Also Read:AMD Accelerates Exascale Computing to New Heights Powering the Fastest Supercomputer Ever, El Capitan

“We saw transformers and generative AI coming and founded d-Matrix to address inference challenges around the largest computing opportunity of our time,” said Sid Sheth, cofounder and CEO of d-Matrix. “The first of it’s kind Corsair compute platform brings blazing fast token generation for high interactivity applications, with an emphasis on making Gen AI commercially viable.”

Analyst firm Gartner predicts a 160% increase in data center energy consumption over the next two years, driven by AI and GenAI. As a result, Gartner estimates 40% of existing AI data centers will be operationally constrained by power availability by 2027. many AI data centers may face operational constraints due to insufficient power supply.(1) Deploying AI models at scale could make them quickly cost-prohibitive.

d-Matrix Industry Firsts and Breakthroughs

d-Matrix combines several world’s first innovations in silicon, software, chiplet packaging and interconnect fabrics to accelerate AI inference.

Generative inference is inherently memory bound. d-Matrix breaks through this memory bandwidth barrier with a novel DIMC architecture that tightly integrates memory and compute. Scaling is achieved using a chiplet-based architecture with DMX Link for high-speed energy-efficient die-to-die connectivity and DMX Bridge for card-to-card connectivity. d-Matrix is among the first in the industry to natively support block floating point numerical formats, now an OCP standard called Micro-scaling (MX) formats., for greater inference efficiency. These industry-first innovations are seamlessly integrated under the hood by d-Matrix’s Aviator software stack that gives AI developers a familiar user experience and tooling.

Corsair comes in an industry standard PCIe Gen5 full height full length card form factor, with two cards connected via DMX Bridge. Each card is powered by DIMC compute cores with 2400 TFLOPs of 8-bit peak compute, 2 GB of integrated Performance Memory, and up to 256 GB of off-chip Capacity Memory. The DIMC architecture delivers ultra-high memory bandwidth of 150 TB/s, significantly higher than HBM. Corsair delivers up to 10x faster interactive speed, 3x better performance per total cost of ownership (TCO), and 3x greater energy efficiency*.

“d-Matrix is at the forefront of a monumental shift in Gen AI as the first company to fully address the pain points of AI in the enterprise”, said Michael Stewart, managing partner of M12, Microsoft’s Venture Fund. “Built by a world-class team and introducing category-defining breakthroughs, d-Matrix’s compute platform radically changes the ability for enterprises to access infrastructure for AI operations and enable them to incrementally scale out operations without the energy constraints and latency concerns that have held AI back from enterprise adoption. d-Matrix is democratizing access to the hardware needed to power AI in standard form factor to make Gen AI finally attainable for everyone.”

Also Read: IonQ to Advance Hybrid Quantum Computing with New Chemistry Application and NVIDIA CUDA-Q

Availability of d-Matrix Corsair inference solutions

Corsair is sampling to early-access customers and will be broadly available in Q2’2025. d-Matrix is proud to be collaborating with OEMs and System Integrators to bring Corsair based solutions to the market.

“We are excited to collaborate with d-Matrix on their Corsair ultra-high bandwidth in-memory compute solution, which is purpose-built for generative AI, and accelerate the adoption of sustainable AI computing,” said Vik Malyala, Senior Vice President for Technology and AI, Supermicro. “Our high-performance end-to-end liquid- and air- cooled systems incorporating Corsair are ideal for next-level AI compute.”

“Combining d-Matrix’s Corsair PCIe card with GigaIO SuperNODE’s industry-leading scale-up architecture creates a transformative solution for enterprises deploying next-generation AI inference at scale,” said Alan Benjamin, CEO at GigaIO. “Our single-node server supports 64 or more Corsairs, delivering massive processing power and low-latency communication between cards. The Corsair SuperNODE eliminates complex multi-node configurations and simplifies deployment, enabling enterprises to quickly adapt to evolving AI workloads while significantly improving their TCO and operational efficiency.”

“By integrating d-Matrix Corsair, Liqid enables unmatched capability, flexibility, and efficiency, overcoming traditional limitations to deliver exceptional inference performance. In the rapidly advancing AI landscape, we enable customers to meet stringent inference demands with Corsair’s ultra-low latency solution,” said Sumit Puri, Co-Founder at Liqid.

d-Matrix is headquartered in Santa Clara, California with offices in Bengaluru, India, Toronto, Canada, and Sydney, Australia.

Also Read: Maximizing Value: Optimizing Cybersecurity with Existing Systems

[To share your insights with us as part of editorial or sponsored content, please write to psen@itechseries.com]

d-Matrix Unveils Corsair, the World’s Most Efficient AI Computing Platform for Inference in Datacenters

Enables 30,000 tokens/second at a blazing fast 2 ms/token for Llama3 70B in a single rack

Business Wire

Quick Links

Visit Our Other Sites

Arkose Labs Launches Arkose Device ID: A Dual-Method Approach to Precise, Persistent Device Identification

Tanium Extends Endpoint Management Platform to Enhance Visibility, Control, and Remediation for Containerized Workloads

Business Wire

Related posts

Kempower to Present Its Dynamic, Scalable, and Modular EV Fast-charging Solutions at Autopromotec in Bologna, Italy

Google Cloud Adds New Features to Vertex AI Search for Healthcare and Life Science Companies

KPN Turns to Oracle to Modernize Operations