CIO Influence
CIO Influence News Machine Learning Networking

CEVA Doubles Down on Generative AI with Enhanced NeuPro-M NPU IP Family

CEVA Doubles Down on Generative AI with Enhanced NeuPro-M NPU IP Family

 CEVA, the leading licensor of wireless connectivity, smart sensing technologies and custom SoC solutions, announced its enhanced NeuPro-M NPU family, directly addressing the processing needs of the next era of Generative AI with industry-leading performance and power efficiency for any AI inferencing workload, from cloud to the edge. The NeuPro-M NPU architecture and tools have been extensively redesigned to support transformer networks in addition to CNNs and other neural networks, as well as support for future machine learning inferencing models. This enables highly-optimized applications leveraging the capabilities of Generative and classic AI to be seamlessly developed and run on the NeuPro-M NPU inside communication gateways, optically connected networks, cars, notebooks and tablets, AR/VR headsets, smartphones, and any other cloud or edge use case.

CIO INFLUENCE: CIO Influence Interview with Russ Ernst, Chief Technology Officer at Blancco

Ran Snir, Vice President and General Manager of the Vision Business Unit at CEVA, commented: “Transformer-based networks that drive Generative AI require a massive increase in compute and memory resources, which calls for new approaches and optimized processing architectures to meet this compute and memory demand boost. Our NeuPro-M NPU IP is designed specifically to handle both classic AI and Generative AI workloads efficiently and cost-effectively today and in the future. It is scalable to address use cases from the edge to the cloud and is future proof to support new inferencing models. The leap in performance we have achieved with this architecture brings the incredible promise of Generative AI to any use case, from cost-sensitive edge devices all the way up to highly-efficient cloud computing and everything in between.”

ABI Research forecasts that Edge AI shipments will grow from 2.4 billion units 2023 to 6.5 billion units in 2028, at a common annual growth rate (CAGR) of 22.4%*. Generative AI is set to play a vital role in underpinning this growth, and increasingly sophisticated and intelligent edge applications are driving the need for more powerful and efficient AI inferencing techniques. In particular, the Large Language Models (LLMs) and vision and audio transformers used in generative AI can transform products and industries but introduce new levels of challenges in terms of performance, power, cost, latency and memory when running on edge devices.

Reece Hayden, Senior Analyst, ABI Research, stated: “The hardware market for Generative AI today is heavily concentrated with dominance by a few vendors. In order to deliver on the promise of this technology, there needs to be a clear path to lower power, lower cost inference processing, both in the cloud and at the edge. This will be achieved with smaller model sizes and more efficient hardware to run it. CEVA’s NeuPro-M NPU IP offers a compelling proposition for deploying generative AI on-device with an impressive power budget, while its scalability also allows NeuPro-M to address more performance-intense use cases in network equipment and beyond.”

CIO INFLUENCE: CIO Influence Interview with Bill Lobig, VP of Product Management at IBM Automation

By evolving inferencing and modeling techniques, new capabilities for leveraging smaller, domain-specific LLMs, vision transformers and other generative AI models at the device-level are set to transform applications in infrastructure, industrial, mobile, consumer, automotive, PC, consumer, and mobile markets. Crucially, the enhanced NeuPro-M architecture is highly versatile and future proof thanks to an integrated VPU (Vector Processing Unit), supporting any future network layer. Additionally, the architecture supports any activation and any data flow, with true sparsity for data and weights that enables up to 4X acceleration in performance, allowing customers to address multiple applications and multiple markets with a single NPU family. To enable larger scalability that is required by diverse AI markets, the NeuPro-M adds new NPM12 and NPM14 NPU cores, with two and four NeuPro-M engines, respectively, to easily migrate to higher performance AI workloads, with the enhanced NeuPro-M family now comprising four NPUs – the NPM11, NPM12, NPM14, and NPM18. This versatility along with exceptional performance and power efficiency make NeuPro-M the leading NPU IP available in the industry today, with peak performance of 350 TOPS/Watt at a 3nm process node and capable of processing more than 1.5 million tokens per second per watt for a transformer-based LLM inferencing.

Accompanying the enhanced NeuPro-M architecture is a revamped comprehensive development tool chain, based on CEVA’s award-winning neural network AI compiler, CDNN, which is architecture aware for full utilization of the NeuPro-M parallel processing engines and for maximizing customer’s AI application performance. The CDNN software includes a memory manager for memory bandwidth reduction and optimal load balancing algorithms, and is compatible with common open-source frameworks, including TVM and ONNX.

CIO INFLUENCE: CIO Influence Interview with Lior Yaari, CEO and Co-Founder at Grip Security

[To share your insights with us, please write to sghosh@martechseries.com]

Related posts

eLogic Named As One of 11 Leading Partners For the Microsoft Cloud for Manufacturing

CIO Influence News Desk

AppSealing Application Security Solution Recognized by 2022 Cybersecurity Excellence Awards

SSH.COM Launches Zero Trust Solution Portfolio For Just-in-Time Access Management