CIO Influence
CIO Influence News Machine Learning

Anyscale Launches New Service Anyscale Endpoints, 10X More Cost-Effective for Most Popular Open-Source LLMs

Anyscale Launches New Service Anyscale Endpoints, 10X More Cost-Effective for Most Popular Open-Source LLMs

New service gives application developers the fastest way to fine-tune and deploy powerful open-source LLMs at scale

Anyscale, the AI infrastructure company built by the creators of Ray, the world’s fastest growing open-source unified framework for scalable computing, launched Anyscale Endpoints, a new service enabling developers to integrate fast, cost-efficient, and scalable large language models (LLMs) into their applications using popular LLM APIs.

Read More: CIO Influence Interview with Joe Ramieri, VP of North America at Instabase

Unveiled at Ray Summit 2023, the leading conference on LLMs and generative AI for developers, Endpoints is less than half the cost of comparable proprietary solutions for general workloads and up to 10X less expensive for specific tasks.

Previously, developers had to assemble machine learning pipelines, train their own models from scratch, then secure, deploy and scale them. This resulted in high costs and slower time-to-market. Anyscale Endpoints lets developers use familiar API calls to seamlessly add “LLM superpowers” to their operational applications without the painstaking process of developing a custom AI platform.

“Obstacles like infrastructure complexity, compute resources and cost have historically limited AI application developers when it comes to open-source LLMs,” said Robert Nishihara, Co-Founder and CEO of Anyscale. “With seamless access via a simple API to powerful GPUs at a market-leading price, Endpoints lets developers take advantage of open-source LLMs without the complexity of traditional ML infrastructure. As AI innovation continues to accelerate, Endpoints enables developers to harvest the latest developments of the open-source community and stay focused on what matters—building the next generation of AI applications.”

Read More: CIO Influence Interview with Jim Alkove, CEO and Co-Founder at Oleria

The Power of Open Source for LLMs

Demand for generative AI and high-quality LLM applications is growing rapidly. According to a new report from Bloomberg Intelligence, the generative AI market is poised to grow from $40 billion in 2022 to $1.3 trillion over the next decade.

According to a Gartner® report, “The key benefits of open-source models include customizability, better control over deployment options, privacy and security, ability to leverage collaborative development, model transparency and potential to reduce vendor lock-in.”i

Unmatched Price-Performance

As a testament to the unmatched scale and efficiency of the Anyscale Platform, Endpoints is offered at $1 per million tokens for state-of-the-art open-source LLMs like Llama-2 70B, and costs even less for other models. This dramatically expands access to LLM services for application developers. Anyscale is also typically able to add new models in hours, not weeks, so Anyscale Endpoints users have rapid access to the continuous innovation of the open-source community.

A Path to an AI Application Platform

LLMs provide significant value to companies as a result of their ability to be tailored to the specific use cases and fine-tuned with additional content and context to serve end users’ specific needs. Fine-tuning helps users get the best combination of price and performance for their use case.

In addition to fine-tuning, Anyscale provides the ability to run and use the Endpoints service within the customer’s existing cloud account on AWS (Amazon Web Services) or GCP (Google Cloud Platform). Not only does that improve security for activities like fine-tuning, it enables customers to reuse existing security controls and policies and use computing resources in their own cloud to process their proprietary data.

Anyscale Endpoints customers also have the option to upgrade to the full Anyscale AI Application Platform, giving them the ability to fully customize an LLM, and have fine-grained control over their data and models and end-to-end app architecture as well as deploy multiple AI applications on the same infrastructure.

The new service seamlessly integrates with many popular Python and machine learning libraries and frameworks, including Weight & Biases, Arize and Hugging Face, enabling developers to address multiple different types of use cases across any cloud as their AI applications evolve.

Driving User Success

“Realchar.ai is about delivering immersive, realistic experiences for our users, not fighting infrastructure or upgrading open-source models,” said Shaun Wei, CEO and Co-Founder at Realchar.ai, an Endpoints beta user. “Endpoints made it possible for us to introduce new services in hours, instead of weeks, and for a fraction of the cost of proprietary services. It also enables us to seamlessly personalize user experiences at scale.”

“We use Anyscale Endpoints to power consumer-facing services that have reach to millions of Google Chrome and Microsoft Edge users,” said Siddartha Saxena, Co-Founder and CTO at Merlin. “Anyscale Endpoints gives us 5x-8x cost advantages over alternatives, making it easy for us to make Merlin even more powerful while staying affordable for millions of users.”

Read MoreCIO Influence Interview with Conor Egan, VP of Product and Engineering at Contentstack

[To participate in our interview series, please write to us at sghosh@martechseries.com]

Related posts

Energous Applauds AirFuel RF Standard for Wireless Power Transfer

FluidAI Medical Teams Up with Medtronic for Remote Monitoring via DIGITAL’s Continuous Connected Patient Care Project

PR Newswire

Theta Lake Introduces Integrated Security and Compliance Archiving for Webex by Cisco

CIO Influence News Desk