This will be the year when innovation and cloud data cost governance merge
It’s a fact that not all cloud data projects succeed—some estimates put the percentage of failed data projects at upwards of 87%. It’s also a fact that every job, workload, and pipeline you run on the way to production success (or failure) costs money. So, whereas a couple of pipelines here and there that cost more than they need to might not be a big deal when you take into account the hundreds (even thousands) of pipelines that companies are running at any given time in the race to deliver innovative products and solutions, inefficiency can be a very big—and very costly—deal, indeed.
Now, factor in the huge amounts of data needed to train generative AI/ML models, the equally huge number of pipelines it takes to power them, and the fact that some of the very people tasked with building these models aren’t particularly skilled at it, and experimentation just became exponentially more expensive. What’s needed to succeed in today’s lightning-fast, competitive business environment is a corporate commitment from the CEO on down to be conscious not only of costs—and even more importantly, their return on investment (ROI)—but committed to keeping wasted spend in check, and empowering staff to act accordingly.
Cost and time-to-value take center stage
Today’s enterprises are faced with myriad challenges, with soaring cloud data costs and time to value taking center stage. Part of this can be attributed directly to the fact that to power their AI/ML models, companies have to gather, retain, and process tremendous amounts of data, involving highly complex systems, and a breadth of users with varying skill sets—all of which are costly and prone to performance issues.
This is a fact unlikely to abate.
To remain competitive and ensure that their AI projects scale without breaking the bank, enterprises need to understand the true costs involved with getting value out of their data and how to ensure that the AI pipelines are optimized for speed and performance. And that’s easier said than done.
Your monthly cloud bill doesn’t tell you how much of your cloud data spend is wasted because of over-provisioned and/or underutilized resources, inefficient data tiering and job scheduling, and code that uses more computing than necessary and is more prone to cause issues in production.
Enterprises also need to consider a project’s ROI. The value, or return, of an AI project depends on its objective: generating new sources of revenue, reducing customer churn, increasing employee productivity, improving operational efficiency, or lowering operational costs. However, the investment part of the equation is more straightforward.
How you determine a project’s value or its gain will vary from project to project, and unfortunately, all too many organizations lack the necessary visibility into their cloud data spending to determine whether that next product or solution is going to cost more money than it’s worth. They need to understand what’s going on underneath the hood when it comes to their data applications and pipelines before they even begin to manage their costs for optimum ROI.
Companies without strong cloud data cost governance will continue to see their cloud data costs skyrocket, while the data leaders who are proactively thinking about (and addressing) governance of their cloud data costs will be the cloud data cost “winners.” This means implementing the proper guardrails to identify and correct bad code, configuration issues, data layout problems, and oversized resources in the development stage—before going into production. After all, it makes more sense to install smoke alarms when you’re building a house than to buy a fire extinguisher when it’s engulfed in flames.
Shift left
Whereas data engineers haven’t had to consider costs up until now, the volume of data—and the very complexity of modern data systems—means that cost awareness has become everyone’s responsibility, including theirs.
Historically, data engineers have been responsible for building, scaling, and monitoring an app, and making sure it runs reliably. But today’s drive for cost optimization is putting pressure on data engineers to ensure not only that pipelines are running smoothly but also that these things all occur within a reasonable cloud data budget and deliver a strong ROI … and doing so long before they reach production, where the big costs are incurred.
AI ML Insights:
AI for Network Management Strategies in 2024
In response, data engineering teams are shifting to the additional responsibility of guiding the business on how to best ensure cloud data cost efficiencies. In doing so, they’re utilizing technology that provides deep observability into specific cloud data costs, leverages AI to identify efficiencies and provide actionable recommendations for improvement, and implements controls to set thresholds and alerts, while making the entire process as easy as a press of a button.
Working hand in hand with this shift is the realization that the exponential efficiency of data analytics in the cloud rests in optimizing not just the infrastructure but also the code. How code is written often determines how much computing is used. Having deep visibility into the costs of data pipelines, and recommendations for how to optimize code to lower costs and enable more workloads for the same costs will be paramount.
AI and automation
As IT leaders strive to ensure their projects are delivering maximum ROI, they need to be asking the “right” questions (e.g., Which projects are the least performant and take the longest time to market? Which projects cost the most?) and map the answers to a project’s criticality.
To do so accurately, they first need to take into account the fact that the skill sets of those tasked with creating AI/ML products can vary greatly.
In any given company, only 5% of those creating AI/ML models know how to write very efficient Spark or SQL queries and control costs; maybe another 20% can write queries well. That means that only a quarter of the company is running their data jobs efficiently. In the cloud, the volume of inefficient code can make innovation cost-prohibitive.
For organizations to realize true cost optimization, everyone will need to adopt the practices of the top 5%—and that requires AI and automation. AI is a great up-leveler in that not only can it tell people where and what went wrong, but also how to fix inefficiencies in their code, infrastructure, or configuration, for example.
The pace at which companies are leveraging AI isn’t slowing down, and whereas AI isn’t going to be a case of “everything for everybody,” increasingly, those companies that want to use it successfully must start monitoring both the impact and ROI of AI.
Recommended: Three Reasons an Organization’s CISO Should be Independent of its CIO