Aporia, the observability platform for machine learning, announced the launch of Direct Data Connectors (DDC). DDC is a novel technology to monitor machine learning models in production by connecting directly to training and inference datasets, without the need to duplicate any data. With DDC, organizations can monitor billions of predictions without data sampling, data duplication, or hidden cloud costs. Aporia is the first company in the MLOps space to offer this capability.
One of the main drawbacks of traditional ML monitoring solutions is their inability to process large amounts of data. This is especially true for ML use cases that require processing large volumes of predictions, such as recommendation systems, search ranking models, fraud detection models, and demand forecasting models. Organizations are often forced to rely on a small random sample of their data for monitoring, which can lead to highly inaccurate results with issues going unnoticed, high rates of false positive alerts, and difficulties in properly monitoring for bias and fairness. Many solutions rely on databases such as Apache Druid, Elasticsearch, or Clickhouse, with monthly cloud costs exceeding $10,000.
CIO INFLUENCE: Apprentice Now Joins Amazon Web Services Training Partner Program to Deliver AWS Cloud Skills Training
With DDC, organizations can monitor every single data point, rather than relying on a sample, to get a more complete and accurate view of their ML system’s performance. This allows them to identify and address any issues in real-time, ensuring the reliability and fairness of their models. Aporia’s solution connects directly to the organization’s existing data lake, avoiding the need for expensive analytics databases, and enabling the monitoring of billions of predictions at minimal cloud costs.
“In today’s world where AI is being used for everything from scheduling airline flights to hiring employees, the traditional approach to ML monitoring is no longer sufficient”, said Liran Hason, CEO of Aporia. “We can’t rely on data samples anymore — as AI becomes more ubiquitous, it is crucial that organizations have a solution in place to accurately and completely monitor all data. We are proud to be the first to offer this capability with DDC.”
CIO INFLUENCE: PlainID Launches The PlainID Technology Network to Enable Identity Aware Security for Advanced Access Control
With Aporia’s solution, users can access fully customizable ML dashboards tailored to their specific use case, as well as customizable drift detection, live alerting, and root-cause analysis tools. DDC is currently available for BigQuery, Amazon S3, Athena, Glue Data Catalog, Delta Lake, Postgres, Redshift, and Snowflake, with more connectors being added continuously.
Other benefits of Aporia’s DDC over traditional ML monitoring solutions include:
- Allowing data scientists to easily integrate new models in <5 minutes, without the need for production code changes or increased risk of crashing workloads.
- Avoiding vendor lock-in and keeping the data safe in the organization’s own format and data store, ensuring the security and accessibility of production data.
- Establishing a single source of truth to ensure that data is accurate and reliable (using two databases for the same production data can lead to discrepancies and make it hard to determine which data is trustworthy).
CIO INFLUENCE: Ascend.io Launches Solution in Partnership with Snowflake, Enabling Cost Savings for Data Teams
[To share your insights with us, please write to sghosh@martechseries.com]