CDM+AI Redefines the Data Backup Market

Advertisements

The year 2024 has marked a significant turning point for the Cloud Data Management (CDM) sector, which is witnessing a surge in high-profile activitiesNotably, Cohesity has completed a groundbreaking acquisition of Veritas's data backup and management business, receiving strategic investments from industry giants like IBM and NVIDIA in a recent funding roundMeanwhile, rival Rubrik has made its triumphant debut on the New York Stock Exchange, achieving a remarkable valuation of $6.4 billionBoth of these leading CDM firms have captured the attention of the market, signaling a new era in data management.

The meteoric rise of CDM can be traced back to a single, overarching theme: CDM has finally transformed into a mainstream solutionWith the acceleration of artificial intelligence (AI) technologies, CDM platforms are poised to become pivotal in data management across various industries, potentially revolutionizing the entire landscape of data backup and management.

The evolution of CDM from a "niche" player to a "mainstream" choice illustrates the dramatic shifts within the data backup and management market in recent years

Traditionally, data backup has been a longstanding component of IT infrastructure, carrying a legacy that spans more than half a centuryBetween 1990 and 2016, the landscape was characterized by a primitive approach to data recovery, largely based on manual laborConsequently, processes concerning data preparation and delivery were often clunky and inefficient.

However, the decade from 2010 to 2020 ushered in an era of technological innovation for CDM solutionsActifio was the pioneer in integrating CDM technology for enterprise backup, while Delphix emerged as a key player in applying CDM for Test Data Management (TDM). By 2016, influential research firm Gartner had recognized CDM as a vital connective tissue in dual-mode IT environments, highlighting its significant potential applications.

The watershed moment for CDM came in 2020, when the technology emerged as a key player in the market

The dual success of Cohesity and Rubrik, both of which were included in Gartner's Magic Quadrant for Backup Solutions, solidified their status as comprehensive CDM platform providersCohesity’s acquisition of Veritas gave it access to a wealth of existing clients in large-scale enterprise data centers, effectively coupling CDM capabilities with a strong client backup portfolioFurthermore, the strategic investments from both IBM and NVIDIA positioned Cohesity at the intersection of Generative AI and CDM.

On the other hand, Rubrik capitalized on the booming public cloud ecosystem in the United States by focusing its efforts on CDM Software as a Service (SaaS). Through its Polaris platform, Rubrik integrated various features such as data protection and ransomware safeguards to deliver a comprehensive suite of CDM services, culminating in a successful Initial Public Offering (IPO) on the NYSE.

With the advent of AI technology, CDM solutions are generating unprecedented innovation and steadily gaining traction in the market, with the potential to fundamentally alter the data backup and management landscape.

A landmark event in the CDM realm for 2024 is the mention of NVIDIA in Cohesity’s Series F investment round

While it may appear coincidental, this relationship underscores the significant impact of advanced AI techniques, including large language models and generative AI, on various sectors, including data management.

The integration of CDM and AI is a clear trend with promising implications for data backup and managementFor instance, Cohesity’s recent rollout of the AI-driven enterprise data assistant, Gaia, exemplifies this integrationThis tool leverages Retrieval-Augmented Generation (RAG) AI and large language models (LLMs) to help users navigate enterprise data, enabling them to pose questions about their data assets and receive informed answersThis capability offers radical enhancements in data security, compliance, and management.

Moreover, the synthesis of CDM and AI technology is set to propel advanced applications of large language models and generative AI in various industries

alefox

If one considers computational power to be the "engine" of the AI era and data as the "fuel," then CDM serves as the "fuel tank," essential for storing and delivering this vital resource.

In current practices, the training of large models typically involves one of three primary methods: Retrieval-Augmented Generation (RAG), fine-tuning, or continuous pre-trainingBoth RAG and fine-tuning approaches require extensive data preparation, including cleaning and governance, along with complex processes like Extract, Transform, Load (ETL) before data is stored in a warehouseIn contrast, continuous pre-training allows for direct provision of raw data, minimizing the need for preliminary processing.

Examining the limitations of conventional data management frameworks, traditional data warehouses often follow a Schema-on-Write approach, providing minimal value for AI applicationsData lakes typically serve to address the shortcomings of warehouses but are primarily focused on unstructured data processes, yielding limited utility for AI applications.

In contrast, backup data optimized through CDM retains a complete record of enterprise data assets, allowing for the creation of raw-format data lakes without modeling or transformation

By employing Schema-on-Read techniques, organizations can efficiently train AI models on this backup data at low costs while substantially reducing storage investments in traditional warehouses and platforms.

Consequently, the union of CDM and AI could mark a crucial innovation frontier within data backup and managementEvidence of the growing relationship between CDM and AI is already surfacing in the marketNotably, Cohesity has partnered with NVIDIA to integrate NVIDIA NIM microservices and NVIDIA AI Enterprise with the Gaia platform, creating powerful generative AI solutionsClients utilizing Cohesity Gaia will soon benefit from seamless integration of generative AI with their backup archival data.

With the continued rise of AI technologies, traditional data infrastructure can no longer afford to remain stagnantThe classic paradigms of data backup are losing their relevance in a rapidly evolving AI landscape

The seamless integration of CDM and AI holds the promise of unlocking new frameworks for AI applications, creating a transformative force within the data backup and management market.

In the realm of domestic CDM, the momentum is equally compellingCompared to the vibrant capital markets abroad, domestic interest in CDM among users is significantKey industries in China, including finance, manufacturing, and government sectors, are rapidly transitioning to CDM as organizations seek to upgrade traditional data backup systemsThe engagement in CDM is not limited to large enterprises; it encompasses a growing ecosystem of startups and seasoned firms pushing the envelope of CDM innovation.

Despite facing challenges over the past few years, including the withdrawal of international players like Actifio and Rubrik due to market dynamics, the seeds of CDM have taken root in ChinaThis scenario has led to the emergence of homegrown CDM startups, igniting a movement that holds the potential for explosive growth.

The success of any technology is contingent upon a blend of market conditions, product innovations, and evolving user needs

In today's context, the rapid digitization and intelligent transformation across various industries in China create a fertile ground for the growth of CDMEnterprises are increasingly considering data backup and management as pivotal components of their operations, facilitating a long-term positive outlook for CDM.

For instance, IDC's report on China's Copy Data Management (CDM) market reveals that as industries undergo digital transformation and data volumes surge, the demands for data security and operational stability continue to rise, leading to a growth rate in the CDM market that consistently outpaces the broader backup sectorThe market has maintained high double-digit growth over the past few years.

While the domestic CDM landscape has not yet produced a definitive market leader, various startup companies and backup service providers have been resolutely dedicated to research and development within the CDM domain

These innovators are crafting solutions that have found initial success across critical sectors such as finance, manufacturing, energy, healthcare, and government.

From the large-scale complexities and data demands faced by financial enterprises to the infrastructural strategies favoring private cloud setups, there is a burgeoning appetite for the CDM+AI combinationThis confluence is set to release the immense latent demand within China's CDM market.

In summary, as we navigate the era of AI, a holistic re-evaluation across all fields is essentialIn traditional settings, data backup has been a critical component for over fifty yearsAs we enter the AI-driven age, CDM, with its unique compatibility with AI technologies, finds itself at a pivotal junctureThe ascent of CDM and AI integration heralds a future where the data backup sector undergoes remarkable transformations, establishing itself as a resilient backbone for intelligent upgrading and transformation across diverse industries.