NetApp is acquiring Alacritus predominantly to obtain its virtual tape library (VTL) technology. The company is demonstrating quite some alacrity as, in just one seven day period, it adds multi-vendor virtualisation to its G-filers, now called V-filers; signs a significant and far-reaching alliance with IBM; and buys Alacritus.
The Alacritus virtual tape library (VTL) technology nicely complements NetApp's product line; it doesn't have in-house VTL products and, in a simplistic lego-block sense, the Alacritus BTL 'stuff' can be layered onto and integrated in to NetApp's storage products - VTL ONTAP we might say.
However, Alacritus has a second technology; continuous data protection (CDP). In this technology a copy of a file is taken and then all byte-level changes to it are recorded with time stamps. These are known as increments. If the file needs to be recovered then it can be restored to any point in the sequence of changes it has undergone. This means that, if there is a system crash or accidental deletion, then the file can be restored virtually to the second before the working copy was lost with, again, no or virtually no lost data.
Gold standard of protection
This is the gold standard of data protection. Logically it is a form of disk-to-disk (D2D) backup, only, of course, it is not a backup in the sense of a copy of the file backed up at intervals which may range from hours to days. The internal structure of a backup file set is quite different from the internal structure of a CDP file set.
We could say that an incremental backup, as seen in synthetic backups, is quite similar to a CDP file set. Both involve an original file with increments. The backup increments will tend to be larger and fewer in number and the backup application can represent the backed-up file to users as a series of versions, one per backup session.
This is impractical with CDP file sets since there can be thousands of increments. It is quite unfeasible to expect users to look at 1,000 versions of a file and find the specific version number they want. Instead, CDP systems often let users specify a time at which they know the file was good and restore it at that point. E.g. just before the time it was over-written.
Against this background Techworld talked to Amit Pandey, VP of NetApp’s enterprise disk unit, about the CDP aspect of the Alacritus acquisition and asked him for his views on it.
Amit said: "(The Alacritus CDP capability) was absolutely of interest. We think that a lot of the data protection market - significant chunks of it - will move to CDP. We have many customers doing snaps at intervals of an hour or a day or even under an hour. It's near-continuous. Many people thinking of using CDP are thinking of throttling it back at first."
We see here the idea of a spectrum of data protection granularity with traditional weekly full backups and daily incrementals at two points on the spectrum, smapshotting in the hourly-daily area, near-CDP in the under-an-hour area and true CDP in the seconds and minutes area. If the data protection function generally is going to have to increase its granularity and embrace near- or true-CDP then this will affect everyone in it - both recent entrant Microsoft with its near-continuous DPM through to Veritas, EMC Legato, CA, Arkeia, Dantz, Yosemite, etc., etc.
Pandey said: "(CDP) piqued our interest with Alacritus. The VTL was the primary motivation. We're working to get the VTL products into the NetApp (product) family. We're not disrupting the people working on the CDP. They'll talk to customers and it will be integrated with our backup technology."
"Customers will be able to specify a data protection granularity," - a few seconds, minutes or hours.
Referring to the potentially vast number of incremental versions of files, Pandey said: "The challenge will be to manage all those copies. They (users) want to undo a change, to back up dozens of steps, and they sometimes find themselves confused. I think that's a danger with CDP if you don't have management of the copies. The (their) representation and how it appears is key to usability."
Users are used to seeing a Windowes Explorer or Unix type representation. There is a tension between the snap/backup representation and the CDP representation, with file versions on the one hand and timed states on the other. How can there be one representation to users of these two abstactions?
Pandey says: "We will have to abstract the increments. It needs to look like a unified image. In many ways, we do that today already; look at synthetics."
NetApp is what we might call the first mainstream storage vendor tackling the challenge of integrating CDP and snap/backup data protection modes and representations to users. It will be fascinating to see how it bridges the representation gap between the two technologies and comes up with a unified capability.