Data Domain has announced a disk-less archiving appliance. The company's archiving appliances claim outstanding data compression facilities, twenty to fifty times less than the original data size. How does it achieve this?

The DD400 series of appliances backup data to ATA-class drives. So far so normal: disk-based backup is what we're talking about. Data Domain sells appliances containing both drives and a server running what Arjun Taneja has termed 'Capacity-Optimised Storage'. Arjun is good at thinking up new storage terminology. I believe he came up with SPAID.

A Taneja report says: "This is achieved by breaking the data into a small number of fundamental parts which, when replicated can be used to rebuild the original data." The parts constituting the data are compared and duplicates discarded with pointers left in their place.

(This reminds me of content-addressed storage (CAS) which uses a hashing scheme. But it is not content addressing as such. There may be an extension to wide area file systems (WAFS) in that file parts could be transmitted to the remote site.)

The unique parts and a plan detailing how to reconstruct the original data from them are what is stored in the Data Domain appliance disks.

Any new files to be backed up have their deconstructed parts compared to all existing parts; the compression algorithm is looking for duplicate parts throughout the backup store and not just within a file to be backed up. Compression efficiencies for later files can exceed those of earlier files. Taneja reckons this can achieve compression efficiencies of over twenty times in real-world situations.

Taneja declares that other companies and products using this COS technology include:

- Archivas
- Avamar
- DataCenterTechnologies
- Rocksoft
- Permabit
- HP's RISS
- Peribit in WAFS area
- Riverbed in WAFS area

Data Domain also has a technology endorsement from the Enterprise Storage Group. This talks of storing 20TB of data in 1TB of disk. That is a pretty enticing prospect.

Disk-less appliance
The Data Domain DD460g Enterprise Restorer compresses data and stores it on external secondary disk, meaning FC-attached Clariion or Nexsan's ATABeast. It provides up to 200TB of capacity.

Data Domain states that its DD460g Enterprise Restorer (g for gateway) 'transforms Fibre Channel-attached external storage arrays into cost-effective storage environments for long-term data protection.' Cost-effective?

Well, yes, in the sense that you don't have to buy a tape library or alternative media to hold the backed-up data. Okay but does it make protection sense? Look at it another way. Where is the data that you are backing up likely to reside? If you have Clariion or Nexsan arrays in a Fibre Channel SAN then you probably have the bulk of your data in a SAN too. Backing up from one part of the SAN to another doesn't have much of a sense of data protection about it? The data may not even leave the rack-set in which it's stored at all.

It must surely increase the risk to data if source data and backup data are held in a set of drives inhabiting the same rack-set. Why bother to backup at all? Just rely on a RAID set to protect the online data. I'm puzzled by the logic here.

Data Domain CEO Frank Slootman states: "With Fibre Channel support, the gateway makes SAN storage the obvious choice for data protection and tape replacement." Hmmm. Not sure about that Frank.

What is not being said
With COS having a twenty-fold increase in storage efficiency compared to raw data and with it effectively de-duping data why isn't it being suggested that COS could be used for CAS? Is this marketeers cleverly avoiding a populated product area and going for an under-populated one? Maybe some CAS products already use COS technology.