Data Domain's newest DD580 de-duplication product can process data faster than the latest LTO4 tape drive's 120MB/s raw transfer speed. It shrinks the space needed for files by factors of ten or twenty or even more. The DD580, with its dual core Intel Woodcrest CPU, doubles the speed of the previous DD560 system which used Xeon processing.

De-duplication involves looking for repeated bit patterns at the sub-file level and replacing them with pointers to the original bit pattern. With highly redundant data, such as backup files, this can shrink them to a tenth or twentieth of their original size. This makes disk-to-disk backup more useful as a greater amount of backup data can be held on fast disk arrays instead of slow tape cartridges.

Howeve, de-duping needs a lot of processing power and is often carried out after the raw data has been copied to a disk storage device, typically a virtual tape library (VTL). Such post-processing de-dupe lets backup proceed at disk speed rather than slower tape speed, around 200MB/s or more.

By using specialised rather than commodity software and through the Woodcrest CPUs, the DD580 can de-dupe data inline at 180MB/sec onto its serial ATA drives, not as fast as VTL post-processing de-dupe products from Sepaton and Quantum/ADIC. It is slower than a streaming LTO4 tape drive backing up compressed data at around 240MB/sec.

But the DD580 is faster than other inline de-dupe products such as ones from Diligent and Quantum, EMC/Avamar and Symantec's PureDisk. Data Domain's MD for EMEA sales, Kevin Platz, said: "These systems are not designed for performance, volume and scalability; some of these products function well but overall they only work with smaller data loads, i.e. up to approximately 250GB of backup data. Also, they are used in lieu of standard backup software and require considerable migration to get their data under the management of the customer's existing backup software."

The DD580 works readily with existing backup infrastructures and has in-built replication. It can send de-duped data immediately to a remote site, enabling recovery with up to date information, not possible with a post-processing de-dupe product. As it is sending a reduced amount of data to the remote site, there is a saving of WAN bandwidth and/or time as well. (See earlier customer story here.)

Fully configured with 16 DD580 controllers, a DDX array can achieve more than 12TB/hour throughput and up to 20PB capacity. It is also available as a DD580g gateway with back-end Fibre Channel or SATA drives. The DD580 will be generally available by August and costs from $120,000 (£80,000 in the UK) for a 15-drive basic appliance. Working out whether a DD580 is better value than either a post-processing de-dupe product or a tape library will require putting a value on its replication features, also its cost of ownership, compared to either of the two alternatives.