If you want a thumping great online archive for unstructured data then HDS has a humungous one, a simply huge archive silo called the Hitachi Content Archive Platform (HCAP) version 2.
V 2 of its unstructured data archive product, the one based on acquired Archivas technology, has blasted away capacity limits by adding HDS' USP-V (Universal Storage Platform - Virtualised) as a supported disk array option and logically separating the archive application servers from the storage.
There can be up to 80 server nodes clustered together. If HDS' USP-V is used for storage then the beast can scale to 20PB of data - it was 32TB - meaning files, metadata and indices. It means space for up to 400 million objects (files or e-mails) per node, up from 25 million, and a grand total of 32 billion stored objects. (It won't be enough. In 2009 we're bound to see a bigger total.)
There are additional features on several fronts:-
- Encryption - data can be encrypted and the key shared between the nodes. This means that if a thief steals a node or a storage item from HCAP the contents are encrypted and unreadable.
- File-level de-duplication - single instance file storage is ensured by comparing each object with others at a binary level and by hash-addressing. This dual comparison means hash collisions cease to be a concern. Sub file-level de-duplication is not a feature and thus its potential storage efficiencies remain as a potential future addition to HDS' HCAP feature arsenal. HCAP will report on the number of duplicate objects found and the consequent saving in storage capacity.
- Server nodes and storage capacity can be scaled independently of each other. The use of USP-V means non-HDS storage can be used behind the UPS-V.
- Replication of compressed data to remote sites is supported for business continuity and disaster recovery.
As before access to and management of HCAP is by standard, not proprietary interfaces. This means NFS, CIFS, webDAV and HTTP for access and SMI-S for management. NDMP is also supported and the SNIA's XAM developing standard will be supported when it is released.
Data is stored in a write-once, read-many (WORM) fashion. and data protection is based on RAID and the SAN array of independent nodes (SAIN) concept as before with no single point of failure.
HDS is pushing its SOSS - services-oriented storage solutions - concept and says HCAP can be a service delivered on the USP-V with its storage being on a serial ATA drive tier in the USP-V's virtualised storage pool. HDS makes the point that management simplicity and consistency are both improved by having content archiving managed just like any other storage platform and not stored in a unique silo and managed in a unique way. This is probably a dig at EMC's Centera which leads the unstructured data disk-based archive market. HDS talks about 'archaic CAS solutions' which is a bit rich considering that EMC's Centera leads the CAS market and is about due for a major refresh.
HDS says HCAP scales to greater server performance with fewer server nodes than competing content-addressed storage (CAS) products and thus HCAP is greener. There are no numbers supplied to back this up so we will have to take it on trust. (It should be noted that, although HCAP uses hashing for file-level de-duplication, HCAP is not a content-addressable storage platform.)
Supportive HCAP partners include Bus-Tech, CA, Open Text, Princeton Softech and ZANTAZ. The cost is $10-14/GB depending upon the storage choices. Assume a starter 5TB system will cost around $70,000 (c£36,000 at standard conversion rates).