The parameters of fixed content disk-based storage have been set by EMC, Network Appliance and StorageTek with Centera, NearStore and BladeStore, respectively. Cheap ATA or Serial ATA drives are used with software providing a write-once, read-many (WORM) capability and using RAID and/or replication-type techniques to safeguard the data against ATA's less-than enterprise-class reliability.
The general idea is that cheaper ATA disks provide high capacity and much faster access than either tape or optical media, yet are significantly cheaper than SCSI or Fibre Channel disk arrays. It's called second tier or nearline storage.
EMC provides a content addressing scheme (CAS) with Centera. This is the basis for ensuring the integrity of the data, avoiding the storage of duplicate items and generally providing an efficient method of showing compliance with the various emerging regulatory regimes, such as Sarbanes-Oxley that bane of IT storage budgets.
EMC uses a proprietary approach with Centera. Applications on connected servers that need to access the Centera repository have to do so through special software, written by an EMC partner, which uses the proprietary Centera API to access the Centera box. This results in quite a high cost with a cumulative price tag possibly a significant way beyond a quarter of a million dollars. Enterprises only, large or specialised ones too, need apply.
Network Appliance uses NearStore ATA hardware and its Data ONTAP SW turns the ATA arrays into secure stores for fixed content - which NetApp terms reference data. SnapShot technology provides point-in-time copies for security. Partners, such as EMC's Documentum division, provide SW that can 'drive' NearStore. Data ONTAP provides both block- and file-level access to the data in NearStore. File-level access means CIFS or NFS, HTTP or NFS or Direct Access File System over Ethernet: no need for a proprietary API here.
MultiStore SW provides storage consolidation. NetApp's WAFL (Write Anywhere File Layout) provides data integrity with a block-level checksum capability. There is a lot more software available but perhaps one of the most pertinent items here is SnapLock. This creates non-rewritable and non-erasable WORM disk volumes, both on nearline and primary storage. Critical files can be held secure until a set retention date.
With BladeStore, StorageTek also has a nearline storage device. The company's Application Storage Manager (ASM) software is intended to be used to build tiered archival systems. It can automatically migrate data to multiple classes of disk and tape storage. StorageTek provides a frontline disk to archival tape range of storage tiers and thus has a wider and integrated set of hardware tiers upon which to provide information life cycle facilities.
Fixed content and compliance needs are just part of StorageTek's overall Information Lifecycle management (ILM) approach. ASM can be driven by settable policies to enable data, such as fixed content, to be stored on the most relevant storage tier and retained for the correct period.
Permabit has devised Permeon, a fixed content storage vault that combines ATA drive arrays, WORM technology, CAS and CIFS/NFS access, thus combining elements of both the EMC and NetApp approaches. The company says its product is up to 50 percent cheaper than such competition with prices of $20 per stored GB mentioned and entry-level prices in the $40,000 (c£23,000) area.
Permabit's product is currently only available in the USA. However, the approach seems one that is bound to become available in Europe quite soon. Permeon involves a Reference Vault which is a set of ATA drive arrays (storage servers collectively called a clique) front-ended by a pair of Permeon Portal boxes. These portals link to an organisation's LAN and, via GigE links, to the storage servers. LAN-connected servers will see NFS and CIFS volumes, courtesy of the portals. There is a pair of these for reliability and that is why there is a pair of GigE switches also.
The portals, running on Linux X86 hardware as do the storage servers, provide a web-based management console and the CAS intelligence. This is founded on individual files being split into sub-files, termed blocks, which are given a content address based on a cryptographic hash of their contents. On this basis data can be distributed evenly across the storage servers, its integrity guaranteed, retrieved efficiently and stored just once. Duplicate blocks are reduced to a stub pointing to the first block. This is called data coalescence. Permabit says this feature alone can seriously reduce the needed capacty to store fixed content.
Permeon produces content certificates which contain the content's address and various items of metadata. Such certificates have an obvious compliance aspect, proving that a record has been stored without corruption or alteration.
For security and self-healing, replica copies of data are distributed throughout the storage servers.
The company produces a fixed content store, called Reference Vault, and a compliance version, called Compliance Vault. Maximum capacity is up to 40TB and I/O can attain 450GB/hour.
Avamar is another US company that reduces needed storage capacity by tagging each item to be stored in its Axion system. This is another content addressing scheme. If the Axion system detects the same tag with a future item it doesn't store it, using a pointer instead. This is similar in effect to Permabit's data coalescence idea, only Avamar calls it PDVR - pervasive data volume reduction.
Avamar uses industry-standard hardware and again has a relevance to fixed content storage and with the extension possibilities to meet compliance needs.
It seems abundantly obvious to this writer that industry-stand hardware will be used to provide relatively inexpensive content-addressed storage that will appear in Europe within a matter of months. This will provide Centera/Data ONTAP-type functionality at price levels that extends affordability down the enterprise ranks. As the ranks of the compliance regimes that affect us all harden, such systems will become increasingly attractive.