IBM surprised most of the storage market with its purchase of secretive Israeli start-up XIV. It indicates a potentially profound new storage direction for IBM. Certainly XIV offers intriguing new SAN-style storage but that on its own is not enogh to justify a purchase. So why did IBM buy the company? There appear to be several reasons: Moshe Yanai, XIV's founder; XIV technology, low-cost Web 2.0 scale storage of non-transactional data; and IBM's reading of the Web 2.0 market.

XIV, which offers block SAN-like storage using SATA drives for Web 2.0 with great scalability and data protection, has received just $3 million investment money; it doesn't take that much to develop software for commodity hardware components. Yanai apparently provided $2,250,00 of that with other private investors contributing the rest. No venture capital firms have an interest, as they do at Diligent.

The Israeli press has mentioned a purchase price of more than $300 million, which would give Yanai a very pleasant Christmas present. IBM gains technology, around fifty people, and has committed to set up a major new storage development facility centred on XIV's facilities in Israel.

Moshe Yanai

Moshe Yanai comes from EMC where he invented Symmetrix arrays ten or more years ago, becoming Symmetrix VP of engineering for ten years, as well as an EMC Fellow and reporting directly to CEO Joe Tucci. He owns several patents for Symmetrix technology and today's DMX4 arrays still use that technology.

He left EMC, 'was ousted publicly' according to IBM senior storage consultant Tony Pearson, in 2001. After that he co-founded de-duplication company Diligent with Doron Kempel, and then founded XIV in Israel. Yanai is still a director at Diligent.

EMC bought the Data General Clariion disk operation when it realised that mid-range arrays were needed alongside the top-end Symmetrix. Yanai reportedly opposed this purchase.

Kempel and Yanai both left EMC in 2001 and formed Diligent from EMC's R&D lab in Israel in June, 2002. In exchange for the lab and facilities EMC was given a 24 percent stake in Diligent valued at $5 million. Yanai reportedly invested $10 million from the wealth he'd earned at EMC.

Yanai is a 'big hitter' in storage terms. He now becomes an IBM Fellow.

XIV technology

XIV stores data in storage modules; Intel Linux servers linked across Gigabit Ethernet with lots of direct-attached serial-ATA (SATA) disk (DAS). These are virtualised into one pool of block storage like a virtualised storage area network (SAN).

It is front-ended by interface modules, again Linux Intel servers, which provide access to users/applications coming in via Fibre Channel or iSCSI. The idea is that the XIV infrastructure can scale very, very well by adding interface or storage modules or both. It can scale interface capacity, processing capacity and storage capacity.

Data is split into 1MB chunks and copies written to two separate disks.

If a storage module crashes then its data is reconstructed from the many copies held in the XIV infrastructure. If a disk crashes then ditto with a re-build apparently taking only 20 minutes or so. If a second disk crashes before the re-build is complete then then the only lost data is that which was held in common by the two drives which has not yet been re-built from the first drive.

This data may, query should, be stored in a backup somewhere.

NEXTRA provides thin provisioning and has excellent snapshot facilities.

Web 2.0-scale storage

The XIV architecture, NEXTRA, was designed to store unstructured or semi-structured data: medical images, music, videos, Web pages, presentations, documents, and similar files. IBM has characterised Web 2.0 storage as being used by companies facing phenomenal storage growth who need one storage platform that can scale up from megabytes to petabytes of storage in just a few years.

The Web 2.0 core data storage problem is exponential growth. XIV's NEXTRA fits that because it can scale as needed, and it's also self-optimising and self-healing. That latter trait fits in with IBM's past emphasis on autonomic computing.

IBM is thinking of social networking Web applications such as FaceBook and Twitter, ones that grow from nothing to millions of users generating their own content in eighteen to twenty four months. Sun uses the metaphor of red shift computing for this phenomenon.

Pearson thinks that this will help clarify and simplify the disk market with online transaction processing data (OLTP) being stored on 15,000 rpm Fibre Channel (FC) drives and unstructured/semi-structured data on SATA drives. The current crop of 10K FC drives will fall away. IBM will offer its existing DS8000 and DS4000 drive arrays for OLTP data and XIV SATA 'arrays' for unstructured data.

IBM's view of XIV and the Web 2.0 market

In IBM's view, the Web 2.0 storage market is real. Existing enterprises are facing unstructured and semi-structured data storage problems. De-duplication technology is a response to that. The unstructured data growth rate has been exponential in social-networking type businesses such as YouTube, FaceBook, Twitter, Flickr and others.

This has been referred to as cloud computing where users' data is stored 'in the cloud' on, for example, YouTube's infrastructure or Flickr's. IBM seems to be taking a view that XIV technology can be used for this.

It might also be thinking that it can be used for current enterprise unstructured and semi-structured data storage. That would need good links between DS8000/DS4000 arrays and the XIV ones with data moving software and common management. Enter Tivoli perhaps.

However, an XIV case study, Bank Leumi, has XIV storage being used as a tier-one storage replacement in a Oracle database environment with strong snapshot requirements.

Ester Lior, Bank Leumi's storage director, said: “We are using NEXTRA for applications that previously ran on our tier-1 and tier-2 systems — in short, we’re getting all we need without tier-1 costs.” That sounds like XIV replacing DS8000 and DS4000-type arrays. Could this be why IBM is so excited about XIV? Does it provide a new storage paradigm that shows the way forward from existing monolithic and modular storage arrays?

Currently IBM has a NetApp reselling agreement. The implications of the XIV purchase for NetApp could be profound with its emphasis on a commodity hardware/OS approach providing lower cost storage.

XIV's NEXTRA is not a network-attached storage (NAS) box although it is for file storage. It isn't a FC SAN although it can accept FC protocol requests, and it isn't DAS, although it is built from networked DAS-heavy servers. It isn't a file area network (FAN) in the Brocade sense either.

This is, for want of a better term, cloud storage. Have a look at Pivot3's offering for a similar within-node/across Gigabit Ethernet-connected nodes RAID-like data protection.

Also have a look at LeftHand Networks' iSCSI SAN storage offering. Although this is pitched at mid-range enterprises, its construction of an IP SAN from DAS-heavy commodity servers parallels the XIV idea.

It's probable that, with the XIV purchase, IBM has closed the door on a LeftHand acquisition. The possibility of LeftHand being bought has been in the air since Dell acquired its main competitor in the IP SAN space, EqualLogic, last year.