EMC's new MPFSi (multi-path file system for iSCSI) software is a distributed file system which allows an application server to choose whether data is delivered to it in file or block form, without the application being aware of the difference.
It requires a software client running on the application server, this does metadata caching and also includes a iSCSI client. The metadata gives it the block addresses for the files it's using, enabling it to fetch data directly without having to go through the NAS server.
The process goes something like this: the client app server requests a file from the NAS gateway, the NAS gateway gives it the relevant metadata, and the app server then decides (according to rules based on file size, transaction type, and so on) whether to retrieve the data as a file through the NAS gateway, or go direct to the storage array via iSCSI and pull the data over as blocks.
Of course, for block transfers to be possible, the relevant hooks have to be in place on the NAS server, both to publish that metadata and to handle distributed file locking, including updating all the other servers' metadata. So far, those features only exist in EMC's Celery hardware - MPFSi uses the Celerra API - but one could envisage other NAS servers acquiring them, especially as EMC has offered it to IETF for possible inclusion in the next version of NFS, and has published the client code on Sourceforge to allow users to customise it.
Up to 4x faster?
EMC claims that MPFSi block transfers can be up to four times faster than the NAS route, because they bypass the NAS gateway and are more efficient. It does depend on the file being retrieved though and the protocol being used (CIFS and NFS have different overheads, for example), hence the alternative of simply using NAS. An app server that lacks the MPFSi software client can still access the same files via NAS alone, of course.
One area it's not aimed at is data access over the WAN, partly because iSCSI is not especially good over long distances. (The use of iSCSI for replication or backup is an exception - this relies on the replication software knowing that the WAN's latency will cause problems for SCSI and compensating for them in advance.)
EMC argues that MPFSi and WAFS are complementary, not in the sense that they might be used together, but in that they address different needs. MPFSi is more of a data centre or campus technology, and it's aimed at the sort of heavy lifting that goes on between an application server and a back-end database over a decent bandwidth connection- which is probably not the sort of thing you would do over a WAN. It also assumes a server at each end of the link, whereas WAFS is aimed at eliminating remote servers and replacing them with caching devices.
If you want an example of the sort of use that MPFSi might be suitable for, EMC's senior director of NAS marketing, Mark Greenlaw, says the reason for starting with client software for Linux only is that "We see a lot of Linux-based grids out there". He adds though that he also sees a lot of demand in the Windows space, and that EMC will have clients for other operating systems later this year. The MPFSi client costs from $500 to $225 per server, depending on volume.
Other NAS suppliers have already added block access to their arrays, but Greenlaw says he doesn't see them replicating MPFSi just yet.
"Having block and file access to the same array is not new," he says. "What's new is allowing the client to control that. It allows the client to map both block and file paths [to the same data], and that's not a trivial task. The idea is to have the performance of iSCSI with the management simplicity of NAS."