IBM is upgrading its high-performance file systems to help push supercomputing into the mainstream.
The company plans to release a new version of its GPFS (General Parallel File System) that offers improved file management. The file system can search across multiple systems, up to 1,000 nodes in parallel, said Scott Handy, vice president of marketing and strategy for IBM Power Systems.
In a test, Handy said IBM scanned 1 billion files using GPFS to show off to customers in fields such as financial services and retail who deal with massive amounts of unstructured files. The scan was completed in just over two and a half hours; Handy said IBM is now working to shorten that to one hour.
The update to GPFS, now at Version 3.2, includes policy-based file management that will allow a user to tell the system how to store and search files. For instance, this upgrade will allow a user to stipulate that files saved in a certain format are to be stored on a particular kind of disk. What that will mean, Handy said, is that users can take a tiered approach to how they distribute data. A user can write a policy telling the system to store certain kinds of data on its fastest and most expensive disk, with other types of data going to lower-cost systems where performance isn't as critical.
That capability would allow users to save money because they could use lower cost storage where it's appropriate, he said.
Another policy-based system, said Handy, could be one that requires files not accessed for 30 days or so to be moved to a less expensive system. The previous release of GPFS treated all such files the same, he said. IBM is also adding clustered management features.
The file system runs on IBM System p and System x hardware, and is supported by AIX as well as some versions of Red Hat and SUSE Linux.