Thousands of users are deploying open source storage software in an effort to avoid pricey proprietary products such as array clustering and disk eraser applications and to get some long-term protection through the availability of source code.
Rafiu Fakunle, CEO of London-based open source vendor Xinit Systems, said that users have downloaded more than 38,000 copies of its Openfiler NAS and SAN software from SourceForge. And Zmanda - the company providing support for the open source backup software product Amanda - says that it supports 20,000 users worldwide.
Open source storage software is available to address a number of user needs. Amanda is a backup software product targeted at small and mid-sized businesses that allows the creation of a single master backup server to back up multiple hosts. DBAN (Darik's Boot and Nuke) allows users to securely wipe the hard drives of their computers.
Other open source storage software includes Lustre, OpenAFS and SAMBA, which are each network file systems used for different tasks. Lustre is used in large scale cluster computing while OpenAFS is deployed to create a single file space across all computers so that any computer can access a file on any other computer. SAMBA allows Linux servers to provide file and print services to Microsoft Windows clients.
Integrators like the Network Resource Group (NRG) say they can deliver substantial savings for their customers using open source storage software.
Terry Hull, a principal network engineer with NRG, recently put together a VLAN for a client using iSCSI and open source storage software. Hull says users can save up to 50 times more by using an established brand.
However other experts remain sceptical about the wisdom of implementing open source storage software products. Jacob Farmer, the CTO of Cambridge Computer has some clients who implemented OpenAFS and Lustre in order to avoid the high cost of clustered file system software.
"Only those with highly skilled personnel were able to pull it off. The rest found that these products were too complex and had deceptively high costs of ownership," says Farmer.
Key questions for users
- What is open source storage software's value proposition?
- What products are available for their specific needs?
- How stable and scalable are the products?
- What risks do they present?
- Under what circumstances should end users consider open source?
- What level of user skill is required to implement and support them?
- What software support options are available?
The three primary value propositions for open source storage software are:
- Minimal or no upfront software costs
- Comparable base line features as proprietary storage software products
- Availability of source code provides some level of long term protection
Open source storage software can be obtained in one of two ways - freely downloaded from a website or purchased. While the underlying source code should be the same in both instances, the enterprise edition should have been fully tested and compiled.
For example, DBAN is an open source storage software product available in both free and commercial versions. DBAN meets the 5022.22-M standards of the Department of Defence (DoD) for data erasure by overwriting all disk locations three times.
However, David Ritchie, an IT manager with an Atlanta-based staffing firm, still finds DBAN is not quite ready. He encountered some quirks when trying to erase data on volumes on external storage. "The amount of storage it displays is different than what is presented by the external storage array and the program runs single-threaded so you need to be strategic in how you deploy it," he says.
The AoE (ATA over Ethernet) protocol provides a method that is comparable to Fibre Channel for users to connect to external storage using common the 1Gbit/s Ethernet protocol and network switches. As a registered IEEE protocol, AoE runs at lower level in the Ethernet stack than TCP/IP so it does not impact server performance in the same way that the iSCSI protocol does yet it provides approximately the same level of performance as more expensive Fibre Channel SANs.
Coraid's CEO, Jim Kemp says, "On a 1 Gig Ethernet link, AoE can achieve 110MB of throughput without burdening the host processor."
However, AoE does have a number of downsides. First, while drivers are freely available for Linux, FreeBSD and Solaris, Windows users still need to purchase an AoE driver such as Rocket Division Software's Starport software.
Second, AoE is not a routable protocol so it can not be used to access storage on other segments of the LAN.
Third, storage products that support this protocol are only available from a few vendors such as Coraid. Finally, AoE requires newer network switches that provide flow control that maximise throughput and limit network collisions.
The availability and accessibility of the source code is also a major advantage of open source storage software, especially for organisations that archive data for long periods of time.
Charles Wegryzn, is a developer for Retriever Technologies, which is working on an open source content management and digital archiving software. Wegryzn says it used to be fairly typical for users to buy software from IBM and IBM would include the source code inside.
"Then Microsoft came along and changed everything. With open source, we are going back to our roots of how computer software sales used to work."
Cambridge Computer's Farmer thinks archived data is the single largest value proposition for open source storage software. Supporting proprietary data formats long term and the possibility of vendors going out of business who provide those formats are valid user concerns now.
Farmer says, "With open source, at least you know you will have support in 25 years since you own the code."
Open Source's Hidden Costs
Despite the benefits open source storage software offers, users need to establish what the hidden costs of open source storage software are, experts say. The major factors that affect the total cost of ownership are:
- Product installation and configuration documentation
- Product support
- Breadth of product functionality
- Hardware and software interoperability
One hidden upfront cost with open source storage software is finding documentation and scripts that ease its installation and configuration. Coraid's Kemp says, "The open source community is rich in information but it is a scavenger hunt to find exactly what you need."
The costs for supporting open source storage software show up in different ways. Open source vendors are in general agreement that managing open source code and changes to it require, as a rule of thumb, administrators with at least two to four years of experience.
"Users who like the idea of modifying open source code need to take a close look at the code to make certain that they can work with it and that it is within their skill set to modify," Cambridge Computer's Farmer says.
Integrators like Terry Hull of the Network Resource Group (NRG) also encounter other issues with product support.
"Getting to the root of a problem when you have open source layer upon open source layer is rarely easy and the thing we (NRG) know we are giving up with open source storage software is a significant margin of management," Hull says.
Another major concern for open systems storage software is the depth of product functionality. Open source products like Amanda and OpenSMS, a policy-driven systems management storage software product, almost always have certain product restrictions. For example, Amanda will not backup Microsoft Windows hosts unless SAMBA, a file and print sharing utility, is first installed on the Windows host, and Amanda offers no media server option so all backups must go through a central server.
OpenSMS only officially supports Linux 2.4 and 2.6 running on an XFS file system though it suggests it should work on other UNIX platforms and, with some porting, on JFS file systems. OpenSMS offers no integration with Microsoft Windows platforms.
The final major concern for enterprise shops is the lack of verifiable interoperability testing between the open source storage software and other hardware and software products in the user's environment.
NRG's Hull notes that while interoperability is not a major concern for over 90 percent of his installs, he still never discounts the possibility of having to troubleshoot interoperability issues. Cambridge Computer's CTO Farmer says, "Unless the software has comprehensive support services behind it such as Amanda does, one needs a really good reason to mess around with it since primary storage is such a vital piece of the IT infrastructure."
Next Steps with Open Source
For the most part, open source storage software is still largely a work in progress that requires users to have years of practical experience as well as the time to research and support the products. However, Cambridge Computer's Farmer offers this advice:
1. Find an open source software where there is a large open source community and make sure that you have the skills and time to modify and manage the code.
2. Go with a low-cost product with an easy way to migrate your data out if need be.
3. Don't be afraid to pay an enormous premium for a big name vendor to avoid the risk.