It appears that the major SRM vendors are hoping that users have had a bout of amnesia when it comes to storage management.
A few years ago the industry attempted to lock customers into expensive, unwieldy framework products intended to manage the entire environment. Tivoli, Unicenter TNG and OpenView all promised much, including flexibility and an open architecture but at a price. However, most organisations have managed with a collection of so called “element managers” and a few have managed to consolidate large chunks of operations into a consistent management platform. What's more, the cost of migrating or engineering solid operational routines and procedures was, and remains, significant.
Nevertheless, here we are again with SRM tools promising the ability to deliver capacity auditing, storage infrastructure management, chargeback, performance monitoring and even automated provisioning all under the umbrella title of the latest buzz-phrase “utility computing”.
There’s only one flaw in this approach, however. In order to deliver real savings to the business it has to work first time and all of the time. In this, the requirements are no different than those applied to any other product delivering the business processes that generate revenue and profit. Why should IT vendors be given (a lot) of additional leeway because they are delivering to the IT support function and not directly to the business?
Why should I care? It’s because I use SRM tools, for real, in the field with real customers and in live “messy” environments. And guess what. A lot of what is on offer doesn’t work reliably and in some circumstances doesn’t work at all.
How do I know this? My company offers capacity auditing as one of its services and we have relationships with most of the first division storage vendors in Europe. We have found that vendor claims for SRM tools are somewhat over enthusiastic. My major criticism is that testing of SRM products does not, apparently, consider heavily loaded hosts; remote hosts via limited bandwidth links; firewalls; existing instances of system management agent on hosts; and requirements to re-start hosts after installation.
An SRM tool or framework will typically offer five major components: capacity audit and management, metering / chargeback, device management, performance analysis and storage provisioning. What the vendors tend to ignore is that for most organisations, the capacity auditing function is inevitably the first and possibly the only use for SRM tools. The vendors are trying to recoup their investment in the (flaky) advanced features where the customer (typically) does not need them.
Chargeback is a wonderful example of an SRM feature that does not acknowledge the magnitude of putting into production. Consider that most organisations pay a fixed contribution to IT provision, backtracking to assign actual storage utilisation fees is going to be an uphill struggle for even the most persuasive IT Director.
It gets worse. In my experience, a lot of IT provision is project funded and, although not impossible, the hurdles involved in getting multiple projects to buy into a shared storage architecture with charge back scheme is doomed to failure.
Device Management is another example of an expensive but little used option. Storage Area Networks tend to be highly static in terms of configuration with the initial switch set-up, zoning and so on performed with the switch vendor’s specific application. I contend that the idea of continuous SAN device management within the SRM framework does not justify the cost over and above the native configuration tools irrespective of whether these can “snap in” to the framework.
The majority of SANs are unlikely to contain so many devices that the ability to monitor and manage from a single SRM environment, as opposed to discrete configuration tools, will be a major factor in driving down the cost of operating the environment.
With the above points in mind, the only compelling cases that can cost justify purchase of an SRM solution, at this stage in the maturity of the products on offer, are where:
System outages can genuinely be prevented by more accurately forecasting when disk capacity is exhausted (In which case, why am I paying those expensive system administrators?)
Here is where most IT Directors will baulk. Typically, the disorganised scramble to reorganise storage space that results from a key server falling over is “hidden” by unpaid overtime. What this means is that an expensive investment in SRM is required to overcome a nebulous problem that is not necessarily appearing on the balance sheet.
An expensive direct attached or storage array expansion can be deferred until at least the next financial year by identifying currently wasted on-line storage.
The definition of wasted storage is highly flexible. However, it is common in most organisations for everything ever produced by every employee to be saved somewhere, or more realistically in several places. Identifying aged, duplicate and inappropriate files across hundreds of servers is, of course, highly labour intensive. So, using a capacity auditing tool to uncover information that can be permanently archived, or deleted, becomes a viable and potentially cost saving measure where a storage upgrade is looming. This is most relevant in borderline cases where a storage array is already in place as part of a SAN and a capacity increase demands an additional drive module, or possibly an additional array, if capacity limits have been reached.
Device monitoring is sufficiently effective and robust such that a service outage can be pre-empted.
Again this is a peripheral case where it must be assumed that a fault or configuration error within the SAN environment is degrading performance in such a way that the intermittency would only be more readily detected from a centrally managed console. This approach leads towards risk analysis where larger SAN environments increase the risk of failure, and fault prediction is dependent on accurate configuration of discrete SNMP alert configurations, for example, within each and every device.
The challenge associated with SRM is to take a cold hard look at what the operation of fragmented and “unmanaged” storage is costing the business, both in terms of administrative salaries and outages. We'll take a look at these in the next article.