This is the first part of a two-part article. The second part will be published tomorrow.

Q: I recently heard you claim that you could cut power/cooling costs in the data centre by 30 per cent by "simply changing the processes around how data is treated." Can you give me an example? -- P.R., Madison, Wisconsin, USA.

A: I love making claims like that -- and am amazed at how infrequently anyone ever calls me on them. Thankfully, I can justify that particular claim in spades. Madison is a lovely place, by the way.

Let's assume we are Joe IT Guy at XYZ company -- an upper-middle-market firm with a few thousand employees, a dozen sites, and all the problems folks like us deal with. We run our transactional production systems and our distributed Windows stuff. We have big SANs and file servers. We have stuff everywhere. We back things up; we do some disaster recovery. We are tragically overworked and chronically undervalued. We are Joe IT.

Let's pick one little area where process improvement can yield big results - test and development. Everyone has T&D operations. Most operate in some version of the following:

  1. T&D is used to make sure internal and external software applications, new infrastructure and upgrades work the way they are supposed to prior to their being rolled out into production.
  2. In order to perform Step 1, T&D has to get real-life data that is complete and current from production systems into their own systems.
  3. In order to perform Step 2, T&D has to beg, borrow and steal -- and lives at the mercy of the production people. It takes time, planning and prayer. The application normally has to come down, and the database must be quiesced; the infrastructure specialists turn knobs and push buttons, and the data is then moved.
  4. Once Step 2 is complete, the actual testing can occur. Usually T&D will make additional copies of the data sets (which are probably already out of date by the time they are used) to test different things.

Now, let's talk about the practical inefficiencies of what just happened. First, the production system runs some application(s), most likely on top of a database. We tend to find our production data really important, so normally we would have at least two copies of that data. We tend to keep our production data on very expensive, very big, very power-hungry infrastructure. We create the data on that infrastructure, and then we keep it there. We then make more copies of that data, also tending to keep it there. We might have two, three, four or more copies of the exact same data at various points in time or at the same point in time. That means our cost of housing that data is two, three, four or more times the cost of housing it as a single instance, and not just for capital, but for power, cooling, footprint, etc.

When we make our test copies, we also tend to create and keep those copies on our production systems. Sometimes we delete them when we are done, but sometimes we leave them around for a long time. Sometimes we forget about them altogether. Those copies take up more space, power and brains. They tend to be backed up along with everything else on that mission-critical production system. That means we might be backing up 23 copies of the exact same data in a single backup. If we do a full backup each week, we will create new backups of the exact same 23 copies of the exact same data each time. You see where I'm going with this?

We have test servers that we run that sit around sucking juice whether they are used or not.

It gets worse. Not only is the process of getting the database copies difficult and expensive, but no one considers the security implications associated with having potentially hundreds of copies of real production data floating around -- in the production system, in the test systems, on the backup systems, at the DR site and on the tape at Iron Mountain. Backup is good, right?

Having two, 10 or 100 copies of the same non-changing data sitting on "production" systems isn't even the real problem. The downstream effects are the issue. Extra stuff on the production system slows down the production application, slows down the database and slows down the user. We tend to combat this by buying more hardware -- bigger, faster hardware that sucks more juice, takes up more room and causes more disruption. More data means that other processes suffer: Networks get clogged, so we need more bandwidth; backup servers get bogged down, so we need bigger machines; backup targets get full faster, etc. The rest of our processes don't know or care that it's the same data -- only that there's more. More causes problems; problems cost money.

The conundrum is that if you are the vendor in the production system, you kind of like it when Joe IT calls for more stuff. It's hard to tell Joe not to buy more and instead to just change some of the behaviour that causes the problems. If we were to try to really help Joe, however, we'd lay it out like this:

  1. Define the objectives for test and development - in a perfect world
    1. Get a complete, accurate and timely copy of the exact production database

    1. Zero impact -- non-disruptive to production
    2. Non-disruptive to production IT
    3. Automated

  2. Put that copy somewhere else - NOT on the production system

    1. Don't create work for production systems or people
    2. Do it dynamically
    3. Run virtual machines everywhere you can

  3. Create a protection policy for the T/D data
    1. Do we back it up?
    2. If so, when, why and how often?
  4. Create a security policy for the T/D data
    1. Protect the assets as if it were still in production
      1. This assumes you are not TJX or TSA
    2. Enforce disposition/destruction
  5. Define a data repurpose policy
    1. Who else could use a copy of this data?
      1. Should we use this copy as a backup copy?
      2. Should we replicate this copy as a DR copy?
    2. Are there other applications that could use this?
      1. Data warehouse
      2. Business intelligence
      3. Those guys in marketing
      4. Business partners

Steve Duplessie founded Enterprise Strategy Group in 1999 and has become one of the most recognised voices in the IT world. He is a regularly featured speaker at shows such as Storage Networking World, where he takes on what's good, bad -- and more importantly -- what's next. For more of Steve's insights, read his blogs.

This is the first part of a two-part article. The second part will be published tomorrow.