Businesses periodically assess their own efficiency by measuring and assessing - re-assessing - what they do. Intel will have done this in a major way over the past few weeks to arrive at a decision to lose 10,500 positions. sun has been through this exercise and so has HP recently.

It's comparatively easy to assess the efficiency of a manufacturing or assembly operation. You need 200 B widgets to make component X and B widget deliveries take 1 hour. You use 50 B widgets an hour so when you get down to 50 B widgets in stock then you re-order. It's the same with preparing accounts. You need the accounts receivables, the accounts payables, etc, etc.

A lot of the information held in businesses is not this kind of operational or structured detail though. It is a level or multiple levels beyond that. It is what people think about the base data. It is the result of data mining operations. It is a presentation to a customer. It is a set of development project e-mails. There is structure to it but not the kind of structure that is used by an assembly-type operation like logistics, manufacturing or accounts.

You can probably predict fairly accurately the growth of structured information needs. But unstructured information is different. We are probably at the stage of the game where a business can not assess whether it really needs all the unstructured information it has in its operations. So much of it is disconnected from reality. It's plans, projects, suggestions, views of how things are, assertions about situations. How much of this stuff do you need? What is the gold? What is the dross?

It seems to me that the growth in unstructured information is uncontrollable and that one of the most valuable storage technologies we currently are developing is de-duplication. After all, if we can cut its volume from 25 to 1 then that is worth doing. Pay attention to Diligent, Avamar and other companies in this field. I get the feeling that their products are going to become really important.

