A European rule on retaining electronic communications data could also boost the use of on-line archives for business intelligence (BI), said Hitachi Data Systems, as it announced a hook-up with software developer FileTek on telco data storage.
The EU Data Retention Directive (2006/24/EC European) was pushed through last March, and EU member states must implement it in national legislation by September this year. It requires service providers to securely store call records and web click information for two years or more, and to make those records available to police and security services.
The problem for service providers, said HDS solution director Lynn Collier, is that while they already collect call data for billing, trending and other uses, it will probably not be in a system that complies with the new rules, so it will need to be extracted for archiving and then managed subsequently.
In addition, the DRD requires more than just who you called, when and for how long - it also specifies the type of communication (eg voice or data), the type of communication device used, and if it was a mobile device, its location. However, it does not cover message content.
The answer, claims HDS, is to combine its Content Archive Platform (CAP), which derives from its acquisition of Archivas earlier this year, with FileTek's StorHouse software. The latter provides the intelligence to extract the data needed from the service provider's existing systems, while the CAP adds digital fingerprinting to ensure data integrity, and then stores the information on disk in a read-only form.
Collier added that, despite the potential for misuse of such an archive and the protests of groups such as Privacy International - and of course, the question of whether such an archive genuinely will add anything to the fight on serious crime - there has been relatively little industry resistance to the Data Retention Directive.
"I think that given the backdrop - the Madrid and London bombings - we haven't heard that much push-back from our customers, whereas I think that if content were involved it would be very different," she said.
She estimated that, without message content, a call record would probably be around 500 bytes. At 10 million calls a day, that's 2TB a year. According to HDS, CAP can support up to 20PB in an 80-node archive system.
Given the costs involved in building and maintaining a DRD archive, Collier argued that the important question for service providers is how to turn necessity into business advantage.
"The real beauty is when you move from the tactical aspect of DRD to using the same asset for other things," she said.
"The task for the service provider is to automate data retention and do it cost effectively, but they can also use CAP-StorHouse as a multi-purpose archive, for example to archive customer data for BI analysis, or email for regulatory compliance. It can all go in the same archive, so it is not 'yet another island of data.'"
She claimed that, unlike content addressed storage (CAS) systems such as EMC's Centera, data stored on a CAP keeps its original format with no 'wrappers' or proprietary access layers. As a result it is accessible to any analysis tool, from a high-end BI system to Microsoft Excel.
"We call it active archiving - it is the next generation on from content addressable storage," she said.