Too many people still think that archiving is the same as taking backups or implementing HSM - hierarchical storage management - warned Atempo, as it announced the latest version of its digital archive software.

It claimed that the new software, called Atempo Digital Archive (ADA) version 2, makes archiving more efficient and effective by including data de-duplication, full content indexing, and a survey tool that can analyse an organisation's storage to see how much could and should be archived off.

"Backup is about restores, whether it's a whole system or just a file, but there's people out there who think they have an archiving strategy because they have backups," said Karim Toubba, the Paris-based company's marketing VP.

"Archiving is very different, it is much more specific, it is retrieval not restoration, and it's search. Backup is days or months, archiving is longer - it's all to do with the long-term preservation of information."

Effective archiving can also greatly reduce an organisation's backup load, he said, because once older data has been archived, it no longer features in the regular backup cycle.

One of ADA's user agreed: "Since implementing it, we have freed up almost 30 percent of disk space on our server by migrating inactive data to tape," said Alain Lacave, IT manager at SNCF Bordeaux, part of the organisation responsible for operating almost all the French railway system.

"Because we no longer need to back up this data, not only have we shortened our backup windows, but we are also saving on media costs."

Atempo's Toubba said that, like most archiving systems, ADA 2.0 is based on HSM techniques, with an agent on the server moving information from one tier of storage to another, based on policies such as when it was last accessed.

However, he claimed that the addition of de-duplication and indexing make it different, as does the option for user-initiated archiving. The latter means you could archive all the files relating to a specific case or project as soon as it is over, say, rather than waiting for it to time-out.

"What's important is reducing your cost base. De-duplication is important because a reduction of up to 80 percent can't be ignored," he said.

He added that the next big challenges for archiving developers are likely to be file formats and rich content.

Formats are an issue because if you are storing data for the long term, you need to know it will still be accessible years in the future - perhaps when you no longer have the version of the application that created it.

"We are changing content types at a furious rate, it's richer data too," he said. "So we are looking at conversion techniques, for example so we archive in an open format. We are saying is there a uniform format we can use? XML is a strong possibility - though not for rich content - because it is extensible."

He added, "Rich content is much harder to search, there is some interesting technology coming up there but it's nascent."

Toubba noted that ADA 2.0 will be generally available later this month for Windows, Mac, Linux and Unix.