People don't understand the difference between archiving and backup, causing major problems in data management, EMC has warned.
Jon Murray, regional program manager at EMC, discussed the issue at a recent conference. "You’ll be surprised at how many people seem to think that these two are interchangeable," he told delegates.
While backup and archiving hail from the same family tree, the relationship is not quite close, he said. Although both are intended to replace the original data that has gone missing, the differences are substantial.
Murray explained that a backup is used to make a secondary copy of information, acting as a redundant set of data for recovery operations in case the original is deleted or damaged. It enhances availability and plays a huge role in business continuity.
"Backup is typically a short-term process of a few months, and the data is usually overwritten on a periodic basis to keep it current. It is generally not used for regulatory compliance," he said. On the other hand, archiving involves the primary copy of information that is no longer needed on a day-to-day basis.
"The data is moved off the high availability, high performance operational environment, to something that is low costing," Murray said. In that storage, the data waits for months or years, until it is retrieved for analysis, value generation, or for compliance to government or court-ordered directives.
Backup/recovery and archiving/retrieval are distinct but essential components of a well-prepared organisation’s information lifecycle strategy. Both provide tangible benefits, especially when integrated to provide better performance, higher availability, better service levels, sharper best practices, and enhanced cost savings.
The more that an organisation can figure out what to archive, the less redundant data there will be to backup repeatedly. For example, if a company can decide that 50 percent of its data is no longer being used operationally, the company can decide to archive it. In effect, the company would be getting a faster backup time of 50 percent. If the company is able to identify which data is the most important to backup, the company will then have control over information growth.