Bernard Shen, a technology consultant at aerospace company BAE Systems North America Inc., went from shipping 1,200 data archive tapes to an off-site storage facility every 90 days to sending just 200 over the same time period. He did it by storing only incremental changes to his company's data.
Shen says the business case for incremental data backup was a no-brainer, but selling the idea to his IT team wasn't easy. "There is a tremendous amount of skepticism around it," he says.
While Rockville, Md.-based BAE saves incremental changes across its 25TB storage-area network (SAN), those data slices can be combined with previous full-server backups to create what's known as a synthetic backup, from which a systems administrator can then restore a file or application if it becomes corrupted or data is lost.
Although Shen has eliminated full backups on his file servers with synthetic backups, he still performs them on his database servers because the technology he uses doesn't support block-level changes.
Adoption of synthetic backup is increasing rapidly because it saves systems administrators from having to shut down application servers to perform full backups and it reduces the amount of data backed up to disk and archive tapes. The changed data that's saved amounts to less than 10% of all data, according to analysts.
Relatively inexpensive disk-to-disk or disk-to-disk-to-tape architectures are rapidly gaining acceptance by IT organizations facing shrinking backup windows, according to Mike Kahn, an analyst at The Clipper Group in Wellesley, Mass.
The pain points users are addressing with synthetic backups are their recovery point objective (RPO), their recovery time objective (RTO) and the ever-growing backup window, analysts say. RTO speaks to how long it takes an organization to be up and running after a disaster or data loss, and RPO refers to how old the recovered data will be.
No server shutdown
Ghyslain Boisvert, executive director of the high-performance computing laboratory at the University of Montreal, uses Time Navigator Server from Atempo Inc. in Palo Alto, Calif. With it, he creates a full backup of research data off-line from incremental backups made to tape libraries from Storage Technology Corp. and Dell Inc.
Boisvert said it still takes him seven hours to perform a synthetic full backup from incrementals for about 660GB of daily data -- not much shorter than the 10 to 12 hours standard full backups take. But the synthetic backup has no effect on his production servers because they no longer need to be shut down as they did during backup to a tape library.
By keeping incremental backups on intermediate disk arrays, also known as virtual tape libraries (VTL) or secondary storage, IT shops can determine the amount of time it will take them to restore a file or block of data without having to go through the lengthy process of finding data on an archived tape. All data written to disk is online and quickly accessible through indexes.
Shen, whose company has 25,000 employees in 30 states, currently performs one incremental backup a day across 33 application servers using an appliance from StorServer Inc. in Colorado Springs. By midyear, he hopes to have 100 out of 170 servers running on it. "The beauty of this technology is that you're doing disk-to-disk backup, and then in the morning, the backup goes from disk to tape," Shen says. "The disk-to-disk backup can run 15 [server] backups at a time."
Bill North, an analyst at IDC in Framingham, Mass., says synthetic backup technology has been around for several years, but its adoption rate is difficult to track because the functionality is usually presented as a tool in many mainstream backup products.
Based on anecdotal evidence, North says, user angst over losing data by not performing full backups has kept most companies from implementing it. "My suspicion is that serious data protectors will still do full backups, but you can certainly reduce the frequency of doing that, making the backup window look a lot more cheerful," he says.
Synthetic backup is offered in products from Veritas Software Corp., IBM's Tivoli Software division and CommVault Systems Inc., as well as from several start-up vendors, such as Sepaton Inc. in Marlboro, Mass.
"While it depends on the product, I think that most out there today are pretty reliable. It's just one more layer of abstraction, and the advantages for the user are significant," North says.
Full backups broken
Kahn says synthetic backups are bound to take hold because the very concept behind full backups is broken. The time that it takes to perform full backups is steadily growing out of control.
Michael Passe, a senior storage architect at Beth Israel Hospital in Boston, said his weekly backups now eat all but four hours of every weekend. "My production-control folks do a lot of backup management," he says.
Passe's storage architecture consists of a 50TB SAN made up entirely of EMC Corp. storage arrays. He recently moved to Veritas NetBackup 5.0, which has a synthetic backup tool, and Passe says he's interested in its potential to reduce his backup window.
Like most users who purchase a backup product with a synthetic feature, the University of Montreal's Boisvert bought Time Navigator Server for full backups, but he liked the idea of synthetic backup so much that he decided to try it. Now, instead of shutting down production servers for 12 hours every week, Boisvert performs a standard full backup only once a year. He says he performs incremental daily backups and a synthetic full backup every three months "to make sure restores are easier to do."
"Usually files that have just been migrated into our tape archive are still on disk as well. That prevents us from going back to tape all the time for restores," Boisvert says.
Are synthetics reliable?
Not everyone is convinced that synthetic backups are reliable. Ray Sears, a senior storage architect at Affiliated Computer Services Inc., a $4.1 billion business process and IT outsourcer in Dallas, likes the concept behind synthetic backups but believes it's not fully baked yet.
Sears uses NetBackup and purchased a disk-based Pathlight VX VTL last year from Advanced Digital Information Corp. in Redmond, Wash., to reduce the backup window for about 600 Sun, Hewlett-Packard and IBM AIX servers on his 20TB SAN. The disk-to-disk VTL technology shrunk the backup window from 24 hours to less than six.
The next logical step would be for Sears to begin using Veritas' synthetic backup element. He says that when he deployed Pathlight, synthetic backup "wasn't as bulletproof as I would have liked it to have been," and it took him longer to perform a full data restore than it did to perform a full backup.
"It's manageability issues more than anything else," Sears says. "I think it's still a really immature technology for a lot of larger companies, who've got HIPAA requirements, FDA requirements and the new Sarbanes-Oxley requirements, to say that we're going to go away from full backups."
Bells and whistles
According to analysts, only about 3 percent to 5 percent of data changes in any given file system in a given week. Therefore, if you're using synthetic backup to protect a 200GB file system, you're actually backing up only 6GB to 10GB of data, not all 200GB.
Shen said that while the day-to-day technical management of an incremental environment is more complicated to use - "the software is a little more sophisticated, and it has more bells and whistles"- it's well worth the added education required for his staff.
Shen says his StorServer installation cost $10,000 and was installed in two days, and the first 15 servers were backing up to it on the third day.
The big return on investment came with having to spend only about $150,000 on back-end storage, as opposed to the $450,000 Sears would have spent if he were to size his environment for full backups.
"That's because your system does not have to be sized to capture the larger volumes of data," he says. "In my case, we can really benefit from the incremental backup because we have millions of files and only a small percentage of them change every day."