Nobody likes to tie up resources with a long-running backup. So how do you shrink your backup window? There are many approaches, but in SAN environments, sometimes the best solution is to back up directly from the shared storage rather than from the servers themselves. These so called "off-host" backups have its own challenges, however, so you need to weigh your options carefully.
Worst-case backup scenarios usually involve a server that maintains a huge amount of data - or a server that squirrels away data in hundreds of thousands of tiny little files. In either case, backing up over the network using traditional backup agents can result in incredibly long backup windows - often the better part of a day for a single server.
The usual culprits include poor network performance, poor source disk performance, poor backup device performance, and/or the overhead of cataloging the files being backed up.
With direct-attached storage, you have few options to shrink your backup window. If you have so much data to move that a single gigabit Ethernet link chokes - or, more often, if the performance of the source server simply isn't up to snuff - good luck.
In SAN-attached environments, you can opt for off-host backups, where you present the source server's storage volumes on the SAN directly to the backup host. The backup host can then draw the data from the SAN without loading the source server at all. Better, if you have enough spare SAN performance headroom, you can freely back up throughout the day without causing problems, as the source server chugs along happily unaware that a backup is taking place.
Off-host backups vary depending on whether virtualisation is in place. In non-virtualised environments, the backup software generally contacts the source server and asks it to create a snapshot of the disks you want to back up.
The source server then uses software provided by the SAN manufacturer to request that a snapshot be created on the SAN. The backup server mounts the snapshot and performs the backup from that. Without these steps, the backup wouldn't be consistent, because it would be performed from a volume that was actively changing rather than a point-in-time copy.
In virtualised environments, specifically those using VMware's ESX hypervisor, the process works differently. Instead of asking the source server to initiate a SAN snapshot, the backup software uses VMware's Virtual Consolidated Backup software to create a snapshot of the virtual machine at the hypervisor level (the SAN doesn't need to do anything itself to make this happen). The backup server can then back up a consistent image of the virtual machine directly off of the SAN without taxing the virtual machine, the virtual host, or the network.
These types of backups are generally very fast - well in excess of the performance you'd get using a traditional backup agent. They also give your backup hardware, whether tape or a VTL, a much better chance of living up to their full performance potential (modern tape drives usually suffer from backup sources that aren't fast enough to feed them at full speed).
Be forewarned, though, that off-host backups can get pretty complicated. In the non-virtualized scenario, your backup software needs to support the SAN you're using, so it can command the SAN to take a snapshot. You also need to be very careful about dealing with the configuration of the backup host, because it could potentially have low-level visibility into a large number of different SAN volumes that are actively being used by other servers - which can cause significant data corruption if configured improperly.
Anytime you're taking a snapshot, on your SAN or in your virtualization hypervisor, you also run the risk of causing problems within the source server's operating system. In Microsoft Windows Server environments, most snapshot methodologies rely on Microsoft's Volume Shadow Copy Services to quiesce the disks prior to a snapshot being created. This can sometimes be a source of irritating, hard-to-solve problems.
Also note that off-host backups won't solve every backup performance problem. One example that stands out in my head is that of a receipt archiving system at a financial institution.
The system handled printing and archiving for literally every receipt the institution printed, which were in black-and-white TIFF image files, usually 1 or 2 kilobytes each. And there were lots of them. Millions of them. Backing up that server would easily take longer than 24 hours - all for less than 500GB of data. Network performance wasn't the problem. Neither was the storage that the data lived on - a relatively high performance SAN with plenty of performance headroom and interconnect bandwidth. The problem was simply the massive quantity of files that had to be cataloged by the backup server.
In this case, an off-host backup wouldn't have helped, because the bottleneck had nothing to do with the host server. In fact, for such a low-traffic system, it wouldn't have mattered if the backups spilled into the production day. The problem was that the backups didn't fit within a whole day. The solution was simple: Periodically archive the images into large compressed files and back those up rather than the images themselves. Easy!
In most cases, however, off-host backups help a lot. They allow you to leverage both your SAN and your expensive backup hardware and dramatically shrink backup windows. No, they won't fix every performance problem, and they come with their own complexity. But if you take your time and test off-host backups carefully, you'll have a modern solution to a very big problem.