Doug Owens, Manager of Technical Services for CBL Data Recovery Technologies Inc. , has been in the IT industry for more than two decades, and specifically within the data recovery business for 17 years. As a specialist in recovery of data from all types of tape media, he is one of the very few leading experts in the field. Owens has performed data recovery services for the majority of the Fortune 500 companies during his career as well as federal, local and state government in the US both civil and military.
What types of problems occur?
According to Owens, The majority of problems stem from something a user has done or not done! We can break these down into four main areas; Damaged media, data loss, accidental over-writing of tapes and physical damage by fire, flood or earthquake.
Broken or snapped media
Broken or snapped media is normally caused by either a malfunctioning drive or poorly maintained media. This is often due to leaving a tape in extreme heat or in a place that has a high fluctuation in temperature. Warehouses or (being) next to an aircon unit is a common cause of this problem. It may be fine during the day but at night extreme temperatures will cause the tape media to gradually weaken. Then once the tape is exercised within a drive unit it may break, Owens explained.
Humidity is also a major cause of tape breakage. Each tape is essentially a very thin film of plastic with a layer of metal on top. Humidity causes rust and the rust makes the tape brittle causing a break, said Owens. Owens has spoken to many customers over the years and estimates that only about 50 percent of users follow correct tape storage procedures. He recommends that the use of a simple thermometer and humidity gauge at a storage location, sampled over a 24-hour period, can ascertain if a tape storage location matches the manufacturer recommended specification.
Poorly maintained drive units are the other major factor in tape breakage. A speck of dirt on a drive head, if left unclean, will eventually harden and attract more dirt and, once there is sufficient mass, this contaminant is likely to damage or break tapes, he said.
Modern drives will often indicate when a cleaning cycle is required and Owens estimates that from his experience, only 30-40 percent of users follow an exact cleaning schedule, while a similar proportion will get close, occasionally missing a scheduled drive maintenance. Surprisingly, 20-30 percent dont clean the drive at all.
Data loss is often the most insidious form of problem, especially as sometimes the problem is not spotted until some considerable time into a backup cycle. If a tape breaks, there will always be some data loss but often not a great deal. However, poorly stored tape can greatly increase data loss without a notable tape break.
Owens gave an example, We had one customer who stored his tapes in a crate near a florescent light. Over a period of time, the magnetic field generated by the fluorescent lights weakened the tapes until they were badly degraded. Any type of exposure to a magnetic field for long enough will weaken the signal strength on the tape.
Data loss can also be caused by a weak signal sent from a drive during a write operation. In one customer example, a 700MB backup that passed the validity check directly after writing had faded away by the next time the data was read. Weak signal can also cause a bleed through and this can be the case on tapes that have been overused, especially as modern tape units seldom use a separate erase head for conditioning tapes. Testing a sample tape a few weeks after a write operation is a good way to spot these types of problems.
Accidentally overwriting data
Although this used to be a very common problem, especially with the Unix community where operating system instructions were commonly used to perform backups, modern software has tended to render this problem obsolete. However, Owens cautions when re-installing software on modern robotic tape libraries. We have seen instances when software controlling a robotic library has been re-installed and then decided to re-initialise all the tapes. It is often wise to unload an automated library before re-installing control software, he says
In the second part of this interview we look at tape problems resulting from disasters such as floods, earthquakes and fires.