You are here: Articles --> 2001 --> “Backing
up” doesn’t mean retreating
Vous êtes ici : Essais --> 2001 --> “Backing up” doesn’t mean retreating
by Geoff Hart
Previously published as: Hart, G. 2001. "Backing up" doesn't mean retreating. http://www.techwr-l.com/techwhirl/magazine/technical/backups.html
Recently, several friends and colleagues have lost important files as a result of viruses, power failures, computer crashes, and miscellaneous other disasters that accompany working with computers. Each person could have minimized the consequences if they had developed and rigorously followed a simple backup strategy for their data. The fact that this happened to experienced computer users in each case leads me to believe that data loss is symptomatic of a broader problem: as technical communicators, our tight focus on documenting how to use a product sometimes makes us forget to document the consequences of using the product.
Recognizing how few user manuals remind us to back up our important data, I’ve begun including such reminders in the manuals for the products that I document. However, this advice focuses more on the need to back up data than on how to actually do so. To fill that gap, I offer the recommendations in this article to help you develop your own documentation concerning recommended backup strategies. I’ll be adding my own, shorter version of this advice to future manuals.
Please note that although I’m talking about backing up data in this article, I’m doing so to illustrate a broader principle: that we must consider factors beyond simply which menu to open and the permitted values for a particular field in a dialog box when we document a product. All products exist within a specific context, and understanding that context (e.g., through contextual inquiry) is crucial to providing documentation that meets all of our audience’s needs. Backing up your data is just one example, and one that I hope you’ll also find useful in protecting your own work. More to the point, I hope you’ll also find it useful in helping your readers to protect their work.
“Contextual inquiry” is the art and science of inquiring about the context in which people use a product, and using the understanding you gain to recognize the implications of that context. For computer users, that context is one in which users must protect data files from computers that are unstable beasts at the best of times—creations that can die suddenly with no apparent cause and that are unacceptably vulnerable to software problems such as operating system malfunctions and viruses. Given this context, data loss is not just possible—it’s nearly inevitable. As writers, this suggest that when we write any computer or software documentation, we should teach our audience how to back up their data; as a bare minimum, we must at least remind them that backups are necessary.
What should we say? That any successful backup strategy must meet four important needs:
In each case, the goal is to let someone recover as much of their work as possible up to the point at which they lost data. Although specialized software exists to meet more demanding needs for backing up data (e.g., on corporate networks), I’ve found that in most cases, users can meet their backup needs manually with little difficulty.
A backup strategy typically involves creating backups that safeguard all files on your hard disk (“full” backups) as well as backups that safeguard only the files that have changed since the last full backup (“incremental” backups). A full backup is much easier to work with, since it covers the entire contents of a hard disk and you know that all the files you’ve copied are the most recent versions. The tradeoff in gaining this protection is that full backups can take considerable time to create, and the process can be particularly tedious if the amount of data requires multiple diskettes, Zip disks, CDs, or tapes. (For simplicity, I’ll simply refer to these things as “backup media” in the remainder of this article.) Incremental backups are faster and take less space, since users only copy the few files they were working on that day. However, incremental backups complicate the task of reconstructing an entire failed or damaged hard disk, since users must restore their data from multiple backup media to ensure that they have the most recent copies of all files; moreover, they still require a full backup so they can recover files that haven’t changed recently. Compression software such as WinZip on the PC, StuffIt on the Mac or PC, or the proprietary tools built into commercially available backup software such as Retrospect can reduce storage requirements enough that an entire data set can fit on a single backup medium. If that’s the case, backups suddenly become a lot less painful to perform.
Each of the four essential elements of a strategy for incremental and full backups offers various options and complications. How can you develop this strategy?
The simplest context is one in which your computer or software crashes and you lose data. The crashes can result from software bugs, hardware problems, or even unpleasant facts of life like power failures. When a crash occurs, you will lose data if your software hasn’t finished saving the currently open files and closing those files properly. Since most people only do full or incremental backups before shutting off their computer for the day, recovering the current version of your work means that you should make interim backups over the course of the day.
Most software lets you make automatic backups of your work; Word, for example, lets you save a recent copy of your work file so that should the computer crash, you can recover most of your data as soon as you reboot. To set this option, open the Tools menu, select “Options”, and select the “Save” tab. [On the Mac, this is the "Preferences" menu choice, and it may be found either under the Tools menu or the Word menu, depending on the version you're using.] In the field labeled “Save autorecovery information every...”, enter a frequency for these automatic backups. Autorecovery files give you an excellent chance of recovering a damaged file, at the cost of using more disk space. If you’re working on a network, or have a second disk drive available on your system, you can set this option so that the software saves the main copy of your file on one disk and the autorecovery copy on another drive, so that if the main disk drive dies, you can still work from the second drive. (Please note that although I’ll use Word as an example, most software offers comparable features.)
If you have only a single hard drive, you can use some removable medium (floppy disk, rewritable CD drive, magnetic tape, etc.) for this purpose. When I write, I always keep a floppy disk in the disk drive, and periodically copy the current version of the file to that disk. [A look back from 2005: I know use USB-based flash drives instead. They're faster, hold more data, and are more portable.—GH] If you work in a region in which power failures frequently crash your computer, you can obtain additional safety by using two diskettes (for example), so that if your computer crashes and damages one diskette while copying the file, you still have a recent copy to work from. Using a removable backup medium such as a diskette offers one important advantage over simply letting your software create autorecovery files on your hard disk: if a catastrophic failure of your main hard drive occurs, it may be impossible to recover any data from that drive. If the data is safe on a diskette, you won’t have that problem.
The problem with recovering the current version of your work is that sometimes that current version isn’t the one you really need. Files can be corrupted by viruses and bugs in the software that created them, extensive changes made to a file may turn out to be wrong, material deleted from an earlier version of the file may turn out to be valuable after all, and so on. Worst of all, these problems can occur gradually, taking days or even weeks to become evident. In each case, by the time you discover the problem, you find yourself needing to recover an older version of the file.
Some software automatically creates a backup of the state of a file before you begin your current work session. For example, Word lets you select this option by opening the Tools menu, selecting “Options”, and selecting the “Save” tab. If you select the option “Automatically save backup copy”, Word will make a backup copy of each file you open by adding .bak to the name, and will then save any changes you make in the current session in the current version of the file (the one with a .doc extension). When you back up your hard disk, you have an opportunity to retain a copy of both the current version of the file and the previous version (the one with the .bak extension).
However, this approach only keeps the two most recent versions of a file available. Because some problems such as file corruption or viruses do their dirty work gradually, it’s useful to keep a copy of your entire set of data files that dates back at least 2 weeks. For example, I currently make incremental copies of all files I’ve modified on a given day using floppy disks, then copy the entire data directory on my hard disk onto a Zip disk at the end of the week. I currently use four Zip disks to back up my data, with a different disk being used each week, so that by the end of a 4-week cycle, I have copies of most files that are up to 4 weeks old. This is particularly helpful if I inadvertently delete a file, since it give me a month in which to recognize the problem and recover the file from an older backup disk. If you’re fortunate enough to own a rewritable CD drive, you can probably keep versions of your data dating back months or even years on the same CD. Even an inexpensive CD burner (older models available for as little as $200) lets you create a “write-once” CD; spending a buck on the medium and up to an hour on the backup is an inexpensive investment in your sanity. [A look back from 2005: Because the amount of data I was managing grew, I replaced the Zip disk with the CD burner built into my new computer. Most modern computers now have CD burners, and if not, you can get them for less than $100. I now use five rewritable CD-RW disks and rotate between them week by week; every month, I burn a copy on a non-rewritable CD-R.—GH]
One thing to be aware of is that some software stores its data in surprising places. For example, Word saves its main files and backup files in whichever directory you specify when you select the Save function; however, the autorecovery files appear in the directory you specify in the “Save” tab of the Options dialog box, which may be somewhere else entirely. Other software, such as Outlook, stores your data (in this case, e-mail messages) in database files that can appear in various locations, such as on a shared network directory or on your C drive. Don’t forget to copy these files as part of your backup process!
Viruses have become intimately familiar to most people who use Microsoft’s Word and Outlook software, both of which are unique among other software in being vulnerable to macro viruses and script viruses. But traditional viruses, worms, Trojan horses, and their ilk (collectively called “malware”) are still out there. Much malware is relatively innocuous, but some can easily corrupt or entirely destroy your data files. To succeed, any backup strategy must let you recover from such problems. [A look back from 2005: Now we have a new threat to worry about: "spyware". This is software that installs itself, much like a virus or Trojan, and spies on what you're doing with the computer. Visit the PC Magazine site to look for information on the latest, most effective antispyware software.—GH]
Like computer crashes and hard drive failures, which can cause sudden and dramatic data loss, malware can have dramatic and immediate effects. It’s easy to recover the data from your most recent backup in these cases, but malware can also corrupt your data slowly, over a period of time, and by the time you notice the problem, it’s too late to do anything about it. If you conscientiously keep your antivirus software up to date, the odds are excellent that you’ll spot the problem and fix it before it goes too far. But if the malware has already destroyed a file, the only defence is a good backup that goes back several weeks—far enough that you still have copies of the files from before the malware began its work.
If you update your antivirus software infrequently, your backups should stretch back at least as far as the last time you updated the software; in this way, a new virus that your outdated version of the antivirus software doesn’t detect won’t stop you from recovering old data. When a problem arises, simply update your antivirus software, remove the virus from your hard disk, copy your backup to your hard disk, then run the antivirus software again to clean up the backup. (And don’t forget to make a new backup, free of the infection, and discard the old one. It makes little sense to go to all this trouble, only to reinfect yourself by copying a single infected file from an old backup that will proceed to infect your entire computer.) A CD-ROM or CD-R backup is particularly useful because viruses can’t erase the data on a CD-ROM. Moreover, CD-ROMs and CD-Rs are immune to other common problems; for example, magnetic media stored near a computer can sometimes be damaged by the electromagnetic field caused by the power surge that occurs when an older, poorly shielded computer monitor powers up, or by the magnetic wands used by airport security staff.
Since viruses attack your computer settings or various other hidden files that your computer uses to determine how your software should function, you also need to locate where these files are stored so you can include them in your backup strategy. In Windows, you should include the Registry file in your backup (check the online help for details); on the Mac, open the System folder and copy the “Preferences” folder. [In OS X, this is now stored outside the System folder in the directory that bears your user name.—GH] Some software stubbornly insists on storing its settings at other locations on the hard disk. To find these files, change one or more of the settings in the software, then search the hard disk to find out what files just changed. (You can usually search by the file’s “modification date”. Check the options of your search function for details.) One or more of these files are likely to be settings files that you should include in your backup.
The best backups in the world do you no good whatsoever if a thief steals them along with your computer, or if a house fire destroys the backups. This being the case, always store backups of important data away from the computer you used to create that data. For example, I rotate my Zip disks between home and work so that if I’m robbed at home, I still have copies at work, where an alarm system, security guard, and fire-suppression equipment all greatly reduce the risk of them being lost; should the worst happen and my work copies be damaged, I can immediately make another copy at home and move it somewhere else that’s safe, such as a bank safety deposit box.
For truly critical data, it sometimes pays to obtain additional peace of mind by making multiple backup copies, and storing them in different places. Sometimes all you need to do is bring one copy to work and leave another copy at your neighbor’s house. Where the data is confidential or top-secret, it pays to investigate a means of encrypting the data to protect it from prying eyes; even if you trust your friends, you can’t trust the thief who takes the data from your friend’s house. [A look back from 2005: Since this article was written, a variety of online services have sprung up that offer backups on their servers, somewhere far away across the Internet. In fact, the service provider that gives you access to the Internet probably offers a certain amount of Web space for a Web page. You can also use this space to store files.—GH]
One last word of advice: Don’t forget to test the backup! Even ordinarily reliable computer hardware sometimes fails, and if your disk drive, Zip drive, or other backup device is damaged and thus unable to record data correctly, it may still tell you that the backup was successful. To ensure that the data has been recorded correctly, open a few files directly from the backup medium. If they’re intact, it’s unlikely that you’ve encountered a problem; if not, you can immediately take other steps to protect your data (e.g., e-mailing it to a friend) while you wait for the backup device to be repaired or replaced.
The example I’ve provided in this article shows how you can develop a backup strategy that covers your needs, but does so by explaining the context in which you’re using your computer and thus, the consequences in terms of safeguarding your data. With a little work, you can condense this article into a much simpler set of recommendations on how users of your software should protect their own data.
More importantly, this article illustrates the process of understanding a product user’s context well enough that you can imagine yourself within that context and document the most important consequences of using the product. It also shows how easily something as basic to protecting our work as creating a backup can be neglected. You won’t see the discussion in this article in most computer manuals, yet what could be more important than telling us how to protect our work? Ask yourself what other aspects of your documentation are similarly lacking, and see whether you can’t provide simple advice or a detailed procedure that helps readers to protect themselves from the consequences of their actions or inaction.
Furthermore, take a long, hard look at the things you’re documenting so you can determine whether you can propose solutions that developers can implement to help protect the product’s users. For example, modern hard drives now routinely exceed 10 Gigabytes of storage, more room than anyone but a video editor can easily fill. That being the case, why not suggest that the developers add a backup directory for the data produced using their products, and store the 20 most recent versions of each file in that directory? With a little thought, it’s easy to set a maximum storage limit that doesn’t occupy too much of the hard disk, to establish a process for letting users delete the oldest files when that limit is approached, and to let users recover old versions if something unfortunate happens to the current version.
Advocate on behalf of your audience, and where that advocacy fails, identify how you can take measures to protect the audience anyway.
©2004–2018 Geoffrey Hart. All rights reserved