Disaster recovery Because IT is the lifeblood of today.s e-business economy, it is essential to be able to recover IT operations quickly after a disaster. An IT disaster can be something as small as a failed disk drive within a server or something as big as a data center burning to the ground. Other common examples of disasters include power failures, destruction due to viruses, theft, sabotage, accidental commands issued by administrators that result in files being deleted or modified, earthquakes, insurrections, and so on. Being prepared for eventualities such as these is the responsibility of IT management and is critical to the success of modern businesses. The key to successful recovery from these situations is a comprehensive, tested disaster recovery plan. This is a business plan that involves not just redundant hardware but personnel, procedures, responsibilities, and issues of legal liability. Implementation The first step in such a plan involves risk assessment to determine which components are most vulnerable and how the company can function in response to loss of data or critical IT services. It is also important to determine what is acceptable to management in terms of a recovery window.many businesses today would suffer significant loss if it should take more than a day or two to recover from failed systems. Disaster recovery plans can be implemented in different ways depending on the size and needs of the company involved. Three common approaches to implementing disaster recovery plans are traditional tape backup, e-vaulting, and mirroring. The traditional approach of small to mid-sized businesses to disaster recovery is to create full backups to tape daily and archive these tapes off-site in secure locations. Most computer systems in businesses use this method, primarily due to its low cost and its wellknown procedures. However, tape backups must be verified, and test restores must be performed periodically to ensure that this system is in fact working. Furthermore, businesses must realize that it typically takes up to 48 hours to restore a failed disk or system from tape backup, a recovery window that may be unacceptable from an e-business standpoint. E-vaulting (or electronic vaulting or data vaulting) takes a different approach. Instead of archiving data to tape, it is sent over high-speed leased lines or Internet connections to a remote data center for safe storage. E-vaulting makes it possible to do more frequent backups and rapid restores, but the weak link is the wide area network (WAN) connection, for if that goes down, a restore might be difficult or impossible. To make e-vaulting more effective, companies sometimes arrange for backup servers to be running at the remote site and back up data directly to those servers. Then if a failure occurs with the company.s main servers, control can be switched over to the backup servers and business can continue without interruption. The problem is that having duplicate systems in place is an expensive proposition and complex to manage, and companies often try to save money by placing older hardware at the remote site. When disaster strikes, this hardware may not be able to perform as hoped and business will go down. Another version of e-vaulting involves using a mobile data center (a network in a moving van) at the remote site. When disaster strikes the primary site, the mobile data center can be brought on location and managed by your staff. Regardless of which form of e-vaulting is used, the main disadvantage in the eyes of most companies is the cost, which is typically many times that of using traditional tape backup solutions. Extending the idea of e-vaulting is a method called mirroring, in which identical hardware is placed at a remote site and data is kept synchronized between servers in the primary and remote site. This is a costly solution both from the point of view of the redundant hardware required and the throughput needed for WAN links, but the restore window is typically under an hour when disaster strikes, so it is a good option from the perspective of e-businesses. Financial institutions such as banks and credit unions often employ this solution to ensure maximum availability and reliability for customer access to accounts. Marketplace While smaller companies tend to manage their own tape backup and e-vaulting solutions in-house, large enterprises generally outsource their disaster recovery needs, particularly if mainframe systems and AS/400s are involved. Three companies handle the lion.s share of disaster recovery business at the enterprise level: Comdisco Continuity Services, SunGard Recovery Services, and IBM.s Business Continuity Recovery Services (BCRS). E-vaulting companies abound and their range of services varies greatly. One example of such a company is Imation, which offers its LiveVault services for small and medium-sized companies. Businesses that host their business services with Web hosting companies often make use of disaster recovery services provided by these services. An example is Exodus Communications, which provides mirroring services for customers at remote data centers. See Also: backup, tape format BACKUP The process of making reliable copies of important data so that the data can be recovered in the event of a disaster. Overview Performing regular backups is perhaps the system or network administrator.s least glamorous but most important task. Data loss on a corporate network can occur for various reasons, including ? Disk failures caused by hardware failure, power outages, or improper use ? Network problems leading to lost packets that are not acknowledged because of router congestion or other situations ? Virus infection, resulting in corrupted files ? Sabotage by hackers or disgruntled employees, resulting in erased data ? Theft of hardware from the premises In each of these scenarios, having reliable backups of your company data is essential to recover from the disaster and continue normal business functioning. At the enterprise level, backups can be performed using a variety of technologies, each of which have their own advantages. These technologies are a blend of backup device hardware and how these devices are implemented. The next section of this article looks at a few common scenarios. First, backup solutions can be characterized by the devices used to store the backed-up data. These devices can include ? Tape drives: A tape drive is a device that stores data on magnetic tape. Many kinds of tape drives and tape formats are supported by different vendors, and these are discussed more fully in the articles .tape drive. and .tape format. elsewhere in this book. Generally speaking, however, tape drives have capacities in the tens of gigabytes range and are suitable for backing up data from individual servers or small groups of servers. ? Tape libraries: A tape library consists of a set of tape drives, a large collection of tapes, and a robotic mechanism for loading and unloading tapes into drives. Tape libraries are common in large companies and can typically store several terabytes (one terabyte equals 1000 gigabytes) of data from groups of servers. For more information, see the article .tape library. elsewhere in this book. ? Optical drives: Another medium for backing up data is optical drives, which range from simple CD-R/W drives to DVD-W drives and libraries containing many such drives. Optical drives are not as common as tape drives and libraries. ? Storage appliances: These are generally rack-mountable black-box solutions in which the underlying operation is not important. Storage appliances are generally used for live data storage but can also be used for small-scale backup purposes. ? Storage Area Networks (SANs): While SANs are primarily used for live storage of data, they can also be used for archiving backup data. See the article .storage area network (SAN). elsewhere in this book for more information. Besides these different backup devices, there are also various ways of implementing them for backing up data from network servers: ? Server-based backups: In this scenario each server that holds valuable data has a tape drive directly attached to it, usually through a Small Computer System Interface (SCSI) connection. The disadvantage of this scenario is that it scales poorly for large companies.administrators would need to run around each morning to collect tapes from drives scattered all over the network. ? Network backups: This is the most common scenario in most large companies. In a typical network backup scenario, a group of servers on a local area network (LAN) are connected using a second network interface card (NIC) in each server to a separate LAN dedicated for backup purposes. This dedicated backup LAN is concentrated using a Fast or Gigabit Ethernet Switch, which is also connected to a dedicated server called the backup server. The backup server has special software running on it that initiates and manages the job of backing up data on the production servers. The backup server itself is then connected by SCSI or FiberChannel to a tape library (see illustration). ? LAN-free backup: This is a simplified form of network backup in which there.s no second backup LAN. Instead, fiber channel cards are added to servers needing to be backed up and these are connected using fiber-optic cabling to a fiber-channel router, which then forwards the information directly over Fibre Channel links to tape libraries (see illustration). LAN-free backup is an emerging approach that is gaining in popularity due to its simplicity and high performance. ? Serverless backups: This is a further refinement of LAN-free backups that takes the actual task of processing the backup from the servers and moves it to a Fibre Channel switch or router used to connect the servers to the tape libraries. This can provide significant relief to the servers since generating backups is a processor-intensive and memory-intensive job that limits other functions they can perform while the backup is occurring. Serverless backup solutions are just emerging in the marketplace. ? Storage over IP: This technology backs up data from network servers directly to backup devices such as tape libraries and SANs using only an Ethernet network. No backup server is required to convert the data from Ethernet frames for transmission over SCSI or Fibre Channel connections to the backup device. Storage over IP is an emerging technology that promises to have a large impact on the backup market, and it is discussed further in the article .storage over IP. elsewhere in this book. ? Internet backups: Backups can also be outsourced over the Internet to a Storage Service Provider (SSP) that is responsible for managing actual backup hardware and securely storing your data. For more information on this, see the article .televaulting. elsewhere in this book. Finally, a third component of a backup system is the backup software itself. Some of the more popular backup software products used in the enterprise include ? ArcServeIT from Computer Associates ? Backup Exec and NetBackup from VERITAS Software Corporation ? Legato NetWorker from Legato Systems ? Storage Manager from Tivoli Systems ? Backup Express from Syncsort ? Hiback and Hibars from Hicomp Software Systems Implementation Instituting a regular backup plan is one of the main components of a company.s disaster recovery policy (see the article .disaster recovery. elsewhere in this book for more information), and the importance of doing so cannot be stressed enough. To guard against these unexpected losses of data.or rather, to prepare for them, since they are, to a certain extent, inevitable.establish a disaster recovery policy that includes a reliable backup plan. In today.s business world, where data is the lifeblood of the enterprise, a comprehensive plan is essential. The following steps are recommended when creating such a plan: ? Decide what kind of backup storage devices to use. Options range from small digital audio tape (DAT) drive units capable of backing up several gigabytes of data to large automated tape libraries capable of handling terabytes of centralized data storage. Other backup options include optical storage libraries and removable disks such as Iomega.s Zip drive disks or Imation SuperDisk disks. ? Decide whether to back up servers with dedicated, locally connected storage devices or over the network to centralized backup libraries. Network backup systems suffer from a single point of failure (the network itself) but are simpler to administer than a multitude of individual backup units. ? Decide whether individual users. workstations should also be backed up. A more cost-effective option is to educate users to always save their work on a network share located on a server that is regularly backed up. ? Decide how to secure the storage of backup tapes and other media. Will duplicate copies be stored both on-site (for easy access if a restore is needed) and off-site (in case the company.s building burns down)? Make sure the storage facilities are climate-controlled and secure. ? Decide what kind of backup strategy to employ. A backup strategy is a combination of a backup schedule and various backup types, including normal, copy, incremental, differential, and daily copy backup types. Also consider whether you will verify all tapes immediately after each backup is performed. For further information, see the articles .backup strategy. and .backup type. elsewhere in this chapter. ? Assign various aspects of the backup procedure to the responsible party. One option some companies now use is to back up data over the Internet to a third-party backup service provider that stores and maintains the backed-up data. This method involves issues of trust and of the Internet connection as a point of failure. ? Test backups periodically to ensure that they are actually readable. Nothing is worse than thinking you have a backup when in fact it is unreadable. Notes To enable administrators to perform regular backups, Microsoft Corporation includes backup utilities with all