Why You Should Keep a Local Alt Disk Copy

Edit: Some links no longer work.

Originally posted April 28, 2015 on AIXchange

After upgrading an AIX system, a customer found that they needed to back out of the change. They ended up restoring rootvg from a mksysb. 

Although that’s one way to do it, I don’t recommend it. Of course you should have an mksysb around in case of a disaster, but you should also have a local alt disk copy available. This is true for any type of upgrade, but it’s especially critical for both VIO server and regular AIX upgrades. 

In addition, a disk copy can come in handy if someone accidentally messes up rootvg during regular operations. You can switch your bootlist and reboot to a clean copy of your rootvg rather than try to restore from a backup. 

Here are several articles that explain this in detail.

IBM developerWorks: 

“With IBM Power virtualization, the VIOS plays an important role and all running VIOS client LPARs are fully dependent on the Virtual I/O Servers. In such an environment, updating VIOS to a next fix pack level can be challenging, without taking the system down for an extended period of time and incurring an outage. This can be mitigated by creating a copy of the current root volume group (rootvg) on an alternate disk and simultaneously applying fix pack updates first on the cloned rootvg on a new disk.” 

For example, updating VIOS 1.3.0.0 to 1.3.0.0-FP8, clone a 1.3.0.0 system, and then install updates to bring the cloned rootvg to 1.3.0.0-FP8. This updates the system while it was still running. Rebooting from the new rootvg disk brings the level of the running system to 1.3.0.0-FP8. If a problem with the new VIOS level were discovered, changing the bootlist back to the 1.3.0.0 disk and rebooting the server brings the system back to 1.3.0.0. Another scenario would include cloning the rootvg and applying individual fixes, rebooting the system and testing those fixes, and rebooting back to the original rootvg if there was a problem. 

This article explains the step-by-step procedure for applying the next fix pack level on VIOS by creating a copy of the current rootvg on an alternate disk and simultaneously applying fix pack updates. 

IBM Knowledge Center:

“The alt_disk_copy command allows users to copy the current rootvg to an alternate disk and to update the operating system to the next maintenance or technology level, without taking the machine down for an extended period of time and mitigating outage risk. This can be done by creating a copy of the current rootvg on an alternate disk and simultaneously applying software updates. If needed, the bootlist command can be run after the new disk has been booted, and the bootlist can be changed to boot back to the older maintenance or technology level of the operating system. 

Cloning the running rootvg, allows the user to create a backup copy of the root volume group. This copy can be used as a back up in case the rootvg failed, or it can be modified by installing additional updates. One scenario might be to clone a 5300-00 system, and then install updates to bring the cloned rootvg to 5300-01. This would update the system while it was still running. Rebooting from the new rootvg would bring the level of the running system to 5300-01. If there was a problem with this level, changing the bootlist back to the 5300-00 disk and rebooting would bring the system back to 5300-00. Other scenarios would include cloning the rootvg and applying individual fixes, rebooting the system and testing those fixes, and rebooting back to the original rootvg if there was a problem. 

At the end of the install, a volume group, altinst_rootvg, is left on the target disks in the varied off state as a place holder. If varied on, it indicates that it owns no logical volumes; however, the volume group does contain logical volumes, but they have been removed from the ODM because their names now conflict with the names of the logical volumes on the running system. Do not vary on the altinst_rootvg volume group; instead, leave the definition there as a placeholder. 

After rebooting from the new alternate disk, the former rootvg volume group shows up in a lspv listing as old_rootvg, and it includes all disks in the original rootvg. This former rootvg volume group is set to not vary-on at reboot, and it should only be removed with the alt_rootvg_op -X old_rootvg or alt_disk_install -X old_rootvg commands. 

If a return to the original rootvg is necessary, the bootlist command is used to change the bootlist to reboot from the original rootvg.” 

IBM developerWorks (again): 

“In 2009, I wrote about using alt_disk_copy… to clone your rootvg disks for ease of back-out when doing AIX upgrades or applications upgrades that resided on the rootvg disks. In that article, I did not cover hardware migrations as this was out of scope. In this article, I discuss how this can be achieved. The man page on alt_disk_copy states (by using the ‘O’ option), “Performs a device reset on the target altinst_rootvg. This causes the alternate disk install to not retain any user-defined device configurations. This flag is useful if the target disk or disks become the rootvg of a different system.” 

In a nutshell, this means that any devices that have had their attributes changed, typically by the system administer, are reset to the default value(s).” 

AIX Health Check:

It is very easy to clone your rootvg to another disk, for example for testing purposes. For example: If you wish to install a piece of software, without modifying the current rootvg, you can clone a rootvg disk to a new disk; start your system from that disk and do the installation there. If it succeeds, you can keep using this new rootvg disk; If it doesn’t, you can revert back to the old rootvg disk, like nothing ever happened.” 

And finally, here’s IBM’s “Introduction to Alt_Cloning on AIX 6.1 and 7.1”:

“This guide is intended for those who are new to alternate disk cloning, (or alt_clone for short) and would like to understand the alt_clone process.”

If you would like to learn more about alternate disk cloning, visit the IBM publib website and search on “alt_disk.” 

Do you keep spare LUNs around for your alt_disk copies? If not, why not?