Edit: Still good stuff.
Originally posted May 19, 2015 on AIXchange
Recently a customer rebooted some systems that hadn’t been restarted in more than a year. All of the LPARs and the VIO servers were powered off so maintenance could be performed. The customer was able to use live partition mobility to relocate the important LPARs. That left just the dev and test environments.
Of course, plenty of systems have gone much longer without a reboot, but restarting systems after a year-plus of continuous uptime can be tricky. And in this instance, problems emerged. Someone had done DLPAR operations without then updating the system profile. To make matters worse, the DLPAR operations were related to the VIO server and virtual fibre adapters. When the VIO servers came back up, the system didn’t recognize the dynamically added adapters, and the client LPARs wouldn’t boot.
Luckily, the customer had hmcscanner output so they could see which adapters were missing based on the information in the client LPAR profiles. However, what should have been a quick restart ended up being a lengthy exercise because the profile information wasn’t in sync with what was actually running.
How is your systems documentation? When you make a change, do you make sure that the profile has also been updated or saved?
Along with mksysb, backupios and viosbr, be sure to backup your profile data on the HMC. You never know when someone might have made a change to the running systems and then neglected to backup the profile.