LPM and Firmware Compatibility

Edit: Check your firmware!

Originally posted April 26, 2016 on AIXchange

Here’s something of interest to those who use live partition mobility (LPM): IBM has created a matrix that shows firmware compatibility for conducting LPM operations between systems:

“Ensure that the firmware levels on the source and destination servers are compatible before upgrading.

In [Table 1], you can see that the first column represent the firmware level you are migrating from, and the values in the top row represent the firmware level you are migrating to. The table lists each combination of firmware levels that support migration.”

Below that first chart is a list that “shows the number of concurrent migrations that are supported per system. The corresponding minimum levels of firmware, Hardware Management Console (HMC), and [VIO servers] that are required are also shown.”

Then there’s a list of restrictions, followed by a table that shows the firmware levels and POWER models that support partition mobility:

“Restrictions:
• Firmware levels 7.2 and 7.3 are restricted to eight concurrent migrations.
• Certain applications such as clustered applications, high availability solutions, and similar applications have heartbeat timers, also referred to as Dead Man Switch (DMS) for node, network, and storage subsystems. If you are migrating these types of applications, you must not use the concurrent migration option as it increases the likelihood of a timeout. This is especially true on 1 GB network connections.
• You must not perform more than four concurrent migrations on a 1 GB network connection. With VIOS Version 2.2.2.0 or later, and a network connection that supports 10 GB or higher, you can run a maximum of eight concurrent migrations.
• From VIOS Version 2.2.2.0, or later, you must have more than one pair of VIOS partitions to support more than eight concurrent mobility operations.
• Systems that are managed by the Integrated Virtualization Manager (IVM) support up to 8 concurrent migrations.
• The Suspend/Resume feature for logical partitions is supported on POWER8 processor-based servers when the firmware is at level 8.4.0, or later. To support the migration of up to 16 active or suspended mobile partitions from the source server to a single or multiple destination servers, the source server must have at least two VIOS partitions that are configured as mover service partitions. Each mover service partition must support up to 8 concurrent partition migration operations. If all 16 partitions are to be migrated to the same destination server, then the destination server must have at least two mover service partitions configured, and each mover service partition must support up to 8 concurrent partition migration operations.
• When the configuration of the mover service partition on the source or destination server does not support 8 concurrent migrations, any migration operation that is started by using either the graphical user interface or the command line will fail when no concurrent mover service partition migration resource is available. You must then use the migrlpar command from the command line with the -p parameter to specify a comma-separated list of logical partition names, or the –id parameter to specify a comma-separated list of logical partition IDs.
• You can migrate a group of logical partitions by using the migrlpar command from the command line. To perform the migration operations, you must use -p parameter to specify a comma-separated list of logical partition names, or the –id parameter to specify a comma-separated list of logical partition IDs.
• You can run up to four concurrent Suspend/Resume operations.
• You cannot perform Live Partition Mobility that is both bidirectional and concurrent. For example, [when] you are moving a mobile partition from the source server to the destination server, you cannot migrate another mobile partition from the destination server to the source server.”

Note that if you do not check your firmware versions, a firmware update can cause future planned LPM operations to fail. That’s all the more reason to add this link to your planning checklist.

Speaking of LPM, Chris Gibson takes note of a new HMC system-wide setting that allows LPM with inactive source storage VIO server.

“A new HMC & Firmware 840 feature allows LPM in a dual VIOS configuration when one VIOS is failed. Previously, LPM was not allowed if one of the source VIOS (in a dual VIOS configuration) was in a failed state. Both [VIO servers] had to be operational to perform LPM. The new support allows the HMC to cache adapter configuration from both [VIO servers]. Whenever changes are made to the configuration, the cached information will be updated on the HMC. If one VIOS is failed, instead of querying the failed VIOS, the HMC cache is used instead to create the new configuration on the target VIOS. This support was needed to cover the situation where there’s failed hardware which is causing an outage on the VIOS and requires a disruptive repair action. This new feature is enabled using a server wide HMC setting to enable the automatic caching of VIOS configuration details.”