Edit: Hopefully nobody runs into this error these days.
Originally posted November 6, 2012 on AIXchange
I was recently asked why the cldump command wasn’t running on a PowerHA 7.1 cluster.
After running /usr/es/sbin/cluster/utilities/cldump, my client received this output:
cldump: Waiting for the Cluster SMUX peer (clstrmgrES)
to stabilize………….
Failed retrieving cluster information.
There are a number of possible causes:
clinfoES or snmpd subsystems are not active.
snmp is unresponsive.
snmp is not configured correctly.
Cluster services are not active on any nodes.
Refer to the HACMP Administration Guide for more information.
I checked and learned that IBM has been scaling back the default SNMP configuration over the years for security reasons. However, this issue is relatively easy to address:
1) edit /etc/snmpv3.conf (all nodes) and remove the comment hash from this line:
#COMMUNITY public public noAuthNoPriv 0.0.0.0 0.0.0.0 –
2) add this line (this is the top-level cluster view of the SNMP MIB):
VACM_VIEW defaultView 1.3.6.1.4.1.2.3.1.2.1.5 – included –
3) restart the relevant daemons (this can be done without stopping cluster services):
stopsrc -s clinfoES
stopsrc -s snmpd
stopsrc -s aixmibd
stopsrc -s hostmibd
stopsrc -s snmpmibd
sleep 10
startsrc -s snmpd
startsrc -s aixmibd
startsrc -s hostmibd
startsrc -s snmpmibd
sleep 60
startsrc -s clinfoES
After these changes, cldump was working.
We also found warning messages when we started cluster services or tried to synchronize the cluster:
WARNING: Volume group datavg is an enhanced concurrent mode volume group used as a serial resource, but the LVM level on node nodea1 does not support fast disk takeover
WARNING: Volume group datavg is an enhanced concurrent mode volume group used as a serial resource, but the LVM level on node nodea2 does not support fast disk takeover
WARNING: Volume group datavg is an enhanced concurrent mode volume group used as a serial resource, but the LVM level on node nodea1 does not support fast disk takeover
WARNING: Volume group datavg is an enhanced concurrent mode volume group used as a serial resource, but the LVM level on node nodea2 does not support fast disk takeover
I called support and was told that this was addressed by IV26874. We were also provided with an iFix, which, once loaded, took care of the problem. So if you see the warning, contact IBM and get the iFix (if it isn’t yet available via service pack.)
Incidentally, neither of these issues was a show-stopper in my client’s environment. I continue to be very impressed by PowerHA 7.1.