Running cldump on a Cluster

Edit: Hopefully nobody runs into this error these days.

Originally posted November 6, 2012 on AIXchange

I was recently asked why the cldump command wasn’t running on a PowerHA 7.1 cluster.

After running /usr/es/sbin/cluster/utilities/cldump, my client received this output:

            cldump: Waiting for the Cluster SMUX peer (clstrmgrES)
            to stabilize………….
            Failed retrieving cluster information.

            There are a number of possible causes:
            clinfoES or snmpd subsystems are not active.
            snmp is unresponsive.
            snmp is not configured correctly.
            Cluster services are not active on any nodes.

            Refer to the HACMP Administration Guide for more information.

I checked and learned that IBM has been scaling back the default SNMP configuration over the years for security reasons. However, this issue is relatively easy to address:

            1) edit /etc/snmpv3.conf (all nodes) and remove the comment hash from this line:

            #COMMUNITY public    public     noAuthNoPriv 0.0.0.0    0.0.0.0         –

            2) add this line (this is the top-level cluster view of the SNMP MIB):

            VACM_VIEW        defaultView     1.3.6.1.4.1.2.3.1.2.1.5 – included –

            3) restart the relevant daemons (this can be done without stopping cluster services):

            stopsrc -s clinfoES
            stopsrc -s snmpd
            stopsrc -s aixmibd
            stopsrc -s hostmibd
            stopsrc -s snmpmibd
            sleep 10
            startsrc -s snmpd
            startsrc -s aixmibd
            startsrc -s hostmibd
            startsrc -s snmpmibd
            sleep 60
            startsrc -s clinfoES

After these changes, cldump was working. 

We also found warning messages when we started cluster services or tried to synchronize the cluster:

            WARNING: Volume group datavg is an enhanced concurrent mode volume group used as a serial resource, but the LVM level on node nodea1 does not support fast disk takeover

            WARNING: Volume group datavg is an enhanced concurrent mode volume group used as a serial resource, but the LVM level on node nodea2 does not support fast disk takeover

            WARNING: Volume group datavg is an enhanced concurrent mode volume group used as a serial resource, but the LVM level on node nodea1 does not support fast disk takeover

            WARNING: Volume group datavg is an enhanced concurrent mode volume group used as a serial resource, but the LVM level on node nodea2 does not support fast disk takeover

I called support and was told that this was addressed by IV26874. We were also provided with an iFix, which, once loaded, took care of the problem. So if you see the warning, contact IBM and get the iFix (if it isn’t yet available via service pack.)

Incidentally, neither of these issues was a show-stopper in my client’s environment. I continue to be very impressed by PowerHA 7.1.