Digging Into AIX 7.3 TL Updates

Originally published by TechChannel December 5, 2023

Rob McNelly also highlights information on Power10 performance, AIX administration, VIOS to NIM mapping, and shares his fondness for some old interfaces

In November, IBM TechXchange conducted a deep dive on AIX 7.3 TL updates. Featuring AIX product manager Jayen Shah and Carl Burnet, DE for IBM Power, the webinar covered the latest AIX enhancements:

“In this webcast, you will learn:
*How AIX OS feature enhancements provide the capacity, performance, and leading security needed to accelerate business outcomes.
*How to harness Power automation to stay current with the latest technology while keeping data secure and maintaining optimal performance.”

Register for the replay and download the slides.

Having tuned into this webinar, I’ll share some information that stood out to me. First, a reminder that should have been on your radar and that should not be a surprise, AIX 7.1 has reached end of support (slide 2). If you’ve not yet upgraded to AIX 7.2 or 7.3, you should make the move as soon as possible. There is nothing more frustrating than needing IBM’s assistance and being told that you need to either upgrade or purchase (assuming it’s available) extended support.

Updates to the AIX Collection at Ansible Gallery are highlighted (slides 6-7), along with updates to the AIX Toolbox (slide 8). VIOS support is covered on slide 12. Version 3.1.1 is end of support, 3.1.2, 3.1.3 and 3.1.4 remains in support, with VIOS 4.1 currently available.

On that note, I’m already seeing people test the upgrades to 4.1. As always you can get the latest lifecycle information and release notes for 4.1.0.10. Keep in mind that moving to 4.1 is essentially the same as upgrading your base operating system. In other words, it’s more involved than simply installing a VIOS fixpack. As a reminder, VIOS 4.1 runs a stripped-down version of AIX 7.3 under the covers (slide 13), whereas VIOS 3.1 ran AIX 7.2 behind the scenes.

There’s more, including PowerVC updates (slide 16) and PowerSC coverage (starting with slide 18). Slide 26 gets into training options, including a link to this course list.

How to Tackle Device Package Installation

Last week, IBM Support posted an update regarding device package installation warnings. Unexciting as that may sound, this is useful, bookmark-worthy information:

Problem
The cfgmgr command (or cfgdev on VIOS) displays the following warning and fails to discover a new device:

cfgmgr: 0514-621 WARNING: The following device packages are required for
    device support but are not currently installed.

This message is followed by one or more fileset names such as:
devices.fcp.disk
devices.fcp.changer
devices.fcp.tape
devices.sas.changer

Cause
The cfgmgr command displays this message when it discovers a device for which a driver cannot be identified. However, these filesets listed in the message do not exist. They are generic names based on the type of device. For example, devices.fcp.tape means that a fibre-channel attached tape drive was found devices.sas.changer means that a SAS-attached media changer (tape robot) was found.

Read on for IBM’s recommendations on resolving these issues.

Network Connectivity Simplified

A decade or so ago, we had to jump through many virtual hoops in our efforts to display information for network devices (specifically, Cisco switches) connected to AIX LPARs. Fortunately, this process has become much simpler, as IBM’s Chris Gibson explains in his recent TechChannel article:

“AIX administrators can now display information for Cisco network devices (switches) that are directly connected to their AIX logical partition (LPAR). This is made possible by exploiting the Cisco Discovery Protocol (CDP). AIX provides the cdpd daemon which can receive incoming data packets or messages by using CDP and discovering the physically connected Cisco devices.”

The output and information available to us these days is pretty impressive:

cdpctl show port en1

Waiting for CDP advertise (default 60 seconds)……

Device ID: route1-n1.local(AGE19190TKY)

Address: 10.10.10.2

Port ID: Ethernet3/43

Capabilities

: Router Level 3

: Level 2 Switch

Cisco switch OS Version: Cisco Nexus Operating System (NX-OS) Software, Version 9.3(9)

Platform: N9K-C9508

Native VLAN ID: 32

Trusted Bitmap: N/A

AVVID untrusted ports: N/A

Duplex: Full

MTU: 576

System Name: route1-n1

System Object ID: N/A

Management Addresses: 10.1.2.10

CDP record received on dev en1 also stored at /tmp/cdp_record_en1

Again, that’s from the article. Be sure to read the whole thing.

Optimizing Power10 Performance

Chris also highlighted this IBM doc on Power10 performance topics:

Abstract: “This document aims to offer guidance and topics for optimizing performance for IBM Power10 processor-based servers. It should be noted that this document does not cover all the best practices for PowerVM, AIX, or IBM i, and should be used in conjunction with other relevant documentation.”

VIOS to NIM Mapping

I always try to make sure my NIM server is at the right level to support my VIO servers. Here’s an IBM Support document that can help you do just that. Though you’ll see conflicting dates of publication, I believe this information has been recently updated.

On User Interfaces and Facing Reality

Lately I’ve been thinking about interfaces that I deal with, both on the job and outside of it. One of my favorites is—still—the smitty menus. I love that they look almost exactly the same as they always have, and that most of the system commands I know and love continue to function as they have for years.

Of course, this is an exception. Consider the evolution of the HMC interface and GUI. Every now and again I stumble across an old, unpatched HMC version, and it makes me think of things that used to be second nature. (And never mind the old interfaces; I also miss the old and familiarand the even older—sounds.)

While we may get eased into changes by being given options to stick with the classic GUI or try out an enhanced interface, eventually, inevitably, the old ways of doing things disappear. Running the new interface is our only choice.

This certainly isn’t exclusive to IBM solutions. Recently I updated the system software for my smartwatch. I immediately noticed that the vendor had made some pretty drastic changes. Menu options and shortcuts were moved, buttons that used to do one thing no longer did anything. Years of muscle memory I’d accumulated were useless, as I had to adapt to new ways of doing things.

But it isn’t just me. Years ago, a coworker fell in love with his Palm Pilot. Even as the world moved on to Blackberrys, Androids and iPhones, he’d live by what I call the “eBay and pray” method of system support. If one device gave out, he’d simply purchase another old used one and move his data there. Eventually though it becomes an exercise in futility. Think of the applications you use—the authenticators, the collaboration tools, things that literally will not run on older hardware and operating systems. The world moves on. Either we move with it or get left behind.

Sure, there are legitimate needs to move forward. Getting the latest bug and security fixes that ship with these updates is essential. Still, it’s easy to wonder if tech companies oftentimes make changes simply for the sake of making change, leaving us users to do the workarounds and roll with the punches. Even as someone who’s spent his career in tech, who knows from day-to-day experience how cool all this stuff is and how much more incredible it’s become over time, if I have the option to stick with an old interface rather than move to a new one, I will choose the tried and true whenever I can, and for as long as I can.

A Warning About Running ezh and eezh Scripts in Updated HMC Environments

Originally published by TechChannel October 31, 2023

Rob McNelly cites recent AIX enhancements and passes along a plethora of tips

In 2020 I wrote about a couple of scripts called ezh and eezh:

“EZH is a script for the [HMC] to provide an alternate, easier to use, command line interface for many common commands, and the goal of the project is to make the HMC command line interface easier to use for day to day administration tasks.

“…Enhanced EZH script is a fork of EZH by Brian Smith for the [HMC] to provide an alternate, simpler command structure which makes the HMC command line interface easier to use for day to day administration tasks. The vision for this project is to enhance and bring EZH commands forward and inline with technologies provided in the newer HMC code base.”

For those of you who use these scripts, you should know that they’ll break the next time you update your HMC. Matthew Opoka, who maintains eezh, gave me this heads up. He explains that, in addition to ezh/eezh, the latest patch will break any HMC script that utilizes sed, since support for the sed command is basically being removed. (These PTFs are intended to address CVE-2023-38280. See the PTF MF71298 and MF71299 readme files for details.)

It’s possible this feature could be reinstated in the HMC code through the IBM Request for Enhancement (RFE) process. If you use sed with the HMC, I suggest you register on IBM’s RFE portal and share your opinions. For the uninitiated, the RFE community is well worth your time and participation, as it gives all of us the opportunity to help shape the future of Power Systems development.

VIOS to NIM Mapping, Recent AIX Enhancements, IBM PowerDraw

These items come courtesy of IBMer Chris Gibson‘s mailing list.

1. This recently updated IBM Support doc covers VIOS to NIM mapping. Check the link to see the master levels needed if you’re using NIM to backup, install, or update a VIOS partition.

2. Here’s an extensive list of recent AIX enhancements, including AIX 7.3 Standard Edition TL 2 and AIX 7 Enterprise Edition 1.10.

3. And here’s a description of a new configuration tool called PowerDraw, which provides an interactive, graphical representation of Power Systems environments:

The product uses the included capture module to collect all the information about the server, PowerVM (VIOS), and partitions from the HMC into a single text capture file… Any modern browser can display and interact with these diagrams, allowing [clients] to easily explore the relationships with mouse over highlighting of connections, and popup boxes that display all the detailed information. It can be useful as the starting point for problem determination, or for easy evaluation of virtualization best practices. The SVG files can be integrated into browser-based dashboards or used as stand-alone local files viewed by a browser. No web server is required.”

Incidentally, Chris’s mailing list is open to all. Email him if you’d like to be subscribed. 

More AIX Problem-Solving Tips

Here are some other things I’ve recently come across.

* This IBM Support doc describes how to use MTU 9000 (jumbo frame) in an environment with AIX LPAR and VIOS.

* Here’s an oldie but a goodie on preparing a Power server for a concurrent SAS or SSD disk replacement in AIX or VIOS:

“It is important to first determine disk array configuration including protection level and then follow appropriate procedure. To perform these procedures AIX root access is required. On VIOS run “oem_setup_env” to switch from user padmin to root.

“If you already established your level of protection, you can go directly to the Procedure.

  • Procedure 1: Replacing an AIX System Disk, hdisk JBOD (Just a Bunch Of Disks)
  • Procedure 2: Replacing a pdisk that is part of a RAID0 unprotected array with a single pdisk
  • Procedure 3: Replacing a pdisk in a RAID5 or RAID10 protected array
  • Procedure 4: Replacing a pdisk in a protected RAID array that was previously a Hot Spare
  • Procedure 5: Replacing a pdisk in an unprotected RAID0 array with multiple pdisks
  • Procedure 6: Replacing a pdisk in a RAID6 array
  • Procedure 7: Replacing a hot spare disk that is either failed or has a PFA (Predictive Failure Analysis) indication” 

* Here’s another useful nugget. Use this HMC command to determine the MAC address of the virtual ethernet adapters on each LPAR:

lshwres -r virtualio –rsubtype eth –level lpar -m <managed system>

* And lastly, a troubleshooter from IBM Support. An AIX system rebooted, but getty didn’t start, preventing a console login:

“The most common reason for getty not starting is failure of a preceding line in /etc/inittab. Since that line does not complete, the line which starts getty is not read….

To troubleshoot:

Check the console setting

# lscons vty0

Check the state of vty0

# lsdev -C|grep vty0

vty0 is available

Check process table for getty

# ps -ef|grep getty

There is no output, which means that the getty process isn’t running.

Check to see what process is prohibiting getty from starting or respawning.

# who -p

“Usually, the culprit is the last process in the who -p output. In this example, it is l2, which is called from the /etc/inittab file. Note: l2 is causing the system to not completely read /etc/inittab.”

See how they addressed the issue.

IBM Support Insights: Infrastructure Management Simplified

Originally published by TechChannel September 7, 2023

IBM Support Insights provides a wealth of critical information to make admins’ jobs easier

Like many of you, my organization manages servers, switches, storage devices, etc., in multiple data centers located throughout and even outside of the country. With collections of devices this large, naturally, there are many challenges. How do you manage hardware inventories? How do you know when a new device is added to your environment? How do you track serial numbers? How do you know when your vendor’s maintenance contracts are about to expire? Do you have an overall risk assessment of your environment at your fingertips?

Recently I received access to a systems management tool called IBM Support Insights, which IBM describes as a cloud-based service that allows organizations to proactively improve uptime and address security vulnerabilities. It’s available at no charge with select IBM warranty support and maintenance contracts.

A subscription version, IBM Support Insights Pro, becomes generally available September 12, 2023. Pro, which uses watsonx Assistant for interactive application help and navigation, is designed in particular for organizations in industries with a heightened sensitivity to downtime and security breaches. In addition to standard capabilities, the Pro version offers access to views for analyzing security vulnerability and hardware lifecycle risks, recommended OS and firmware levels, and enhanced case analysis (to be released shortly after announce) and up to 12 months of case history.

Snapshots of Inventories, Hardware Issues and More

By logging into Support Insights, I was able to answer many of the questions I posed above. In one sample demo account that was loaded with real-world data, I discovered that nine new assets were added in the past month. I also learned the date they were added, the machine type, the product family, the serial number, and the vendor. By clicking on individual assets, I could access contract and billing information. I could see where each asset is installed, who it was sold to, and any cases that were opened against that asset—all from a single portal running in my web browser.

I could also check the firmware level, view the expiration dates for hardware support, and be informed of pending OS updates. I could view all cases that had been opened with IBM support: the total created, the number closed and the number of days from creation to close for each individual case. I could sort IBM cases by asset name to see how many calls were being logged against specific machines.

This additional information makes it easier to plan actions around inventories. Organizations can quickly learn when and what assets are approaching end of service, which LPARs need to be updated and which machines they’re spending the most time servicing. They’ll eliminate gaps in service that could lead to extended downtime, and they won’t miss patches on any of their machines.

Getting back to my testing, I found a risks and recommendations page that provided an overall risk score for my environment. I could access visual detail—i.e., charts—to examine coverage issues, OS/firmware end of support events, security vulnerabilities (CVEs) and hardware end of support events. I was able to view all my devices that lacked a support contract, along with assets with a looming expiration of coverage (within 30 to 90 days).

Have you ever tried to open a ticket with support, only to learn that your support contract has expired? Or have you been told that support can’t assist you until you update your LPAR, because the issue you’re dealing with was fixed in a service pack issued months earlier that you haven’t installed? With this information close at hand, organizations can avoid these dreaded but all too common experiences.

Under the OS/Firmware dial I could see which machines needed firmware or OS updates. The Hardware dial displayed alerts for all the hardware that was at or approaching end of life. The Asset Analysis tab—a feature available only to Support Insights Pro subscribers—contains views for analyzing security vulnerability and hardware lifecycle risks and enables users to make reasoned decisions about patches, upgrades and replacement planning by providing details about correlating risks and issues.

All of this information is extremely valuable. You’re afforded a snapshot of your hardware issues. You can determine if your staff is spending a lot of time with the vendor trying to resolve these issues. Support Insights offers visibility to your entire data center’s inventory while providing information from various sources, connectors, and vendors to help you navigate potential issues and recommended actions.

From an IBM perspective, the tool utilizes information derived from the IBM Call Home service to get data related to IBM Power and Storage systems, things like firmware levels. The tool can also source data from IBM contracts, allowing simple access to information on machine types and models, contract coverage information, along with warranty information. This can help populate information around lifecycle status of hardware, along with any coverage or warranty issues.

The IBM Technical Support Appliance (TSA) can be used to automatically discover assets and import data into Support Insights. I briefly describe TSA here.

Support Insights provides lifecycle information for the AIX, IBM i and Red Hat operating systems, along with widely used IBM solutions like VIOS and the HMC. In one location you can view the versions you’re running, along with recommended updates. There is also support for non-IBM servers (Dell, HPE, Oracle, VMWare), storage platforms (EMC, HPE, NetApp) and network and security devices (including Cisco, Juniper, F5, Brocade, QLogic, Check Point, Palo Alto and Fortinet). On that note, Support Insights supports the Cisco CSPC Connector, which provides information about OS levels, security vulnerabilities, hardware, software end-of-support and more. Further, the Support Insights ServiceNow CMDB Connector can pull data from ServiceNow to help populate inventory and provides additional data for analysis.  As of this writing, the ServiceNow Connector is in beta.

Some Views From My Testing

Since a picture is worth a thousand words (or so I’ve been told), I’ll share some screenshots.

Here’s a sample of inventory management:

Inventory coverage from a sample customer

Figure 1. Inventory coverage from a sample customer

Here’s a view of inventory coverage from a sample customer. These views are customizable. I can see current maintenance and support agreements and determine if some have expired or are about to. With this information, I can quickly and easily identify my most at-risk assets and avoid issues that could trigger unplanned downtime.

Machine type/models and coverage status

Figure 2. Machine type/models and coverage status

By drilling down into the asset names, I could see machine type/models and the coverage status. Running this against real data can help you proactively manage renewals, which saves money, time and resources. I don’t have to wonder whether all my devices are covered. I know.

Renewal management

Figure 3. Renewal management

Have I opened cases with vendors? How many? How long has it taken to get these issues resolved? This information can be used to inform business units about vendor relationships and help the organization learn to identify any machines that have abnormal service patterns.

Case metrics dashboard

Figure 4. Case metrics dashboard

What risks exist in my environment? Are they related to security or back-level OSes and hardware that is end of support? With prioritized at-risk assets identified, I can plan for patches and move to supported hardware when necessary.

Risk and recommendations dashboard view

Figure 5. Risk and recommendations dashboard view

Risks and recommendations asset list

Figure 6. Risks and recommendations asset list

These views are from Support Insights Pro. They enable users to understand security vulnerability and hardware lifecycle risks at a deeper level and make better, more informed decisions by correlating risk factors across security, hardware and software end of support, and support coverage.

Risks and recommendations, continued

Figure 7. Risks and recommendations, continued

Providing Visibility

During my training, I came across this summary of the tool:

“IT inventory management requires visibility first and foremost. Without visibility across the IT estate, it is difficult and time-consuming to identify the most critical issues and assess the business impact. IBM Support Insights provides automated multi-vendor infrastructure asset discovery, customized inventory views and exports, and support coverage and change analysis.

“By understanding the overall risk of their IT estate (servers, storage and networking), clients can benefit from taking on-time and preventive actions to improve their overall availability and reduce the risk of security breaches and service disruption.

An interactive demo is available here. You’ll need an IBM ID to log in. Or contact ibmsi@us.ibm.com for details.

The Subtle but Significant Differences Posed by Power10 Hardware and the New HMC

Originally published by TechChannel August 24, 2023

Rob McNelly urges old dogs to learn new tricks and shares additional thoughts about the recently retired Nigel Griffiths

In March, Jaqui Lynch wrote about connecting HMCs to Power10 servers.

If you’re in the process of moving to Power10/updating your HMCs, this is a must-read. If you’d like me to summarize it, I can do so in a pithy phrase: The only constant is change.

The modern HMC has different connectivity methods compared to the original versions that us old dogs relied upon for all of these years. If it’s been some time since you’ve refreshed your Power or HMC hardware, you need to understand the changes that have been implemented. The Power-based HMC running V10 has yet another new interface, and there are new ways to connect the HMC to the network. While most of us have moved from the classic HMC interface to the enhanced version over the past few years (unless you stayed on old, unsupported hardware and code levels, which creates its own set of problems), this latest change is something we must all deal with eventually as we refresh our environments. If your current hardware is, actually, relatively current, that may be a bit down the road yet. Still, it never hurts to think about it in advance.

Jaqui’s article includes a number of useful supporting links. One is an IBM-produced set of videos on installing Power10 servers and connecting them to the HMC. Give them a watch. These short videos are packed with important information.

Similar to the HMC’s transformation, Power10 is a slightly different animal compared to POWER8/9. To provide some idea of the content, here’s a list of individual video titles:

  • “A look at the top three issues (HMC Access ID, VMI, and ACFs) seen on POWER10 eBMC systems”
  • “SSH with eBMC”
  • “eBMC Basic User Functions”
  • “Firmware update via eBMC ASMI menu”
  • “Configuring a new eBMC POWER10 system without an HMC”
  • “Lost Admin Password Recovery”
  • “Configuring a new POWER10 eBMC system with a DHCP address provided by the HMC”

Once you’re up and running in your new environment, you’ll find that many of the tasks that you’ve grown accustomed to remain the same. The major changes are mostly confined to the process of setting up your system for the first time. In addition, you’ll find some differences in patching and connecting to the BMC.

Speaking of the HMC

This is an oldie but a goody. You have a server that’s been assigned an IP address from the HMC. How do you determine what that address is? Function 30 is what you need:

“The default TCP/IP addresses for POWER server FSP ethernet ports HMC1 and HMC2 differ depending on the platform and server firmware level. The following table documents the default IP addresses of both ports for both single and redundant FSPs (if installed).

“Note: IBM POWER8, POWER7, and POWER6 servers with firmware Ex340 or later implement zero configuration (zeroconf) networking for the FSP. The default settings remain the same however it implements zero configuration networking to prevent duplicate IP addresses, should multiple servers be plugged into the same network with no DHCP server available. When the FSP is plugged into a network with no active DHCP server, it will locate a unique IP address in the 169.254 range.

“To determine the current IP address of the FSP, use control panel function 30. For further information see IBM support document N1015416 Panel Function 30.”

One other HMC-related item: The next session of the Power Systems Virtual User Group is devoted to HMC and VIOS, with Jaqui Lynch handling the presentation:

“VIO servers are the most critical part of your system setup. If they are not happy, then no client LPAR will be happy. This session provides tips on setting up and maintaining VIO servers including upgrades and patching. Backup and recovery will also be covered. The HMC is a critical component of your environment and has become far more complicated. Time permitting, we will cover the differences between the old Intel HMC and the new POWER 7063 HMCs. We will also cover maintenance for the HMC and BMC and some of the new techniques and options made available with version 10. Tools such as FLRT, FLRTVC, and the HMCScanner will also be discussed.”

While the live event has already taken place, a replay will be uploaded in the near future.

Still At It

Not surprisingly, Nigel is still on social media, extolling the capabilities of IBM Power Systems/AIX.

Screen-Shot-2023-08-24-at-10-25-38-AM.png

I refer to him as Nigel, because the first name alone is sufficient. Of course, I mean Nigel Griffiths, who retired from IBM in July. I mentioned it in passing in my previous column, but I planned on saying a bit more once I had time to gather my thoughts.

Nigel will always be a unique presence in our world. Seriously, who else is there in the Power/AIX space who’s instantly recognized on a first-name basis? It sure isn’t Rob.

When I think of Nigel, I always come back to the sheer volume of work he’s produced over the years. His YouTube page features 200+ videos, some going back more than a decade. This is to say nothing of his countless in-person presentations at conferences, or the webinars and lectures he presented, the first looks at hardware, the infographics, the simple to understand demos of new technologies and enhancements.

Then there’s the NMON and NJMON performance monitoring tools. It is a sign of confidence in your skills when you name your application Nigel’s Monitor. It’s a sign of your talent and ingenuity when the AIX world adopts it, and IBM itself recognizes its value and adds it to AIX:

The original nmon version was for the IBM AIX operating system (Release 4.3 and above) and was a freely downloadable binary format only tool from the IBM AIX wiki. Later a version was written for the Linux operating system running on IA-32, x86, x86_64, IBM RS/6000 and POWER processors, mainframe and ARM (including Raspberry Pi). nmon for Linux was released by IBM as open source in July 2009. The code is available from the Sourceforge open-source repository.

“The nmon for AIX code was later bundled in as part of the AIX operating systems. From AIX 5.3 TL09 and AIX 6.1 TL02 onward it was included in the default installation of AIX and fully supported by IBM. The nmon command and the topas command are the same binary but behave differently depending on the command name used.”

Honestly, it’s hard to think of anything I’ve learned over the years that didn’t somehow involve Nigel, either as a presenter or as a technical expert answering questions in the background.

Voids occur when people move on, so I’m very happy to see that Nigel has merely stepped back, rather than walk away completely. Hopefully we’ll continue to hear more, wherever his future adventures take him.

When the Latest and Greatest Isn’t That Great

Originally published by TechChannel August 15, 2023

Rob McNelly on the challenges posed by some technological innovations, notes from IBM Support, and a shout-out to the newly retired Nigel Griffiths

I recently went on a wild ride while with family. Thanks to a last-minute cancellation, my sister and niece needed to get from Rochester, New York, to Newark, New Jersey, to catch their connecting flight home to Phoenix.

That flight, from Newark was scheduled to leave around 8 p.m. We were notified of the cancellation around noon, which at least gave us some time. While the airline did offer them a flight from Rochester to Chicago, there was a catch: two more connecting flights would follow, from Chicago to Los Angeles, and then from Los Angeles to Phoenix.

As a seasoned traveler, I knew that itinerary was problematic. A delay at any point, and my sister and niece could be left to wander through airports for a day or longer. It made the most sense to keep that original nonstop flight and find another route to Newark. Since I already had a rental car with a full tank of gas, I decided to channel my inner Elwood Blues and hit the road.

The drive took roughly 11 hours round trip. That’s about as far as driving from Phoenix to Denver, or Phoenix to Salt Lake City, or Phoenix to San Francisco. As I’ve mentioned, I’m good with driving. Last fall I drove across the country and back. Just a few weeks ago I chose to skip the hassles of flying and drive some six hours to a customer site in California. Really, considering the time I’d spend parking at the airport, going through security and waiting at the gate, the time commitment was about the same flying or driving.

But back to the Rochester to Newark run. Adding to the excitement, a thunderstorm rolled through; it literally rained the whole way. Living in the Arizona desert as I do, this isn’t something I’m used to. My rental car was another unfamiliar space. It was a 2023 model with only about 5,000 miles on it. This vehicle had all the modern bells and whistles and tech.

Odd as this may sound coming from me, a technology professional, I prefer older vehicles. They’re reliable, and they’re cheap to register and insure. I drive my cars into the ground before I replace them.

Certainly, the technology in new vehicles is impressive, but it seems each innovation leads to new things that can go wrong. In this case, it was the cruise control. In many new vehicles, the cruise control system utilizes sensors that determine the proximity of your vehicle to the one ahead of you on the highway. If the car in front of you slows down, your cruise control slows you down to maintain safe separation. If that vehicle ahead of you speeds up, so does yours, automatically. This is all fine and good in normal weather conditions, but again, it was raining. And once the rain got really heavy, that triggered an error message on my dash. The sensor stopped working during the downpour, so the cruise control stopped working, and there was no way to override it from what I could tell. So, for a time I actually had to use the gas pedal. That felt like the Dark Ages, or maybe the Flintstones. Anyway, not being able to set it and forget it with the cruise made things a bit more arduous.

I’ve found similar challenges with four-wheel drive. In my old 4×4 Suburban, I just press a button and I’m good to go. Newer vehicles have computerized traction control that are designed to compensate for the driver’s abilities; they do things that you might not expect. This and other automated settings can get you in trouble if you don’t realize they’re active.

Not that there isn’t precedent for that sort of thing. A Windows laptop will update and reboot itself every few weeks. Automatic updates are great, provided everything is working as expected. But what if some application stops working because of issues with a patch? What if you were not expecting a reboot and you lose work that was in progress, or, in the worst-case scenario, what if a bug with the patch causes the OS itself to stop booting? As frequently and fervently and as I advocate for everyone to regularly patch their systems, I wouldn’t want the production systems that I work on to automatically download patches from IBM and reboot without my input.

In any event, the rain eventually let up and the error with the sensors went away, which allowed me to reengage the cruise control. I spent the rest of the trip wistfully thinking of my old car and its simple settings.

News and Tips from IBM Support

1. An important update on regarding Call Home and electronic fix distribution: “Due to technical issues, the May 30/31 change will not become active until Sept 19/20, but leave in your firewall configuration if already configured from our previous guidance.”

2. This information on AIX ulimit and maxuproc may come in handy: “A design change took effect in AIX 7.2 TL5 and AIX 7.3 so that ulimit shows the value set for maxuproc. Since processes per user, or nproc, is not set in /etc/security/limits, the value shown in previous releases of AIX was always unlimited…. The design change ensures that ulimit shows what the actual limits are for this value.”

3. Get an rPerf performance rating: “Run this script of your AIX server to calculate an estimated rPerf rating for that particular logical partition (LPAR), also called virtual machine (VM). The estimate is based on the number CPUs and Machine-Type-Model (MTM) calculated from the official documented rPerf ratings.”

4. I ran into this recently after upgrading VIOS: “Starting in VIOS version 2.2.4.10 and above, the following message of the day (motd) message is displayed upon login after the VIOS is updated to 2.2.4.10 or higher…”

In my case I had to run the rulescfgset command twice for it to work.

Nigel Griffiths Retires from IBM

Did you see that Nigel Griffiths recently retired from IBM? Nigel is truly a unique presence who has contributed so much to the AIX world. I, for one, have cited his deep dives into the technology almost constantly over the years. But on that note, if IBM is looking for someone to carry on with the behind the scenes videos of their brand new servers, I’m certainly available. Seriously though, congrats and best wishes to Nigel.

Google’s Bard Spins Some Tales About IBM Power Systems and AIX

Originally published by TechChannel June 22, 2023

Rob McNelly experiments with the AI tool to see what it knows about IBM Power Systems and AIX

I assume none of you live under a rock, so you’ve most likely heard about people’s experiences toying around with ChatGPT and other AI solutions. I decided to get in on the fun, so recently I put Google’s bot—it’s called Bard—through its paces.

I wanted to see what Bard knows about IBM Power Systems and AIX. Let’s just say the answers I received were underwhelming.

POWER9 Chips

My first query was a simple one: Can you still get POWER9 chips? The response:

“Yes, you can still get POWER9 chips. IBM is still manufacturing and selling POWER9 chips, and there are a number of other companies that make POWER9-based servers. Some of the companies that sell POWER9 chips include: IBM, Raptor Computing Systems, Suzhou Powercore Technology, Tyan, SGI, Lenovo.”

This isn’t entirely accurate. I know about systems from Raptor, and I’ve occasionally come across articles that reference Suzhou working with modified Power chips in China. Tyan did produce a POWER8 system, but I’m not aware of anything from SGI, which went bankrupt and was bought out in 2009.

I decided I should be more specific. I asked which Lenovo systems use POWER9 chips. Here’s Bard:

“Lenovo offers a variety of systems that use POWER9 chips, including:
ThinkSystem SR650
ThinkSystem SR950
ThinkSystem DX360 M5
ThinkSystem DX750 M5
ThinkSystem SD650
ThinkSystem SD750
ThinkSystem TH240
ThinkSystem TH440
ThinkSystem TH540”

Me again. You likely know, as I do, that ThinkSystems are rack servers that run Intel processors. But plenty of folks, some of whom may even oversee your IT department, wouldn’t know the difference. Taking this information at face value could lead to some very uninformed decisions.

AIX vs. Linux

So, I took a different approach and made my queries more general. I asked about the advantages of AIX over Linux. The response:

“AIX has several advantages over Linux, including:

  • Stability: AIX is known for its stability and reliability. It has been around for over 30 years and is used by some of the world’s largest companies.
  • Performance: AIX is a high-performance operating system that can be used for a variety of demanding workloads.
  • Security: AIX is a secure operating system that is regularly updated with security patches.
  • Support: AIX is supported by IBM, which has a large network of support professionals.
  • Compatibility: AIX is compatible with a wide range of hardware and software.

“However, AIX is also more expensive than Linux. It is also not as widely available as Linux. Ultimately, the best operating system for you will depend on your specific needs and requirements. If you need a stable, reliable and secure operating system for demanding workloads, then AIX is a good option. If you are looking for a more affordable and widely available operating system, then Linux is a good option.”

A lot of that sounds reasonable. But the part about AIX being compatible with a wide range of hardware and software? Not so much.

OSes on IBM Power

I persisted. I asked what operating systems run on IBM Power. Bard answered:

“The IBM Power servers can run a variety of operating systems, including:

  • AIX: AIX is a Unix-based operating system developed by IBM. It is known for its stability, reliability and performance.
  • Linux: Linux is a Unix-like operating system that is open source and free to use. It is a popular choice for servers and other high-performance computing applications.
  • IBM i: IBM i is a proprietary operating system that is designed for business applications. It is known for its stability and security.
  • z/OS: z/OS is a mainframe operating system that is developed and marketed by IBM. It is known for its scalability and reliability.

“Other operating systems: There are a number of other operating systems that can be run on IBM Power servers, including Solaris, HP-UX, and FreeBSD.”

Again, this was fine until the mentions of z/OS, Solaris, etc.

Key Takeaways

By this point, I’d seen enough. For anyone who’s worked with Power Systems and AIX, the errors I encountered are easy to recognize, but of course there are plenty of instances where I actually need answers. I guess the lesson is don’t accept anything at face value. Sure, I probably could have better phrased my questions. And I expect that AI tools will, in the near future, rapidly progress in their efficiency. But whether we’re querying a traditional search engine or an AI system, we still need to fact check our results.

While many, many AI think pieces have been posted just in the past few months, I like the perspective offered by the author John Scalzi. He recently conducted his own experiment with Bard.

“I asked Bard, Google’s AI, the name my upcoming novel, and also (as a test, because I haven’t publicly announced the title), the title of the novel after that. Bard correctly noted the name of the next novel (Starter Villain) and correctly noted I hadn’t publicly named the novel after that. But then, unprompted, it confabulated the following:

“Here are some details about Starter Villain:

  • The book is set in a world where villains are a legitimate career path.
  • The main character, Jason, is a young man who dreams of becoming a villain.
  • Jason enrolls in a villainy academy, where he learns the skills he needs to be a successful villain.
  • Jason’s classmates include a group of misfits and outcasts, who he quickly befriends.
  • Jason and his friends must face off against a powerful villain, who is determined to destroy them.”

“None of the above is even remotely true. I mean, it sounds like a fun YA book! But that’s not the book I wrote.

“Why does Bard think this is what my book’s about? Got me. It’s not like the details of Starter Villain aren’t out there at this point; reviews are starting to come in and we have a “back cover” book synopsis on every online retailer out there.”

Scalzi concludes: “Today, I think there will be a whole generation of people, particularly my age and older, so used to the idea that Google and other search engines pull up ‘correct’ information—an idea promoted by Google and other search engine owners, to be sure—that they won’t even question whether the information they’re being offered up has any relation to the truth.

“AI will make the internet even less truthful than it is today. It is already doing it.”

Lest you think that Scalzi was being hyperbolic at the end, a law firm was recently found to have conducted research using ChatGPT that referenced non-existent example legal cases.

Maintaining a healthy skepticism was a good idea at the dawn of the public internet. If anything, it’s even more important now.

Nigel Griffiths: An AIX Content Machine

Here’s the latest from Nigel’s AIXpert blog. You’ll also find a link to his ever-expanding trove of YouTube videos.

If You’re Not Using Inventory Scout, You’re Missing Out

Originally published by TechChannel May 17, 2023

Rob McNelly with the latest on invscout, along with tips, resources and an opportunity to give feedback to IBM

I’ve mentioned Inventory Scout (invscout) quite a bit over the years. That includes producing this video tutorial and citing this script.

So why am I bringing it up yet again? It’s to let you know that IBM has once again changed the location where you can download the catalog.mic file. Because IBM no longer uses FTP, I had to run this command to download the file:
            wget –no-check-certificate

Here’s how to upload invscout microcode survey (mup) files.

Survey files from Inventory Scout may be concatenated together and uploaded in one HTTP POST operation. Extra blank lines between survey files are ignored, however if a newline character is missing from the end of one of the survey files, the resulting concatenation may be unparsable. So it is a good idea to insert an extra blank line between files. The name of the upload file must either be mdsData, or end with .mup .

The prepared upload file may be sent to the MDS upload server by invoking a utility that performs a standard HTTP POST operation. A number of utilities are available on the internet that can perform the POST operation. For instance the following cURL command (available in the AIX Toolbox) will upload a file called “local.mup” to MDS for analysis:
            curl -F “mdsData=@local.mup;type=multipart/form

If the POST goes correctly, the resulting output stream will contain an HTML document containing an analysis of the uploaded microcode survey file.

That may seem like a bother, but invscout is a very quick and effective way to determine if your server and I/O firmware are up to date. If I use my script and give it a list of hosts to check, I can get a consolidated report that provides tons of information: hostname, IP address, server model, serial number, and installed microcode levels (along with the latest available levels). Rest assured, invscout is worth the effort.

Network Adapter Issue Explained

IBM’s Darshan Patel recently posted this explanation of a PCIe3 connection issue.

Question: Why does the link of ent4 and ent7 in following setup take a long time to come up or not come up at all?

Discussion with Cisco revealed that the port connected to the Mellanox adapter needs special tuning. Mellanox adapters and switch use a low frequency communication method for auto-negotiation during the link up process. Some switches have compatibility issues and do not support the low frequency communication in their hardware. In order to overcome the switch port speed getting locked to the negotiation signal, Cisco Nexus 9000 switches have a dfe-tuning-delay command that enables them to start locking to the signal only after a predefined delay time to avoid trying to lock on the low frequency signal. There are signal paths placed on the switch PCB that connect switch port to the chip inside the switch. Not all the paths are the same length. Depending upon the signal path, only certain ports on the switch require this tuning.

Be sure to read the whole thing. And thanks to IBM’s Chris Gibson for bringing this to my attention.

Power Systems Prereqs

IBM has a web page called Power Systems Prerequisites.

By selecting your machine type, feature codes, and operating system, you can receive some valuable information. For example, I entered 9009-22G, Feature Code 5729, and ALL operating systems, and got back 18 prerequisites. Here are a few other returns I got, all with 5729 as the feature code:

            AIX Version 7.1 with the 7100-05 Technology Level and Service Pack 7100-05-06-2028
            (#0000) For any I/O configurations

            AIX Version 7.2 with the 7200-03 Technology Level and Service Pack 7200-03-06-2038
            (#0000) For any I/O configurations planned availability February 19, 2021

            AIX Version 7.2 with the 7200-04 Technology Level and Service Pack 7200-04-02-2028
            (#0000) For any I/O configurations

            AIX Version 7.2 with the 7200-05 Technology Level
            (#0000) For any I/O configurations

            AIX Version 7.3 with the 7300-00 Technology Level
            (#0000) For any I/O configurations

            IBM i 7.1: RS710-10 OS & TR PTF Group (SF99707 Level 11)

            Fix Level: C7192710 + HIPER PTF Group (#0000) Requires latest PTFs and HW feature EB3U
            IBM i 7.1 Activation for S922

Power10 Systems Redbooks and More

These publications have been out awhile, but if you haven’t had an opportunity to get up to speed on Power10 systems, here are some places to start:

            * IBM Power E1050: Technical Overview and Introduction
            * IBM Power S1014, S1022s, S1022, and S1024: Technical Overview and Introduction
            * IBM Power E1080: Technical Overview and Introduction

While I’m at it, here’s an introduction to 7063-CR2 HMC configuration from IBM Support.

Power Research Program Seeking Feedback

IBM has some surveys they would like you to fill out: one for the Power Research Program, one for Power Server Energy Modes, and one regarding OS subscriptions.

Teaching the Next Generation

If you’re not familiar with the IBM Power Skills Academy (previously known as the IBM Academic Initiative), read about the ongoing efforts to provide education and training on Power Systems, AIX and IBM i at colleges and universities worldwide.

The IBM Power Skills Academy (PSA) has equipped educators at eligible colleges and universities throughout the world with the materials, technology and resources they need to teach their students IBM Power skills.

These resources include free access to IBM Power courses and our Power Academic Cloud.

PSA enables colleges and universities to enhance their computer science, information systems, engineering and business programs, resulting in students who can compete in the job market of any industry. All PSA materials and resources are available to student and instructor members at no charge!

The Power Academic Cloud provides remote access to Power systems running AIX, IBM i and/or Linux, for teaching and non-commercial research activities. Based on your unique requirements, the PSA team will build an environment on which you can teach the latest Power technology and business application strategies. The Power Academic Cloud is available, for no charge, to faculty and IT staff who are registered members of PSA.

Ansible Webinar, HMC Tips and Changes to IBM’s Call Home

Originally published by TechChannel April 27, 2023

Rob McNelly offers another anecdote about the value of testing changes

One advantage of being an IBM Champion is the communication we get from IBM. I’m subscribed to various newsletters that provide useful information. For instance, I was recently notified about a webinar about Ansible. This is from the email:

“Join us to learn how Ansible offers significant benefits by providing fast and repeatable installations and configurations for AIX and the Oracle Database, including the RAC feature. In this webinar, you will learn what the available AIX and Oracle Ansible automation tools are, and how they can be used to achieve better and more reliable installations.”

Here’s the replay.

That same newsletter includes links to the recently updated AIX strategy paper and roadmap, the community badge programAIX training courses, and AIX community information. Finally, there is a primer on why “AIX running on IBM Power10 is a winning combination for your business.”

Sure, you may have already seen some of this information, but I appreciate these reminders.

On the subject of IBM resources, check out what’s new at IBM Support. Recent updates include details on opening a support case (in text and video formats), among many other topics.

CLI Options: An Update

A reader recently reached out about something I wrote for my AIXchange blog almost a decade ago. The post is about using the HMC command line interface (which I still love).

Apparently I was unclear with my examples, because the reader copied the first one, thinking it would add an additional adapter in his environment without affecting the adapters he already had. Instead, it removed his existing virtual adapters. Fortunately, he was able to save his running config and recapture the virtual adapters that had been removed.

Remember: When you see something online, no matter the source, make sure you try it out in a test/dev environment first before putting it into production.

I should add that I’ve since found a simpler way to add virtual adapters and VLANs. Check out this TechChannel column from 2021.

Connecting eBMC Power Servers to the HMC

If you’re looking to connect your Power10 eBMC system to an HMC, this document may help:

“eBMC Power servers introduce a new connection type between HMC and managed server. Each eBMC server requires two connections; one to the BMC itself and one to the Virtual Management Interface (VMI). This requires two IP addresses per BMC Ethernet port. As with FSP-based systems there are two eBMC Ethernet ports to allow for redundant HMCs. Each HMC is cabled (private DHCP networks) or connected via static IP address to one eBMC port. Redundant HMCs require a total of 4 IP addresses for each server. The default setting for the eBMC IP address is DHCP client, however the default setting for the VMI IP address is currently set to static. The VMI IP settings should be configured before powering on, or the server will go into a no connection state when it reaches standby. The eBMC IP address is active when the server has completed AC power apply initialization (power off state). The VMI IP address is not active until the server is powered on (standby or operating).”

HMC Enhanced UI Configuration

Another HMC item, this one on the Enhanced UI. It’s from 2020, but the information is relevant:

“With Enhanced UI, we simplified the virtual network management experience and designed a simple and fresh model that helps you configure networking to the VMs with few simple steps avoiding the chances of user errors (and all that from a single console). With the newer model, configuring network adapter for a VM is as simple as creating a network on the system and adding VM to that network (that’s it!)”

Call Home Users Need to Update Their Connections

This notification has made the rounds, but in case you haven’t seen it, there are significant changes are on the horizon with IBM’s Call Home electronic fix distribution solution:

“Public internet IP addresses are changing for the IBM servers that support Call Home and electronic download of fixes for customer system’s software, hardware, and operating system. This change pertains to all operating systems and applications connecting to IBM for electronic Call Home and fix download. Customer action might be required to ensure uninterrupted Call Home and fix delivery services.

“New network connections between your machine and IBM servers are required to keep your ability to perform Call Home and download fixes. If you have a firewall in your network, you might need to make changes to allow the new connections.”

According to the table in the IBM Support link, most of these redirects will occur in 2024. But depending on your change control and how siloed your organization may be, it’s advisable to take action at your earliest opportunity.

Hanging Out (Virtually) with IBM Champions, Upgrading an Older HMC and More

Originally published by TechChannel March 23, 2023

Rob McNelly explains how he recently updated an older HMC with the help of IBM Support and gives a reminder about technical debt

In January IBM announced its latest class of IBM Champions.

As is noted in that blog entry, “This year’s 839 IBM Champions come from 60 countries; 68% of them have been IBM Champions before and are returning to the program, including our 30 Lifetime IBM Champions.”

That I am one of those 30 Lifetime Champions is still humbling for me, but of course it’s an honor to be associated with all the Champions. And there is an ongoing association, via the IBM Champion Slack channels. I get on whenever I can. Being able to interact in near real time, to ask questions of and get advice from these very smart people located around the world, is pretty cool.

IBM Support to the Rescue

I was recently asked about upgrading an older HMC. Writing for this website, Jaqui Lynch has a lot of good general information about upgrading and maintaining your systems.

But once I found something relevant to the specific question, courtesy of IBM Support, I thought I had a handle on it. That’s because when I read this:

“This document provides instructions to upgrade vHMC for PowerVM and HMC 7063 machine type from Version 9 Release 1 with mandatory fix MH01858, or V9R2M950 to V10R1M1010.”

I felt confident that I could apply MH01858 and then go directly to V10R1M1010 without messing with any V9 versions.

Simply opening a ticket with IBM Support saved quite a bit of trouble, as they were able to confirm that my client could indeed make the leap. Unfortunately, a hardware issue unrelated to the upgrade threw us for a bit of a loop.

We proceeded through the steps. But once we ran:

            chhmc -c altdiskboot -s enable –mode upgrade

and then:

            hmcshutdown -t now -r

nothing happened.

Support had us go through the process again. When that didn’t work, they asked us to send a debug file.

At this point, they determined that somehow the RAID1 array kicked out one of the drives and sent us a procedure to resolve the issue. This involved getting the root password from Support and running various commands, including:

            /opt/hmc/bin/mvcli info -o vd
            /opt/hmc/bin/mvcli delete -o vd
            /opt/hmc/bin/mvcl rebuild

The point is, if you run into this type of issue, you will most likely need the help of IBM Support, so I’ll leave you to open your own ticket and get the procedure.

Once the array was rebuilt and Support confirmed there were no errors, we were able to continue with the firmware upgrade, the upgrade to V10R2M1030 and the managed system firmware.

Around that time, I pressed a button on my iPhone to upgrade the software. That process went just a bit more smoothly.

A Script to Collect HMC Profile Data

Also HMC-related: Kristian’s Blog: HMC Profile Diff

“More often than not, I find myself having to compare two Hardware Management Console (HMC) logical partition (LPAR) profile configurations. Sometimes this is to ensure that a profile in a disaster recovery site matches that of its production counterpart. Other times, it’s to make sure all members of a cluster have identical resource configurations.

“In a perfect world, I’d be managing the LPAR configurations using something like Terraform, but this currently isn’t an option. I can login to the HMC and manually verify the profiles (which is what I’ve been doing), but this is tedious and prone to error when you need to compare many profile pairs.”

A Reminder About Technical Debt

One of the interesting things about consumer software is that you have few choices when it comes to upgrades. While you can postpone updates on Windows machines, or your phone, the fixes will load eventually. Where am I going with this? The same place I always do. Please consider upgrading your machines on a regular schedule to avoid technical debt.

SXSW is Going On, But There’s an Even Cooler Scene in Austin

Many of us who work on AIX have been fortunate enough to have spent time at the IBM facility in Austin, Texas. But if you haven’t been on site, see it for yourself, and learn how the E1080 models were developed and tested.

Networking, Hanging NIM Operations and AIX Quirks

Originally published by TechChannel February 16, 2023

Rob McNelly covers how to troubleshoot common NIM issues and AIX tips and tricks

I enjoyed this perspective on the Cisco Discovery Protocol (CDP) from IBM Champion Andrey Klyachkin:

“You know this eternal problem, don’t you? Datacenter guys laid the cable from an AIX server to some port on some network switch. Of course, everything was written on some sheet of paper many years ago. But years went after years, the paper was blown away by the wind and nobody knows anymore where the cable goes to.

“Fortunately Cisco sends CDP packets regularly and you can find the information about the connected port and switch there. I even know datacenters where it is still allowed and some where security guys asked to shut down CDP…

“If CDP is switched on, it sends a packet every 60 seconds with the information about the switch and port. You can get the packet and decode it to get the information. The only problem is to decode the information.

“One of the new features of IBM AIX 7.2 TL5 is CDP daemon. AIX understands and can decode CDP packets. In order to use it, you must have the packet bos.net.tcp.cdp installed…”

Read the entire post, which includes four screenshots, on Andrey’s LinkedIn feed. There’s also a caveat about running cdpctl show port enX to get the information you’re interested in. All in all, this seems much easier than the method I wrote about back in 2014.

While that was posted back in December, the conversation continues. Read this response via Twitter, along with this article on implementing the Link Layer Discovery Protocol (LLDP) on VIOS.

Finally, here’s a breakdown of the differences between LLDP and CDP:

“LLDP is a layer two discovery protocol, similar to Cisco’s CDP. The big difference between the two is that LLDP is a standard while CDP is a Cisco proprietary protocol.

“Cisco devices support the IEEE 802.1ab version of LLDP. This allows non-Cisco devices to advertise information about themselves to our network devices.”

Diagnosing Hanging NIM Operations

As a long-time NIM advocate, Chris Gibson’s tweet on troubleshooting hanging NIM mksysb caught my eye. You’ll find the details in this IBM Support document:

“Problem: In this document we go through a few of the most common causes for a hanging mksysb operation. We will dive into the internals during NIM backups and learn how to troubleshoot and fix those problems.

“Symptom: The mksysb displays 100% complete on your NIM, but prompt is never returned. The mksysb is hanging at a certain percent of the backup.

“Cause
1 The cause for a hang is usually a problem with the network. At the beginning of a NIM mksysb, the NIMSH daemon working on the client LPAR will open two TCP sessions, one on client port 3901 to > NIM 1023-513 and one on client port 3902 to > NIM 1023-513 where the second session is referred to as Auxiliary session and will be used to relay the mksysb command success/failure return code when the backup complete. If this session is dropped or interrupted, the NIM master will keep waiting for that return code even after the process is fully complete and successful.

2 During mksysb backup we use the ‘backbyname’ command to back up the data we need, if the command is unable to access/read a specific file or directory, the process may hang. Normally, this would be caused by a hung NFS mount point or one where the root used has no read permissions for. Additionally, this may be caused by a corrupt file system.”

AIX USB Device Quirks

Here’s yet another tip that came my way via Chris on Twitter. While one could make a strong case that AIX pros are quirky, this information from IBM’s AIX 7.3 document archive covers quirks:

“Starting from AIX 7.3 Technology Level 1, the AIX operating system provides quirks to support various third-party USB mass storage devices.

“The following quirks are supported starting from AIX 7.3 for USB mass storage devices:

delay_doorbell: If this quirk is set, the ringing of the adapter doorbell is delayed by 1 ms.

cbw_csw_order: If this quirk is set, the data packets are sent only in the order of command block wrapper (CBW), data, and command status wrapper (CSW). Most of the third-party devices, for example, Seagate and Western Digital, need this quirk to operate correctly with the AIX USB stack.”

Also explained: how to add quirk for USB devices, and how to create quirk entries for specific USB devices, be they from third parties or any specific vendor.

Problem-Solving via the Internet, the Importance of Patching and More

Originally published by TechChannel January 20, 2023

Rob McNelly shares clients’ stories and details how he solved their technical challenges

Recently, I helped mount a filesystem on a Windows system that was being exported from an AIX system. The client wasn’t sure it would work—and there were challenges—but fortunately I found some guidance by googling.

Following these instructions, I was able to export a test filesystem from an AIX machine in my lab. Then I NFS mounted it from a test Windows machine.

Since I wasn’t running a Windows Server OS version, I ran the command listed in the document under Desktop OS in a PowerShell window:

Enable-WindowsOptionalFeature -FeatureName ServicesForNFS-ClientOnly, ClientForNFS-Infrastructure -Online -NoRestart

I couldn’t run the mount command in the PowerShell window, but I was able to run it in a normal command prompt window:

mount -o anon 10.1.1.1:/robtestfs z:

Initially, running the mount command triggered Network Error = 53. To fix that, I added the Windows machine as a temporary entry on the AIX machine in /etc/hosts. This allowed AIX to resolve the Windows machine hostname and mount the filesystem.

Another Reason to Keep Your System Firmware Current

A client wanted to update their VIO servers. After running shutdown -restart, the VIO LPAR would hang with LED code CA000040. This time the internet led me to a familiar source: IBM Support, which gave me these instructions:

“Problem: LPAR boot on a POWER9 system may hang with code CA000040. In iqyylog SRC B200F003 might be reported.
Symptom: LPAR does not boot and no access to SMS is possible.
Cause: This issue is caused by a new timer variable implemented at POWER9 used for creating delay timers. All POWER9 systems would be exposed, if those servers are not restarted for 814 days.
Environment: Any POWER9 system. The solution is provided in FW930.30, FW950.00, and FW940.30.
Diagnosing The Problem: To verify how long a system has been running, check the SRC history via ASMI with celogin:

“System Information -> Progress Indicator History

“Check if the system is up >= 814 days based on the time stamp on the STANDBY entry.
“A nondisruptive resource dump would include this information also.

Resolving The Problem: Workaround is to set the LPAR processor compatibility mode to Power8 until firmware fix is installed. This needs to be done for any LPAR facing the problem.

Powering the system off/on will reset the timer.”

Me again: In this case, the client changed the VIO profile so that it ran in POWER8 processor mode. This allowed the LPAR to boot. Then they later scheduled an outage to update the system firmware, so this shouldn’t be an issue going forward. If you have POWER9 systems that have been up for a long time sans firmware updates, keep this possibility in mind.

Seriously, this experience makes the case for regular patching cycles. Even if the client had only patched once a year, this issue could have been avoided. Don’t forget: You don’t just patch the OS. Also consider system and device firmware, the HMC, etc. There’s nothing worse than worse than opening a ticket with IBM and being told that a months-old fix pack or a firmware update would have solved your problem.

Using NIM to Install VIOS

While the HMC offers its own VIO server installation options, for many of us, NIM is still the way to go. IBM Support provides additional information about NIM for VIOS installation:

“This document guides you through the steps of preparing NIM for new VIOS installation. It provides instructions how to create the needed NIM resources to perform VIOS installation. In case you run into any problems during this operation open new case with IBM Support team for further investigation.

“This topic is covered in the following parts:
Downloading the VIOS iso image
Pulling the mksysb_image file from the VIOS iso image and checking its integrity
Defining the VIOS as client of NIM
Defining mksysb and spot resources and allocating them to the VIOS
Booting the VIOS in SMS over network”

This doc also features several helpful screenshots.

PowerVM and AIX Networking Options

This IBM Support document lists some quick tips for configuring PowerVM and AIX:

“The purpose of this document is to list common issues and their solutions concerning Ethernet adapters with IBM PowerVM and AIX. This document covers dedicated (stand-alone) Ethernet adapters, EtherChannel and SEA configurations.

“This document discusses the following topics:
Link Down and Link Up errors on Ethernet devices
General EtherChannel Failure
Link Aggregation Control Protocol (LACP) EtherChannel Tips
General Shared Ethernet adapter (SEA) failures
Configure output options for unused Ethernet devices”

Power10 to Utilize Oracle Databases?

An interesting development:

“IBM has quietly announced it is planning a 24-core Power10 processor, seemingly to make one of its servers capable of running Oracle’s database in a cost-effective fashion.

“A hardware announcement dated December 13 revealed the chip in the following ‘statement of general direction’ about Big Blue’s Power S1014 technology-based server:

“IBM intends to announce a high-density 24-core processor for the IBM Power S1014 system (MTM 9105-41B) to address application environments utilizing an Oracle Database with the Standard Edition 2 (SE2) licensing model. It intends to combine a robust compute throughput with the superior reliability and availability features of the IBM Power platform while complying with Oracle Database SE2 licensing guidelines.”

A Closer Look at the Latest AIX Release

If you didn’t catch it ahead of the holidays, Chris Gibson wrote something for this very website. Be sure to read his thoughts on some AIX 7.3 TL1 enhancements that caught his eye.

More Thoughts on the Cloud Journey

So last month, I drove cross-country. More recently, my wife and I were traveling again, this time by air.

It was fascinating to look at the map as our plane covered the miles in literally fractions of the time that it did by car. Going from Amarillo to Albuquerque took many hours by car, but in flight that portion of the trip felt like it was only a matter of minutes. Sure, that’s to be expected when you’re moving at 500 MPH, but it’s still impressive.

Don’t forget boats: Years ago, I took an automobile ferry across Lake Michigan. That was unique, and of course it was also a time-saver compared to the alternative of driving around that Great Lake. I’ve yet to make long journeys by train. Of course, passenger rail travel is more common in Europe and elsewhere than here in the U.S.

So which method of travel is best? Of course, that depends. As much as I hike and camp, I can’t see me piling my belongings into an RV and seeing the country for months at a time, but plenty of people do just that. But how you choose to get there comes down to your preferences and circumstances. What kind of cargo are you hauling? How much time do you have? How much do you want to see new scenery and experience new places? How much do you hate lines at airport security checkpoints?

Last month I compared traveling options to businesses exploring their options with cloud. I still like the analogy. Should you keep your systems on-premises? Should you move only certain workloads? Should you contract with a managed services provider? It depends. These are questions only you and your organization can answer. You may find others’ experiences with cloud instructive, but ultimately, no one knows your business like you. It’s up to you to determine the best way to your destination.

After Helping Clients Navigate Challenging Times, the AIX World Moves Forward

By Neil Tardy

Originally published by TechChannel May 24, 2021

IBM’s Carl Burnett, Nigel Griffiths, Petra Bührer, Maria Ward and Ann Detjen on how AIX has helped clients navigate the pandemic, and what’s next in the AIX roadmap

From the beginning, IBM AIX was touted for its reliability. That this characteristic has endured was never more evident than over the past year, as IBM clients relied upon the AIX OS and Power Systems hardware to manage their operations through a challenging time.

IBM’s Carl Burnett, Distinguished Engineer, Power Systems software development, believes that AIX/Power platform proved itself in 2020.

“The pandemic stressed customer systems more than ever before. AIX and Power held up to those impacts and stresses, and IBM was there to help,” he says. “We had customers whose transactions grew significantly—some were pushed to their limits—but we helped them grow their system capacity.”

Within IBM’s Lab Services division, this assistance included the establishment of a no-charge services program to assist Power Systems clients that were adversely impacted by the pandemic. Lab Services also enhanced its Power to Cloud rewards program so it could quickly fulfill requests from clients worldwide.

“Many AIX customers—particularly in healthcare, banking and retail—saw their application performance demands spike dramatically as customers shifted to online transactions,” says IBM’s Ann Detjen, worldwide Power Systems practice leader, IBM Lab Services for Systems. “One client said the acceleration of online shopping was like ‘having Black Friday every day.’ “

Lab Services itself transitioned to online delivery of its services last year. “Like everyone in the industry, the switch from on-site to online services delivery posed some initial challenges as far as providing VPN access, for example,” Detjen adds. “But today Lab Services still delivers most of our engagements online. While we expect significant recovery in on-site services, in many cases, our clients are expressing a preference for a new, hybrid model of services delivery.”

Nigel Griffiths, advanced technology support, IBM Power Systems U.K., acknowledged that over the last year, clients learned just how important it is to maintain their systems and keep current on not just hardware and software, but system and adapter firmware as well as commonly deployed tools like the Virtual Input/Output Server (VIOS). “It’s very hard telling customers that the fixes that would have addressed their issue have been in every updated version over the past few years,” he says. Luckily, IBM was able to assist many clients as they embraced dynamic LPAR and other key capabilities to grow their system resources and keep their systems up to date.

Elaborating upon these points, Petra Bührer, global sales leader, IBM Power Systems Germany, notes that numerous clients took advantage of IBM’s Capacity on Demand (COD) and Power Enterprise Pools 2.0 offerings. COD allows clients to activate processor cores and memory units when additional resources are needed on their system. Enterprise Pools provides enhanced multi-system resource sharing and by-the-minute consumption of on-premises compute resources to clients that deploy and manage a private cloud infrastructure.

“AIX on Power clients were able to drive outstanding system utilizations—greater than 90% in some cases. They were also able to manage their environments remotely,” Bührer says. “The reliability and resiliency of their systems gave our clients peace of mind.”

Burnett adds: “During the pandemic our clients have gained an even greater appreciation for AIX and Power. I believe we’ve strengthened our customer loyalty.”

Looking Ahead: The AIX Roadmap 

While helping AIX clients keep their businesses running during the pandemic, IBM also continued to deliver new and enhanced functionality. AIX 7.2 Technology Level (TL) 5 came out in November 2020. Key features include multi-queue N-Port IP virtualization (NPIV) and support for logical volume encryption.

What Burnett describes as NPIV scaling is the result of IBM transforming the control channel that existed between system LPARs and VIOS into a multi-lane architecture. He explains that this change, coupled with path reduction work and improvements in the serialization model, enable a much higher degree of I/O concurrency throughout the NPIV stack. Logical volume encryption provides an easier to administer model for encryption of data at rest based in AIX by leveraging in-core crypto hardware acceleration available in POWER8/9. Multiple key management and storage options are included. The IBM Power Systems Community blog has an introduction to this topic. See IBM’s AIX documentation for an in-depth look.

Though these developments are significant for AIX clients, we now know that much more lies ahead. 2021 began with recognition of AIX’s 35 anniversary and the acknowledgment that IBM’s AIX roadmap now extends into the 2030s. Then in February, IBM released a statement of direction noting its intent to deliver a new version of the OS, AIX 7.3, by the end of this year. 

POWER10 and AIX 7.3 Capabilities 

With still months to go before an official announcement, IBM is unsurprisingly tight-lipped about specifics. However, key pieces of the puzzle are already in place.

“While we’re not yet at a point where we can start discussing it publicly, I would say that 7.3 is the AIX version where we look forward to exploiting and optimizing the features of the next generation of the Power platform based on the POWER10 chip that was introduced last year at the Hot Chips conference,” Burnett says.

IBM has also said it intends to maximize interoperability for clients as they transition to the new OS and new processors. POWER10 is expected to support each of the AIX 7 releases: 7.3, 7.2 and 7.1, while AIX 7.3 is expected to support POWER8, POWER9 and POWER10.

Of course, POWER10 and AIX 7.3 are the most enticing combination. “There are interesting new instructions in POWER10 we are looking to exploit,” Burnett says. “We’re also doing work in the area of continuous computing, including a major new capability around our Live Update technology on the platform. There are interesting capabilities around security and hybrid cloud. We’re exploring deep integration opportunities to get the most from POWER10 with AIX 7.3.”

New and enhanced capabilities in the area of machine learning and inferencing are also expected with the coming release. IBM’s Maria Ward, AIX Product Manager, Power Systems, says that AIX developers and product management teams are working with their IBM counterparts in AI development to determine how to further exploit AI workloads. 

“AIX runs some of the worlds’ most mission-critical, data-intensive workloads for businesses around the world. Businesses today are looking for ways to infuse AI directly into their workstream and leverage their vast amounts of data to unlock valuable insights using AI,” notes Ward. “By colocating transactional workloads running on AIX with existing AI models, business can gain valuable insights from the data they already have residing on their Power Systems without having to move the data off of the system.”

Finally, IBM plans to offer an open beta to allow registered clients to test AIX 7.3 on their existing Power Systems hardware. Ward says that the initial 7.3 images will be available about 8-10 weeks prior to the GA. Details about registration are forthcoming. For more information or to request access, send an email to mrward@us.ibm.com.  

‘Among the Most Reliable and Highly Available OSes in the Industry’ 

At its inception in 1986, AIX was a UNIX workstation-based OS—a desk-side system, in essence—that ran on a single uni-processor computer. Today, AIX runs on enterprise grade hardware that powers critical workloads for major businesses and organizations around the world.

That’s hardly news, but that remarkable transformation is still worth noting. And IBM and AIX clients can look forward to more.

“AIX is among the most reliable and highly available operating systems in the industry,” says Burnett. “Our clients trust AIX and know that it will keep their workloads up and running, and available.”

Mobile Applications for IBM i Enable Business Continuity

by Neil Tardy

Originally published by TechChannel February 3, 2020

Mobile development expert Alan Seiden and IBMer Brad Bentley explain how the OS is well-suited for the task.

Cellphones and mobile devices, long a part of our daily lives, have also become essential in the world of business. That’s certainly the case for Alan Seiden.

As head of Seiden Group, a consulting firm specializing in application development and modernization on the IBM i platform, Seiden manages his business from his phone. When traveling to client sites and user events, he checks and updates his schedule using Trello task tracking software. And with Slack, the popular collaborative messaging platform, Seiden and his employees all keep in constant contact.

“The expense tracker in Slack allows you to create receipts by taking photos from your phone. The features our phones have—starting with the camera—allow you to capture data on the spot,” he says.

“We’re all so busy; our to-do lists are too big already,” Seiden adds. “That’s what makes the mobile experience so important. We can handle business even when we’re not at our desks.”

No longer the exclusive domain of retailers, the mobile experience—or mobility, if you prefer—now extends far beyond everyday consumerism. And increasingly, IBM i clients recognize that you don’t need to have something to sell online to benefit from mobile technologies.

Designing the User Experience

Consider a company executive responsible for approving purchase orders. Not that long ago, this person needed a desktop computer or perhaps an emulator to review and sign off on these documents. If she was traveling, the paperwork had to wait. But now, with a phone and access to the company’s mobile application—which provides automated notifications for each new purchase order—there’s no paper. Approvals can be given with the swipe of a thumb, or even registered through a voice application. Everything can be handled from the shop floor or from the road.

Or imagine a supply company that develops its own mobile application. The app connects to the company’s IBM i system, which tracks the status of all orders. With information about deliveries—as well as pending orders and back orders—at his fingertips, the shop manager knows when he needs to schedule contractors to unload incoming cargo.

“So much of the time, we think of mobility as external, as this interface that only our customers interact with,” says Brad Bentley, senior management consultant with IBM Systems Lab Services. “But now we’re seeing people putting their business intelligence, their analytics, into mobility applications for internal use. There’s real impact on utilizations for these producers.”

Seiden believes that understanding the user base is the first step in developing web apps. Identifying your primary users—be they customers or employees—and how they’ll access your application—be it from mobile devices, desktops or some mix of both—is essential to designing a suitable user experience.

Naturally, a company website will be accessed from all devices: phones, tablets and desktops. “Responsive” is the term that describes optimizing the online experience for all users. “It’s very important that all web apps should, at minimum, be responsive,” says Seiden. Responsive apps change their appearance according to the form factor, so they look good and function well on all common device sizes.

The other common option is to develop a native application. These applications are optimized specifically for mobile devices and downloadable from online app stores. Some native apps are widely used (e.g., Google Maps), but some companies develop and maintain their own native apps. “These apps take advantage of all of the features of the phone,” Seiden says. “That takes specialized knowledge, but a native application can potentially provide an enhanced experience.”

Something to Build on or Learn From

While the process of developing and running mobile applications is certainly involved, IBM i businesses can take heart in a couple of factors. One is that you’ve kind of been here before. Over the past several years, many enterprises have launched and completed web enablement projects. Web enablement can be viewed as a precursor to going mobile—and, depending on your experience, it can be something to build on or learn from.

“Mobility is an extension of web enablement,” Bentley says. “And the beauty of it is now, if you re-evaluate and re-architect things the right way, you can post both of these solutions through a single instance and have them perform very well from your server infrastructure.”

The other factor is that IBM i is very well-suited for this task. The platform is designed to connect data and business logic by supporting the latest languages, frameworks and patterns. Through the universal language of web services, and specifically, the implementation of RESTful service layers, mobile applications can interact with IBM i securely while providing scalability. Open-source tools like Bootstrap, which is designed to enhance responsiveness in mobile apps, offer even greater flexibility.

This seems like an appropriate point to note that the “i” does indeed stand for “integration.” “As a mobile developer, I don’t really care what’s on the back end or what I’m talking to. I care about how I talk to it,” says Bentley. “With IBM i hosting the data and serving as the master repository for everything, you can do everything you need to do. IBM i is an amazing platform for mobile applications.” 

How IBM i Clients Helped Shape Db2 Mirror for i

By Neil Tardy

Originally published by TechChannel January 2, 2020

Kris Whitney, product architect for Db2 Mirror for i, shares how the continuous availability database solution was created in response to client feedback.

For years, IBM i clients have been asking the company to develop new products or provide enhanced capabilities. There’s even a defined process for these asks with the Request for Enhancement (RFE) program. Partnering with clients to deliver advanced solutions that address complex business problems has been a hallmark of the platform since its inception.

In 2015, Kris Whitney and other members of the IBM i high availability (HA) team were approached by some long-time clients about their need for a continuous availability solution. However, this was no ordinary request. 

It started with a few individual queries, but eventually grew into a chorus of voices. “These were some very important clients, and they all had similar requirements,” says Whitney. “They had calculated their business loss, even for planned maintenance, and concluded that they couldn’t be down at any point.” The message was that simple: These clients literally didn’t have a moment to spare for downtime.

“They wanted a commitment,” he adds. “We’d heard that conversation for a long time about 24-7 and getting to active-active, but now they needed continuous availability to remain on the platform.”

IBM’s commitment resulted in IBM Db2* Mirror for i, a licensed program product (LPP) unveiled in April 2019. The announcement was the culmination of four years of internal development and extensive collaboration with IBM i clients spanning the finance, insurance, pharmaceutical and retail industries worldwide.

This solution’s reception has been impressive. For various reasons—starting with their satisfaction in the overall quality of the products themselves—IBM i clients are generally reluctant to move to new releases or investigate new offerings. But interest in Db2 Mirror has been noticeable since GA in June.

“Yes, our adoption rate is traditionally slow, but we have people who are more motivated than I’ve seen in the past,” says Whitney. “They’re getting into their test environments and trying to get there faster.”

‘Can I Have it in 6 Months?’

In contrast, just getting there was the priority during development. As the product architect for Db2 Mirror, Whitney said that everyone involved understood the challenges of this undertaking.

“We knew full well going in that this was going to be one of those projects that don’t come around very often. As far as the size and the amount of impacts throughout the entire OS, it touched a lot of places and a lot of developers had to be involved,” he says.

While the project followed established IBM design guidelines and principles, participating clients had more opportunities to provide feedback and the ability to do so earlier in the process. That meant more demos and more reviews. Even during the initial stages, a group of stakeholders reviewed the requirements and weighed in on high-level design ideas. An early alpha program, set up more than a year prior to release, allowed participants to access a cloud-like environment to get hands-on with the solution in progress.

“Naturally, there was joking around: ‘Can I have it in six months?’ ” Whitney recalls. “But our clients understood the enormity of this and how it would change the way they wanted to run their IBM i business.”

At this point, clients were running their businesses with PowerHA* System Mirror for IBM i. As a solution that provides storage-based clustering capabilities for HA environments, PowerHA high availability “gets you almost all the way” to continuous availability, as Whitney puts it. Of course, PowerHA high availability continues to fulfill the needs of a substantial number of IBM i clients.

While Db2 Mirror has some broad similarities to PowerHA—both are LPPs that are tightly integrated with the OS, and they share some clustering code—they’re distinct solutions. To accomplish what it does, Db2 Mirror takes advantage of advancements in connectivity technology. Whitney and Mark Anderson, formerly the chief Db2 for i database architect (he retired in 2019), came up with a design for replicating databases using low latency high speed RDMA over converged Ethernet (RoCE: pronounced “rocky”) adapters.

“That was really the turning point, when RoCE technology became a little bit more mainstream,” says Whitney. “These adapters became more common.”

Using RoCE to connect two IBM i instances—which are most likely but not necessarily two physical servers—the two databases are presented as a single database that’s accessible from both systems simultaneously. This is known as an active-active configuration.

From the beginning this capability was specifically sought by some clients.

“In parts of Asia, there are banking requirements. Whether these are formal laws, legislation that will be enacted, or merely suggestions from governments, it’s a little fuzzy. But these clients were very interested in achieving an active-active solution,” says Whitney. “We heard it from multiple companies in that region of the world, so we knew it was critical.”

Another key component of Db2 Mirror is its UI, which is browser-based and runs directly on IBM i. It’s stylistically similar to the well-regarded GUIs present in IBM Storwize* solutions.

Whitney and GUI architect Tim Rowe arrived at the design choice. “Early on, I challenged Tim to do something different from what we’ve done in the past. It’s very straight forward. After some basic training, people get it,” he says. “It’s probably one of our best GUI products to date.”

A Surprising Response 

For the most part, Whitney says the development process for Db2 Mirror was typical. Everyone put in long nights and worked their share of weekends, but the solution came together roughly on schedule and largely as planned.

Since the release, however, some unexpected discoveries have been made. Fortunately, they all fall into the category of pleasant surprises. For starters, Db2 Mirror has proven to be a fit for environments of all sizes. It makes sense when you think of an industry like finance. Small community banks are subject to the same regulations as the big ones, so their availability requirements are just as stringent.

“It’s applicable to a broader set of clients than we ever expected,” says Whitney. “We have—I’ll call it one of the smallest IBM i clients in the world. I mean it’s the smallest machine you can buy with a single core. They’re very interested.”

Clients have also found an additional use for Db2 Mirror: to move data. Say you’re looking to replicate a set of libraries that your users frequently query. By replicating a second system and maintaining the synchronous connection, that set of users—and the system overhead they generate—can be directed to the new node, while the original can be dedicated to production workloads.

“That’s completely valid,” says Whitney, “even if it wasn’t what we originally intended to solve.”

To better serve the entire client base, IBM is working to provide the capability to utilize internal storage with Db2 Mirror. In October, IBM issued a statement of direction pointing to support for direct attached internal storage in the near future. The company had previously made known its intent to support non-volatile memory express (NVMe) storage adapters.

As you can see, the work of bringing continuous availability to the IBM i platform is ongoing.

“I don’t see revolutionary changes to the product, but there are little things we can do to make it easier for applications to take advantage of active-active as well as enhancements we can make to the types of objects we replicate,” Whitney says. “We’re continuing a strong roadmap for a few years—and a lot of that, again, is based on what we’re hearing from people using the product.” 

Open-Source Technology is Fulfilling Business Needs of IBM i Clients

By Neil Tardy

Originally published by TechChannel September 3 2019

Jesse Gorzinski, business architect for open source on IBM i, explains the rise of the open-source revolution.

If you ask Jesse Gorzinski how he knows that open source on IBM i is a big deal, he simply pulls up his planner. 

“Even two years ago it was an occasional thing; I’d go talk about open source at two, maybe three conferences a year,” he says. “By May of this year, I was booked up for all of 2019 in terms of travel and engagement.”

As the business architect for open source on IBM i, Gorzinski has searched company-wide for helping hands because it’s no longer possible for him or members of his core team to respond to every inquiry or accept every request to speak at a user group meeting or IBM client briefing.

“We have folks who aren’t historically IBM i people that we’ve brought in to cover some of these topics because so many IBM i clients are approaching us and wanting to learn about this stuff.

“It’s fun,” he adds, “but it’s also overwhelming.”

That’s quite an admission for Gorzinski, who obviously loves
to talk about open source and everything it enables IBM i clients to do.

“Open source has always been something I was very excited and passionate about,” says Gorzinski, who initially worked in finance on the then iSeries platform before joining IBM in 2006. “It’s really thrilling right now, bringing this technology to IBM i.”

Of course, open-source options have provided added value for IBM i clients for years. The Apache web server has been around about as long as there’s been a public internet. The Zend PHP server debuted back in 2006, while support for the Node.js JavaScript* runtime and Python programming language arrived in 2014 and 2015, respectively.

“I started getting directly involved with our new open-source mission in 2014,” says Gorzinski. “That was when we were starting to look at Node.js and even then, we recognized that this was something we needed to be ahead of the game on.”

While all of these innovations were significant, the interest and excitement around open source on IBM i still feels new. Over the past 12-18 months, it seems everything has come together to make open source a vehicle for doing innovative, but also solidly business-critical, tasks.

Lining Up the Pieces

Within IBM i development, 2018 was an eventful year. With a Jenkins-based continuous integration, continuous delivery and continuous deployment (CI/CD) system in place, the number of open-source packages available for IBM i soared. Gorzinski estimates nearly 300 such packages are currently available. Availability of the newest long-term support (LTS) version of Node.js was one of numerous changes and enhancements in that area.

“When we switched to the new toolset, a lot of solutions just started working,” says Gorzinski. “Things that were previously unfathomable quickly became standard, or easily doable.”

Perhaps most significant was the transition to RPM, which represented a radical departure from the initial foray into IBM-delivered open source, OPS. Created in 2014, this licensed program (5733-OPS, Open Source for IBM i) was a well-intentioned initiative to provide IBM i users with a familiar operating environment. While that had its benefits, Gorzinski admits there were also problems for both the user base and IBM’s development team.

“We made it familiar to those who knew classic IBM i administration,” he says, “but a lot of people had difficulty finding OPS on our downloads site. And because we were doing PTF installs, sometimes these downloads would pull in dependencies upon dependencies of other PTFs. You could spend an afternoon downloading gigabytes of PTFs if your system wasn’t up to date.”

With a lack of automation on the development side, providing timely updates was its own challenge for IBM.

“It was a lot of work for us as well, and that didn’t serve anybody,” says Gorzinski. “RPM allows us to do automated testing and automated deployment. Now we can meet client needs much more efficiently.”

With its roots in Red Hat Linux* (at its inception, “RPM” stood for “Red Hat Package Manager,” though it’s now a redundant acronym), RPM has essentially replaced OPS, which is in deprecation and set for end of life in December. While IBM relies on RPM to deliver open-source software, it, of course, offers benefits to IBM i clients as well.

Most simply, RPM is designed to simplify the process of downloading and deploying open-source software. An RPM is typically used with the YUM package manager. YUM—which can be traced to another Linux variant known as Yellow Dog Linux—offers an extra administrative layer to handle things like dependency management. If, for example, you want to install the popular open-source web server NGINX, YUM will issue a prompt, letting you know that NGINX requires SSL libraries to function. Then you can download those libraries.

“There’s a reason we invested in YUM; it provides the simplification,” Gorzinski says. “It automatically figures out all those dependencies. Fetch them all and it will install them in the right order in just minutes or even seconds. YUM allows administrators to do point and click or single command installs. Say you want to install Active MQ: just click ‘install.’ If you want to check for updates, if you want to update packages, it’s a single command. That’s the value of YUM.”

The Open-Source on IBM i Revolution Begins

Gorzinski believes that with RPM and YUM as the centerpieces, IBM’s technology is now aligned with the open-source world. “Folks are coming to us after looking at their two-year roadmap or whatever planning cycle they use and saying: ‘In two years, we want to get here.’ And I’m able to say: ‘We already have that technology for you.’ We’re seeing more engagements where we have what people are looking for, even before they realize they need it,” he says. 

At the same time, Gorzinski can scan download numbers and other internal data and find himself surprised by the rate of adoption and the breadth of things IBM i clients are already accomplishing with open source.

“In some cases, we were delivering technology thinking it’d power production workloads a couple of years from now, only to find customers deploying it just a few months later,” he says. “I was expecting a slow but steady rise in open-source software being deployed on the platform, but this is an explosion.”

There’s a reason Gorzinski calls it the open-source revolution. The changes clients are seeing are groundbreaking—and even chaotic. With that in mind, he has advice for IBM i clients, whether you’re just getting started or are already open source-savvy.

Understand everything you can do with RPMs. For instance, classic IBM i admins may not know about RPM repositories. This software, downloadable from IBM, provides even greater flexibility. As a self-contained directory, an RPM repository can be cloned or backed up, or hosted internally. By creating snapshots, distinct repositories can serve development, QA and production environments. In addition, a repository can be used to sync 20 or more LPARs to the same versions of open-source software.

Understand that open source is enterprise-ready—and IBM is ready to provide support. Even as open source goes mainstream, a misconception persists about what Gorzinski calls the enterprise-readiness of open-source software. However, open-source solutions are secure, functional and reliable. On top of that, support is available from IBM Technology Support Services (TSS) division.

“A lot of people think open source is a use-at-your-own-risk-type of thing, but you can pay for top-notch support from IBM,” says Gorzinski. “The TSS offering is there to help you through your whole software development lifecycle.”

Recognize that open source meets practical business needs. Start with something that remains an ongoing need for many enterprises: application modernization. Open source and IBM i can get you moving forward. Node.js is uniquely suited to provide web-based interfaces while handling the latest JavaScript frameworks, all while delivering fast response times and reducing memory usage.

Or think of the industries that have long relied on the IBM i: manufacturing, retail, healthcare, logistics. It turns out that open source has much to offer in these areas. Gorzinski points to Internet of Things (IoT) software, which basically describes an array of internet-connected physical devices and appliances such as thermostats and heart monitors.

“IBM i now, thanks to open source, has IoT capabilities,” he says. “In the industries where IBM i has a strong footprint, the use of IoT will continue to grow, and IBM i will be the main storage point for all of that data.”

Embracing Open Source

The open-source revolution is indeed upon us—and the revolution is ongoing. Early this year, IBM demoed the IBM Q System One, which is designed to be the first quantum computing system for commercial use. With IBM i capable of integrating with the IBM Q* Experience to run computations or emulate the technology, quantum’s potential uses in areas ranging from chemistry to finance is already capturing the imaginations of the user base.

“One day, I tweeted about doing quantum on IBM i, and that tweet generated more direct messages than anything I’ve ever tweeted,” Gorzinski says. “People are coming out of the woodwork and saying ‘Yes! We want to hear more.’”

Cloud, IBM Support, AIX History and More

Originally published by TechChannel December 6, 2022

Rob McNelly on cloud technology, expanded IBM Support options, IBM requiring encryption for fix downloads and AIX history

Recently my wife and I traveled cross-country via car. We spent time with family in Virginia and then drove north through Maryland, Pennsylvania and New York state. After celebrating the Thanksgiving holiday, we circled back through Ohio, Indiana, Illinois, Missouri, Oklahoma, Texas and New Mexico before returning home to Arizona.

Over two weeks, we covered thousands of miles, encountering varying weather conditions—fog, rain, snow—and stages of road construction. I guess spending that much time on the interstates made me a bit loopy, but I maintain that “Uranus Fudge Factory” is a funny name, and their billboards made me legitimately curious about the quality of their fudge.

I used Google Maps to navigate the journey. Initially we avoided the tolls, which took us to some interesting back roads and out of the way places. But as we got closer to home, we opted for the most direct route, never mind the cost. Toward the end I know I was just ready to be done with the drive.

Still, I’m glad we didn’t fly. For the most part I enjoyed the journey and the changing scenery. Driving naturally gave me time away from my phone, my email and work in general. It gave me a chance to think—though for better or worse, that eventually led me back to work considerations.

One thing I kept thinking of was companies’ adoption of cloud technology. It’s like driving or flying, at least in the sense that there’s more than one way to get there. It could be a quick trip, like a plane flight, or a longer road. 

Some companies want everything to remain on-premise. An onsite data center and IT staff is what they know; it’s what’s worked. In a sense, their journey may never begin. Other companies are taking action by enabling rapid provisioning or live partition mobility. Theirs might only be a short journey, but they’re moving forward.

Then there are true hybrid cloud environments, where software and/or infrastructure may be deployed as a service. Other companies no longer manage their own servers. Everything has gone to the cloud. Or they may still own the servers but pay service providers to manage them day to day.

In short, we all have different needs and priorities. Every company has unique goals. So of course every cloud journey is different.

A Change From IBM Support 

Just as there are pros and cons to driving versus flying, there are pros and cons to the many available options for supporting your business. One size may not fit all, but it is good to have options.

On that note, IBM Support recently announced that it is providing customers with greater flexibility to add and invite team members to work on support cases:

“Coming soon, users on your accounts will have new flexibility over who they can invite as team members on support cases. New team members do not need to be associated with the account or have an IBM ID.

As an administrator, you will no longer be required to take the time to approve new team members. While this change gives users more freedom to add team members, you will still receive email notifications when someone on your accounts adds a team member. You can still use the User Administration page to remove users you don’t agree to from your cases, just as before.

Here’s how it works
When a user creates a case, they simply select Add A Team Member from the Add Team Members menu, then enter the new team member’s name and email address, and then click Search.

If the person they enter does not already have an IBM ID, the person opening the case clicks OK to confirm that they’d like to invite the new team member to the case.

The people added to cases with this method will have full access to the case they are added to. They will be able to add comments and edit the case. They will not have access to any other cases.”

Support Ending for Unencrypted Fix Downloads

IBM recently announced that it will no longer support unencrypted fix downloads. This change is set to go into effect on February 15, 2023.

From IBM Support:

“Many leaders of the internet industry—such as World Wide Web Consortium (W3C), Internet Engineering Task Force (IETF) and Internet Architecture Board (IAB)—state that universal use of encryption is the way forward for the internet traffic.

Therefore web platforms should be designed to actively prefer secure communication so data is protected in transit and at rest.

Aligned with this industry direction, IBM IT Security Standards have been enforcing the use of encrypted communications.

Therefore IBM Electronic Fix Distribution (EFD)/IBM Electronic Customer Care (ECC)/IBM Fix Central systems will stop supporting unencrypted fix downloads on February 15, 2023, to improve user privacy and security and enforce compliance with IBM IT Security Standards. Shortly after that date, unencrypted fix download flows will NOT be allowed anymore.

Recommended Action
Ensure as soon as possible that the connections made to IBM fix download servers are secured.

The IBM fix downloads servers currently support HTTPS, SFTP, FTPS and DDPS secure download protocols. Ensure you have secure protocols in place and update any procedures, including existing jobs, scripts or tools to use the secure fix download protocols.

You can use secure protocols now, but they will be the only options when they are enforced on the deadline specified above.

If you are still using an unencrypted fix download protocol (such as HTTP, plain FTP or DDP) then make sure you switch to an encrypted one (such as HTTPS, SFTP, FTPS or DDPS).”

A Brief History of AIX 

Nigel Griffiths recently updated this document that features release/end of support dates and notes for each AIX version from 3.1 through 7.3. The doc also includes a brief AIX timeline as well as tips and other “often forgotten” details.

Rob McNelly on ‘Lights-Out Data Center’ Issues, the Latest IBM Announcements and More

Originally published by TechChannel October 19, 2022

Have you heard of “lights-out data centers?” Rob McNelly explains what they are along with their pitfalls, and explores the latest IBM announcements here.

Though I’ve heard about lights-out data centers for years, I truly don’t envision a future where humans will never set foot on the raised floor. We’ll always need hands and eyes in the room to perform tasks on our systems.

Case in point: Recently I serviced a customer that had three of their four fibre network ports inactive on their network switch. For example, ent0 showed that we were disconnected:

                 entstat -d ent0
                  Link Status: Down
                  Media Speed Selected: Autonegotiation
                  Media Speed Running: Unknown

While ent1 was fine. We were connected:

                  entstat -d ent1
                  Link Status: Up
                  Media Speed Selected: Autonegotiation
                  Media Speed Running: 1000 Mbps Full Duplex

The OS was not seeing the expected connection on the network ports. This was verified by the network team, who could also see from the switch side that the ports they expected to have connections were in fact not connected.

During this call, we learned that this was an ongoing issue. The client initially tried replacing small form factor pluggables (SFPs) on the switch. They physically verified that the expected ports from the server were plugged into the correct ports on the switch.

We had the luxury of being able to swap cables, and lo and behold, the problem followed the cables. What was the working port prior to the swap ceased to function, and vice versa. Was it a bad cable? Nope. We tried a different cable and had the same issue.

At that point it was lights on, figuratively, in our heads, because we realized that the TX and RX polarity was reversed on the cable. So we asked the onsite team to correct the cable and plug it back into the switch. As expected, the port fired right up. All three of the non-working ports had this issue, so we did two more reversals of the TX and RX.

Working remotely, we could adjust the switch and logical configurations on the server all we wanted, but it wouldn’t have accomplished anything. To fix this problem, we needed people on site.

On that note, be sure to show your appreciation for the CEs and any data center personnel you work with. If you yourself are a “hands and eyes” person, then I thank you, too. Remember, without the professionals who work directly on these systems and associated equipment, none of us are doing much of anything.

First Impressions of IBM October 11 Announcements 

As IBM Champion Alan Fulton notes, there is indeed much to unpack with IBM’s October 10 announcements, starting with updates to PowerVM, vHMC, PowerVC, AIX and IBM i.

There’s plenty that caught my eye as well:

  • Support for AIX install and boot from iSCSI attached storage. Consult the IBM System Storage Interoperation Center (SSIC) for additional information on supported configurations.
  • Increased NFS file size limit beyond 32 TB. See the AIX 7.3 TL1 Release Notes for the new supported limits
  • AIX tar command support for pax archive format. Previously, AIX tar supported only star archive format. The new pax format archive can be created using the “– format=pax” option in AIX tar command.
  • Improvements in AIX dump performance through hardware-accelerated compression on IBM POWER9 and Power10 systems
  • JFS2 filesystem now allows dynamic switching between inline and outline logging
  • The chpv command now provides an option to force offline a poorly performing PV in a mirrored pair
  • Ability to perform VIOS updates using vHMC

That’s just an abbreviated list. Read the announcement letter for yourself.

Two More Tales From the Field

Another customer had a VIO server that was spitting out unexpected vfchost errors in the error log, so they opened a ticket. IBM Support pointed them to this information

Problem: Qlogic or Cavium IBM fibre adapters in Power Systems register as targets in the SAN fabric instead of initiators.
Symptom: Any one of a number of symptoms might be present, including:

1. NPIV client LPARs do not discover devices during scans by system firmware (SMS or ioinfo)
2. AIX hosts can fail to boot
3. The AIX error log might be filled with extraneous errors when SAN monitoring software runs, or even when cfgmgr runs, as the adapter attempts to log in to itself. The errors decode as name server query failures. Detailed SENSE DATA indicates the failure was against the physical adapter’s own N_Port ID.

The doc also notes that, on a VIO server, these steps must be performed from the oem_setup_env prompt. Then reboot the VIO server. In our case, we followed the directions and the errors went away. So keep this in mind should you run across something similar.

And one final story: Yet another customer that uses SSH to access a server wanted to determine why the sessions would end when left open for some time. There are a few ways to deal with this problem, but start here, and scroll down for this response

ssh -o TCPKeepAlive=yes -o ServerAliveCountMax=20 -o ServerAliveInterval=15 my-user-name@my-server-domain-name-here

My customer tried that solution from their command line and it worked, and eventually they made the change to their /etc/ssh/sshd_config file so they no longer needed to enter those options on the command line.

IBM Support Forums Have Moved 

As of October 11, the IBM Support forums are now part of the IBM Community website:

“To improve your support experience and provide you with the best possible access to people who know and understand your products, the Support Forums join the IBM Community on October 11, 2022.

Simply visit the IBM Community website to search for and continue discussing your products there. The IBM Support site will provide a link to the IBM Community for some time after the move, but we recommend all users update bookmarks pointing to the Support site’s Forums as soon as possible. To make this transition as easy as possible, the Forums will remain on the Support site until November 11, but you will only be able to read questions and responses there, not post new ones.”

While I’m on the topic of IBM Support, be sure to check out the Complete Guide To Must Gather LPM Data Collection on PowerVC, VIO, AIX, Linux and IBM i.

The Latest From Nigel and Chris 

If you know AIX, you know Nigel Griffiths. And if you know Nigel, you know nmon is his baby. He recently sent out this information.

If you have thousands of nmon files, you can drown in the high volumes of data. You need to extract the key facts to allow planning your server consolidation, migrating to newer servers or Power Live Partition Mobility. These nsum shell scripts allow does the hard work to build a CSV file to import into a spreadsheet for further work.

Also via Twitter, Chris Gibson points to this document on migrating workloads to Power9 and Power10 systems. And on his personal blog, he explains how to find the hardware uptime for Power Systems frames.

“We needed to find the hardware uptime for a particular POWER9 frame to determine how close we were to hitting this known POWER9 firmware bug.

We found we could calculate this by looking at the “Progress Indicator History” view in ASMI and looking at the date associated with RUNTIME/STANDBY and working out how many days had passed since the frame was powered up.”

In that post he links to a C program that calculates the uptime, so check it out.

Open Firmware Macros Tips, and a Closer Look at Power10 Servers

Originally published by TechChannel September 2, 2022

Rob McNelly highlights tips on open firmware macros, recaps Nigel Griffith’s closer look at Power10 Servers, explores AIX security bulletins and more

There are several ways to boot LPARs on IBM Power Systems servers. For example, you can boot to your OS of choice—AIX, IBM i, or Linux—or to SMS or open firmware. While most of us are familiar with all of these options, if you’re not, check out this IBM Support document. I’ll highlight the section on booting to the open firmware prompt:

Booting the LPAR to the Open Firmware (OK) prompt
1. Make sure the LPAR is not activated. If it is hung, go to the HMC GUI, and under Systems Management -> Servers -> server name, check the box next to the LPAR. Then, from the arrow on the right side of the LPAR name, activate the menu and select “Operations -> shutdown”.
2. Wait until the LPAR is in a “Not Activated” state, and the Reference Code shows all zeros.
3. Mouse click the arrows to the right of the LPAR name again to display the menu. Click “Operations -> Activate -> Profile”
4. From the Activate Logical Partition window, click the “Advanced” button.
5. From the Activate Logical Partition – Advanced button window, select “Open Firmware OK Prompt” from the “Boot Mode” drop down list.

Chris Engel has information about the firmware chain of trust for PowerVM, while Colleen Stoufer explains the restricted OF prompt.

Access privileges at the Open Firmware prompt must be restricted to preserve the secure boot status of the PFW code loaded on the partition. The new Restricted OF Prompt will limit input and execution to a defined set of macros. The option to access the Restricted OF Prompt will be displayed on the splash screen. You will see that the “8 = Open Firmware Prompt” option has been replaced with the “9 = Restricted Open Firmware Prompt” option.

Once you’ve booted to the restricted OF prompt, you can do many things, as you’ll discover by consulting the Restricted OF Prompt User Guide. (Hat tip: Chris Gibson on Twitter.) This is one handy doc, because it allows you to learn more about your system before you’ve even loaded an actual OS:

“When a partition is booted in Firmware Secure Boot mode, normal access to the Open Firmware prompt is disabled. A new Restricted OF prompt is provided that allows access to a set of macros that will allow customers to continue to perform many of the functions that they rely upon without jeopardizing the security of the firmware. This prompt will not allow execution of any commands that are not part of this documented set of macros. Existing tools will require updates to function in this new environment.”

The set of macros defined in this section will execute at the Restricted OF prompt. The inputs and expected output are described. In most cases, the macros will output a string which indicates the success or failure of the execution of the macro.

** The set of macros is subject to change. New macros will be added as required, and defunct macros will be removed. This document will be updated when changes are made.

In the following sections that describe the macros, the following will apply:
• All input at the Restricted OF prompt is case insensitive.
• Square brackets are used to enclose any optional items.
• All parameters for the macros MUST be on the same input line as the macro name.
• The default language for the macros is English (no translations).

Try running macro_help. This macro displays the list of the currently supported macros and the required parameters.

0 > macro_help
>> BOOT_FROM_SEQ [ADDPARMS]
 where:
 [ADDPARMS] = additional parameters such as debug flags
 (if not provided, boot from devices in boot-device list without debug)
>> DISPLAY_BOOTSEQ
>> SET_DEFAULT_BOOTSEQ
>> BOOT_FROM_DEVICE <DEVTYPE> <ADDRESS> [BOOTPARMS]
 where:
 DEVTYPE = { #disk | #cd/dvd | #san | #network | #tape }
 ADDRESS = location-code
 [BOOTPARMS] = specific to the device type
 (see Restricted OF Prompt User Guide for examples)
>> DISPLAY_BOOT_DEVICES <DEVTYPE>
 where:
 DEVTYPE = { #disk | #cd/dvd | #san | #network | #tape | #all }
>> DISPLAY_MAC_ADDRESS <ADDRESS>
 where:
 ADDRESS = location-code
>> DISPLAY_NETWORK_PATHNAME <ADDRESS>
 where:
 ADDRESS = location-code
>> PING <ADDRESS> [PINGPARMS]
 where:
 ADDRESS = location-code
 [PINGPARMS] = required and optional parameters
 (see Restricted OF Prompt User Guide for examples)
>> DISPLAY_PCI_PROPS [ADDRESS]
 where:
 [ADDRESS] = location-code
 (if not provided, all adapter PCI properties will be displayed)
>> DISPLAY_ADAPTER_WWPN [ADDRESS]
 where:
 [ADDRESS] = location-code
 (if not provided, all adapter WWPNs will be displayed)
>> LUN_ATTACHED? <ADDRESS>
 where:
 ADDRESS = location-code

Usage examples are also included, so be sure to check it out. Download the PDF

More Power10 Deep Dives

During a recent Power Systems Virtual User Group webcast, Nigel Griffiths covered the Power10 scale-out and midrange systems. Listen to the replay and download the presentation

Nigel has another webcast coming up this week. He takes a closer look at the Power10 S1024 scale-out system. In addition to displaying tons of photos, Nigel will discuss the new service processor and eBMC. He’ll also touch on the mandatory HMC v10, which is used on Power10 hardware, including the new S1022/S1024 and E1050 and the previously released E1080.

Best Practices Doc Updated

The latest version of the Power Implementation Quality Standard document is now available. I’ve mentioned Fredrik Lundholm’s work before, most recently here. Most slides have a headline in red noting the most recent update. This being Version 2.6, the most recent updates are labeled “Upd 2.6.” And be sure to check out the notes at the bottom of many of the slides; you’ll find good information there as well.

Recent AIX Security Bulletins

I assume you are keeping up with AIX security bulletins. Three recently caught my eye:

  • AIX is vulnerable to arbitrary command execution (CVE-2022-1292 and CVE-2022-2068) or an attacker may obtain sensitive information (CVE-2022-2097) due to OpenSSL
  • IBM PowerVM VIOS could allow a remote attacker to tamper with system configuration or cause a denial of service (CVE-2022-35643)
  • AIX is affected by multiple vulnerabilities in Python

Be sure to keep your systems patched, and sign up for these security bulletins if you are not already receiving them.

Power10 and the Impact of Upgrading

Originally published by TechChannel August 2, 2022

The availability of the full Power10 server portfolio is another reminder of the noticeable benefits of moving forward with technology

Recently, after settling into a new home, I upgraded my cable modem and router. I was fine with the internet service I had, but for the same cost, I was able to go from ~100Mbps to ~1000Mbps download speeds. Obviously those numbers may vary depending on your connection (wired versus wireless), your equipment, and the interference from other wireless devices in your area, but rest assured, upgrading is worth it. Browsing is faster. Moving large files around is faster. The overall experience is better, and the difference is very noticeable.

The same thing happened earlier this year when I upgraded my phone. I had no complaints about the old phone, but the improved performance was strikingly apparent once I made the change.

That’s the basic story of technology and its impact. The devices we use day to day serve us very well, and may do so for years on end, but once you finally make the move, you’re immediately reminded how quickly technology advances. As I’ve noted previously, it really sneaks up on you: 

“So maybe you need a new laptop. Or maybe it’s time to look at your infrastructure and consider upgrading your hardware and software. Again, I was fine chugging along as I was, but now with the snappier performance, I realize what I was missing out on. I also can’t count the number of times I’ve seen IBM Power Systems users react similarly to the performance of new hardware. Yes, everyone has budgets, but what is it worth to your organization when response times are better, jobs complete faster and more work is being done with fewer cores?”

Upgrading to Power10

With that, let’s talk Power10. The entire server portfolio, both enterprise and scale-out, is now available. As you would expect, the performance has significantly improved yet again. Do you think you’d notice the difference if your servers were upgraded?

Here’s the IBM press release. In addition, I received this IBM email that provides basic details:

Today we are announcing the rest of the Power10 server family; the scale-out Power S1014, Power S1022, Power S1024, and the midrange Power E1050. These new systems, built around the Power10 processor, have twice the cores and memory bandwidth of the previous generation to bring high-end advantages to the entire Power10 product line.

IBM Power S1014

The IBM Power S1014 is a 1-socket, 4U Power10-based server for IBM AIX, IBM i, and Linux workloads, and has 57% more performance per core and 20% more memory bandwidth compared to the Power S914. Reduce physical data center footprints and lower your cooling and electrical costs by doing more with less. The Power S1014 is ideal for IBM i, Oracle Database SE, AI inferencing, and more. Learn more about the Power S1014 and tour a virtual demo.
 
IBM Power S1022


The IBM Power S1022 is a 2-socket, 2U server for IBM AIX, IBM i, and Linux workloads, and has 37% more performance per core and 2.4X more memory bandwidth compared to the Power S922. It is available in either a single chip model, the Power S1022s, or dual chip model, the Power S1022. The Power S1022 is ideal for distributed computing, DevOps, dev/test environments, and more. Learn more about the Power S1022 and tour a virtual demo.

IBM Power S1024

The IBM Power S1024 is a 2-socket, 4U server for IBM AIX, IBM i, and Linux workloads and has 33% more performance per core and 2.4X more memory bandwidth compared to the Power S924. With double the number of cores compared to Power9-based servers, you can lower your cooling and electrical costs by consolidating more workloads onto fewer servers. Users can also further optimize and reduce costs by taking advantage of flexible consumption models and only paying for what they use. Learn more about the Power S1024 and tour a virtual demo.

IBM Power E1050

The IBM Power E1050 is a 4-socket rack server optimized for data-intensive applications and hybrid cloud deployments. Enhanced security with transparent memory encryption and production-ready AI at the point of data enable faster insights for clients. Scaling is consistent across private and public cloud environments with flexible consumption options for users. Learn more about the Power E1050 and tour a virtual demo here

If you want to go further in-depth, here are some things I’ve come across, starting with this blog post by Ken King, General Manager, IBM Power:

“We introduced the IBM Power10 high-end server last September and we are continuing to broaden the portfolio with a major launch of four new systems today: the scale-out Power S1014, Power S1022 and Power S1024, along with a midrange server, the Power E1050. These new systems, built around the Power10 processor, have twice the cores and memory bandwidth of the previous generation to bring high-end advantages to the entire Power10 product line.”

For other perspectives on the new servers, start with this very website, which has an announcement feature and Lifetime Champion Jaqui Lynch’s analysis.

Nigel Griffiths’s summary includes links to IBM Redbooks, rperf numbers and announcements letters. He also has videos, including:

For viewpoints outside of IBM, there’s Charles King and The Register.

Finally, bookmark this page on IBM Support. You can select the different models and easily determine which operating systems are supported, along with the minimum and recommended levels you should be running. Power10 is just the latest addition; you’ll find information on previous Power processors. I check this info all the time.

The new servers can provide quite a performance boost, particularly if you’re running on pre-POWER9 hardware. When you do choose to migrate, be prepared to be pleasantly surprised.

IBM Virtual Event Highlights the Latest on Power10 and News on AIX Toolbox Open-Source Support

Originally published by TechChannel July 1, 2022

Rob McNelly looks at an upcoming IBM virtual event, an IBM Support update for AIX Toolbox open-source software and more

IBM is hosting a Power10 virtual event on Thursday, July 14, at 11 a.m. EDT. You can register here and you’ll find the event description below:

“Businesses are facing continued uncertainties. Resource availability. Volatility in demand and in costs. Now, more than ever, this requires flexible and reliable technology to deliver.

Join us for a virtual event to learn more about the latest from IBM Power. Hear from clients and IBM experts about how Power helps create digital advantage with hybrid cloud infrastructure to modernize, automate and secure your business with class-leading reliability.”

Just given the number of IBM Power Systems executives who are participating (click the link to see the list), it’s safe to assume that this is a big deal. Having taken part in a few NDA sessions regarding the upcoming announcement, I can’t get into specifics, but I think you’ll be very interested to learn about what’s ahead.

An Update on IBM Support for AIX Toolbox Open-Source Software

IBM recently announced that it is providing remote assistance with selected community-supported open-source products available through the AIX Toolbox repository. This is significant since support for AIX Toolbox open-source software wasn’t previously available through IBM AIX support cases

The service is formally known as IBM Support for Community Open-Software for AIX Toolbox. According to the announcement letter, IBM provides Level 1, Level 2, and Level 3 support for these AIX Toolbox packages: curl, db, dnf, dnf-plugins-core, expat, gettext, glib2, gnupg2, json-c, krb5, ncurses, openldap, python3, readline, sed, sqlite, texinfo, xz, zchunk, zlib, and zstd.

In addition, IBM will:

1. Provide remote assistance to the client for all covered products, through telephone from IBM’s support center, or electronically, in response to requests pertaining to the following:

  • Basic, short duration installation, usage, and configuration questions for open-source packages running on AIX downloaded from AIX Toolbox only and code-related questions
  • Diagnostic information review to assist in isolation of a problem cause; for example, assistance interpreting traces and dumps for installation and code-related problems
  • For known defects, provide available corrective service information and information about obtaining a corrective fix from the AIX Toolbo

2. Assist the client in determining the cause of the problem and provide a corrective information-fix if it is available from the open-source community, AIX Toolbox or IBM Support

3. Report the defect to the open-source community and inform the client of known actions taken and the availability of the corrective information-fix if a new defect (referenced or without known correction) is identified

4. Provide the client with guidance on how to obtain patches from the AIX Toolbox, or maintenance updates or refreshes (collectively known as fixes) directly from the AIX Toolbox. IBM may provide workarounds as temporary solutions to the client or to the open-source community so the open-source community can create permanent patches. There is no guarantee that the workarounds will be accepted by the open-source community as part of the main code branch.

5. IBM will provide the security vulnerability fix, and when the security vulnerability fix is available in the community, will also provide support for all associated dependencies from the supported packages from the SPL downloaded from the AIX Toolbox repository The announcement letter goes into considerable detail, so take the time to read the whole thing.

What Constitutes a Senior AIX Admin?

Newsflash: Experienced IT people are hard to find. OK, you already knew that people leave their jobs. They take new jobs, they retire, what have you. Replacing workers is the focus of this IBM Community thread from March

Here’s a sampling:

“I’ve worked with admins which blew me away with the depth of their knowledge and their curiosity to know more! I always love it when they can teach me something too. I hang onto their names and network with them!

I’ve also had the opposite issue where system administrators were really just application administrators or other non-technical roles, where the system fell under their responsibility.

There are all types, but both will be listed in the same job title on their resume. […]

I think that a Senior is the one who has real experience and has not been doing just one thing, he is the one who is learning something all the time, he is the one who is trying to make things happen by doing something new.

But I think the most important characteristic of a Senior is that they are teachers, guides and mentors to others.

Of course, their experience allows them to avoid asking for help for little things, because of that they provide better results and in shorter times than others.”

It’s an interesting discussion, and a reminder that interesting conversations are happening on the Community pages.

The Importance of Keeping Current

This document explains why clients should keep their application, OS, and firmware up to date: 

IBM Remote Technical Support will recommend and encourage IBM clients to upgrade for these reasons:

  • Adhere to best practices for keeping software levels in-sync
  • Address known issues related to the reported problem
  • Address undetermined or possible future issues using out-of-sync levels
  • Take advantage of new enhancements or features 

IBM, like any vendor, is not and will not be capable of testing and verifying all combinations of applications, operating systems, and firmware levels. When considering all versions of all components, there are an infinite number of combinations to fully test all versions and combinations.

Remembering Gareth Coates

I want to take a moment to remember Gareth Coates, who passed away on June 15. I referenced his IBM blog and cited information he’d provided over the years. I learned quite a few tricks and tips from him.

Read and share memories and condolences here

Life is short, and none of us get out alive. Let those you love know that you love them.

Hiking Update 

For those keeping score, I successfully completed my North to South Rim Grand Canyon hike back in May, a few days after the North Rim opened for the season. It was hot, it was long, but it was beautiful and I cannot wait to do it again. Two days prior I fractured my toe, but luckily, ibuprofen was sufficient to allow me to get through it.

Just a reminder: Hiking in the Canyon during the summer is no joke. Sadly, a young woman recently lost her life three miles below the rim. 

Tips on the HMC, VIOS, LPAR and More

Originally published by TechChannel April 20, 2022

IBM Champion Rob McNelly on HMC, VIOS, LPAR, AIX 7.2 and 7.3 tips, and more

It’s always nice to hear from readers. Even better is when I can confirm that my information is helping techies in the real world. My colleague Eric Hopkins recently put my invscout script to work for one of his clients. Here’s his story:

I have a customer who has smartly blocked outbound traffic to the internet from their NIM server, so I used details from your article to get a quick report of firmware status which would have taken me hours to gather and review manually.

Here’s how I did it in short order.
1. Downloaded ftp://ftp.software.ibm.com/software/server/firmware/catalog.mic to my workstation and used scp to get it to NIM
2. Used scp to distribute to a pile of VIOS. NOTE: I have SSH key exchange set up for ease of use, so I have $WCOLL defined as a host list file which contains #comments so I needed to exclude them.

            –START
            for i in `cat $WCOLL | grep -v \#`
            do
              scp /export/microcode/catalog.mic $i:/var/adm/invscout/
            done
            –END

3. Used dsh to run invscout
            # dsh /usr/sbin/invscout
4. Gathered the data from each host at /var/adm/invscout/<hostname>.mup using a fancy dsh command to exclude dshbak headers but leave invscout delimeters, but first ran a dsh command to list the files to make sure I wasn’t going to grab any old garbage
            # dsh “ls -l /var/adm/invscout/*.mup”
            # dsh “cat /var/adm/invscout/*.mup” | dshbak | egrep -v ‘HOST:|\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-‘ > VIOSmup

5. Used scp to get /export/microcode/VIOSmup to my workstation
6. Visited https://esupport.ibm.com/customercare/mds/mds and clicked on “Manual Upload” link.
7. Browsed my workstation for VIOSmup, selected radio button “Off line HTML” and clicked Big Blue’s “Upload file” button.

Moments later I was presented with the results which I printed to a PDF document for safe keeping.

Keep in mind, the IBM Support URL has changed yet again. Here’s the new one. If your script contains the old URL you will get an error.

This page provides additional detail. 

AIX 7.1: The Build Environment for AIX Toolbox Packages

As much as I like to believe we are all on current supported versions of AIX, I know that this is not actually the case. In case you missed it

We (AIX Toolbox team) are moving to AIX 7.1 as the build environment for AIX toolbox packages. Because of this, new versions of packages built on AIX 7.1 may not work on AIX 6.1 or earlier AIX releases.

Today our build environment for most of the packages is AIX 6.1. AIX 6.1 is already out of support, and we cannot continue this version forever. As a start we will just move the packages those we are upgrading to new version as per regular updates. The updates due to security vulnerabilities will continue to be built on AIX 6.1 (if vulnerable package was built on AIX 6.1) but in future we will move to AIX 7.1 for those packages as well.

This should not impact anyone who is using AIX 7.1 or later releases of AIX. The packages those are already built on AIX 6.1 will continue to be available on AIX toolbox repository. The AIX release name is part of package name where package was built. For example, python3-3.7.12-1.aix6.1.ppc.rpm is built on AIX 6.1 and can be installed on AIX 6.1 or later AIX releases whereas python3.9-3.9.6-1.aix7.1.ppc.rpm is built on AIX 7.1 and can be installed on AIX 7.1 or later releases.

This will also benefit in taking advantages of new features available in AIX 7.1 releases.

HMC Database Connectivity Fix and AIX 7.2 and 7.3 Installation Information

  • Fellow IBM Power Champion Jaqui Lynch offers another helpful tip: “Unable to connect to Database on HMC virtual networks after updating from 3.1.0. to 3.1.3.” If you get this error when you click on virtual networks on the HMC after the update, the fix is here
  • Along those lines, if you subscribe to IBM email notifications, you might have noticed more tips and tricks being shared there as well. Among those to recently hit my inbox: disabling unused ports on FC adapters; resolving LED 2700 when booting an AIX LPAR or PowerVM; and getting around an LPAR not booting due to error AA00E158 with HMC. While it’s nice to bookmark the actual links, information sometimes disappears from the internet. But for me, given how my memory works, if I run into one of these issues, having read about it previously usually rings a bell: I’ll at least recall seeing something. So when that happens, I can typically what I’m looking for—even if it’s not the doc I saw initially.
  • Finally, I came across some installation tips for AIX 7.2 and 7.3

From Techie Twitter

Be sure to read the last paragraph in that GitHub link to fully understand how the whole thing works. It almost seems too good to be true.

A New Home for the RFE Community

Originally published by TechChannel March 18, 2022

IBM Champion Rob McNelly on the IBM Power Systems Ideas portal, a FLRT change and his latest outdoor adventure

I’ve written about the Request for Enhancement (RFE) process in the past, but recently, Brian Veale notified us of the new IBM Power Systems Ideas portal. This is a site where you can present your ideas about improving products and services and interact with IBM developers.

You’ll need an IBM ID to submit suggestions, comment and vote. But even without signing into the portal, you can view ideas by topic (AIX, HMC, VIOS, etc.) and filter through ideas that others have pitched based on their status (under review, future consideration, planned for future release, etc.) and the response to them (trending, popular). There are literally hundreds of ideas. Chances are you’ll like some more than others. By looking through existing ideas, you can avoid making the mistake of submitting an idea which has already been proposed. 

For what it’s worth, the most popular ideas at the time of this writing were:

  • Dynamically editable min/max fields when an LPAR gets shut down
  • Maintenance capabilities for vNIC failover
  • HMC performance dashboard enhancements 

Some ideas that are not under consideration include more recent versions of WPARs, Ethernet drivers for running AIX 5.3 on POWER8 and redirecting commands on the HMC restricted shell. As is noted on the portal splash page: “Some ideas can be implemented at IBM, while others may not fit within the development plans for the product. In either case, the team will let you know as soon as possible. In some cases, we may be able to find alternatives for ideas which cannot be implemented in a reasonable time.”

Surely you have thoughts on what would make your favorite OS even better. Undoubtedly you have ideas about enhancements that could make your work days easier. Most admins I know have a ton of hard-earned experience, and they have no problem letting others know what they think. Of course not every idea magically becomes a reality, but I appreciate IBM’s willingness to allow their client base this degree of input. So if you’re not familiar with RFEs, check out the portal and learn. And if you’re already active in the community, the easiest way to contribute is to spread the word. Let others know about the new portal. Everyone who uses IBM solutions should participate in the RFE community.

An Important Update for FLRT Users

IBM Support recently noted that the Fix Level Recommendation Tool (FLRT) is now cloud based. This means that user scripts that call FLRT must be updated to this new URL

The change is explained in detail here. Note: “The functioning of this site will not be changed as a result of the move, but there will be a new URL. Initially there will be a redirect to the new site, which will be removed within a few months. There is no firm date set for the removal of the redirect.” 

While many of you are likely already aware of this, please help spread the word. If you’re unfamiliar with FLRT, read my primer

A Grand Canyon Update

For those who are interested, I’ll make it official: I survived my first Grand Canyon rim to rim hike

I was prepared, which made things somewhat easy. I discovered a nearby mountain has a “short” trail—a mile up, a mile down—that’s generally steeper and rockier than the Canyon trails. So I spent the past several Saturdays there, hiking up and back 5-8 times each day. That equates to 10-16 miles, with significant elevation changes.

This is a pretty good video of what to expect when attempting the rim to rim hike, which starts and ends at different points of the Canyon’s South Rim. For the record, I needed 10.5 hours to complete the hike, while the guy in the video got through it in just over 7.5 hours. Still, I’m pretty happy with how things went, and I’m looking forward to my next challenge. In May I’ll hike from the North to the South Rim. That’s a longer hike, and of course it will be much hotter by then, so wish me luck.

IBM Notifications, AIX Migration Tips and More

Originally published by TechChannel February 14, 2022

IBM Champion Rob McNelly on AIX migration tips and tricks, new Power10 performance information, IBM’s updated firmware release schedule and more

I’m regularly reminded of the benefits of IBM’s My Notifications email service. Yes, we all have too much traffic in our inboxes, but My Notifications is a flexible offering that provides worthwhile news and information. For instance, I subscribe to get alerts on recently released fixes. I also receive useful tips on topics like SAS disk array management:

How do I create, list and manage SAS Disk Array?
– General Help:
# sissasraidmgr -h
– Viewing the Disk Array Configuration:
# sissasraidmgr -Ll controller_name -j1
– Preparing Disks for Use in SAS Disk Arrays:
# sissasraidmgr -P -z disk_list
(For example, # sissasraidmgr -P -z hdisk1 hdisk2 pdisk3 pdisk4)
– Changing pdisks to hdisks:
# sissasraidmgr -U -z pdisk_list
– Creating a SAS Disk Array:
# sissasraidmgr -C -r raid level -s  stripe size -z pdisk_list
– Deleting a SAS Disk Array:
# sissasraidmgr -D -l controller name -d array_name
– Adding Disks to an Existing Disk Array:
# sissasraidmgr -A -l array name -z pdisk list
– Creating Hot Spare Disks:
# sissasraidmgr -H -z pdisk_list
– Deleting Hot Spare Disks:
# sissasraidmgr -I -z pdisk_list
– Displaying Rechargeable Battery Information:
# sissasraidmgr -M -o0 -l adapter_name
– Forcing a Rechargeable Battery Error:
# sissasraidmgr -M -o1 -l adapter_name
– Recovering from Disk Failures:
# sissasraidmgr -R -z pdisk_list
– Viewing the SAS device resource locations:
# sissasraidmgr -Z –o0 –j3 -l adapter_name
– Viewing the SAS device resource information:
# sissasraidmgr -Z –o1 –j3 -l adapter_name
– Viewing the SAS path information for the attached device:
# sissasraidmgr -T –o1 –j3 -l device_name
– Viewing the SAS path information graphically for the attached device:
# sissasraidmgr -T –o0 –j3 -l device_name

On a somewhat-related note, if you’re seeking AIX-specific news and information, Built On Power, a provider of solutions and services for Power Systems environments, is aggregating news and links on AIX and other topics. 

DNF and AIX 7.3 Migrations 

Chris Gibson has been tweeting about DNF, an install manager used in AIX migrations.

There’s an IBM Power Community blog devoted to DNF as well as an IBM Support page that lists potential errors when migrating to AIX 7.3:

“The rpm.rte 7.15.1.2004 installed on AIX 7.3 delivers 64-bit rpm. The DNF on AIX 7.1 and 7.2 used 32-bit rpm. You need to run the dnf_aixtoolbox.sh script again, to update to 64-bit DNF.”

Getting NTP Running on AIX 

A client was having trouble getting network time protocol (NTP) working on AIX. They would run lssrc –s xntpd and see it was inoperative. They would try to start it, and it would immediately die.

After running through the common issues like checking the config file, they called me. We read through more documentation before following one of the suggestions: run ps | grep ntp. Upon doing so, we learned that an xntpd process was running. After killing the process, we ran startsrc and the subsystem came right up.

For all of our technical know-how, sometimes diagnosing a problem requires little more than putting a fresh set of eyes on it. 

Virtual Serial Numbers Provide Flexibility for IBM i Clients

This seems very interesting for IBM i clients:

“IBM now offers you the ability to acquire a virtual serial number and assign it to a logical partition, also known as a virtual machine (VM). IBM i software can then be ordered on or transferred to the virtual serial number instead of being tied to a physical IBM Power serial number. Having the IBM i entitlement, keys, and support entitlement on the virtual serial number provides the flexibility to move a VM to a different Power machine.”

Whenever I’d bring up using Live Partition Mobility (LPM) with my IBM i friends, they would always remind me that many of their applications and licenses are tied to a physical serial number. This announcement will go a long way toward making LPM a reality for IBM i clients.

AIX 7.3 Certified for Oracle Database 19c; New IBM Redpaper; Power10 Quick Start Guide and System Firmware Releases

In December, Oracle announced the availability of the Oracle Database Release 19c with AIX 7.3. The press release points to this link, where users can confirm the supported combinations of IBM AIX 7.3 and POWER Systems (64-bit).

On Twitter, IBM Redbooks project leader Dino Quintero noted the release of a new Redbook that describes IBM Geographic Logical Volume Manager (GLVM) for data mirroring in Cloud deployments:

 “This publication is intended to help with the requirements to configure and implement GLVM   for cloud configurations. This book addresses topics for IT architects, IT specialists, sellers and anyone looking to implement and manage high availability and disaster recovery in the Cloud.”

Per Chris Gibson, IBM has published additional Power10 performance information. You’ll find a number of tips and tricks in this 15-slide presentation.

IBM’s firmware release schedule was updated on Jan. 31:

“In order to plan correctly for system firmware maintenance, there is a need to know when new Releases are coming out along with when a Release is planned to go end of service pack support. Therefore, the below table was created to provide this detail. Things can come up that alters this schedule somewhat, but this is the current plan. As things change, the table is updated to reflect status.”

Grand Plans 

For those of you interested in—or at least willing to tolerate—updates on my outdoor adventures, I have news. Next month I’ll be attempting my first one-day rim-to-rim hike of the Grand Canyon. I’ll actually start and finish from the South Rim, which is technically considered a rim-to-rim hike. Then in May I’ll spend a day hiking from the North to the South rim. Wish me luck.

Readers Respond: Updating VIOS on POWER7, and a Simplified invscout Script

Originally published by TechChannel January 21, 2022

IBM Champion Rob McNelly reflects on what he’s learned from readers—from mixed results updating VIOS on POWER7 to a cleaned-up invscout script

A customer was planning to patch their VIO servers on a POWER7 machine. You may recall my mention of another customer that had success running recent code on POWER7 despite the lack of documentation.

“I mention POWER7 specifically because a client with that system was wondering about their options. If you go to FLRT and plug in 8202-E4C for the machine type model, you’ll see that 2.2.6.5 is the latest recommended version of VIOS code. For an 8202-E4D model (POWER7+), the recommendation is 3.1.2.21.

What would you do in this situation? Is there enough of a difference between POWER7 and POWER7+ that you would hesitate to deviate from the recommendations? In this case, even when told by support to stand pat, the client was brave and chose to install 3.1.2.21 on their POWER7 box, because that version of VIOS is still supported. It worked. Of course every environment is unique, so your mileage may vary. And ultimately, getting to POWER9 should be your goal if it is at all possible. But I can report that at least one client is now successfully running 3.1.2.21 on POWER7.”

In this case, the customer’s 8202-E4C was running fine on VIOS 3.1.1.25. But with FLRT showing that 3.1.2.21 was supported on an 8202-E4D, they decided to roll the dice and move to 3.1.3.14. It didn’t go well. Their NPIV mappings stopped working, and attempts to login to the SAN switch triggered errors from the HMC lsnportlogin command. Using the chnportlogin command didn’t work either. Here’s what they saw:

wwpn_status=2,logged_in=unknown,wwpn_status_reason=invalid error number

Lacking time in their change window for additional testing, they ended up restoring from a mksysb and returning to 3.1.1.25. Everything worked again. Now the plan is to stay there until the system is retired. Naturally this setup isn’t supported by IBM, but the customer feels it’s their best option. Incidentally, the customer is running another 8202-E4C box at VIOS at level 3.1.2.21 with no issues.

So there’s hope, but also a strong dose of reality. Yes, some of these older machines can be updated beyond the stated supported levels, but you’ll need to do some testing (and, obviously, capture a mksysb, run viosbr, etc., even before that. As I recently said on Twitter, nobody cares about backups, but they do care about restores). Realistically though, it appears we’ve about reached the limits of extending VIOS on POWER7 machines.

Still, I’d love to hear more about this topic, particularly if you’ve been able to get even later VIOS versions to work on POWER7 hardware.

Simplified invscout Script 

I greatly appreciate reader feedback, because typically I learn something. Here’s another recent reader interaction about a script that wasn’t working.

You may have seen my TechChannel video about invscout. I had linked to a script that’s supposed to automate the download of the catalog.mic file, copy it to multiple LPARs, consolidate the output and send it to IBM. However, a reader told me the script did not work as written. Fortunately, it was a quick fix. I changed a couple URLs and did other small simplifications that the reader requested.

Obviously you’ll need a machine that can access the LPARs you select to run invscout, and of course you’ll need internet access to automate the catalog.mic file download from IBM and send the file for processing. Depending on restrictions in your environment, it may take more manual steps to get this to work, but the general idea is that you start with a device—a NIM server or something similar—that has ssh keys set up to log into LPARs.

Be sure to edit the /usr/local/etc/servers file (or change the location to something you would prefer) and verify the TEMP directory you plan to use for this. You’ll also need scp, curl and wget installed.

The original script is capable of self-cleanup, so you may want to automate that by adding those sections in as well. The original also emails the output. I added a couple of comments to highlight where the IBM URLs had changed. You may prefer running the original script with my URL changes, but I also wanted to provide another, shorter version. Given the frequency with which useful tech stuff disappears from the internet, it never hurts to double up on these things.

#!/bin/ksh
# script: generate_survey.ksh
# purpose: To generate a microcode survey html file
# where is my list of servers located?
SERVERS=/usr/local/etc/servers
# what temporary folder will I use?
TEMP=/tmp/mup
# what is the invscout folder
INV=/var/adm/invscout
# what is the catalog.mic file location for invscout?
MIC=${INV}/microcode/catalog.mic
# user check
USER=`whoami`
if [ "$USER" != "root" ];
then
    echo "Only root can run this script."
    exit 1;
fi
 
# create a temporary directory
mkdir $TEMP 2>/dev/null
cd $TEMP
 
# this url had changed from the original script
wget ftp://ftp.software.ibm.com/software/server/firmware/catalog.mic
 
# move the catalog.mic file to this servers invscout directory
mv $TEMP/catalog.mic $MIC
 
# distribute this file to all other hosts
for server in `cat $SERVERS` ; do
   echo "${server}"
   scp -p $MIC $server:$MIC
done
 
# run invscout on all these hosts
# this will create a hostname.mup file
for server in `cat $SERVERS` ; do
   echo "${server}"
   ssh $server invscout
done
 
# collect the hostname.mup files
for server in `cat $SERVERS` ; do
   echo "${server}"
   scp -p $server:$INV/*.mup $TEMP
done
 
# concatenate all hostname.mup files to one file
cat ${TEMP}/*mup > ${TEMP}/muppet.$$
 
# Sometimes, the IBM website will respond with an
# "Expectation Failed" error message. Loop the curl command until
# we get valid output.
 
stop="false"
 
while [ $stop = "false" ] ; do
# The following url had changed from the original script. I also had to make sure that it was all on the #same line, it is all one command.
curl -H Expect: -F mdsData=@${TEMP}/muppet.$$ -F sendfile="Upload file" http://www14.software.ibm.com/support/customercare/mds/mds> ${TEMP}/survey.html
 
#
# Test if we see Expectation Failed in the output
 
#
 
unset mytest
mytest=`grep "Expectation Failed" ${TEMP}/survey.html`
 
if [ -z "${mytest}" ] ; then
        stop="true"
fi
sleep 10
done

At this point I moved the survey.html file onto a machine with a browser, and it gave me a very useful report. Chris Gibson has some great examples of the reports and information you can expect to see. Scroll down for specifics about adapter firmware updates. The red type really pops, and once you’ve updated your firmware and rerun the report, seeing the red disappear can be very satisfying as well.

Apache Log4j Vulnerability, AIX Tips and Tools, a Look at Time Drift, and More

Originally published by TechChannel January 4, 2022

IBM Champion Rob McNelly on the Log4j security vulnerability, helpful AIX tips and tools, time drifts on POWER8 and POWER9 servers, and more

System security is no laughing matter. Well, occasionally it’s a laughing matter, but I’ll get to that in a moment. On Dec. 15 IBM issued a security bulletin regarding a vulnerability in Apache Log4j that affects the Power HMC. Check the IBM PSIRT blog for the latest updates: 

“… Product teams are releasing remediations for Log4j 2.x CVE-2021-44228 as fast as possible, moving to the latest version that’s available when they are developing a fix. Where possible, the dependency on Log4j is removed entirely.

IBM is aware of additional, recently disclosed vulnerabilities in Apache Log4j, tracked under CVE-2021-45105 and CVE-2021-45046. Work continues to mitigate or remediate these vulnerabilities in products and services that already have released a remediation based on Log4j 2.15.

With so much active industry research on Log4j, mitigation and remediation recommendations will evolve. We are actively assessing the latest Log4j developments and will share updates accordingly.”

The blog also lists products that are confirmed not impacted, as well as products that have been remediated. Patch your systems as soon as possible.

Now for a little Log4j levity. In a reprise of the great jif/gif debates, admins are again arguing about pronunciation. Is it “log four jay”? “Log forge”? I’ve seen and heard both. Twitter has also weighed in on this very important matter:

            Problem: Apache Log4j
            Solution: A patchy Log4j
            *crowd booing*

Encrypting AIX Logical Volumes

On Twitter, IBM’s Soumya Menon cites documentation about encrypting logical volumes in AIX:

“Starting with IBM AIX 7.2 with Technology Level 5, the Logical Volume Manager (LVM) supports the data encryption at the logical volume (LV) level. Using this feature, you can encrypt the data at rest to protect data exposure because of lost or stolen hard disk drives or because of inappropriately decommissioned computers. The term data at rest refers to an inactive data that is stored physically in any digital form.

Each LV is encrypted with a unique key. The logical volume data is encrypted before the data is written to the physical volume. This data is decrypted when it is read from the physical volume. By default, data encryption is not enabled in logical volumes. You must enable the data encryption option at the volume group level before you enable the data encryption option at the logical volume level.

The hdcryptmgr command manages the encryption keys, data encryption, and data decryption of the logical volume.”

An Option for Decoding and Summarizing AIX I/O Error Messages 

Here’s an interesting tool for your bag of tricks:

            NAME
            summ
            PURPOSE
            Diagnostic tool for decoding and summarizing AIX I/O error messages
            SYNTAX
            summ [Flags] [Filename]
            FLAGS
            -e Include FC driver error numbers for each error.
            -p Paginate the output.
            -r Reverse the order of output.
            -s Include sequence numbers in each line’s header.
            -c I/O retry cmd_history failure time and reason
            NOTE: summ –help displays the flag options.

DESCRIPTION
The summ command is an AIX only diagnostic tool used to decode Fibre Channel and SCSI disk AIX error report entries. It is an invaluable tool that can aid in diagnosing storage array or SAN fabric-related problems providing the source of the error.

The script generates single-line error messages enhancing the readability of the AIX error report. The tool is used by IBM Support worldwide, and is considered safe to run in a production environment.

Timeout Issues When Querying VIOS Resources From the HMC 

IBM Support has an explanation:

Question: When trying to query virtual resources configuration on a managed system from HMC, it may happen that it takes very long time before completing or it fails with a timeout error message.

Cause: Any time a query is performed from HMC, a call is made to all VIOS on the managed system to get details on the configuration. On VIOS, the vio_daemon will proceed with this request by sending a query to the CMDB and respond to HMC.

There are different possible issue which could lead in timeout or at least long delay for this query, and the most common error message seen is:

-> The system is currently too busy to complete the specified request. Please retry the operation at a later time. If the operation continues to fail, check the error log to see if the filesystem is full.

Answer: The error above let us think that the VIOS is currently suffering some performance issue. Indeed the VIOS has to manage all the resource shared to client lpar (including disk access/IO, network communication…), but it also has to deal with all resource management request from all connected HMC (and in some case Novalink, PowerVC or other management product).

To reduce the risk of experiencing timeout issues, use the part command to monitor VIOS resources (CPU/memory). 

Time Drift on POWER8 and POWER9 Servers 

Also from IBM Support:

“Some clients noticed that the Power server time drifts seconds per day when compared to other systems, wall clock, or an NTP reference. They might observe it over a period of days, weeks, or months. The client can be asking questions.

Why is the server behaving differently?
Is there something wrong with my server?
Why must I use NTP when I never had to before?
Why is IBM not told me they changed the TOD accuracy of the server?

It does not indicate hardware needing replacement. The immediate suggestion is that clients use Simple Network Time Protocol (SNTP)/NTP as the power system Time of Day (TOD) can be expected to drift seconds per day when NTP synchronization is not used.

The only sure method to eliminate TOD drift is to deploy NTP and it is the IBM-recommended method to synchronize partition and system time and date for several generations of Power servers. Configuring and deploying NTP is outside the scope of the document but is described by OS documentation in IBM Knowledge Center. The reference section has pointers to some of the documentation.”

For more on time drift, see my old post from 2009.

E1080 at a Glance, AIX 7.3 Released, a Handy Tool for Job Hunters, NIM Install Troubleshooting 

  • Nigel Griffiths posted a pair of images that summarize key features of the IBM Power Systems server model E1080. Download them from IBM Support.
  • We had the open beta, now we have the real thing, AIX 7.3 has been released. Keep in mind the requirements: an IBM POWER8, POWER9, POWER10, or later, technology-based server. Also note that POWER8 Nutanix (CS821 and CS822) does not support AIX 7.3.
  • According to his Twitter bio, Ron Gould is a systems and network administrator. He’s come up with an interesting tool to help other techies update and customize their resumes. Download the relevant files here
  • If you’ve ever had trouble installing NIM, bookmark this page from IBM Support. 

Note: “This document is intended as a reference guide for troubleshooting common NIM LED hangs. It is not intended as a fail-safe resolution guide, however [these] steps represent the most likely causes and resolutions to various NIM hangs.”

Dynamic LPAR on HANA, Support Docs Updated and User Group Session Replays

Edit: I am looking forward to the next half marathon at Lake Powell

Originally published by TechChannel November 29, 2021

Dynamic LPAR is now available for SAP HANA databases. Get details on this, along with updated Power10 and AIX 7.3 documentation and more from IBM Champion Rob McNelly.

Dynamic LPAR is now available for SAP HANA databases.

The new capabilities allow HANA admins to adjust memory on active LPARs without shutting down partitions. This tweet has more information,including an important reminder about the potential need to run dynamic platform optimizer (DPO) should you experience performance degradation.

This is just another reason to consider running SAP HANA on Power Systems servers if you aren’t already.

“With IBM POWER9 processors and IBM PowerVM, Power Systems can host up to 16 production SAP HANA databases on a single server. You can granularly allocate memory and cores across SAP HANA instances to meet precise capacity needs. Support for shared processor pools lets you dynamically distribute compute capacity across SAP environments, reducing total cost of ownership (TCO). On-demand workload scaling allows you to quickly and easily add more cores and memory to SAP HANA workloads—without configuration recertification by SAP.

“This flexible solution allows you to run both SAP and non-SAP applications—including transactional, analytical, memory-intensive, and I/O-intensive workloads—on a single platform. You can also run legacy SAP applications alongside SAP S/4HANA workloads and migrate at your own pace. Flexible resource allocation lets you support short-term, long-term, and shifting demand.”

Service and Support Best Practices Docs Updated 

Chris Gibson shared this link, which lists numerous documents that recommend service and support strategies for IBM systems and software. Recently updated AIX docs include Power10 Performance Best Practices and POWER9 Performance Best Practices. Scroll down to the hardware and firmware section and you’ll find the POWER8, POWER9 and Power10 system firmware release planned schedule, which was also just updated.

Note: These brief checklists and docs should not be confused with the much more comprehensive Power Implementation Quality Standard document that was just updated to version 2.5. The newly updated pages are clearly marked as you make your way through the presentation.

AIX 7.3: A Technical Review 

Chris also let me know about the Singapore AIX/IBM i/Linux on Power Meetup Group. The next monthly meeting is Nov. 26. He and Anthony Steel will discuss new features in AIX 7.3 (as well as 7.2). The presentation will last about an hour. A Q&A session will follow.

If you’re coming by this information after the fact, check out the meeting archives

Power10 Presentations and Videos 

Speaking of user groups, and the latest announcement information, some interesting replays are available. The UK-based Power VUG Technical Webinar Series hosted these recent presentations:

  • Session 110: PowerVM features in Power10 systems and HMC V10, CMC, Enterprise Pools 2.0 Enhancements
  • Session 109: Green is easy, with IBM Power—how Power10 contributes to your Sustainability journey, while saving cost at the same time

And check out these Power Systems Virtual User Group webcast replays:

Cool Twitter-Adjacent Stuff

On Twitter, Kiran Tripathi cites options for capturing and exporting VM instances, here.

“You can capture and export an AIX or IBM i VM instance by using the Power Systems Virtual Server user interface or CLI. A VM is captured as a volume backed image. The image is stored in new volumes on the storage providers. An image can be exported to an IBM Cloud Object Storage (Cloud Object Storage) bucket. When an image is exported, the volumes of the image are copied and packaged in an Open Virtualization Appliance (OVA) file. The OVA file is compressed by using gzip before it gets uploaded to the IBM Cloud Object Storage bucket.

“When you capture and export a VM, you can choose the image catalog, COS, or both as destinations. The image catalog resides on the IBM Power storage area network (SAN). IBM’s COS is encrypted and dispersed across multiple geographic locations, and accessed over HTTP by using a REST API. This service uses the distributed storage technologies that are provided by the IBM COS System (formerly Cleversafe). You can always export your image in your image catalog to COS at a later point. You can also deploy the captured image to create a clone of the VM by using a different network configuration.”

In a post on LinkedInChris Peterson explains how AIX APIs can be used to explore user password histories:

“This one is a bit specialized and not something everyone should meddle with. Of course, your code has to run as root in order to query this “database” in the first place, so all’s fair in love and business.


“I was extremely pleased by the announcement that AIX’s next release will default to a much better password hashing algorithm than crypt() that allows for longer passwords! Huzzah!


“Yet another—mostly documented—API that sets AIX apart from some of the “competition.” If you know what to look for and have an idea where to look, you’re almost guaranteed to find ways to make AIX one of the most secure and auditable platforms anywhere.”

Tell Yourself You Can, and You Will 

When I heard that a friend was going to run a half marathon, my first thought was: “There’s no way I could do that.”

But the thing is, I’m an experienced runner. I love running. Sure I’d never run that distance before, but was it really that difficult? I decided to find out. One day I extended one of my regular runs to see how far I could get. A half marathon is 13.1 miles; I ended up covering 14 miles with relative ease. So yes, I realized, I can run a half marathon—and I did just that at the recent event at Lake Powell in northern Arizona.

If you tell yourself you can’t, you’ll be right. But if you tell yourself you can, you’ll also be right. It may take time, effort and practice. You may fail once or multiple times when trying something new, but that’s OK. You’ll most certainly fail if you don’t try. It’s far better to fail at something than succeed at nothing.

A New AIX Video Series and the Latest on AIX Open Beta and Power10

Edit: did you install the open beta or did you go straight to the GA release?

Originally published by TechChannel October 14, 2021

IBM Champion Rob McNelly on his new TechChannel AIX video series, and the latest news on Power10 and AIX open beta

A quick professional note: I’ve launched a video series for TechChannel. I’ve recorded three so far, including my latest on the Technical Support Appliance (TSA). Take a moment to check them out—the videos all clock in at around four minutes—and then send me suggestions for topics you’d like to see covered going forward. Being new to video creation, it’s a challenge, but I like presenting this information in a new way. Note that each video is transcribed, so read the text if you prefer that to me talking.

The Latest on the AIX Open Beta and Other Power10 Topics

Have you been reading the IBM Power Community blog? I’ve been enjoying the content, including this entry on PowerVM features in Power10 Servers. PowerVM components such as VIOS, the HMC and server firmware are updated in the new release. The piece also covers discontinued features such as active memory sharing and workload management groups.

Here’s another good entry: “HMC V10 R1 M1010 Features and CMC & Enterprise Pools 2.0 Enhancements”

“As IBM introduced the E1080 system, the first in a generation of servers based on the Power10 processor, comes a newer of version HMC enabling management of Power10 Systems. HMC V10 comes with a lot of new features & enhancements which will enable a seamless systems management experience and addressing many Request for Enhancements (RFEs).”

Also be sure to read this update on the AIX open beta:

“On behalf of the whole AIX team, I am excited to announce that the AIX 7.3 Open Beta is now LIVE! The free AIX 7.3 Open Beta provides an early view of the new AIX 7.3 release that will be available later this year. The Open Beta is open to anyone that would like to participate, with multiple options for getting access to the early release.

· Take advantage of $1500 in PowerVS credits to install off-prem via IBM Power Virtual Server.
· Install the early code on your own Power8 or later server.
· Software Vendors that need access to a server can get special access thru the IBM Technology Zone.

This is a unique opportunity to test out AIX 7.3 ahead of its release! Participants will be able to take advantage of some of the new features and enhancements, including:

· Increased file and filesystem sizes.
· Python and Bash frameworks running directly with AIX.
· Support for dnf command for installation of open source packages from the AIX Toolbox.
· Reduced time to dynamically add processors/memory to a running LPAR.
· Reduced IPL times for multi-terabyte memory LPARs.
· pigz and zlibNX commands now transparently use NX GZIP acceleration on Power9 and Power10
enhanced support for logical volume (LV) encryption to include rootvg and dump device.
· tcp protocol stack supports CUBIC.
· Additional IP Security (IPsec) enhancements.
· Create an OVA file from a mksysb using create_ova command.
· Create ISO image from mksysb_iso command.
· Integration with new IBM Open XL C/C++ and Fortran for AIX 17.1 compilers.”

I have downloaded the code and started testing in the lab. Enjoyable as this is for me, these experiences always make me think about how much has changed with the OS, even since AIX 7. Of course compared to AIX 6, which I loaded on VIOS back in 2007, it’s a whole new world now.

As you get hands-on with AIX 7.3, you should spend time in the user forums. I set up my subscription to receive emails with comments, and beta participants are already providing great feedback and information. Incidentally, the beta runs till the end of January, so you can still join the party. Register here.

Mapping VIOS and Virtual FC Adapters

Chris Gibson just posted a short but timely piece on mapping VIOS and virtual FC adapters in Power10 environments. 

The upshot is that you can now get output with the client adapter name, vios partition, vfc host adapter name and location, physical adapter on VIOS, etc. This makes mapping the adapter on your VIO client out to the vio server and actual physical adapter and port that much easier.

vpgadmin Explained

A reader was looking at /etc/passwd on the VIO server and noticed vpgadmin. What does vpgadmin do? IBM Support has the answer

“What is the purpose of User ID “vpgadmin” on the Virtual I/O Server? Can it be removed? Can the user settings be modified?

Cause: The user ID is being flagged during a security scan?
Answer: The vpgadmin is a new user introduced in version 2.2.6.31 and above, as well as in version 3.1. Note: If the Virtual I/O Server is running at ioslevel lower than 2.2.6.31, and it is later updated to that version or higher, the vpgadmin ID will be automatically created during the update process.
vpgadmin is used internally and required for Virtual I/O Server database administration (SolidDB and Postgre DB).

This user should not be removed or disabled since it is reserved for internal VIOS administration. Doing so will lead to unpredictable results and will leave the Virtual I/O Server in an unsupported configuration. Changing the user’s group ownership is not supported.”

There’s lots more, so read the whole thing on the IBM Support site.

Cloud Backup Management Options

If you’re considering a cloud migration, you may be interested in the contents of the new IBM Redpaper, “Cloud Backup Management with PowerHA SystemMirror.” The document shows you how to take a backup of your data by either cloud backup or remote storage. It also explains how to recover your data if there is a disaster. At 11 pages, it’s a quick and valuable read.

You’ve Heard the News, Now Dig Into the Details: Get up to Speed on the Power10 Announcement

Edit: A nice roundup of links

Originally published by TechChannel September 14, 2021

IBM Champion Rob McNelly on Power10 E1080 server highlights, and resources you can use to learn more

On Sept. 8, IBM announced the new Power10 server, the E1080 (9080-HEX). While much has already been written about the new server, I’m here to share more details. If you take the time to check out even a few of the many links in this article, you’ll soon be up to speed. I’ve also included links to in-depth training, which will be available very soon.

E1080 Facts and Figures 

Let’s start with IBM’s updated server facts and features and this new Redbook. You should also watch Nigel Griffiths’s “fast facts” and “10 highlights” videos. Both come in at under 15 minutes.

As you can imagine, Griffiths is a busy guy these days. He’s also doing a live presentation, “Power10 from the Hands-on Experience,” on Sept. 15. He’ll essentially take a new system apart and show you what’s inside. In other words, you’ll get the view that is usually restricted to the CEs who set up and service systems.

For more in-depth information, check out the Sept. 30 Power Systems Virtual User Group presentation, “IBM Power10 Announce,” with Bill Starke and Joe Armstrong. Register here.

Of course you’ll want to explore what IBM is saying about Power10. Start with this easy to remember landing page, this new server data sheet, and this whitepaper. Also check out the Power10 AR experience and interactive demo. Finally, there’s this interesting blog post and this explainer about the significance of a new SAP benchmark: 

“Today, SAP published a new SD 2-tier result for IBM’s soon to be announced Power E1080. First the highlights:

174,000 SD Users
955,050 SAPS
120 cores

Wait, almost 1M SAPS with only 120 cores? HPE achieved 670,830 SAPS (122,300 users) with 224 cores on their Superdome Flex 280 with the Intel Xeon Platinum 8380H Processor in January 2021.

This new result is almost 3 times the SAPS/core of HPE’s biggest and baddest system. (Funny note: autocorrect tried to change “baddest” to “saddest.”) This new result is also about 33% faster, on a per core basis, than the previous Power 980 result published at the end of 2018. That is certainly not remarkable since Intel’s per core performance on this benchmark also increased about 69.5%, since 2017 … sorry, missed the decimal, 0.695%. (Comparing two Dell 2-socket results, Intel 8180 & Intel 8380).”

Administrative Changes

AIX admins will notice something new on Power10: It’s much easier to keep track of the status of your software maintenance agreements (SWMA). You’ll find notifications about the expiration of SWMAs in both the error log and on the HMC

“With the introduction of the Power10 server, IBM expands upon the use of the update access key (UAK) with the addition of an SWMA UAK. This functionality provides proactive notification of AIX SWMA expirations to ensure continued and uninterrupted software support. AIX SWMA UAKs do not limit the operability of or capability to update AIX.”

You should also be aware of changes with the UAKs

“What do I need to know about AIX Update Access Keys and Power10?

IBM Power10, or later, processor-based servers include technology that helps to manage and keep your SWMAs current so that you can apply AIX updates and receive support from IBM. The server checks for an active SWMA when updating the AIX operating system. The server utilizes an AIX UAK that includes the expiration date for the associated SWMA. Informational messages are generated when the release date for the AIX operating system has passed the expiration date of the AIX UAK and during normal operation. Additionally, the server periodically checks and informs the administrator about AIX UAKs that are about to expire, AIX UAKs that have expired, or AIX UAKs that are missing. It is recommended that you replace the AIX UAK within 30 days prior to expiration.”

Of course you’ll need to run new HMC code to go with your new Power10 server. Now you can get the new virtual appliance or update the code on your HMC hardware, assuming it’s compatible with the new software version

Also in This Announcement 

Along with enhancements to PowerVM and PowerVCPowerSC 2.0 is now available: 

“IBM PowerSC 2.0 strengthens and simplifies system security management by combining the capabilities of IBM PowerSC Standard Edition 1.3 and IBM PowerSC MFA 1.3 and adding a major new feature:

Endpoint detection and response (EDR) capability, which includes the following:
Host-based intrusion detection system (HIDS). HIDS analyzes the traffic to and from a specific computer for signs of possible intrusion incidents, violations, or imminent threats. HIDS can also monitor key system files and any attempt to overwrite these files, a function that works in conjunction with file integrity monitoring (FIM). HIDS can do log-based intrusion detection, which overlaps with log inspection functionality, and includes time-based alerting and active response.

Log inspection and analysis. PowerSC 2.0 can identify important security events buried in the OS and application logs and ignore relatively unimportant events.

Event context and filtering. Events are sorted, prioritized, filtered, categorized, and put in context to help identify anomalous activity.

Incident response. This methodology, used to respond to and manage a cyberattack, aims to reduce the damage and help recover as quickly as possible.”

Finally, in case you were wondering, Power10 E1080 servers have a PVU of 120

There’s a lot here, and yet I can’t say I’ve covered everything. IBM never sleeps, after all. So feel free to reach out if there’s something I’ve missed.

CLI Options for Configuring and Modifying LPARs

Edit: I find it so much faster to make these changes from the command line vs the GUI

Originally published by TechChannel August 23, 2021

Rob McNelly highlights resources you can use to learn how to create an LPAR without using the HMC GUI, and use the CLI for dynamic LPAR operations

Gareth Coates, whose work I’ve cited previously, recently explained how to create an LPAR without using the HMC GUI:

“Here is a suite of files to create an LPAR on an IBM [Power Systems server]…. There are shell scripts and configuration files.

You need to make changes to suit your particular environment. I’d say that the chance of them working on your systems without such modifications is nil.

A tar file is available for download in the “Attachment” section at the bottom of this page.

My original motivation for developing the suite was when I was working on a particular project. To access the HMCs from my desktop, network packets had to go through several firewalls using a complicated WAN. There were many LPARs on numerous Enterprise Class systems and many users of the HMCs’ GUIs. The HMCs were busy and paging was happening. The result was a slow network connection; both low bandwidth and more importantly, high latency! It could easily take over 5 minutes to fill in the HMC pages and create the LPAR, even when all of the properties were already defined and I just had to complete the forms. Then, I had to use DLPAR to add virtual adapters to the VIO [servers] and edit the VIOS profiles.

Using “malt,” I could achieve all of the steps in under 30 seconds. Sometimes, I was tasked with creating several (maybe 10) LPARs with similar configurations but with different LPARid numbers, you can imagine the time which the scripting saved.

I found that improvements to the HMC and new powerful products like PowerVC and the “LPAR Provisioning Toolkit” took over from malt, so I did not post my utility. Recently, several customers have asked for access to it, so here it is.

I understand that it has some limitations, but it is what it is. I am not doing any more development on it. If people extend it, for example to use SSP storage, and let me have the code, I’ll consider posting it here …”

Be sure to read the entire support doc for complete instructions.

Getting the Most From Dynamic LPAR

Along those lines, my coworker Eric Hopkins recently helped a client better understand how to use the command line interface (CLI) for dynamic LPAR (DLPAR) operations:

“The full capabilities to dynamically add or delete devices or edit the contents of a profile using CLI of the HMC still exists and has become the most direct method for implementing those changes. While confusing at first glance, the CLI offers a logical repeatable method of DLPAR actions.

The following examples will define the needed information for each task and utilizes a defined variable within the command structure. A real-world example will also be shown for clarity and syntax.

Two commands are needed to completely add an operational device. One command to add it to the LPAR that points to the managing VIOS, and one command to add it to the VIOS that points to the LPAR.

The syntax of the action being taken must be understood with reference to where the intention of the command is directed to add or remove a device. If a virtual fibre device is being added to an IBM i or AIX LPAR the $VIRTUAL_SLOT is the Adapter ID of the virtual device on that LPAR, while $REMOTE_SLOT refers to the managing VIOS for that device. Similarly, if this virtual fibre device is being added to the VIOS, $VIRTUAL_SLOT refers to the Adapter ID of the virtual device on that VIOS while $REMOTE_SLOT refers to the Adapter ID of the virtual device on the corresponding LPAR.

Best practice is that this number matches. However, if it doesn’t match, it’s important to know the distinction between the two variables.

Once the command has been accurately issued, the device will appear in the running operating system. At this time it is imperative to save the running config to the profile of the LPAR or VIOS. Without this step, when the LPAR is halted and started from its profile, the device will no longer exist.

Variable                                  Information needed:
$MACHINE_NAME              Server-9009-22A-SN8675309
$VIOS_NAME                        VIOS1
$LPAR_NAME                       nim01
$VIRTUAL_SLOT                  315
$REMOTE_SLOT                  315
$DEVICE TYPE                      fc (fibre), eth (ethernet), scsi (SCSI)
$ADD_OR_REMOVE            a (add), r (remove)
$SERVER_OR_CLIENT        server (Adding on VIOS), client (Adding on LPAR)

Adding a virtual fibre channel device to an LPAR:

chhwres -r virtualio -m $MACHINE_NAME -o $ADD_OR_REMOVE -p $VIOS_NAME --rsubtype $DEVICE TYPE -s $VIRTUAL_SLOT -a "adapter_type=$SERVER_OR_CLIENT,remote_lpar_name=$LPAR_NAME,remote_slot_num=$REMOTE_SLOT"
 
chhwres -r virtualio -m Server-9009-22A-SN8675309 -o a -p VIOS1 --rsubtype fc -s 315 -a "adapter_type=client,remote_lpar_name=nim01,remote_slot_num=315"

Adding a virtual fibre channel device to a VIO:

chhwres -r virtualio -m $MACHINE_NAME -o $ADD_OR_REMOVE -p $VIOS_NAME --rsubtype $DEVICE TYPE -s $VIRTUAL_SLOT -a "adapter_type=$SERVER_OR_CLIENT,remote_lpar_name=$LPAR_NAME,remote_slot_num=$REMOTE_SLOT"
 
chhwres -r virtualio -m Server-9009-22A-SN8675309 -o a -p VIOS1 --rsubtype fc -s 315 -a "adapter_type=server,remote_lpar_name=nim01,remote_slot_num=315"

Removing a virtual adapter in slot 315 from an LPAR or VIOS:

chhwres -r virtualio -m $MACHINE_NAME -o $ADD_OR_REMOVE -p $LPAR_NAME -s $VIRTUAL_SLOT
 
chhwres -r virtualio -m Server-9009-22A-SN8675309 -o r -p nim01 -s 315

* Don’t forget to save running config!

1. On the HMC GUI check the box by the LPAR name.
2. From the ACTIONS pulldown menu choose Profiles, then Save Current Configuration.
3. A window will pop up to allow overwrite of the existing profile or the option to save to a new profile.”

Everyone should be aware of these handy options. Even in smaller, static environments where the GUI is typically sufficient, these alternatives can be handy when installing systems or applying changes.

Waiting on a Replacement Part, the Latest With OpenSSH, a Bold Client Choice and a VIOS Webinar

Edit: I still wonder about keeping a spare water heater on hand

Originally published by TechChannel July 30, 2021

Rob McNelly on VIOS admin tips, OpenSSH for AIX, effective support and more

Recently my water heater died. Fortunately it’s relatively new and still under warranty, so I didn’t have to dip into my pockets to get it repaired. But I did have to wait, because the manufacturer needed to ship a part, which took a few days.  

Supply chains are still being stretched, so it wasn’t surprising to learn that the required part wasn’t available in my area. Since I live in Phoenix, having to take a few showers in lukewarm water isn’t the worst thing this time of year, but still, what’s the point of a manufacturer’s warranty if my issue can’t be promptly resolved?

With all the camping I do, I’ve collected quite a bit of equipment over the years. A lot of this gear, like cots and sleeping bags and air mattresses, serves to keep me warm and comfortable. Given that I also have a 12-volt fridge with sufficient batteries to run it, not to mention a vehicle with a generator and an AC unit, one could argue that I’m more of a “glamper.” But having options, based on where you’re going and what you’re doing, is important. My point though is that during this recent stretch, I considered deploying one of my favorite camping comforts at home: a portable tank-less water heater.

While I don’t have this exact system, you get the idea. It’s great when you want to clean up at the end of a hot, sweaty, dirty day in the back country, but it doesn’t have much use in the house. It’s not an option to warm water for the dishwasher or washing machine. And as I said, with Arizona summers being what they are, I can get by without hot showers for now.

I must admit, even though it was just a few days, I missed the luxury of instant hot water. I’m sure my grandparents wouldn’t skip a beat without the convenience of an on-demand, whole house water heater. But me, the rugged outdoorsman? I was reminded how soft I am living in this modern world.

Anyway, to put a tech spin on this, I liken it to having a bares-bones disaster recovery solution. My little water heater is like a small backup box that will get me by in a pinch. By comparison, my regular home water heater is a powerful production machine that gives me everything I need to handle my water workload.

Would it make sense to purchase a second full size backup water heater, just in case I needed it? No. After all I’m not using my hot water to make money or service my customers, and it’s not as if I could lose millions for every minute without hot water. But the calculation is the same. What’s the cost of downtime? How valuable is continuous availability?

The part eventually arrived. In hindsight, it may have been faster had the manufacturer simply replaced my broken water heater. Of course from their perspective, shipping a single part is easier and cheaper. So I guess I come away from this episode with even greater appreciation for IBM support. Because when I need computer parts for a server, IBM generally ships and installs them promptly.  Another advantage is that more times than not, my server will proactively call home. On occasion it’s actually IBM that lets me know when a part is failing and must be replaced. I sure wish my water heater could call home. 

OpenSSH for AIX: What’s New 

Chris Gibson has a nice write-up about the new features available with the latest version of OpenSSH for AIX:

The latest version of OpenSSH for AIX 7.2 now supports POWER9 HW GZIP! The requirements for this new capability are as follows:

  • Processor mode: Power9 (instead of Power9_Base)*
  • AIX Level: 7.2 TL 4 and above*
  • zlibNX: 7.2.4.0 NX accelerated zlib compression library*
  • OpenSSH version: 8.1.102.2102 and above*
  • OpenSSL version: 1.0.2.2100 and above *
  • ssh commands (like ssh, scp etc) should be run with -C option (compression enabled) 

Read the whole thing to get the details.

Assessing Your Options With FLRT Lite

Off the top of your head, do you know if a POWER7 server will run VIOS 3.1.2.21?

Start by consulting FLRT lite and searching on VIOS. As of this writing, the current VIOS version recommendation is 3.1.2.21.

Then check the VIOS lifecycle information. Version 2.2 has reached end of support, while the projected end of support for 3.1.2.x is November 2023.

I mention POWER7 specifically because a client with that system was wondering about their options. If you go to FLRT and plug in 8202-E4C for the machine type model, you’ll see that 2.2.6.5 is the latest recommended version of VIOS code. For an 8202-E4D model (POWER7+), the recommendation is 3.1.2.21.

What would you do in this situation? Is there enough of a difference between POWER7 and POWER7+ that you would hesitate to deviate from the recommendations? In this case, even when told by support to stand pat, the client was brave and chose to install 3.1.2.21 on their POWER7 box, because that version of VIOS is still supported. It worked. Of course every environment is unique, so your mileage may vary. And ultimately, getting to POWER9 should be your goal if it is at all possible. But I can report that at least one client is now successfully running 3.1.2.21 on POWER7.

VIOS Admin Tips 

I want to again plug the Power Virtual User Group (VUG) technical webinar series. Last month I cited Nigel Griffiths’s first look at the POWER9 HMC, but there’s another excellent presentation from back in May that I neglected to mention. It’s Jaqui Lynch’s deep dive on VIOS

Session 106: VIOS—Top Tips for Successful Administration
Speaker: Jaqui Lynch, Information Architect, AIX & Linux Performance Consultant, IBM Power Champion

VIO servers are the most critical part of your system setup. If they are not happy, then no client LPAR is happy. This session provides tips on setting up and maintaining VIO servers including upgrades and patching. Backup and recovery are also covered.

Be sure to download the slides and listen to the replay. As always, she has some great information. Her recent TechChannel article on this topic is also worth your time.

AIX Patch Management With Ansible, and the Broader Benefits of Automation

Edit: I still love going to the Grand Canyon

Originally published by TechChannel July 15, 2021

Rob McNelly on automation benefits, AIX patch management with Red Hat Ansible and more

I made one of the classic blunders during a recent trip to the Grand Canyon. This had nothing to do with a land war, or a Sicilian, but as an Arizona resident and avid hiker I should know better: always, always carry your water bottle, especially during the summer.

I venture to the Grand Canyon fairly often. This year alone I’ve been to the South Rim twice, in April and again in June. I’ve also traveled to the North Rim. I’ve hiked and/or biked while camping at Mather, Desert View and the North Rim campgrounds. Most recently, I was driving along the North Rim, making occasional stops at various scenic views and overlooks. Even though it wasn’t anything strenuous, as always, I packed my vehicle with salty snacks and plenty of water. Both are essential when engaging in physical activity in the desert.

However, at one stop, I left my water bottles—along with two 7-gallon containers of water and a cooler full of snacks—behind in my vehicle. I figured I wasn’t going far and that I’d quickly return. I figured wrong. The journey from the car to the overlook was longer than I realized, and the overlook itself branched out into multiple trails. Temps were only in the mid 90s, but with the elevation, low humidity, and lack of shade, the onset of thirst arrives pretty quickly. You can go from feeling fine to having a problem in the blink of an eye.

Of course my provisions did me no good sitting in the car. If I wanted to head out on the trails, I realized I had to first return to my vehicle. Once I’d hydrated and snacked, I loaded my day pack with more food and water. Now I was ready to go back to the overlook and check out the trails for a bit.

Systems Management Planning

In a similar vein, I believe it’s easy to underestimate what we’re getting ourselves into when managing our systems. On first glance we may think we can make a quick fix, but how often do seemingly minor issues end up being something more complicated? Or we may write a few simple scripts and then see that remedy balloon into a hodgepodge of methods and tools that differ from site to site, and sometimes even from admin to admin. Only then do we realize that what works across a handful of systems won’t cut it in a large environment of many systems.

Is there a way out of these dilemmas? Talor Holloway attempts to answer this question in his recent blog post, “AIX Patch Management with Ansible.”

He states: “Leading enterprises today use the Red Hat Ansible automation platform to provision, configure, manage, secure and orchestrate hybrid IT environments. A common misconception is that Ansible is just used to manage the Linux operating system. This is a false belief. Ansible supports Linux, Windows, AIX, IBM i and IBM z/OS environments. This blog will help AIX system administrators get started with Ansible on AIX, and introduce a patching use case … As enterprises move to a modern, enterprise-wide automation strategy with the Ansible automation platform, extending automation to AIX is a great method to simplify and develop consistency in the way AIX systems are supported, all while using the same automation tools that can be used across the enterprise.”

The Benefits of Automation

Automation enables us to do more with less effort, while greatly reducing the potential for human error. It allows us to standardize, allowing for uniformity of tens or hundreds of LPARs in a single environment.

Of course we still must do the work. Planning and testing, which have always been essential to systems management, are also critical with automated solutions. Make a mistake while implementing these tools and you could blow up several of your machines at once. Ultimately though, it’s worth the effort. The more we can automate, the easier our jobs will become. Automation saves time, time that we can spend more productively. In the big picture, we can redirect our newfound time into keeping our skills sharp and passing on our knowledge on to those who will eventually take our place on the raised floor.

VIOS Post-Migration Performance Tip, and Details on the POWER9-Based HMC CR2

On Twitter, Chris Gibson linked to an IBM support document that explains why the emfc_kpr process consumes so much CPU in AIX 7.2 following an upgrade to VIOS 3.1.

The emfc_kpr process is a kernel process that was added for handling the 16Gb and higher speed fibre channel adapters.

Instead of processing the threads via the protocol driver (as it works in previous releases), emfc_kpr process is now processing them.

If there are virtual FC adapters configured in the environment (with VFC), then emfc_kpr works in conjunction with the npivk process.

The emfc_kpr process (and npivk process in NPIV / VFC environment) processes will consume CPU as I/O is moved/passed through.

The doc goes on to explain that the emfc_kpr is designed to enhance the overall adapter driver performance, so despite its CPU usage, overall system performance should improve, though if high I/O throughput is in place, additional CPU resources may be needed.

In a recent Power VUG Technical Webinar Series presentation, Nigel Griffiths takes a “first look” at the new POWER9-based HMC CR2. 

The Advanced Technology team were very impressed with the construction and design of the new POWER9-based HMC (7063-CR2). It operates very much like the older POWER8-based model (7063-CR1), but is faster. It is very much worth the upgrade if you still have the older & slower x86-based HMC. Make sure you are ready for your Power10 servers expected later in 2021, by having the new HMC up and running.

Watch the video replay and download the slides from the July 14 presentation here.

Power Systems Best Practices Doc Updated, dnf, New IBM Customer Support Portals and More.

Edit: A nice grab bag of topics

Originally published by TechChannel June 15, 2021

The Power Implementation Quality Standard for Commercial Workloads document has been updated. Get the details from Rob McNelly here.

The Power Implementation Quality Standard for Commercial Workloads document has been updated. Download Version 2.4 (June 2021) here.

This presentation details the expected best practices implementation guidelines for Power Systems hardware, including virtualized systems. As you go through it, notice that, starting with page 9, there’s typically an indicator (in red and in parenthesis) of the most recent update for each slide. The pages that reference the current 2.4 release—either (Mod 2.4) or (Upd 2.4)—have new information, but be sure to review the whole thing. This is one of my go-to reference guides, and I always find useful reminders each time I go through it. Thanks again to IBM senior architect Fredrik Lundholm for his tireless work on this project.

Dandified yum Now Part of the AIX Toolbox

If you work with open source, you’re undoubtedly familiar with yum. (It’s saved many a developer from “rpm hell.”) But it’s time to get on board with dandified yum (dnf), which is now included with the AIX Toolbox.

The need for dnf is easily explained: It supports the current Python 3-based package manager. The open-source community no longer supports the Python 2 package managers upon which yum is based.

One way to set up and run dnf on AIX is with the dnf_aixtoolbox.sh script. (It’s similar to yum.sh, the script for setting up yum.) dnf works on AIX 7.1 TL3 and higher versions. OpenSSL 1.0.2.2001 or higher must also be present on the system. Here’s the script:

            # ./dnf_aixtoolbox.sh -?
            Usage: dnf_aixtoolbox.sh <-d> <-y> <-n> -?
                   -d    Install and setup dnf if yum is not installed.
                  -y    Update yum3 to yum4(softlink to dnf).
                     Use this option if want to redirect all yum commands to dnf.
                      Existing yum-3.4.3 will be updated to yum-4.x.x(dnf)
                  -n    Install dnf where both yum and dnf can coexist if yum is installed already.
                      This is not a recommended option.

            -d option:
                 This option can be used if there is no yum is installed on the machine.
                 This will setup dnf as the default package manger.

            -y option:
                 This is the most recommended option if yum is already installed.
                 This option can be used in two situations.

            If yum is already installed and wants to update yum3 to yum4(dnf).
                      All yum commands are dnf commands as yum will be just a soft link to dnf.

            If yum3(3.4.3) isn’t installed but still like to use yum command which are actually dnf
            commands.

At the end of the document you’ll find examples that show the expected output from running each of the various options.

A New Way to Manage Software and Hardware Service Cases 

IBM is implementing a new web portal for placing hardware service calls. The portal, which will be activated on Saturday, June 19 at 23:59 p.m., replaces the Service Request portal. The newest portal will operate similarly to the recently launched software service portal. As of June 19, all tickets will have a Salesforce Ticket Number (TSxxxxxxx). That means no more PMRs/PMHs for hardware tickets.

To learn more, check out this short video about how to open and manage hardware cases. IBM Support also has FAQsdetails about the enhanced work order and Call Home status capabilities and an introduction to Call Home Connect Cloud. 

The new portals are designed to allow customers to manage their hardware and software cases from a single interface. There will be no change to cases opened via voice at 800-IBM-SERV, Call Home, B2B or the IBM Service Center for Cisco Products Portal.

‘Mini-NIM’ Requirements and Other Tidbits

  • On Twitter, Chris Gibson noted that IBM has just updated its minimum NIM master levels for VIOS clients. For those of us who manage VIO servers, the version requirements for NIM servers is critical information. Related: “minimum NIM” is kind of a tongue-twister.
  • Also on Twitter, Nigel Griffiths shared this 7-minute video devoted to 35 years of AIX. 
  • Speaking of NIM, you can learn much more from Jaqui Lynch’s presentation to the Power Systems VUG. Follow the slides and listen to the video
  • A new certification, IBM AIX v7 Administrator Specialty, is now available. Visit IBM’s certification program site to register for the exam, or to download a sample exam. 
  • I’m among a group of IBM Champions who will discuss the present and future of the AIX platform. HelpSystems, a provider of security and automation software, is hosting a discussion, which is set for Wednesday, June 23. It’s free to register.

Historical Perspective

I recently came across this blog, which chronicles random stories about IBM history.

For example: “… do you know that IBM also once embraced biomimicry? That’s right. In 1974 the B in IBM also stood for BEEHIVE!”

Here’s another: “In the mid 1950s, a CIA spook comes to IBM with a request: “Hey Blue suited dudes, make us a machine that can store millions of images that are searchable by keywords. In other words a data lake of images—just like Google Images, except Google is still 40 years in the future.” 

While there have been four entries since March, prior to that, the blog hadn’t been updated since 2017. So don’t count on frequent updates. But I have been enjoying this collection of stories about IBM from decades gone by.

Guru: The Essential Workers Of IT, And Lessons Learned From Trying Times

Edit: I still advocate for giving blood and do so roughly every eight weeks

Originally published by IT Jungle June 7, 2021

Over the past year we’ve all heard about essential workers. Essential workers are present in our industry as well. Those who manage and maintain our power grids and technology infrastructures have always played a vital role. It’s easy to overlook these professionals, but they shouldn’t be taken for granted. In 2020, their mostly unnoticed efforts were particularly important.

Don’t get me wrong; I’m not necessarily putting IT workers on the same level as healthcare workers or first responders. My point is simply that during the pandemic, we were reminded that some aspects of our jobs can only be accomplished by people who are present in power plants or on the raised floor in our computer rooms. In many industries the operations teams continued to perform at their work on site.

We were reminded that we can do much of our jobs remotely, and that’s great. However, the shutdown made apparent the need for greater automation. Instead of having someone on site to do a task, we can let the computer do it instead of a human. I find taking repetitive, manual tasks off my daily to do list to be quite satisfying. Take, for instance, setting up processes to automate the creation of a virtual machine. Who wouldn’t prefer to carve out LPARs automatically at the push of a button? How much joy do you feel when you create a script and add it to the scheduler and take repetitive manual tasks off of your plate?

However, some work will always require human hands and eyes. Consider a new server build. (Admittedly this may not be the best example, since many IBM i shops run on a single physical frame, and often for years at a time. But you’ve likely been through a server upgrade at some point in your career, so stick with me.) While much of this process can be performed remotely, let’s start at the beginning. Once your hardware has been received, how do you remotely rack and stack the system, and connect the power plugs into the PDU? How do you remotely plug in a USB disk or a DVD that contains an OS install image? People are still needed for these tasks. You need someone on the raised floor to install and configure the box, at least until it’s powered up and on the network. You need people to troubleshoot network and fiber cable issues. You need CEs to come onsite and replace failed parts. As much as we talk about lights-out data centers, we’re not there yet. Our machines still require human intervention.

A Home/Office Hybrid

No one wants to relive 2020, but we have learned lessons from it just the same over the past year. For instance, remote work is likely here to stay. In our industry and others, people came to value working from home, and they want the flexibility to continue to do their jobs away from the office on at least a part-time basis. Forrester predicts that most companies in the United States and European will employ a hybrid work model post-pandemic. This is echoed by a Citrix survey indicating that “90 percent of respondents have no interest in returning to office work full time once the COVID-19 pandemic is over. More than half prefer a hybrid working model where they can work from home most or all of the time, while 18 percent want a hybrid model where they work from the office more.” Bloomberg shares anecdotes where employees are quitting instead of giving up working from home.

Author Scott Berkun suggests that CEOs roll with the change in attitude:

Remote work is seen as a threat to many CEOs simply because of their fear of change and resistance to progress. That fear leads to an irrational rejection of remote work, instead of a thoughtful examination of where it has succeeded and what can be learned.

Incidentally, no one had to make the case for working at home to me. I’ve done it for years: For me, quoting myself here, working at home was an easy choice. “I love being able to get going first thing in the morning while I’m fresh and alert. On weekdays at least, I’ll typically awaken thinking of work anyway. So rather than sitting through a commute, worrying about losing time, I can get right to it…. my dedicated office workspace has long been in place. I have fast Internet, a full-size multi-monitor setup, and my cherished old school tools: an actual landline and a vintage Model M keyboard.”

Naturally though, my own work routine also underwent a transformation. Even though I was used to working from home, I’d never worked exclusively from home. Prior to 2020 I traveled around the country, working at customer sites and attending conferences and classes (while racking up frequent flier miles and hotel and rental car points along the way). But I spent much of this past year helping business clients scale their infrastructures to allow for higher system utilization. Many of us had to enable our employees to be able to work remotely. Sometimes that involved beefing up bandwidth requirements to allow for more employees to connect remotely via VPN and use collaboration tools.

Another thing I’ve learned is that collaboration has its limits. As I write this, I have Slack open, along with multiple email inboxes. Throughout the day I can expect to join meetings using Webex, GoToMeeting, Zoom, or Teams. I’ll use shared documents, in addition to those I send and receive in email. And I’ll converse with coworkers via Teams, Google Hangouts and Jabber, among other messaging platforms.

Of course, the advantage to messaging – and even text and email, to an extent – is it’s meant to be non-intrusive. You’re allowing the other person to take their time and respond when they’re available. That’s considerate, and consideration is important. But sometimes you just have to cut through all the clutter and talk to people in real time. Talking one on one clears up misunderstandings. Sometimes it’s the simplest way.

Give Yourself A Pat On The Back

Over the last year, I was reminded that time is short. Fortunately, my family’s health wasn’t directly affected by Covid, but I still felt its impact. In the past year one of my friends died from brain cancer while another lost a child to Leukemia. The pandemic made that all the more difficult. Hospital visits were restricted, both in terms of hours and the number of people allowed in the room. That’s assuming visitors were allowed at all; I was turned away more than once. With the limits on gatherings, funeral services were also different.

Certainly, I wasn’t the only one to be reminded of what is truly important. Another example: blood donation. I’d donated sporadically for years, but seeing the gratitude of these families for those who made it possible for their loved ones to get transfusions, plasma and other blood products inspired me to give blood as often as I can, roughly every eight weeks. Please consider giving blood if you are able.

Things are starting to open up. Where I live fewer businesses are requiring masks. Seemingly it’s just a matter of time before I get back on the road for client visits and in-person conferences. While I enjoyed learning about tiger sanctuaries and chess players, I’m more than ready for a return to near normalcy.

How ever you managed to get through the past year, I believe you, the IT professional, are worthy of appreciation. By simply doing your job, by applying your unique skills, you made vital contributions. Your sacrifices went largely unseen. You work weekends. You work holidays. You’re on call at all hours. When systems require patching or operating systems require upgrading, you may get a change window of 1 to 3 a.m. on a Sunday. You have to make that work, and you do.

I’m reminded of the old Dunkin Donuts TV ads. A guy would head out the door at the crack of dawn and say to himself, “It’s time to make the donuts.” The world needs the people who make the donuts. The world also needs techies, and during a tough time, you delivered.

Rob McNelly is a senior Power Systems solutions architect doing pre-sales and post-sales support for Meridian IT, headquartered in Deerfield, Illinois. McNelly was a technical editor for IBM Systems Magazine, and a former administrator within IBM’s Integrated Technology Delivery and Server Operations division. Prior to working for IBM, McNelly was an OS/400 and IBM i operator for many years for multiple companies. McNelly was named an IBM Champion for Power Systems in 2011, an IBM Champion Lifetime Achievement recipient in 2019, and can be reached at rob.mcnelly@gmail.com.

Attention to Patching and Other Vital System Maintenance Will Help You Avoid the Heat

Edit: I should still be better prepared for my rides in the desert

Originally published by TechChannel May 18, 2021

Rob McNelly on why adequate system maintenance can help you prepare (and avoid) system problems

As I found myself walking through the desert, pushing a bicycle with a flat tire, I wondered how I got here.

That’s not a metaphor; that’s life in Arizona. With the cactus and tumbleweeds and other assorted spiky, poky things, it’s a dangerous place for tires and tubes. It’s not a great place for people, either, once the temperatures hit triple digits, as was the case that day.

I probably could have called someone to pick me up, but that would just add insult to injury. Besides, I didn’t have far to walk, and I had enough water on hand. So I trudged home in the heat, wondering how I could have avoided this fate.

For starters, I could have filled my tire tube with slime. That’s a real thing. It seals any holes in tire tubes. I wouldn’t have needed to do anything else.

Or I could have brought along a spare bike tube, pump, and patch kit. Then I could make repairs on the spot. Or I could have simply replaced the tire itself. It was old and the tread was thinning.

Ultimately I realized this was on me. This outcome was entirely foreseeable, and I’d neglected to adequately prepare for it.

Does that sound familiar? Isn’t this often the case with your system maintenance? Once you determine the cause of the problem, it’s glaringly obvious that something was neglected along the way. Say you open a PMR and IBM informs you that your issue is a known defect. Had you patched your system when that SP or TL was first released, months or even years prior, the bug would not have affected you. Or maybe you need a physical frame taken down so a CE can replace a part. Many components are “hot swappable” these days, but not everything. Wouldn’t it have been nice if you’d have prepared for this eventuality by simply keeping a spare frame with free resources available in your environment? That way you could LPM the workload to it, and the necessary work wouldn’t affect anyone at all.

But rather than plan, you hoped for the best. Or perhaps you concluded that if ain’t it broke, don’t fix it. And the outcome was entirely foreseeable—as well as entirely avoidable.

Make the Case for Maintenance 

Back in the 1980s Castrol Motor Oil reminded TV viewers that “if you make things hard on your engine, your engine will make things hard on you.” (While a surprising number of old Castrol ads are archived on YouTube, I couldn’t find that particular one. But this is similar.) Like your car, your system is a valuable and complex piece of machinery that requires care. Fall too far behind on patching and basic maintenance, and the simple becomes much more complicated than it should be. Updates become large upgrade projects. Ignoring maintenance will ultimately leave you running old, unsupported hardware and out of date OSes, with no easy path forward. Technical debt will be paid, one way or another. 

What should be done? Start by letting those at the C-level know that patching is important, and potentially a huge cost savings over inaction. Maybe use the car analogy in a gentle reminder. If you can see the need to change your oil or fill your tank (or charge your battery in the case of electric vehicles), you should be able to see the need for system maintenance. You know you can’t ignore that oil change reminder sticker in your windshield. Sure you can put it off, but not forever. The same applies to your machines.

We may need to convince others of the importance of maintenance, but we admins should know from bitter experience why it matters. It’s a sinking feeling when you realize how easy it would have been to ensure your backups would function before the need arose to actually restore your machine. What about that mksysb you’re taking? Have you audited it to make sure the images are not only being created, but are actually usable? Sure, by writing a script you theoretically can set it and forget it in cron, but neglect to check on the results of that backup script, and you’ll end up with Schrodinger’s backup:

“The condition of any backup is unknown until a restore is attempted.”

Certainly your VIO mappings and VLANs and other configurations are saved somewhere, right? And your VIO servers are backed up as well, so they can be easily recreated if needed? What about your HMC? Is it easily recoverable? Are the configurations backed up? Did you run and keep hmcscanner reports so you know how everything was set up in your environment? Do you have the information so that shared Ethernet adapters and etherchannel devices can be recreated if needed? Have you actually done this? What about mapping your NPIV and vSCSI disks? And do you know which disk drivers you need to load?

There’s even more to consider, but you get the point. It all boils down to being prepared for the unexpected. Any and all of these problems can leave you on a slow, humbling walking through the desert, asking yourself why you weren’t better prepared and hoping you have enough water to see you through.

And that is a metaphor.

Key AIX Security Considerations

Edit: This was published as an ebook by TechChannel

Rob McNelly explains what questions you and your team should be
asking to make sure data is locked down

I will also archive it here in case the original link stops working in the future.

You’ve seen the headlines about malware attacks and cyberhacks. Whether it’s a competitor looking to steal your secrets or criminals looking to extort money, system administrators have myriad reasons to be wary. After all, the only absolutely secure system is one that is powered off.

Luckily, if you’re running AIX on IBM Power Systems hardware, you are officially “secure” and don’t need to take further action (in case you couldn’t tell, that’s sarcasm).

Don’t get me wrong; AIX is great. It’s my favorite OS. But it still requires monitoring and patching, and that’s for starters. If you don’t believe me, check IBM’s APAR security information or CVE vulnerability data.

It may be true that Windows and Linux systems, which number in the millions, are higher profile and thus more commonly targeted. However, that’s no reason for AIX admins to be complacent. If anything, systems running AIX make more tempting targets for bad actors. Look at it this way: The data held on AIX systems is incredibly valuable.

These machines typically run mission-critical workloads and essential databases and applications for some of the world’s largest enterprises. What are the ramifications of someone gaining access to or corrupting this data? What happens if records are deleted or destroyed? Yes, most AIX systems are behind a firewall, and most large corporate environments have disaster recovery sites and detailed recovery plans. Again though, that’s not enough. More must be done to reduce the chances of a damaging attack.

Put Yourself on Notice: IBM Support

I like to keep up to date on the latest known vulnerabilities by subscribing to IBM’s notifications. You can bookmark the links I cited earlier, or just do what I do and register for IBM updates. I receive weekly emails from IBM. Go to the IBM Support site to subscribe and manage your subscriptions and delivery preferences.

While I prefer weekly updates, you can opt for daily email. You can also limit update topics to ensure the information you receive is relevant.

Once you check these boxes, ask yourself some questions about your own environment. For instance, if an attacker gains access to your internal network, how quickly or easily could you identify the vulnerability? Are unnecessary services running on your machine? It’s harder to attack a system that’s listening on only a limited number of ports.

I mentioned firewalls: They’re a nice line of defense, but attackers can still beat them and gain access via the network. They could gain a foothold by compromising VPN credentials or some Windows or Linux machine on the network, and then move laterally within your organization by behaving as an authorized network user. Your network team should be watchful for unusual behavior such as logins at odd hours or atypical actions.

Asking the Tough Security Questions

To see if you are covered, ask yourself these 18 question

1. Do the user IDs on your system have strong passwords?

2. Have you changed your default password algorithm?

3. Have you disabled or deleted accounts that are no longer needed?

4. Are you authenticating via LDAP or some other central service, or are you trying to manually manage user IDs across your machines?

5. Once users log in, are they allowed to escalate their privileges via sudo or some other mechanism?

6. Are those permissions regularly audited?

7. Are the sudo logs themselves audited?

8. Are you tracking attempts, successful or not, to log into your system? Put a machine on port 22 on the public facing internet and see how quickly it gets inundated. If you’re seeing that sort of activity behind your firewall,
something may not be right.

9. If you’re tracking logins, are log files being monitored and reviewed?

10. Do you have a security information and event management (SIEM) server that actively checks logs across your environment?

11. Are log files growing without being rotated, or are they allowed to grow indefinitely? Considering the huge amount of disk that we can allocate to filesystems these days, log file size may seem insignificant, but rotating these logs is still a good idea.

12. Do you keep logs locally or send the files to a central system? This information can help diagnose an intrusion, particularly if an attacker gains access to a machine and alters the files stored there. Of course, if an attacker accesses the logging machines and deletes those files, that’s another matter.

13. Are your systems regularly patched? Besides the OS, are you up to date on patching system firmware, device firmware and any VIO servers that are in use?

14. For those who continue to rely on legacy applications and older AIX versions, are you taking extra precautions? Those using unsupported hardware and software don’t have the options of opening a problem ticket with IBM support or applying security patches. If you’re dealing with these limitations, you must be extra vigilant in assessing and monitoring risks to your environment.

15. What are your procedures, who gets notified, and what actions are taken when an intrusion attempt is detected or recognized after the fact?

16. Who determines when systems should be removed from the network, and who decides how to analyze the system after an event occurs?

17. At what point do you declare a disaster and move operations to another location?

18. Do you have a disaster recovery plan?


Help From the Outside: Lab Services and Documentation

IBM Lab Services for Power Systems or your IBM Business Partner can help you assess your organization’s security and compliance practices and procedures.

Another option is to engage a penetration testing company. Penetration testers simulate attacks to determine how your system would hold up against the real thing, and how well your staff responds to notifications of anomalies in real time. Knowing that there was no detection of an attack is valuable information as well.

While this overview offers a few things to keep in mind as far as managing the security of your systems, it is by no means intended to be an exhaustive list. Rather it is meant to help jump start conversations in your organization to start considering how important your data is and what you can do it keep it safe. I encourage you to read this detailed look at AIX security strategies authored by lifetime IBM Power Champion Jaqui Lynch

AIX/VIOS Versioning Recommendations, a POWER9-Based HMC and the Evolution of IBM Certifications

Edit: Are you current?

Originally published by TechChannel April 22, 2021

Rob McNelly on why AIX has stood the test of time, the new POWER9-based HMC and how IBM certifications have evolved

I believe AIX has stood the test of time. This opinion is based in part on the fact that legacy AIX documentation is still relevant. For example, I recently downloaded two Redbooks, “IBM Certification Study Guide eServer p5 and pSeries Administration and Support for AIX 5L Version 5.3” and “IBM eServer Certification Study Guide – pSeries AIX System Support.” Seeing “eServer” in the titles gives you an idea of how far back I went, but for the record, that admin and support doc was published in 2006, and last updated in 2010. The certification study guide came out at the end of 2001, and was updated in 2004.

Honestly, a lot of this information is still useful. Of course not everything translates to today’s highly virtualized, flash storage-based environments, but so much does. This speaks to the design and the thought that’s been put into the OS from the start. The commands and concepts have always been well thought out. That the system can be tuned to fit unique workloads is very impressive. And the value of IBM Support, which quickly diagnoses and helps resolve problems, is something I still take for granted.

Did you need to relearn everything when you went from AIX 6.1 to 7.1, or 5.3 to 6.1, or 4.3.3 to 5.1? No. Do the smitty menus and fastpaths look the same? Do your old scripts typically continue to run when you upgrade to a new version of the OS? Do you have LPARs that have actually migrated between multiple versions of the OS? And (as much as I hate to even ask), are those LPARs that are running unsupported versions still going strong? That would be yes, yes, yes and yes.

As we consider our next moves—whether it’s patching and updating the OS or planning for cloud migration, the move to POWER10-based servers or an upgrade to AIX 7.3—we can be secure in knowing that the framework will remain the same.

Checking Your Current AIX/VIOS Versions

Nigel Griffiths has a great analogy about the AIX/VIOS versions you should be running: 

“You always get your cars brakes, shocks, tires & lights checked + updated every year, as you value your family + friends. The same goes for operating systems. Failing to update to current supported versions = you lose your job! Here is my take on the AIX/VIOS we all should be using.”

Use FLRT LITE to check for the current AIX and VIOS recommendations.

A New POWER9-Based HMC 

In case you missed it, IBM announced a newer HMC model based on the POWER9 processor: 

The Power HMC (7063-CR2) is a dedicated rack-mounted workstation that helps you to configure and manage system resources on Power servers using POWER7, POWER8, or POWER9 technology-based processors. The HMC connects to one or more managed systems to perform the following primary functions:

  • Provide a console for system administrators and service providers to manage server hardware
  • Deliver basic virtualization management through support for configuring logical partitions (LPARs) and dynamic resource allocation, including processor and memory settings
  • Detect, report, and store changes in hardware conditions
  • Act as a service focal point for service providers to determine an appropriate service strategy
  • Display operating system session terminals for each partition
  • Provide the call home focal point for managed servers
  • Display ASMI menus for managed servers

The announcement letter lists these standard hardware attributes:

  • 1U base configuration
  • POWER9 130W 6c CPU
  • 64 GB (4 x 16 GB) or 128 GB (4 x 32 GB) of DDR4 system memory
  • 2 x 1.8 TB SAS SFF 2.5-inch hard disk drive (HDD) RAID 1
  • Rail bracket option for round-hole rack mounts
  • Two USB 3.0 hub ports in the front of the server (option to remove)
  • Two USB 3.0 hub ports in the rear of the server
  • Redundant 900W power supplies
  • 4 x 1 Gb Ethernet ports
  • 2 x 10 Gb Ethernet Ports on optional PCI adapter
  • 1 x 1 Gb baseboard management controller (BMC) Ethernet port

To see it in action, check out this Nigel Griffiths video

Power Systems VUG Covers Recent Announcements, AIX Anniversary 

As noted, AIX turns 35 this year. The latest Power Systems Virtual User Group presentation acknowledges the anniversary and covers recent AIX announcements. View the PDF and listen to the replay.

Among other changes, expect to see new AIX certifications. An AIX Foundations exam is coming soon, and other AIX certifications (including advanced certifications) will follow.

Here’s a post about the evolution of IBM certifications. Look for more granular levels of certification. IBM is calling them foundational skills, intermediate skills, and advanced skills. There are differences and distinctions between them:

  • Foundational activities are geared toward learners who are new to the subject matter and seeking to learn basic concepts and build foundational knowledge in support of efforts to gain a working knowledge of the topic. These credentials are typically issued to individuals with little or no prior knowledge and experience with the subject matter represented by the credential.
  • Intermediate level activities are geared toward learners who have acquired some degree of competence in the covered topic resulting from prior training, education and/or work experience. Intermediate level activities are for learners who seek to build upon foundational knowledge, refine and better hone their skills, and advance their understanding of the topic.
  • Advanced activities are tailored toward learners who have already achieved a higher degree of technical competence in the subject matter resulting from expanded training and supplemental work experience. Advanced level activities are for learners who wish to build upon intermediate knowledge and field experience toward the achievement of mastery in a specific technical area.

Other notes from this post:

  • Thought Leader credentials are earned by subject matter experts who are frequently consulted for their mastery level knowledge and skills. They are able to apply those skills in the most challenging situations and are highly effective at coaching and mentoring others in doing the same. A Thought Leader is an innovator and respected authority on the related subject matter. Thought Leader skill demonstration is typically validated by SME, board review, evidence submissions, and verified experiential activities.
  • IBM Certificate-Validation of knowledge, skills and abilities achieved through the aggregation of multiple learning experiences structured as a prescriptive learning pathway supporting skill development within a more comprehensive area of study.
  • IBM Professional Certification provides validation of IBM technology-oriented knowledge, skills and abilities for essential job roles through administration of proctored examination and/or performance-based testing. The certification assessment is independent of any specific educational event or related learning opportunities. Certification is also intended to measure or enhance continued competence through re-certification or renewal requirements. Re-certification may require mandatory continuing education hours, proctored reassessment, non-proctored reassessment or a combination thereof.

Disclosure: I did some volunteer work on the foundational test. Part of our discussion involved looking at past study guides.

Guru: A Simple Script That Capitalizes on IBM i’s Open Source Capabilities

Edit: Still one of my favorite trails, I returned in December and made it to Plateau Point.

Originally published by IT Jungle April 26, 2021

In my last piece, we got Yum and Bash running on IBM i. Now let’s build on that foundation to find an open campsite.

I’m serious. Recently I used an IBM i LPAR to schedule a hiking and camping trip to the Grand Canyon. Where to begin? My bio, perhaps. Here it’s mentioned that “Rob enjoys camping, hiking, biking, and backpacking through the mountains of Arizona. . . . His favorite trip was hiking to the blue waters of Havasupai, and he is planning on hiking the Grand Canyon in the near future.”

So last summer I went camping and ended up riding my bike around the South Rim. I told a friend about my plans to do more exploring there this year, including a rim to rim hike. Basically, he called me out. He wanted to do the hike, but he was also sick of hearing me go on about it.

Attempting a rim to rim hike in a single day isn’t recommended. It’s best to spend a night or two resting at the bottom at Phantom Ranch, one of the developed campsites, or to find your own spot in the back country (assuming you can obtain the proper permits and reservations). Of course the appeal of day hikes is that no permits are needed. However, you must be properly conditioned for this challenge:

“Over 250 people are rescued from the canyon each year. The difference between a great adventure in Grand Canyon and a trip to the hospital (or worse) is up to YOU. DO NOT attempt to hike from the rim to the river and back in one day, especially during the months of May to September.”

Though it was early April and not that hot yet, I took this advice. Rather than go rim to rim, I thought we’d try something of a practice hike down to Indian Garden. That covers around 10 miles round trip with the accompanying elevation changes. At that point we could gauge our comfort level, and if we felt good we could continue the hike to Plateau Point, which would add about three miles before we’d turn around. For some context, depending on the route chosen, if you do a rim to rim hike starting at the North Rim you’ll descend 6,000 feet, and then come up 4,500 feet to the top of the South Rim while covering around 24 miles.

Although plenty of free camping is available on National Forest land just south of the park entrance, I wanted to sleep closer to the trailhead, so I tried to get a spot at the developed Mather Campground near the South Rim of the Grand Canyon. The recreation.gov website had no availability at Mather for the nights I wanted, even though I’d periodically check and refresh. Eventually I got to wondering about other options, and sure enough, a site called campsitephotos.com has a campsite assist offering. This seemed to be what I was looking for.

I clicked on the arrow and entered my destination:

There were options to narrow the search, but I wasn’t picky.

I chose a Matrix scan:

Then I entered some potential dates:

I provided my cell number to receive text alerts:

Then it hit me up with the pricing options:

That was affordable enough, but what if I wanted to make changes? What if I wanted to check more often than every 15 minutes? How hard was it to roll my own website scraper? I searched further and found some interesting code on GitHub. (More readable versions are typed below each of the following screen shots, which came from: https://github.com/banool/recreation-gov-campsite-checker)

Campsite Availability Scraping

This has been updated to work with the new recreation.gov site and API!!! This script scrapes the https://recreation.gov website for campsite availabilities.

Note: Please don’t abuse this script. Most folks out there don’t know how to run scrapers against websites, so you’re at an unfair advantage by using this.

Example Usage
$ python camping.py --start-date 2018-07-20 --end-date 2018-07-23 --parks 232448 232450 232447 232770
TUOLUMNE MEADOWS: 0 site(s) available out of 148 site(s)
LOWER PINES: 11 site(s) available out of 73 site(s)
UPPER PINES: 0 site(s) available out of 235 site(s)
BASIN MONTANA CAMPGROUND: 0 site(s) available out of 30 site(s)

Installation

I wrote this in Python 3.7 but I’ve tested it as working with 3.5 and 3.6 also.

python3 -m venv myvenv
source myvenv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
# You're good to go!

For all I know this code may be in use or even modified by Campsite Assist. In any event it seemed like a great tool to run on an IBM i LPAR. And sure, many people may see Linux as a more natural environment for something like this, but there’s value in getting these tools to run on IBM i, particularly when it’s this simple.

I logged in via putty and checked to see if Python3 was installed:

	bash-4.4$ python3
	bash: python3: command not found

I used the find command to check for yum:

	bash-4.4$ find / -name yum -print
	/QOpenSys/pkgs/bin/yum

Then I used Yum to install Python3. I omitted many of the messages and skipped to the end:

	bash-4.4$ /QOpenSys/pkgs/bin/yum install python3
	Installed:
	  python3.ppc64 0:3.6.12-1
	Dependency Installed:
	  libreadline8.ppc64 0:8.0-1

Complete!

I also wanted to install git so I could copy the code from GitHub.

	bash-4.4$ /QOpenSys/pkgs/bin/yum install git
	Installed:
	  git.ppc64 0:2.26.2-3
	Dependency Installed:
	  libpcre2-8-0.ppc64 0:10.35-1

I wasn’t sure where git was installed, so I ran find and then used git to clone the website:

	bash-4.4$ find / -name git -print
	/QopenSys/pkgs/bin/git
	bash-4.4$ /QOpenSys/pkgs/bin/git clone https://github.com/banool/recreation-gov-campsite-checker.git 
	Cloning into 'recreation-gov-campsite-checker'...
	remote: Enumerating objects: 16, done.
	remote: Counting objects: 100% (16/16), done.
	remote: Compressing objects: 100% (12/12), done.
	remote: Total 161 (delta 7), reused 11 (delta 4), pack-reused 145
	Receiving objects: 100% (161/161), 42.65 KiB | 2.24 MiB/s, done.
	Resolving deltas: 100% (77/77), done.
	bash-4.4$ ls -ld rec*
	drwxr-sr-x    4 rob 0             12288 Apr  3 07:20 recreation-gov-campsite-checker
	bash-4.4$ cd recreation-gov-campsite-checker/
	bash-4.4$ ls
	README.md                      camping.py                     fake_twitter_credentials.json  notifier.py                    	other                          requirements.txt               setup.py

Now it was just a matter of running through the installation instructions listed at the beginning of this section:

	bash-4.4$ /QOpenSys/pkgs/bin/python3 -m venv myvenv
	bash-4.4$ source myvenv/bin/activate
	(myvenv) bash-4.4$
	(myvenv) bash-4.4$ pip install --upgrade pip
	Collecting pip
	  Downloading 	https://files.pythonhosted.org/packages/fe/ef/60d7ba03b5c442309ef42e7d69959f73aacccd0d86008362a681c4698e83/pip-21.0.1-py3-none-any.whl (1.5MB)
	    100% |################################| 1.5MB 3.1MB/s
	Installing collected packages: pip
	  Found existing installation: pip 18.1
	    Uninstalling pip-18.1:
	      Successfully uninstalled pip-18.1
Successfully installed pip-21.0.1
	(myvenv) bash-4.4$ pip install -r requirements.txt

There was quite a bit of output once I’d finished:

Successfully installed Click-7.0 appdirs-1.4.3 attrs-18.2.0 black-18.9b0 certifi-2018.11.29 chardet-3.0.4 fake-useragent-0.1.11 future-0.17.1 idna-2.8 isort-4.3.4 oauthlib-3.0.1 python-dateutil-2.8.1 python-twitter-3.5 requests-2.21.0 requests-oauthlib-1.2.0 six-1.15.0 soupsieve-1.8 toml-0.10.0 urllib3-1.24.2

I got some errors when I ran the code, which uses emojis to alert you on the status of the search. Rather than try and make that work, I changed the emojis to English text by running:

	vi camping.py 

Then I edited 29 and 30:

	  +29 SUCCESS_EMOJI = "yay"
	   +30  FAILURE_EMOJI = "boo"

At this point it ran as expected. I tried a few different date ranges to see what was being returned:

(myvenv) bash-4.4$ python3 camping.py --start-date 2021-04-06 --end-date 2021-04-08 --nights 2 --parks 232490
There are no campsites available :(
boo MATHER CAMPGROUND (232490): 0 site(s) available out of 357 site(s)
(myvenv) bash-4.4$ python3 camping.py --start-date 2021-04-06 --end-date 2021-05-08 --nights 2 --parks 232490
There are campsites available from 2021-04-06 to 2021-05-08!!!
yay MATHER CAMPGROUND (232490): 23 site(s) available out of 357 site(s)

To view the raw data being returned, turn on debug mode with — debug. You’ll then get specific campsite IDs and available dates:

(myvenv) bash-4.4$ python3 camping.py --start-date 2021-04-01 --end-date 2021-04-09 --parks 232490 
--debug   2> test.out
	(myvenv) bash-4.4$ cat test.out 
	2021-04-03 15:38:36,066 - 19668 - DEBUG - Querying for 232490 with these params:
	 {'start_date': '2021-04-01T00:00:00.000Z'}
	2021-04-03 15:38:39,031 - 19668 - DEBUG - Information for park 232490: {
	  "4079": [],
	  "4080": [],
	  "4081": [],
	  "4082": [],
	  "4083": [],
	  "4084": [],
	  "4085": [
	    "2021-04-28T00:00:00Z"
	  ],
	  "4086": [],
	  "4087": [],
	  "4088": [],
	  "4089": [],
	  "4090": [
	    "2021-04-27T00:00:00Z"
	  ],
	  "4091": [],
	  "4092": [
	    "2021-04-28T00:00:00Z"

That’s just a taste. The actual file has tons of additional information. At this point, I created a quick and dirty script called test.bash:

	python3 camping.py --start-date 2021-04-01 --end-date 2021-04-09 --parks 232490 
	--debug   2> test.out
	cat test.out | grep '2021-04-0[5678]T'

The first line ran the code and sent the debug output to a file; the second line read through the file and searched for a date string. I broadened my search to April 5-8 rather than April 6-7 so I could see results even when my target dates were unavailable. Wild cards and regular expressions are beyond the scope of this article, and I realize there are probably more elegant ways to accomplish this, but I’d be happy to hear how you would have done it.

I made the script executable by running:

	(myenv) bash-4.4$ chmod u+x test.bash

Then I ran it with:

	(myvenv) bash-4.4$ ./test.bash
	There are no campsites available :(
	boo MATHER CAMPGROUND (232490): 0 site(s) available out of 357 site(s)
	    "2021-04-05T00:00:00Z",
	    "2021-04-07T00:00:00Z",
	    "2021-04-07T00:00:00Z"
	    "2021-04-08T00:00:00Z"

At this point I created a simple loop that would run on the command line.

	(myvenv) bash-4.4$ while true
	> do date
	> ./test.bash
	> sleep 300
	> done
	Sat Apr  3 08:34:35 CST 2021
	There are no campsites available :(
	boo MATHER CAMPGROUND (232490): 0 site(s) available out of 357 site(s)
	    "2021-04-05T00:00:00Z",
	    "2021-04-07T00:00:00Z",
	    "2021-04-07T00:00:00Z"
	    "2021-04-08T00:00:00Z"

Rather than hammer the site, I gave it a 5-minute delay (300 seconds). Yes, even 5 minutes may be excessive, but I figured it wouldn’t need to run for long. In any event, that interval worked for me. After about 25 minutes, an open campsite that fit my criteria became available.

During this test run you can see the changes in availability:

	Sat Apr  3 08:34:35 CST 2021
	There are no campsites available :(
	boo MATHER CAMPGROUND (232490): 0 site(s) available out of 357 site(s)
	    "2021-04-05T00:00:00Z",
	    "2021-04-07T00:00:00Z",
	    "2021-04-07T00:00:00Z"
	    "2021-04-08T00:00:00Z"
	Sat Apr  3 08:39:38 CST 2021
	There are no campsites available :(
	boo MATHER CAMPGROUND (232490): 0 site(s) available out of 357 site(s)
	    "2021-04-05T00:00:00Z",
	    "2021-04-07T00:00:00Z",
	    "2021-04-08T00:00:00Z"
	Sat Apr  3 08:44:39 CST 2021
	There are no campsites available :(
	boo MATHER CAMPGROUND (232490): 0 site(s) available out of 357 site(s)
	    "2021-04-07T00:00:00Z",
	    "2021-04-08T00:00:00Z"
	Sat Apr  3 08:49:42 CST 2021
	There are no campsites available :(
	boo MATHER CAMPGROUND (232490): 0 site(s) available out of 357 site(s)
	    "2021-04-08T00:00:00Z"
	Sat Apr  3 08:54:45 CST 2021
	There are no campsites available :(
	boo MATHER CAMPGROUND (232490): 0 site(s) available out of 357 site(s)
	    "2021-04-08T00:00:00Z"

To get this going, you need to know your campground ID. Search for your campground on recreation.gov; the ID is in your title bar.

This will work for any campsite. Emojis aside, this code worked without any modifications. Everything ran as described on GitHub, and it was running on a Power Systems server. Incidentally, I also tried it on a Linux machine to make sure there were no differences between the platforms. Indeed, everything was the same, right down to the emoji issue.

As it happened, my buddy couldn’t make that trip, so my son joined me for some camping. We hiked down to Indian Garden and back along the Bright Angel trail. We got a late start, so we made sure to pack it in before dark rather than continue on to Plateau Point. Physically I felt up to it, but I also realized I’ll need to train more for the real deal; coming back uphill is no joke. My legs were sore and stiff for days after.

Knowing that I had a reservation at a campsite that was relatively close to where we started/ended the hike made the experience even more enjoyable. Sure, this isn’t exactly a traditional use for IBM i, but my tale confirms what you should already know. Using modern tools and techniques, this OS is capable of solving all kinds of interesting problems.

AIX at 35 and the Layered Significance of AIX 7.3

Edit: Have you started your upgrades yet?

Originally published by TechChannel March 10, 2021

Rob McNelly on AIX and Power hardware improvements, and where the AIX roadmap might lead beyond 2023

AIX turns 35 this year. While it’s fun to look back (here’s the IBM announcement letter from 1986), there’s also ample reason to look ahead, especially since IBM recently stated that it intends to release AIX 7.3 before the end of this year.

Previous AIX releases have included an open beta program, and it’s my understanding that this will be the case with 7.3 as well. Expect to see additional open-source packages bundled and supported with the base OS out of the box. The new release will run on POWER8 processors and later, so it may be time to consider migrating from any POWER7 or older hardware that’s still running in your environment.

I love AIX, to the point that I almost feel protective of it. I’ve said for years—most recently in 2018—that AIX isn’t going anywhere. Still, some pushback persists, so I feel like I need to continue to remind everyone that this OS runs critical workloads throughout the business world. For instance, I’ll get asked, “When is the next release coming?” even though people should understand that new capabilities are being brought to AIX on an ongoing basis through the application of service packs (SPs) and the release of new technology levels (TLs).

It’s a matter of aesthetics, but had IBM opted to call this release 8.1, that stamp of newness would have provided doubters with additional reassurance. Of course there’s a flip side, too. The leaps from AIX 5.3 to 6.1 and AIX 6.1 to 7.1 made some clients and ISVs skittish. From this perspective, the steps from 7.1 to 7.2 to 7.3 seem less urgent.

In any event, there is intrinsic value in knowing that a new release is on the way. Consider the tone of this piece from The Register: “While IBM has promised years more support for AIX, news of an update will be welcome as a sign of ongoing commitment and because the last major drop of the OS 2015’s version 7.2.”

However you’re looking at this, keep in mind that IBM’s plans for AIX go far beyond 2021. The AIX lifecycle currently extends to 2023. For some though, a two-year window isn’t much reassurance. “What will happen in 2024?” I hear. To that, I encourage people to check out this AIX Executive Strategy paper. This document includes a roadmap that takes AIX beyond 2030. You’ll need an IBM ID to download it, but it’s worth your while. You’ll find nuggets throughout that illustrate the value of our favorite OS. For instance:

“AIX is deployed across a variety of industries such as finance, manufacturing, retail, telecommunications, healthcare, travel and government, along with many others. … As IT infrastructure expands into new workloads, the ability of Power Systems and PowerVM virtualization allows AIX, IBM i and Linux to run side by side for efficient consolidation and optimization of data exchange and processing between these different environments. Power Systems is unique in its capabilities to host this wide range of solutions efficiently”

And, as has been the case for years, IBM provides “binary compatibility guarantees to ensure that clients can run their workloads on new AIX releases such as 7.3, on the latest Power platform without having to worry about recompiling or rewriting applications.”

So, to sum up: The capabilities of AIX are continually being improved. Power hardware is capable of running multiple operating systems on the same frame, and, as always, AIX is running mission critical applications in multiple industries around the world. Does that sound like a dying OS and ecosystem to you?

There’s other recent IBM news of interest to AIX users. Brandon Pederson notes that IBM is encouraging clients to upgrade to POWER9 now by allowing them to lock in a price for POWER10:

“After announcing the next generation POWER processor, we often hear from clients, “oh, I’ll just wait to upgrade my infrastructure until then!” But why wait? There are immediate performance, availability and security benefits to be had by upgrading to POWER9 now. So, to help put clients on the path to POWER10 but also take advantage of POWER9 right away, we are announcing a special offering for our scale-out servers. Available starting today, the IBM Power Systems Flexible Trade Up Offer for the S922 and S924 will help clients upgrade now to POWER9 and then to POWER10 when available at a predictable price.”

Anyway, now is a good time to reflect on all the reasons you love AIX. To that end, Nigel Griffiths has a link to some of the AIX “Best Bits,” where he highlights key parts of the OS and ecosystem that are often taken for granted. And Prenessa Lowery tells us what to expect as IBM officially marks the 35-year anniversary:

“You will have the opportunity to hear from IBM executives, SMEs, AIX influencers, AIX customers and more as they share their AIX story! Also, you will get a chance to gain more insights on how AIX will prepare clients for the future.”

So there’s a whole lot going on with AIX, now and well into the future. As we take the time to celebrate where we came from, we should also be excited for what’s ahead. Hopefully the knowledge that AIX 7.3 and POWER10 are on the horizon are compelling reasons to help convince you that your favorite operating system is alive and well and worth your continued attention.

Guru: Getting Yum And Bash Up And Running

Edit: It might be time to start looking at dnf

Originally published by IT Jungle March 15, 2021

In the March 1 edition of The Four Hundred, I noted that an emphasis on things like system/application modernization and open source solutions gives IBM i newcomers a degree of comfort with the platform. I also made the point that no matter how long you or I have been at this, there are always people, young or not so young, who are new to the platform and come to this site seeking introductory information about various tasks and capabilities.

With this in mind, I want to delve further into open source for those who are new to it. As popular as open source is on this platform, not everyone in the IBM i world is there yet. Due to a lack of time – or maybe even a lack of interest – there remain administrators who don’t use it and environments that don’t deploy it.

It’s a fairly simple path from Unix to Linux to Solaris to AIX to open source software running on IBM i. Of course syntax and technique differ, but many of the concepts easily move from there to here. Certainly compared to, say, managing Windows environments vs. IBM i, there’s much more common ground with open source.

AIX has had a Linux application toolbox for years. It consists of open source code compiled to run on AIX that can be installed as rpm packages. Of course it wasn’t always this way (as Alex Woodie explains), but these days, the open source model on IBM i is quite similar. The software installation dependency problem (a.k.a. RPM hell) has been addressed with Yum, which makes installing packages a breeze. Run a shell script and download a minimal amount of RPM packages to get started. Then either allow your machine to connect to the internet for additional downloads as needed, or set up a repository on a machine that RPM can access in the local network.

I’m also intrigued by the new capabilities that bash has received on IBM i. In the aforementioned article on Access Client Solutions, I left off with a SSH session connected to IBM i. Now, for those wishing to explore the world of open source, let’s get Yum and Bash working on the system.

I followed these directions. Different Yum install options are available. I chose the bootstrap.sql method.

I downloaded bootstrap.sql to my machine and selected the Run SQL Scripts option under the database section of the ACS menu.

It installed as expected, as seen below.

Be sure to read the whole document. Toward the end you’ll find some important notes to help you adjust your PATH. There’s also a link to a Yum cheat sheet.

Once this was set up and I was logged in via SSH to the command line, I located the Yum installation by running:

find / -name yum –print

That provided the full path to the Yum command:

/QopenSys/pkgs/bin/yum

I then ran /QOpenSys/pkgs/bin/yum update:

bash-4.4$ /QOpenSys/pkgs/bin/yum update
Setting up Update Process
Resolving Dependencies
--> Running transaction check
---> Package bash.ppc64 0:4.4-2 will be updated
---> Package bash.ppc64 0:4.4-5 will be an update
---> Package ca-certificates.noarch 0:2_git20170807.10b2785-1 will be updated
---> Package ca-certificates.noarch 0:2_git20170807.10b2785-2 will be an update
---> Package libcurl4.ppc64 0:7.70.0-1 will be updated
---> Package libcurl4.ppc64 0:7.70.0-3 will be an update
---> Package libsqlite3-0.ppc64 0:3.19.3-2 will be updated
---> Package libsqlite3-0.ppc64 0:3.32.3-1 will be an update
---> Package libssh2-1.ppc64 0:1.9.0-2 will be updated
---> Package libssh2-1.ppc64 0:1.9.0-3 will be an update
---> Package libutil2.ppc64 0:0.8.1-1 will be updated
---> Package libutil2.ppc64 0:0.9.1-1 will be an update
---> Package python2.ppc64 0:2.7.18-3 will be updated
---> Package python2.ppc64 0:2.7.18-5 will be an update
--> Processing Dependency: update-alternatives for package: python2-2.7.18-5.ppc64
--> Processing Dependency: update-alternatives for package: python2-2.7.18-5.ppc64
---> Package python2-rpm.ppc64 0:4.13.1-2 will be updated
---> Package python2-rpm.ppc64 0:4.13.1-7 will be an update
---> Package rpm.ppc64 0:4.13.1-2 will be updated
---> Package rpm.ppc64 0:4.13.1-7 will be an update
--> Processing Dependency: curl for package: rpm-4.13.1-7.ppc64
---> Package yum.noarch 0:3.4.3-18 will be updated
---> Package yum.noarch 0:3.4.3-19 will be an update
--> Running transaction check
---> Package curl.ppc64 0:7.70.0-3 will be installed
---> Package update-alternatives.ppc64 0:1.19.7-1 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================
 Package                 Arch       Version                       Repository
                                                                           Size
================================================================================
Updating:
 bash                    ppc64      4.4-5                         ibm     2.1 M
 ca-certificates         noarch     2_git20170807.10b2785-2       ibm      12 k
 libcurl4                ppc64      7.70.0-3                      ibm     403 k
 libsqlite3-0            ppc64      3.32.3-1                      ibm     1.7 M
 libssh2-1               ppc64      1.9.0-3                       ibm     327 k
 libutil2                ppc64      0.9.1-1                       ibm      17 k
 python2                 ppc64      2.7.18-5                      ibm      26 M
 python2-rpm             ppc64      4.13.1-7                      ibm     276 k
 rpm                     ppc64      4.13.1-7                      ibm     2.2 M
 yum                     noarch     3.4.3-19                      ibm     1.2 M
Installing for dependencies:
 curl                    ppc64      7.70.0-3                      ibm     116 k
 update-alternatives     ppc64      1.19.7-1                      ibm      84 k

Transaction Summary
================================================================================
Install       2 Packages
Upgrade      10 Packages

Total download size: 34 M
Is this ok [y/N]: y
Downloading Packages:
(1/12): bash-4.4-5.ibmi7.2.ppc64.rpm                        | 2.1 MB  00:01
(2/12): ca-certificates-2_git20170807.10b2785-2.ibmi7.2.noa |  12 kB  00:00
(3/12): curl-7.70.0-3.ibmi7.2.ppc64.rpm                     | 116 kB  00:00
(4/12): libcurl4-7.70.0-3.ibmi7.2.ppc64.rpm                 | 403 kB  00:00
(5/12): libsqlite3-0-3.32.3-1.ibmi7.2.ppc64.rpm             | 1.7 MB  00:00
(6/12): libssh2-1-1.9.0-3.ibmi7.2.ppc64.rpm                 | 327 kB  00:00
(7/12): libutil2-0.9.1-1.ibmi7.2.ppc64.rpm                  |  17 kB  00:00
(8/12): python2-2.7.18-5.ibmi7.2.ppc64.rpm                  |  26 MB  00:13
(9/12): python2-rpm-4.13.1-7.ibmi7.2.ppc64.rpm              | 276 kB  00:00
(10/12): rpm-4.13.1-7.ibmi7.2.ppc64.rpm                     | 2.2 MB  00:01
(11/12): update-alternatives-1.19.7-1.ibmi7.2.ppc64.rpm     |  84 kB  00:00
(12/12): yum-3.4.3-19.ibmi7.2.noarch.rpm                    | 1.2 MB  00:00
--------------------------------------------------------------------------------
Total                                           1.9 MB/s |  34 MB     00:18
Running Transaction Check
Running Transaction Test
Transaction Test Succeeded
Running Transaction
  Updating   : bash-4.4-5.ppc64                                            1/22
  Updating   : libutil2-0.9.1-1.ppc64                                      2/22
  Updating   : libssh2-1-1.9.0-3.ppc64                                     3/22
  Updating   : libcurl4-7.70.0-3.ppc64                                     4/22
  Installing : curl-7.70.0-3.ppc64                                         5/22
  Updating   : rpm-4.13.1-7.ppc64                                          6/22
  Updating   : libsqlite3-0-3.32.3-1.ppc64                                 7/22
  Installing : update-alternatives-1.19.7-1.ppc64                          8/22
  Updating   : python2-2.7.18-5.ppc64                                      9/22
update-alternatives: using /QOpenSys/pkgs/bin/python2.7 to provide /QOpenSys/pkgs/bin/python (python) in auto mode
  Updating   : python2-rpm-4.13.1-7.ppc64                                 10/22
  Updating   : yum-3.4.3-19.noarch                                        11/22
  Updating   : ca-certificates-2_git20170807.10b2785-2.noarch             12/22
  Cleanup    : yum-3.4.3-18.noarch                                        13/22
  Cleanup    : python2-rpm-4.13.1-2.ppc64                                 14/22
  Cleanup    : rpm-4.13.1-2.ppc64                                         15/22
  Cleanup    : python2-2.7.18-3.ppc64                                     16/22
  Cleanup    : libcurl4-7.70.0-1.ppc64                                    17/22
  Cleanup    : ca-certificates-2_git20170807.10b2785-1.noarch             18/22
  Cleanup    : bash-4.4-2.ppc64                                           19/22
  Cleanup    : libssh2-1-1.9.0-2.ppc64                                    20/22
  Cleanup    : libsqlite3-0-3.19.3-2.ppc64                                21/22
  Cleanup    : libutil2-0.8.1-1.ppc64                                     22/22

Dependency Installed:
  curl.ppc64 0:7.70.0-3           update-alternatives.ppc64 0:1.19.7-1

Updated:
  bash.ppc64 0:4.4-5          ca-certificates.noarch 0:2_git20170807.10b2785-2
  libcurl4.ppc64 0:7.70.0-3   libsqlite3-0.ppc64 0:3.32.3-1
  libssh2-1.ppc64 0:1.9.0-3   libutil2.ppc64 0:0.9.1-1
  python2.ppc64 0:2.7.18-5    python2-rpm.ppc64 0:4.13.1-7
  rpm.ppc64 0:4.13.1-7        yum.noarch 0:3.4.3-19

Complete!
bash-4.4$

Again, referring to this article, I was able to run some of the commands the author suggests:

bash-4.4$ getjobid
Process identifier 233 is 003275/QSECOFR/QP0ZSPWP
bash-4.4$ cl
cl: usage: cl [-beEhiIkKnOpqsSv] COMMAND [ARG ...]
bash-4.4$ liblist
QSYS        SYS
QSYS2       SYS
QUSRSYS     SYS
QSHELL      PRD
QGPL        USR
QTEMP       USR

While working on this piece, there was a bash update. So I brought my system to the latest and greatest by running another Yum update command:

bash-4.4$ yum update
ibm                                                         | 3.6 kB  00:00
ibm/primary_db                                              | 372 kB  00:00
Setting up Update Process
Resolving Dependencies
--> Running transaction check
---> Package bash.ppc64 0:4.4-5 will be updated
---> Package bash.ppc64 0:4.4-6 will be an update
---> Package libncurses6.ppc64 0:6.0-6 will be updated
---> Package libncurses6.ppc64 0:6.0-7 will be an update
---> Package libopenssl1_1.ppc64 0:1.1.1g-1 will be updated
---> Package libopenssl1_1.ppc64 0:1.1.1i-1 will be an update
---> Package ncurses-terminfo.ppc64 0:6.0-6 will be updated
---> Package ncurses-terminfo.ppc64 0:6.0-7 will be an update
---> Package perl.ppc64 0:5.24.1-1 will be updated
---> Package perl.ppc64 0:5.24.1-2 will be an update
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================
 Package                   Arch           Version            Repository    Size
================================================================================
Updating:
 bash                      ppc64          4.4-6              ibm          2.1 M
 libncurses6               ppc64          6.0-7              ibm          1.5 M
 libopenssl1_1             ppc64          1.1.1i-1           ibm          2.0 M
 ncurses-terminfo          ppc64          6.0-7              ibm          583 k
 perl                      ppc64          5.24.1-2           ibm           28 M

Transaction Summary
================================================================================
Upgrade       5 Packages

Total download size: 35 M
Is this ok [y/N]: y
Downloading Packages:
(1/5): bash-4.4-6.ibmi7.2.ppc64.rpm                         | 2.1 MB  00:01
(2/5): libncurses6-6.0-7.ibmi7.2.ppc64.rpm                  | 1.5 MB  00:00
(3/5): libopenssl1_1-1.1.1i-1.ibmi7.2.ppc64.rpm             | 2.0 MB  00:01
(4/5): ncurses-terminfo-6.0-7.ibmi7.2.ppc64.rpm             | 583 kB  00:00
(5/5): perl-5.24.1-2.ibmi7.2.ppc64.rpm                      |  28 MB  00:16
--------------------------------------------------------------------------------
Total                                           1.8 MB/s |  35 MB     00:19
Running Transaction Check
Running Transaction Test
Transaction Test Succeeded
Running Transaction
  Updating   : ncurses-terminfo-6.0-7.ppc64                                1/10
  Updating   : libncurses6-6.0-7.ppc64                                     2/10
  Updating   : bash-4.4-6.ppc64                                            3/10
  Updating   : libopenssl1_1-1.1.1i-1.ppc64                                4/10
  Updating   : perl-5.24.1-2.ppc64                                         5/10
  Cleanup    : bash-4.4-5.ppc64                                            6/10
  Cleanup    : libncurses6-6.0-6.ppc64                                     7/10
  Cleanup    : ncurses-terminfo-6.0-6.ppc64                                8/10
  Cleanup    : libopenssl1_1-1.1.1g-1.ppc64                                9/10
  Cleanup    : perl-5.24.1-1.ppc64                                        10/10
Updated:
  bash.ppc64 0:4.4-6                     libncurses6.ppc64 0:6.0-7
  libopenssl1_1.ppc64 0:1.1.1i-1         ncurses-terminfo.ppc64 0:6.0-7
  perl.ppc64 0:5.24.1-2
Complete!

I mentioned that bash update to show you why it’s important to regularly check for yum updates. The latest versions provide new features and enhanced security. Putting off system updates can be costly, but yum makes it pretty simple to keep current.

At this point, you should have OpenSSH, Yum, and the latest version of bash installed (as well as ACS). Now you’re set up to install other open source packages.

To find out what’s available, run “Yum list.” At the time of this writing there were 398 selections. Note: This doesn’t mean there are nearly 400 programs. Some of these packages are actually components that are needed to get various programs to run. The good news is that yum sorts out the dependencies – and you do have many packages from which to choose.

If you’re new to IBM i but are familiar with Linux, running things in bash while logged in via putty is great. Rather than having to login to a 5250 console and navigate green screens, you’re dealing with familiar commands and syntax. You can write scripts as you would with any other Unix-like system. And for experienced IBM i users, this interface also allows you to easily leverage what you already know.

For example, if I preface a command with “cl” in Bash, I can do some interesting things:

cl wrkactjob

As you can imagine the output from wrkactjob is long, so try it on your system. Some smaller output comes when I run:

cl wrkoutq

bash-4.4$ cl wrkoutq
CPD000D: Command *LIBL/WRKOUTQ not safe for a multithreaded job.
 5770SS1  V7R4M0  190621            Work With All Output Queues          2/12/21  10:13:03 CST            Page    1
 Queue        Library       Files    Writer       Status
 QDKT         QGPL              0                  RLS
 QPFROUTQ     QGPL              0                  RLS
 QPRINT       QGPL             14                  RLS
 QPRINTS      QGPL              0                  RLS
 QPRINT2      QGPL              0                  RLS
 QSPRCLOUTQ   QRCL              0                  RLS
 QSRVMON      QSERVICE          0                  RLS
 QS9SRVAGT    QSRVAGT           0                  RLS
 QEZDEBUG     QUSRSYS           0                  RLS
 QEZJOBLOG    QUSRSYS           1                  RLS
 QTPPPOUTQ    QUSRSYS           0                  RLS
         * * * * *   E N D   O F   L I S T I N G   * * * * *

You may need to get the hang of formatting commands, for example:

bash-4.4$ cl wrkoutq outq\(qgpl/qprint\)

CPD000D: Command *LIBL/WRKOUTQ not safe for a multithreaded job.
 5770SS1 V7R4M0 190621         Work With Output Queue        QPRINT      in  QGPL        2/12/21 10:14:04 CST            Page    1
 File       User       User Data  Status Pages Copies Form Type  Pty File Number   Job        Number Date     Time
 QPJOBLOG   QSYS       QLPSVR      RDY       1     1  *STD        5          1     QLPSVR     000135 02/26/20 02:45:05
 REQUESTROL QSECOFR    SQL         RDY      47     1  *STD        5          1     DSP01      000146 02/26/20 03:25:50
 QPRINT     QSYS                   RDY       1     1  *STD        5          1     QSLPSVR    000402 02/27/20 03:29:03
 QPRINT     QSYS                   RDY       1     1  *STD        5          1     QSLPSVR    000734 03/03/20 06:08:45
 QPRINT     QSYS                   RDY       1     1  *STD        5          5     QPRTJOB    000551 06/04/20 15:44:07
 QPRINT     QSYS                   RDY       1     1  *STD        5          6     QPRTJOB    000551 06/04/20 15:44:18
 QPRINT     QSYS                   RDY       1     1  *STD        5          7     QPRTJOB    000551 06/04/20 15:44:28
 QPRINT     QSYS                   RDY       1     1  *STD        5          8     QPRTJOB    000551 06/04/20 15:44:39
 QPCSMPRT   QSECOFR                RDY       2     1  *STD        5          1     DSP01      002556 07/02/20 09:13:15
 QPCSMPRT   QSECOFR                RDY       2     1  *STD        5          2     DSP01      002556 07/02/20 09:13:16
 QPCSMPRT   QSECOFR                RDY       1     1  *STD        5          3     DSP01      002556 07/02/20 09:13:16
 QPRINT     QSYS                   RDY       1     1  *STD        5          1     QSLPSVR    001271 07/07/20 02:31:27
 QSYSPRT    QSECOFR    CHGLICINF   RDY       1     1  *STD        5          1     DSP01      002826 01/21/21 12:39:35
 
         * * * * *   E N D   O F   L I S T I N G   * * * * *

Here’s another nice command:

cl dspmsg msgq\(qsysopr\)

Incidentally, running this example alerted me to an issue I needed to dig into:

CPI096E  99  INFO         Disk unit connection is missing.

Now if I want to install Python, PHP, or Ruby, I can just issue the “Yum install” command.

Finally, note the various support options for open source users on IBM i. Bitbucket allows you to tap into the broader open source community, and IBM i open source team members do monitor the site. If you can’t find what you need there, you can contact IBMers Camilla Sharpe and Jesse Gorzinski. We can help you do that if you need it.

Rob McNelly is a senior Power Systems solutions architect doing pre-sales and post-sales support for Meridian IT, headquartered in Deerfield, Illinois. McNelly was a technical editor for IBM Systems Magazine, and a former administrator within IBM’s Integrated Technology Delivery and Server Operations division. Prior to working for IBM, McNelly was an OS/400 and IBM i operator for many years for multiple companies. McNelly was named an IBM Champion for Power Systems in 2011, an IBM Champion Lifetime Achievement recipient in 2019, and can be reached at rob.mcnelly@gmail.com.

Standing Pat Is No Longer an Option With the HMC

Edit: Have you upgraded yet?

Originally published by TechChannel March 19, 2021

New POWER9 servers running new firmware won’t work with your old HMC. Rob McNelly explains why, and provides some upgrade options.

Lifetime IBM Champion Jaqui Lynch recently wrote about why you need to upgrade to the latest HMC levels and how you can accomplish this task. Her article is full of good information, and you’ll find some great, bookmark-worthy links at the end.

But here’s the thing: We―and by we, I mean consultants, along with IBM itself―have been pounding this drum for a while now. Details about the HMC upgrade and end of life for early, x86 “appliance only” HMC levels have been public literally for years. I first noted the big changes in 2017, and I’ve kept writing about them.

Given all the information that’s been out there for so long, no one should be caught off-guard by IBM’s HMC plans. And yet, many seem surprised. Why? I believe many clients ignored the messaging because they erroneously assumed they could continue to upgrade software levels for their existing HMC x86 hardware indefinitely, as needed. Provided the performance is good, they don’t think about a replacement. And in environments where HMC use is limited to getting systems up and running, even slow performance might not spur taking action.

But it’s time. Getting current with the HMC can be put off no longer. Standing pat now comes with actual, real-world consequences. For instance, I know of a client that ordered a new POWER9 server. It arrived in the data center running FW950 firmware. This caused a bit of a headache as they were unable to connect their new server to their older HMC. What should have been a routine hardware refresh ended up being much more challenging, to the point where budgets and project plans were impacted.

Upgrade Options

So, one more time, here’s the official word from IBM support:

“POWER9 servers with FW950 require minimum HMC level V9R2M950. Starting November 23, 2020, IBM manufacturing began shipping all POWER9 servers with FW950. It is customary for each server firmware release to require a minimum HMC level. In the case of FW950 the minimum HMC level is V9R2M950. Managing a server with FW950 from an HMC lower than the required level will display “Version Mismatch” for the system state indicating the HMC needs to be upgraded to V9R2M950.

On April 24, 2018, it was announced that update support for x86-based HMC hardware appliances would end in 2018. V9R2M950 has dropped support of the x86-based physical HMC appliances (7042-CR9, 7042-CR8, 7042-CR7) whose last supported release is therefore V9R1.

What this means is that if the customer does not have a 7063-CR1 or Virtual HMC V9 appliance, they cannot HMC install/manage any POWER9 server running FW950.” (Emphasis mine)

It’s that simple: New POWER9 servers running new firmware won’t work with your old HMC. Moving forward is imperative. So how does one do that? Let’s run through the process of getting a new HMC. First, here’s another set of options for HMC upgrades, via either the network, a DVD or a USB flash drive. This document explains configuration and lists the default passwords you should change.

Speaking of change, note that there’s a new tool for managing that shiny new 7063-CR1 HMC. It’s called the baseboard management controller (BMC). This document tells you how to configure the BMC IP address. Accessing the HMC via the BMC gives you additional options compared to the x86 appliances of yesterday. For starters, you can get a console and power on the HMC remotely, which can be extremely useful in a lights out environment. You’ll know that the HMC is booting properly just by watching the output scroll by on the console. That’s reassuring when you’re performing work remotely. Should you require additional access control, this document explains how to restrict BMC access. And here’s how you can configure the BMC from petitboot or from the enhanced GUI. And here’s a list of other links on BMC management. Finally, this doc covers troubleshooting.

Change is inevitable. We’ve progressed from the classic GUI to the enhanced GUI. Now we’re advancing to new HMC hardware. I mean, who would have thought that we’d be spinning up virtual HMCs in VMware, yet clients are increasingly making that choice. On that note, call me old school, but I still prefer to see at least one physical HMC in an environment. If you’re experiencing an unplanned outage or downtime and need HMC access in order to troubleshoot or power on frames and LPARs, you’ll have a pretty tough time getting to the vHMC if VMware is also having an issue, or if the Power Systems server hosting your vHMC is having issues. Sure, that’s an unlikely scenario, but going virtual is a choice that should only be made once you’ve seriously considered all options and weighed the risks versus the benefits.

The Importance of Keeping Current 

If you’ve already upgraded, good for you. But if you’re just now realizing that you need to take action, I’d encourage you first to get on board with the latest HMC hardware, and then to think even bigger.

I don’t view this as an isolated issue. Updates are critical. We must regularly update firmware, update and upgrade the VIO server, and upgrade the OS. When we fall behind on any of them, we run the risk of being in an unsupported state. Sometimes you’re so out of date you risk running very old firmware and VIO code alongside brand new HMC code, and some of these code combinations may be either unsupported or untested. Why take those risks? Why put your back against the wall? As shown with the client story I shared at the beginning of this article, when you only make changes when forced to, you often make things more difficult for yourself. It’s so much better to stick to a regular maintenance schedule to keep your systems current and avoid unnecessary technical debt.

Guru: For IBM i Newcomers, An Access Client Solutions Primer

Edit: This article is an example of how pictures are worth thousands of words

Originally published by IT Jungle March 1, 2021

I enjoy reading about IBM i Fresh Faces. Sure, it’s refreshing to see that it’s not just graybeards like me who are working on the platform, but what really matters is that young people are learning about, getting hands-on with, and coming to love IBM i. I appreciate that many of these stories revolve around system/application modernization and open source solutions. This gives newcomers to IBM i a degree of comfort by reducing their learning curve.

Learning any new technology is challenging. Specifically, how do you go from learning about the concepts to actually getting on a machine and discovering what it can do? Of course, this is compounded in the world of IBM Power Systems hardware, where system access is often tough to come by for students and hobbyists. But for now, I’d like to provide some guidance for those newcomers who have access but aren’t sure where to start.

Recently I took note of a Webex replay on IBM i Access Client Solutions (ACS). I assumed that interested readers would all have a basic understanding of ACS — but that wasn’t the case. Shortly after this article was posted, I was asked how to actually get started using the tool. So let’s back up a bit. If you’re new to IBM i, or ACS specifically, register at IBM.com. You’ll need an ID to be able to access certain educational and technical documents, like this download page for ACS.

Once you’re logged in, you’ll see this page:

Check “I agree,” then click “I confirm.” This brings you to the page where you can actually download the code.

I selected the IBM i Access Client Solutions file (the top of this list) and downloaded it. I unzipped the file and it installed on the Windows machine I was using.

I tried to run it, but I didn’t have Java installed. There are many ways to get Java; I followed this tutorial.

I installed the code and set up my Environment Variables as instructed.

I verified that Java was installed by running:

java –version

I got back:

openjdk version “11” 2018-09-25
OpenJDK Runtime Environment 18.9 (build 11+28)
OpenJDK 64-Bit Server VM 18.9 (build 11+28, mixed mode)

After that it was a simple matter of clicking on the ACS icon and starting the app.

Again, I’m assuming you have the IP address and userid/password information of a system you can log on to. Like this:

If you cannot read the screen shot above, it says:

“Welcome to IBM i Access Client Solutions
IBM i Access Client Solutions provides a platform independent interface which consolidates the most common tasks for using and managing your IBM i system. Additional information about each task is available by either moving the cursor over the task or by using the tab and arrow keys to navigate between groups and tasks. To select a task, click on the task or use the tab and arrow keys to navigate to a task and then press the enter key.”

To get started, add a system configuration for each IBM i system you want to use or manage. Do this by selecting System Configurations from the Management tasks.

As the instructions state, toward the bottom of the list there’s the “Management” section. The first item within that section is “System Configurations.”

Again, if you cannot read that text:

System Configurations provides an interface to create and manage system configurations for your IBM i systems. Use this task to create the system configuration for each IBM i system you plan to use or manage. System Configurations supports:

  • creating new system configurations
  • changing preferences (such as SSL or password prompting) for existing system configurations
  • adding a console configuration to an existing system configuration or locating a console for a new system

Use this task to create system configurations prior to using some other tasks.

I clicked New, entered my system name, and clicked on the connection tab. I instructed it to always prompt for a username and password. I also specified an IP address rather than have it look one up.

I returned to the general tab and verified the connection. This is what I saw:

Because I hadn’t installed ssh on the IBM i system, the ssh service failed to connect.

Back to ACS I went. In the top section, labeled “General,” I was able to click on 5250 emulator.

This brought up a green screen.

I entered my user ID and password.

This brought up another green screen where I could log into the system.

From here I went to Define or Change the System (number 7 in the screen shot above), then to Work with Licensed Programs (option 2, below).

I chose option 11 Install licensed programs.

Now I could search for the software I wanted. Again, more assumptions: at this point you need either to know how to install software, or have someone do it for you. You also need the appropriate install media. The ability to determine which software is on which DVD would also help. Googling “openssh IBM i” led me to this link.

I checked my system and found that I needed to install 5773SC1 (below).

To determine which DVD I needed, I searched for “ibm i media labels and contents 7.4,” which returned a doc labeled GI1199350.pdf. The “Media labels and their contents” document points you to the DVD with the software you need to install. Multiple tables in the doc help you determine which of the five DVDs you’ll need to load the software from. The one I wanted was B_groupx_04.

More assumptions: I’m assuming you know how to load a DVD on your system. Whether you need to physically load media into a DVD or you’re managing your system virtually, it’s important to know how you’ll install the code.

In my case I was using VIOS, so I needed to make sure the DVD was available in my virtual media repository. Then I needed to map it so that my IBM i client LPAR could use it. With the screen shots below, I’m assuming you’re familiar with HMC and the VIO interfaces. Obviously your mileage may vary.

I selected my VIO servers.

Once I was able to see the VIO servers, I clicked on the media repository view.

I chose VIOS2, the server with the media repository.

After I selected it, then I selected “Manage Virtual Storage.”

This brought up the list of media. I chose the DVD I needed.

After selecting it, I modified the partition assignment, linking it to the LPAR I needed.

 

 

Then I logged into my 5250 terminal. I returned to the “Install Licensed Programs” option.

I added it to the list of products.

I selected opt01, Programs and language objects.

Once it was installed, I was able to start sshd.

At this point, ssh was running.

My ACS setup was complete. Now I could choose either to continue using my 5250 emulator or try to open an ssh terminal. This option is located in the General section, second from the end of the list.

My system already had putty loaded, so I was able to start my session with a simple click of “SSH Terminal.”

SSH Terminal launches an already installed SSH client (terminal emulator). With an SSH terminal, it’s easy to run IBM i commands, invoke things in the Portable Application Solutions Environment (PASE), and access open source tools and technologies (e.g. Python, Node.js, Git, etc.).

If you receive a connection error within the launched SSH client (e.g., \”connection refused\”) or if a window pops up but immediately vanishes, you may need to start the SSH daemon by running this CL command:

	STRTCPSVR *SSHD        (This requires the 5733-SC1 product)

I appreciate that ACS provides context-sensitive help. In this case it let me know which product I needed to install and how to start sshd once it was installed.

With this access, you can login and get started. In future articles I’ll share more how-to information for newcomers to IBM i.

Rob McNelly is a senior Power Systems solutions architect doing pre-sales and post-sales support for Meridian IT, headquartered in Deerfield, Illinois. McNelly was a technical editor for IBM Systems Magazine, and a former administrator within IBM’s Integrated Technology Delivery and Server Operations division. Prior to working for IBM, McNelly was an OS/400 and IBM i operator for many years for multiple companies. McNelly was named an IBM Champion for Power Systems in 2011, an IBM Champion Lifetime Achievement recipient in 2019, and can be reached at rob.mcnelly@gmail.com.

Client Story, VIOS Tips, HMC News and Why You Should Move on From AIX 7.1

Edit: I really like articles that cover more than one topic

Originally published by TechChannel February 17, 2021

Rob McNelly explains why older AIX versions run slower, and why it’s time to move on from AIX 7.1

Upon checking the error log, a client noticed errors that were pointing to a fiber port on their system. While the port physically existed on the machine, a fiber cable wasn’t connecting that port to a switch. The client acknowledged the port could be needed at some point, but for now they didn’t want it to logically exist. And naturally, they wanted the errors to stop.

The solution is provided in this IBM Support document:

“Usually when there are unused ports on FC Adapters, it is possible to disable those ports… and stop cfgdev/cfgmgr from configuring the devices… [This] will stop all the error log messages.

The steps and information provided… are intended to disable ports that are not connected and are intended to be not connected… [If] the errors are logged against ports that should be connected or should be in the “Available” status, please troubleshoot those adapters accordingly.”

The doc covers four methods for disabling a port, depending on how the card is configured. If it’s allocated directly to an LPAR, run the procedure that’s described for the root user. Alternatively, you can run the procedure using smitty, or you can script it. Finally, if you’re in a VIO server, you can run the following as padmin:

            Remove the fcs# device and all child devices.
            > rmdev -dev fcs# -recursive -ucfg
            Set the fscsi# device to not autoconfigure.
            > chdev -dev fscsi# -attr autoconfig=defined
            fscsi# changed

            At a future date when you need to use the port, you can enable it;
            > chdev -dev fscsi# -attr autoconfig=available
            fscsi# changed
            Then run cfgdev to configure the devices.
            > cfgdev

The Benefits of Baseboard Management Controllers

Baseboard management controllers (BMCs) are becoming more common in Power Systems environments. They can be used to manage OpenPower systems like the LC921, and they also work with newer POWER-based HMC appliances:

“IBM Power Systems servers use a… BMC and the Intelligent Platform Management Interface (IPMI) for system service management, monitoring, maintenance, and control. The BMC also provides access to the system event logs (SEL). The BMC is a specialized service processor that monitors the physical state of the system by using sensors. A system administrator or service representative can communicate with the BMC through an independent connection. The BMC uses IPMI and is contained on the system backplane. IPMI provides one communication method to the BMC, by using a command line interface. IPMItool can be used either from a remote Linux system, or from the host operating system console window. IPMItool remote connections to the BMC can be done by using either the serial connection to the BMC, or through a configured Ethernet port. The BMC provides a web interface, which provides a graphical user interface (GUI) that can be accessed from a management console or workstation that has network connectivity to the BMC. This connection requires an Ethernet port to be configured for use by the BMC.”

If you have a POWER-based HMC appliance in a lights-out data center, or if you’re not currently traveling to data centers, powering on/off with a BMC is a handy option.

Replacing a Physical Disk in a VIOS Environment 

Have you had a physical disk fail in configurations where VIOS is booting from internal disk? Here’s a good procedure to follow if your disk replacement skills are rusty:

“Question
How to free up a failing disk that’s part of a PowerVM Virtual I/O Server, mirrored rootvg in preparation to replace the disk. This applies to VIOS 3.1.

Cause
Mirrored VIOS rootvg disk is failing.
Note: to determine if the disk may need to be replaced, contact your local Hardware Support Representative.”

Scroll to the bottom of that page, and you’ll find a link that explains how to mirror rootvg again once you’ve replaced the disk.

These concepts may seem basic, but many administrators come from non-AIX backgrounds (including IBM i) and are unfamiliar with these sorts of tasks. For anyone who hasn’t set up VIOS and doesn’t know how to manage it, these types of documents can be very helpful.

Managing VIOS Backups 

Here’s something many of us have been looking forward to for awhile: With HMC version 9.2.950, you can manage the I/O configuration of VIOS and backup VIOS images directly from the HMC. This could eliminate the need to have a NIM server to manage VIOS backup and restore operations, although—just as some people prefer belts and suspenders—you may be more comfortable doing both until your confidence grows with this new option.
 
Here are the steps involved
 
1. In the navigation area, click the HMC Management icon, and then select Templates and OS Images.
 
2. From the Templates and OS Images window, select the VIOS Images tab, and then click Manage Virtual I/O Server Backups.
 
3. In the Manage Virtual I/O Server Backups window, select the Virtual I/O Server Configuration Backup tab. A table is displayed that lists all the backup files of the VIOS configuration that is taken by the HMC. Additionally, you can view the time at which the configuration file was last edited.
   a) To take the backup of the input/output configuration of a VIOS, click Backup I/O configuration. In the Backup I/O configuration window, select the managed system and the VIOS for which the backup is created, and then specify a name for the backup file. The name you specify must consist of 1-40 characters including file extension .tar.gz. You can use the characters A-Z and a-z, the numbers of 0-9, the dot (.), the dash (-) and the underscore (_) characters.
   b) To rename an existing backup file that is stored in the HMC, select a configuration file from the table and click Action > Rename.
   c) To restore the VIOS input/output configuration, select a backup file which contains the I/O configuration of the VIOS that you want to restore, and click Action > Restore.
 
4. In the Manage Virtual I/O Server Backups window, click the Virtual I/O Server Backup tab. A table is displayed that list all the VIOS image backup that are taken in the HMC. Additionally, you can also view the name and size of the VIOS image, the time when the VIOS image file was last edited, the managed system and the VIOS from which the image was captured.
   a) To take the backup of the VIOS image, click Create Backup. In the Create Backup window, select the managed system and the VIOS for which the backup is created, and then specify a name for the backup file. The name you specify must consist of 1-40 characters including file extension .tar. You can use the characters A-Z and a-z, the numbers of 0-9, the dot (.), the dash (-) and the underscore (_) characters.
   b) To rename an existing VIOS image backup file that is stored in the HMC, select a backup file from the table and click Action > Rename.
   c) To remove a VIOS image backup file from the HMC, select a backup file which contains the VIOS configuration that you want to remove from the table, and click Action > Remove.
 
5. Click OK.
 

x86 HMC Reaches End of Service

If you haven’t heard, it’s time to upgrade your x86 HMC:
 
“HMC V9 R1 is the last release to support the 7042 machine type. HMC V9R2 will support the 7063 machine type and Virtual HMC Appliances (x86/ppc64le) only.
 
Note: iFixes and Service packs for V9 R1 will be supported on 7042 machine types until EoS of V9 R1.”
 

A Quick Case for Upgrading to AIX 7.2 

In his AIXpert blog, Nigel Griffiths explains why AIX 7.1 users should upgrade to AIX 7.2:
 
“No sensible technical person would be running anything older than AIX 7.1 at this point in time, due to:

  • Lack of support or support comes at a high price via Service Extensions
  • Lack of security updates
  • Missing advanced functions and advanced features that are only found in later AIX releases, particularly AIX 7.2
  • Not making full use of POWER8 and POWER9 servers. For example, AIX 6.1 can only do SMT=1,2 and 4 (no SMT=8!).
  • Missing years of development into removing serialization on locks and latches, adding parallel execution, shortening path lengths (less CPU cycles used to get work done), better performance tuning options, better out-of-box performance settings, and field hardening like trace and diagnostics.” 

Putting these all together: older AIX versions run slower.

Why AIX Admins Should Know vi and ksh

Edit: these are still good skills to have

Originally published by TechChannel February 1, 2021

We may bash one another with our opinions, but every AIX admin needs to know vi and ksh

People are pretty passionate about the tech that they use. That’s hardly a news flash. It can be argued that none of us is truly objective, but I’ve been through enough heated discussions over time to understand that strong feelings exist about computing.

So we may as well cop to our biases. I’ll start. I freely admit to being an AIX bigot, and I know that colors how I see the world.

What about you? Do you prefer Windows or Linux or MacOS on your desktop? I would say that MacOS and Windows run into these issues less than Linux, but your mileage may vary. But is preference even the most important thing? If disparate desktop environments are being used, the simple and straightforward can become more complicated. Sure, workarounds are doable, but that eats up time that could be spent collaborating.

So would it be better that everyone use the same popular OS to maintain a friction-less environment? If I can get Zoom or Webex or Microsoft Teams running on a Windows machine, it’s a safe bet you can get it to work on your Windows machine. That means we can quickly get on a shared session and work together.

This sort of thing is hardly new. I worked at IBM in the early 1990s, when the company was exhorting us to eat our own dog food. That meant using Lotus Notes for our mail environment, along with Lotus Office Suite for word processing. As a result, the process of exchanging incompatible files with customers was typically challenging. I ended up requesting permission to run Microsoft Office products. It was the only way to resolve our issues, and once we were on the same software, our problems went away.

Of course the counter argument is: shouldn’t the quality of the product matter most? What about viruses and malware? That code is often being written to infect the masses that run Windows. The argument for running Linux or MacOS is they give you a better chance of avoiding those issues.

This extends to the enterprise. We hear about ransomware encrypting and locking Windows servers, primarily. I’ve not heard of similar things happening with AIX or IBM i running on POWER.

Of course we can drill way down here. Even among AIX users, strong preferences exist over which tools and commands to deploy and run in any given situation. For instance, what’s your favorite editor? Are you in the vi camp? Emacs? Do you prefer something with a GUI, or something like Pico? Just typing this out, it seems like such a minor consideration. And yet, the opinions are many, and the feelings are real.

What about your favorite shell?

A Unix shell is a command-line interpreter or shell that provides a command line user interface for Unix-like operating systems. The shell is both an interactive command language and a scripting language, and is used by the operating system to control the execution of the system using shell scripts.

Most experienced Linux users are pretty comfortable with the bash shell. But do you know you can install that shell on AIX? Use the chsh command, or change your shell to bash. But be careful:

AIX default shell is the Korn shell. There are quite some other shells one can use with AIX. However IMHO anything but ksh or ksh93 in AIX is about 99.xx% compatible only. The missing .xx% can cause you some headaches when you rather would concentrate on some other os related problem. Therefore, as an AIX administrator I’d always recommend to stick to ksh.

On that note, is there anything more frustrating than troubleshooting a problem and then discovering it was self-inflicted? Changing your root shell can lead to problems:

Question: Can I change root or another system account user’s default shell from ksh to ksh93, or bash?

Answer: The Korn shell (/usr/bin/ksh) is set up as the default shell. The default or standard shell refers to the shells linked to and started with the /usr/bin/sh command.

The AIX Operating System is tested with SHELL=ksh in all system account script processes, unless otherwise defined in the script. Using chsh or otherwise modifying system user accounts in /etc/passwd to change a system account shell to non-ksh could result in script failures at run time. Users are advised to test any default shell changes carefully in their environment.

Note: The bash shell is an open source product, and is not supported by AIX Support cases.

IMPORTANT: Do not replace /usr/bin/sh or /usr/bin/ksh with other binaries!

  • The system is in 32-bit mode in phase 1.
  • The kernel cannot load a 64-bit binary until phase 2 of the boot.
  • There are shell scripts that run during phase 1 of a system boot.

# file /usr/bin/ksh
-> executable (RISC System/6000) or object module
# file /usr/bin/ksh93
-> 64-bit XCOFF executable or object module not stripped
# file /usr/bin/bash
-> 64-bit XCOFF executable or object module

To reiterate: Do not change root’s default shell!

I get it, you really like bash. You really don’t like ksh. Tough. You’re a professional. Learn how to navigate with ksh. Get so proficient at it that it becomes second nature. You need to master it. If you’re new to AIX, there are resources. This old book remains an excellent Korn primer. Here’s another older book. Same deal.

As I’ve said often, get on a system and get hands-on. Trial and error—while always trying to accomplish something, day in and day out—is the best way to learn in my experience. The time to learn is not when the system is down; that’s when you need to know. When a system is down and maintenance must be performed, an expert is needed: you. This is why you must be well-versed with the system’s built-in tools. If you cannot already run vi and edit a file without thinking about it, practice more. If you cannot log into ksh and navigate your shell history and edit commands on the fly, practice more.

Arguments can be made, sides can be taken, but my position is that knowledge of vi and ksh represents the bare minimum for being able to maintain an AIX system.

Sometimes Folks Only Learn The Hard Way About Keeping Software Current

Edit: My third post for IT Jungle

Originally posted February 1, 2021

School’s been out for me for a very long time, but I still enjoy learning. I gain a sense of satisfaction whenever I learn something new. Specific to technology, exposure to new concepts helps me understand how things work together. I cannot count the number of times where I watched over someone’s shoulder, or watched someone on a shared screen, to learn about a new tool or technique, or a different way to set up my desktop or environment.

Watching and listening to people is my preferred way to learn, but other forms of education – reading IBM Redbooks and other documents, or articles, or watching webinar replays – are also worthwhile. Pick a topic in the Power Systems ecosystem, whether it is the pros and cons of a virtual HMC, how many physical adapters you can fit into a given machine, how to transfer files or backup machines, CL or RPG programming techniques, updates to the open source environment on IBM i or overall general best practices, I love to soak it all up.

Digging into new topics – even if it’s only an inch deep – is especially important for IT pros in smaller workplaces. At smaller companies, fewer people wear more hats, and they’re typically asked to do more.

Even if I don’t use this information immediately, even if I never use it, I still value the experience of learning. I like to know what’s possible. To me it’s worth the time to get exposed to the concepts, and the less familiar I am with something, the more motivated I am to read about it. You never know when some tidbit of information that you’ve absorbed will come in handy. Having even the slightest introduction to a topic makes it much easier to conduct a search or ask a question later on.

Scott Berkun offered a similar perspective on Twitter recently: “If you’re experienced in your job, a great way to grow is to study something else. Go read a book about or go to a conference relating to something you know little about. You’ll ask big questions. You’ll learn new models and thoughts. You’ll return to your work with fresh eyes.”

In that vein, many of you may be unfamiliar with the inner workings of the PowerVM Virtual I/O Server, a.k.a. VIOS. It’s worth learning more about this topic though, because action needs to be taken in virtualized environments.

VIOS allows you to virtualize Power Systems servers. While I see a great deal of it in AIX and Linux on Power environments, it also supports a growing number of IBM i workloads. If you’re a VIOS user, you should know that, as of October 2020, VIOS 2.x is no longer supported by IBM. And you should understand why it’s important to move VIOS 3.1.x as soon as possible.

The withdrawal of support for VIOS 2.x doesn’t mean that you’re completely out of options for getting support. As is the case with most withdrawn offerings, IBM will still provide extended support. But paying the premium isn’t the only issue, or even the primary issue.

Also, this isn’t a boilerplate plea to move to the latest and greatest. You should move because VIOS 3.x is fundamentally different from VIOS 2.x. The updated version is designed to function more efficiently with newer Power Systems hardware (Power8 and later). VIOS 3.x runs AIX 7.2 under the covers, whereas VIOS 2.x runs AIX 6.1. If you’re not well-versed in the AIX operating system numbering scheme, AIX 6.1 arrived in the 2007 timeframe; 6.1 TL9 went end of support in April 2017. So under the covers, those versions of VIOS were getting long in the tooth. The new VIOS code, as noted, is better able to exploit the improvements in the Power hardware, (think of being able to dispatch to more threads, etc.). In addition, IBM removed VIOS legacy code (does IBM Systems Director ring a bell?), resulting in overall performance improvements with VIOS 3.x.

As a refresher, storage that’s been allocated to a VIO server can be connected to a VIO client by using either NPIV or vSCSI. Each option has advantages and disadvantages, but these days I see NPIV more often than not.

The following description of a separation of duties is mostly applicable to larger shops, where different teams have different roles, verses a smaller shop where one guy may do it all. With NPIV, the SAN guys are zoning LUNs directly to the client LPAR, whereas with vSCSI they’re zoning LUNs to the VIO server itself. It’s an extra step for the Power Systems administrator to then map the allocated LUN to the client IBM i LPAR. SAN guys generally prefer the added visibility into which LPARs are using the actual LUNs. Rather than see a LUN disappear when it gets mapped to a black box (the VIO server), I find that they tend to prefer NPIV. In a smaller shop, since it is the same guy doing the zoning and the mapping, it may be less of an issue.

With the code changes that have occurred, it’s important to recognize that the process of upgrading to VIOS 3.x differs from what you’re accustomed to. You’re not simply putting on a fixpack or service pack; you’re doing an under-the-covers OS upgrade from AIX 6.1 to AIX 7.2. Upgrading is still fairly straight-forward, but it does require some planning and preparation, so approach it carefully.

The good news is that most environments set up VIO servers in a dual VIOS configuration, which provides multiple paths to storage and the network and allows for maintenance activities to occur without affecting running client LPARs. The idea is that you can upgrade VIO server 2, test it, and then fail everything over to that second server and then conduct maintenance on VIO server 1. Of course, you should subject your VIO failover process to regular testing. As with any high availability solution, or even system backups, if you do not test, you cannot be sure that things will work correctly when you need it to.

IBM has tools and methods to help you perform the upgrade, but the basic idea is you’re backing up your configuration, doing a new VIOS install, and then restoring the configuration information to your fresh VIOS copy. A friend has been performing quite a few of these upgrades lately. Although IBM has a tool you can use to backup your data, he prefers to gather the configuration data himself.

In any event, he recounted a unique experience. He’d been told that, in this particular environment, all of the IBM i clients were connected via NPIV to their storage. After gathering the necessary data and doing the fresh install of the VIOS code, he went ahead and restored the network and the NPIV configurations. He then failed over the clients to his newly built VIO server and began work on the other one. Most of the IBM i clients were just fine, though a few started spitting out LED codes, indicating that they had lost access to their disks. It turns out that some vSCSI disk connections remained in the environment after all. As noted, he’d taken the word of others regarding how the disk was connected rather than verify the connections himself.

What would you expect to happen if you pull all the paths to the disks out from under a running system? I’d certainly expect them to crash. However, once the problem was understood (this took about an hour), both VIO servers had their vSCSI mappings put back in place, and the LPARs just continued from where they left off. They didn’t crash, they didn’t reboot; they just continued running as if nothing had happened. The customer was able to login and verify the system looked and behaved as they expected it to.

Maybe that speaks to the resilience of IBM i, or maybe my friend got lucky. The maintenance was performed during a quiet time of the day, so the systems weren’t being taxed in that moment. Regardless, I was impressed.

But don’t take the wrong lesson from this story. Yanking disk paths away from running production systems willy nilly is never a good thing. However, I would like to try and reproduce this behavior in the lab. When discussing this result with others, I’ve heard skepticism. There must be another reason why that environment didn’t crash. One thought is that there must have been a connection to the load source disk via NPIV.

I intend to get to the bottom of it. I look at it as another learning opportunity. Formal school may be out for most of us, but we’re still learning.

Rob McNelly is a senior Power Systems solutions architect doing pre-sales and post-sales support for Meridian IT, headquartered in Deerfield, Illinois. McNelly was a technical editor for IBM Systems Magazine, and a former administrator within IBM’s Integrated Technology Delivery and Server Operations division. Prior to working for IBM, McNelly was an OS/400 and IBM i operator for many years for multiple companies. McNelly was named an IBM Champion for Power Systems in 2011, an IBM Champion Lifetime Achievement recipient in 2019, and can be reached at rob.mcnelly@gmail.com.

The Ultimate Set Of Tools

Edit: My second post for IT Jungle

Originally posted December 14, 2020

Those of a certain age will certainly remember the moment in Fast Times At Ridgemont High when Jeff Spicoli got himself into a bind after he wrecked his friend’s car. Luckily, Spicoli’s dad was a TV repairman, and he had an ultimate set of tools. Spicoli knew that with those tools, he could fix it. That level of confidence is intoxicating, although in this case it was possibly misplaced.

I have a friend that lives around the corner from me that actually does have the ultimate set of tools, and knows how to use them. For example, he recently rebuilt his Jeep from the frame up. The other day my son, a high school senior, reported to me that there was a sound coming from one of the front wheels of his car. I took it for a drive and the metal on metal grinding sounded expensive. My friend offered to take a look, quickly diagnosed the problem, went with us to the auto parts store and back to his garage where he replaced the worn-out brake pads that were squealing. When you combine tools with experience, people can accomplish a great deal.

You have to take care of the tools that you have, part of what made my friend so effective was that he kept his tools clean, he kept his garage organized, he put his tools away when he was done using them, he knew where his tools were, he practiced with them over time, and he knew how to use them. Although our tools may be digital in nature, the same principles apply.

There’s a relevant story from Steven Covey’s 7 Habits of Highly Effective People:

The lumberjack was trying to cut down a tree with and was swearing and cursing as he labored in vain.

“What’s the problem?” The man asked.

“My saw’s blunt and won’t cut the tree properly.” The lumberjack responded.

“Why don’t you just sharpen it?”

“Because then I would have to stop sawing.” Said the lumberjack.

“But if you sharpened your saw, you could cut more efficiently and effectively than before.”

“But I don’t have time to stop!” The lumberjack retorted, getting more frustrated.

The man shook his head and kept on walking, leaving the lumberjack to his pointless frustration.

As an IT consultant, ongoing education and training is the equivalent of sharpening my saw and keeping my tools organized. Over the years I’ve always made time to attend conferences. These days I take part in as many free online education sessions as I can. I try to listen to them when they’re live, but if I have a conflict when the session takes place, I download the slides and watch the session replays when I get time. Invariably, I’ll learn something new.

In our world, you may think you know it all – or at least you believe you know everything you need to know to do your job. But of course technology doesn’t stop: The systems and solutions we work with are always being updated. Then there’s the more mundane reality that sometimes you’ll pick up a tip for doing a particular task, and it might be months or years before you get a chance to put what you learned into practice. And guess what? You’ve forgotten. I certainly forget things I’ve learned on occasion. The point is relearning is also a part of learning. Reminders never hurt.

There are a number of excellent educational resources out there. But if you’re looking for a single resource that provides current technical information for Power Systems users, check into the Power VUG Technical Webinar Series:

Power VUG (Power Virtual User Group) is a monthly technical webinar series for IBM Power Systems. The webinars are informal and have a focus on how-to, how-it-works, best practice, and hints and tips. Many include live demonstrations. They are relevant to AIX/IBM i/Linux on Power Systems. The series is aimed at a technical audience – operators, systems administrators, and technical specialists – those using/planning to use IBM’s Power Servers. It is open to Clients/Business Partners/IBMers.

They’ve done more than 100 sessions since 2011, covering topics like VIOS, IBM i suspend/resume and Live Partition Mobility, performance, monitoring, configuration best practices, IBM i mobile access, and Ansible on IBM i, to name just a few. Power VUG webinars aren’t simply people reading off of slide decks. Many sessions include live demos, and most have Q&A sessions. There’s nothing like actually watching the tools in action to get a good feel for how they work.

As an example, I’ll highlight a session presented by Tim Rowe, Business Architect of Application Development for IBM i on November 11. I encourage you to download Rowe’s slide deck and watch the replay:

Slide 2 is a roadmap of the information Rowe covers: IBM i Access Client Solutions (ACS) v 1.1.8.6, IBM Navigator for i Performance Data Investigator, the Digital Certificate Manager GUI, IBM i services powered by SQL, Administration Runtime Expert, Nagios, and open source plugins. There’s a lot here.

The session begins by defining access management. Who is accessing the system? There are tools and access for everyday users, while those who manage the system have a different, more powerful set of tools. A DBA will require yet another set of tools to make sure the database is running as expected.

In today’s ecosystem we need to be sure the tools run anywhere. Mac, Linux, Windows, and various mobile devices, all need to work and be supported, and with the new software they are.

IBM i Access for Windows went end of life in April 2019; it has been replaced by ACS. There are two ways to get the package: either download it from here, or if you’re running IBM i 7.4 SF99662 or IBMi 7.3 SF99722, get it directly from your machine by pointing to your IFS at /QIBM/proddata/Access/ACS/Base.

Once you apply the PTFs, visit this location regularly to ensure you’re keeping current with ACS. Rowe also shows how with version 1.1.8.6 version of the software, you can modify ACS properties in ACS to ensure that your searches for updates are targeted to your local system. It’s important to keep ACS current. Customer input is important to the ongoing development of this product.

There’s now the option to select and upload multiple files to your IFS, along with the ability to upload an entire directory with its contents. In the past only a single file could be uploaded at a time. IBM has improved both the performance and the actual interface, and have provided a better capability to filter files in different ways. Other additions include integrated file unzip support and the capability to view ILE source members or EBCDIC files as UTF-8 text files. Finally, there’s the capability to run SQL scripts. Watch the replay and you’ll see examples of how this is helpful.

IBM i developers are making greater use of SQL. Scott Forstie, senior software engineer at IBM, has a GitHub page which contains 67 working examples. And COMMON has a video series on SQL. Also check out SQL Tutor.

A neat part of the demo involves content assist. You’re prompted for possible values that will make sense in the context of the command that you’re creating. Rather than needing to look up commands or syntax, you can click your mouse to see which options are available. There’s a context-sensitive interface that can help with CL and SQL; it’s also useful for configuring commands and options for the commands. The demo shows various ways you can build these commands, along with using the formatter tool. Click on the “insert from examples” button; if you don’t write SQL, you can use built-in examples and learn from them. The examples help answer questions about the system, like which files are owned by users, IFS growth, etc. This section in itself is a great reason to watch the replay.

IBM has also changed the ACS interface. The preferences and property settings are now in the same place, which makes it easier to keep track of the different options you might want to set instead of searching for options in multiple places.

Under tools, there’s an open source package manager. From here you can manage the rpm packages on the IBM i system. In the past, if your machine didn’t have direct access to the repository, you had to download all the packages before you could use any. Now, if your IBM i machine is isolated from the internet, you can use your workstation in conjunction with an SSH tunnel to automatically download from an rpm repository on the internet and get that copied to your IBM i system.

Moving away from ACS, there is a discussion around IBM Navigator for i. There’s a performance data investigator (PDI) that you can use with collection services, disk watcher, job watcher, database, etc. You can view the data in new graphical interfaces, and this new functionality makes it an important feature. Tons of system information is available for analysis. If you want to know how a specific chart was generated, can click on the show SQL button to see what was run and what data was gathered. Don’t forget: Navigator is a free solution that comes with the system. Again, watch the demo to see it in action.

A new digital certificate manager, available for IBM i 7.3 and 7.4, makes it much easier to manage your certificates. You can manage local certificates, you can look at all of the certificates that are on your local certificate store. You can get information, search, look for expired certificates, etc. There’s also an SQL service. While the older tool you used for your digital certificates remains available for now, expect it to eventually go away.

Administration Runtime Expert (ARE) is a tool for comparing central systems to endpoints. You can, for example, move PTFs around your environment using this powerful and free tool. It was stated during the session that it is deserving of its own session to cover all it can do.

Toward the end, learn more about monitoring your environment using Nagios. There are lists of supported elements like CPU utilization, disk status, list job information from a subsystem, message checks, etc. It’s also possible to use SQL plugins to define your own rules, your only limitation as to what you monitor is your imagination.

Why did I go into such detail on one webinar? Because I’ve written enough articles over the years to know many readers don’t invest the time to click on the links and listen to the replay. I wanted to try to whet your appetite. If you’re interested in learning more about the capabilities of IBM i and Power Systems hardware, Power VUG webinars are a great way to sharpen your saw. Please click that link and take some time to watch and learn, then you too will be able to add to your ultimate set of tools.

Rob McNelly is a senior Power Systems solutions architect doing pre-sales and post-sales support for Meridian IT, headquartered in Deerfield, Illinois. McNelly was a technical editor for IBM Systems Magazine, and a former administrator within IBM’s Integrated Technology Delivery and Server Operations division. Prior to working for IBM, McNelly was an OS/400 and IBM i operator for many years for multiple companies. McNelly was named an IBM Champion for Power Systems in 2011, an IBM Champion Lifetime Achievement recipient in 2019, and can be reached at rob.mcnelly@gmail.com.

You Can’t Just Keep Doing What You Have Always Done

Edit: My first post for IT Jungle

Originally posted November 30, 2020

There are philosophical differences that exist between people regarding the best ways to configure, manage, and maintain infrastructure, and this holds as true for infrastructure built around IBM i as it is for other kinds of platforms. The easiest and fastest thing to do – and what makes the most sense according to plenty of people – is to do what they have always done. Just keep replicating the past out into the future to infinity.

Others point out that this mindset stifles progress and ignores all of the innovations that have come to the IBM i platform through the years. How can you even consider not making the full use of your computer’s capabilities? Why are you still running your system the way you did in 2010, or in 2000 or even 1990?

recent tweet by Larry Bolhuis sums up this way of thinking nicely, if a little hyperbolically:

“When you update your @IBMPowerSystems are you truly updating or just replacing one generation of CPU w/ the next and adding a dusting more disk and memory? If your business partner didn’t discuss PowerVM, SAN Storage, and VTLs before you bought, get a new partner.”

To expand on that, why would you ignore the power and flexibility to be gained by connecting to a SAN? Among many other things, SAN-based replication, snapshots, the ability to perform Live Partition Mobility, are given up when you refuse to move forward and think differently. Why purchase internal disk or an SSD that’s only available to one system in your environment when you could share in a pool of disk that’s used by other machines?

IBM resellers and business partners should strive to educate customers, and increase their own awareness. They should point out the advantages of, for instance, connecting to a SAN. Of course to assume that all IBM i installations even have SANs would be a mistake. Sure, they are present in most large environments, but they may still be the exception in smaller shops. One size does not fit all, and real-world circumstances must be recognized and acknowledged. Still, sellers should be sharing this information with you. Small shops may still benefit from connecting to a smaller IBM SAN, for instance. What’s important is that these discussions occur and customers learn of new options.

It’s a fine line, of course. Some choose to stay put. They’re satisfied, in many cases thrilled, with what they have. The system just runs. There have been no outages, and what more can you ask for from an investment in hardware? Everything is simple. Troubleshooting is straightforward; diagnoses are easy. The focus is the machine and its components, and possibly the network connection. Personally, whether it’s in my own environment or through my consulting with clients, I always get a charge out of working with new technology and getting the most functionality out of it that I can. But I can see the elegance in having a reliable standalone system, and in that way of thinking.

When it’s time for a hardware refresh, the request in this scenario is simple, something along the lines of: “Please give me the current equivalent of what I’m currently running.” Maybe a little more disk is needed, or a faster, higher-capacity tape drive. With the speed of today’s processors even a factory de-configuration of some cores is a consideration; if the workload in question is small enough, maximum horsepower may not be needed. So the order is placed and the upgrade is performed. You are content, knowing that this new configuration will last for years, or at least until this hardware reaches end of life and you’re compelled to do another upgrade.

I was recently part of an interesting back and forth with Bolhuis and others about these sorts of scenarios and choices. It was noted that customers are not always educated about what they can do with their Power hardware. They may not fully appreciate the benefits of virtualization, or running things differently than they always have:

“You know your @IBMPowerSystems can run more than just #IBMi or just #AIX! Use your next upgrade to enable deployment of both, or Linux, or all three. Don’t let yourself be boxed in by old single system thinking. Be sure your BP knows the options for virtualization!”

Especially for those of us who think about this stuff for a living, it’s natural to assume that everyone else is just as engaged. But not everyone attends virtual briefings or consumes every detail of IBM’s announcements, they have better things to worry about. (That is why you have The Four Hundred doing this for you, in fact.)

Replacing existing hardware with similar, more current hardware may make perfect sense, but remember it isn’t the only way. Business partners should be partners. They need to really listen to you, they should strive to understand your needs, answer questions and keep you informed. They should help you solve your business problems. And people like me aren’t serving our clients and customers if we don’t suggest alternatives. With some slight modifications, a traditional standalone system can host multiple LPARs, including Linux on Power workloads, and production and development LPARs, for example.

A position we do not want to be in is to get a request from management or from system users that we cannot fulfill. They may come to us and say that they understand that in theory they can add an LPAR to an existing Power System. Or they may tell you that they are under the impression that they can boot the system from a SAN. They may approach you and tell you that they understand that they can do X, Y, or Z.

The answer may very well be: “Yes you can, in theory, but since we did not discuss this, you’re missing an HMC, and you’re missing these other components that will enable that functionality. So until we remedy the situation, the answer is going to be no.”

None of this is a revelation; these capabilities have existed for years. For you the decision may not lie in knowing that you can do it, but in the incremental costs involved, or the skills available, or the thinking that if it isn’t broke don’t fix it, or as stated before the elegance of the system running as it is.

Some may view IBM i as a legacy environment and seek an excuse to discard it for something new, shiny and sexy. Education needs to occur at the management and executive levels as well. Power should be a strategic platform to your business, and wherever and however possible, it should be exploited to its full potential.

I think we can agree that some customers, and maybe even some sellers in the IBM i world aren’t up to speed — or simply not all that interested in learning about everything that the platform can do. That’s unfortunate, but that’s part of my takeaway from the discussion with Bolhuis.

Blame can go in all directions on this. Should IBM i operators and admins make more of an effort to seek out the information? Should IBM make information easier to find? Should employers be providing their employees with the training and classes and conferences they could benefit from? Should sellers be more active in educating themselves, and having conversations and whiteboard sessions and lunch and learns with customers?

It could simply be a product of the reality we all live in. Everyone is doing more, everyone is busy. Day to day it is easy to let learning take a back seat to current system issues. And that doesn’t even begin to get into the unprecedented chaos that has been 2020. With our day to day jobs, with our lives, it’s tough to balance everything. It all takes time, and there are only so many hours in a day.

Ultimately though, a lack of education or knowledge or experience can be remedied. The discussions are good – they’re imperative, really. Even internal discussions can be beneficial. Maybe ask yourself: How would life change if you went from a single production LPAR to having access to multiple instances of IBM i? Yes, there are costs to consider: licensing, additional cores, additional memory, additional disk, etc. Some of you will look at the costs and dismiss this proposition out of hand. But sometimes these decisions are based more on valuing the way things have always been done and fearing the unknown.

Getting a test machine is a good start when it comes to trying new things. No one wants to mess with production, so a test box or partition is the best way to try something new or get better at what you do. Implementing a dedicated environment where you can change code and test backups, upgrades, installs, and so forth and never have to worry about affecting the business is really a no-brainer decision.

So let’s initiate these conversations. What would make your life easier? Where would you benefit from taking a slightly different approach? If you could change anything, what would you do differently? Needing to make no changes at all is also an acceptable answer, provided you have done your due diligence and spent a little bit of time considering all of your available options.

Rob McNelly is a senior Power Systems solutions architect doing pre-sales and post-sales support for Meridian IT, headquartered in Deerfield, Illinois. McNelly was a technical editor for IBM Systems Magazine, and a former administrator within IBM’s Integrated Technology Delivery and Server Operations division. Prior to working for IBM, McNelly was an OS/400 and IBM i operator for many years for multiple companies. McNelly was named an IBM Champion for Power Systems in 2011, an IBM Champion Lifetime Achievement recipient in 2019, and can be reached at rob.mcnelly@gmail.com.

Some Ideas for Expanding the Power Systems/AIX Universe

Edit: The first post for TechChannel in 2021.

Originally posted January 8, 2021

Moving forward, Rob McNelly will be writing regularly on AIX, IBM Power Systems hardware and related topics for TechChannel

“Hey, I heard you missed us, we’re back” – Van Halen, 1984

It’s a new year, and many changes are on the horizon. My AIXchange blog will remain on the website, and all of my previous work, going back to the early 2000s, is archived on robmcnelly.com—but moving forward, I’ll be writing regularly on AIX, IBM Power Systems hardware and related topics for TechChannel, producing new articles and passing on what I know. I’m also on Twitter (@robmcnelly).

In 2019 I received the IBM Power Champion Lifetime Achievement award. You may not realize it, but as part of that elite group, us lifetime achievers have massive pull. IBM doesn’t do anything without consulting us. So I recently drew up a list of things IBM should explore to grow the POWER community and skills in general.

Disclaimer: I have no pull and no one asked me to do this. Of course being an IBM Champion is a great honor, but I was just joking. What has actually happened is, based on many discussions with lots of smart people inside and outside of IBM, I’ve formed some ideas. Certainly some of these items may be more practical or feasible than others, and there may be perfectly valid reasons why certain things will never happen. But if the items on this list become reality, I believe our community would benefit. Feel free to circulate this information, and share your thoughts with me on this or any other topics you’d like to see me explore.

1) IBM should explicitly state that AIX can be used for non-commercial personal use. The reality is that we already have access to AIX in our everyday jobs, so why not be clear about what can be done with it? With or without official word, I imagine IBM lawyers have better things to do than go after people who play with AIX in their basements. Still, it’d be nice to see that written down—and if it already is, it should be more prominently stated.

I can take this further: Why not find cost-effective ways for hobbyists or students to access older POWER7 or POWER8 hardware as it comes off lease or reaches end of life? And what about a low-cost SAN option to connect everything? Throw in a virtual HMC image while we’re at it. Sure, I understand that this is no small amount of hardware to keep in one’s basement, but in an industry where so many are at or nearing retirement age, it seems short-sighted to not help people who want to spend their off-hours trying to get better at what they do. Let’s make it easier. Let’s dream a bit.

2) Provide free or very low-cost access to cloud instances. Like other tech giants, IBM is a cloud provider. Letting potential customers try out your interface and log into your systems seems like a good way to get further established in a highly competitive marketplace. Users could report bugs or usability issues to IBM, which would be better than having actual paying customers find and report them. Allowing greater access means greater mainstream familiarity with the product, which would come in handy when IBM pitches to clients that are considering cloud. If admins already know and like using the interface, that could spur more customers migrate to Power Systems.

And it wouldn’t take much CPU, memory and disk for IBM to make these instances available. Just limit individuals to a small percentage of a CPU, and many users could effectively share the same hardware. Imagine how much the ecosystem could grow if people could simply log into the systems and learn. Think how easy it is to download and use various Linux distributions. Shouldn’t it be at least that easy to run AIX on Power hardware? 

IBM does have its Cloud Free offering, but why not allow access to hobbyists and students without making them provide billing and payment information? The idea of clicking the wrong button fills me with dread.

Again, there may be valid reasons for this, but it’s my dream, and I’m dreaming big.

3) Make it easier to access the educational resources available through the IBM Academic Initiative. I get it: I can find a local school, partner with faculty, and gain access to the materials that way. Schools are also provided with system access for their students, the whole program is a fantastic idea. But the model of making me find a school first seems backwards. Why not provide me with the educational resources and systems in advance? That way, when I approach faculty at a local community college or high school, I’ll have a solid understanding of the materials, and they’ll have a good idea of the course content that I’m offering them.

4) Develop a low-cost, entry-level desktop system. To truly be proficient with Power Systems hardware, users would need, at the very least, access to and control of a virtual HMC and one system that could accommodate the creation of LPARs. Of course POWER9 desktop offerings are already available, but they’re fairly pricey and run Linux on Power (not AIX) out of the box. (Although I have seen AIX working via QEMU.)

But imagine if desktop hardware costs were more in line with what you’d spend on an Intel or AMD PC. Why is this important? Think about what made you a tech guru. Did you only study articles or Redbooks? I doubt it. Having hands-on access to systems allows AIX pros to practice critical tasks associated with operating system firmware, the HMC, VIOS, Live Partition Mobility, and much more. With fewer employers providing training or test boxes, people need opportunities to hone their skills and build confidence. You only need to run rm –rf * as root once to understand why you should never, ever do this. There’s simply no substitute for the learning experience that non-production environments provide.

Don’t get me wrong: I’m glad there’s a manufacturer selling fully open POWER9 hobbyist hardware, and I understand why that’s more expensive to build than other chipsets. We’re talking about massive differences in manufacturing scale. Still, I wish there was something more affordable. Either create the hardware within IBM or partner with another manufacturer, but do it. Just imagine what could be accomplished with something like this.

A final thing: Nigel Griffiths has a couple of new presentations. One is a POWER10 preview from the Power Virtual User Group (VUG) technical webinar series. Watch the replay and download the slides. Also watch the replay from his presentation on what’s new in AIX.

AIX Security and IBM i TR Announcement Highlights

Edit: An end of an era.

Originally posted December 2020 in the final issue of IBM Systems Magazine

Technical Editor Rob McNelly breaks down the latest IBM i and AIX announcements

In October, IBM made a series of announcements covering an array of products and offerings, including IBM Power Systems™ hardware enhancements, new AIX® features and function and the latest IBM i technology refreshes (TRs). 

AIX Security and Availability Updates 

Along with security, high availability is emphasized with the AIX announcements. On that note, support for logical volume (LV) encryption is a huge development.

Part of the AIX 7.2 Technology Level (TL) enhancements, LV encryption support provides for efficient encryption/decryption of data within an LV. While you won’t be able to encrypt rootvg at this time, you can encrypt other system LVs. As noted in the announcement, where available AIX will use on-chip cryptographic acceleration, allowing for data-at-rest encryption. (Learn more.)

PowerHA® SystemMirror 7.2.5 features a Geographic Logial Volume Manager (GLVM) configuration wizard that’s designed to simplify disaster recovery and enable clients to configure and orchestrate multiple parallel GLVM instances from a source to target. Assuming you have the bandwidth, multiple network streams should improve replication speed, and the addition of compression should make replication even faster and more efficient. If you lose one path between nodes, you can continue mirroring your data via another path through the improved network monitoring interface. And new statistics provide greater insight into replication status. With many cloud providers lacking storage-based replication options, GLVM can help facilitate cloud migrations. (Learn more)

The new create_ova command creates an open virtual appliance (OVA) package. An OVA package is an archive file that can be deployed as a VM and imported into any PowerVC environment containing a supported storage device or any cloud service that supports the Open Virtualization format (OVF) packaging standard.

IBM’s Chris Gibson discusses this in detail in his blog (“Creating Bootable AIX OVA Images”)

OVA could be used to migrate LPARs to another data center or to the cloud, assuming you can take the downtime associated with creating and sending the file over the network, and then using that file to deploy the server image. In tandem, GLVM enhancements and the addition of create_ova help simplify cloud migrations.

With Version 9.25.950 of the IBM Virtual HMC (vHMC), clients can use the HMC to backup and restore their Virtual Input/Output Server (VIOS), and also store VIOS backups on the HMC itself. For sites with limited VIOS skills, using a network installation management (NIM) server to restore VIO images in a disaster situation is a lot to ask. In small environments—say, one HMC and one POWER® server—recovery could be even more problematic with no other machine to host a NIM server. The HMC being a viable backup/restore option should simplify the process. We’ll see about scalability. This may not be great for backing up huge POWER server fleets’ VIO servers, but there’s a place for it. (Learn more)

IBM i TRs

TRs were issued for IBM i 7.4 TR3 and IBM i 7.3 TR9. With this announcement IBM delivered 15 new or enhanced open-source packages, including pigz, chsh, MariaDB and PostgreSQL for database flexibility. These additional technologies are intended to give developers greater freedom of choice when building applications on IBM i. (Learn more)

Also available are the new IBM i Playbooks for Ansible®, which automate tasks like provisioning cloud environments, deploying applications, applying security patches and much more. Automation is built in across IBM’s high availability/disaster recovery portfolio. Additional object types and improved application evaluation capabilities have been brought to Db2® Mirror for i, while BRMS delivers significant ease of use based on IBM i Services. (Learn more.)

With security, base authentication in IBM Integrated Web Services (IWS) no longer requires an HTTP server, and IWS also now enables the use of third-party security services. PowerSC MFA can now be run on IBM i alongside AIX and Linux®, providing a single dashboard for security management of any environment. Multifactor authentication is also built into the latest release of PowerVC, increasing security of private cloud and virtualized environments.

More Information

A complete summary of the Oct. 6 IBM announcement.

Why You Should be Running VIOS 3.x

Edit: Hopefully you have upgraded your VIO servers by now.

Originally posted November 2020 by IBM Systems Magazine

Technical Editor Rob McNelly explains the advantages of upgrading your VIO server.

The PowerVM® Virtual I/O Server (VIOS) provides the capability to virtualize your POWER® servers. It’s the software layer that runs between client VMs and the physical hardware.

Imagine a server running 25 VMs. Prior to the advent of virtualization, these would by necessity be multiple physical servers, each with its own set of network and SAN adapters. Of course, virtualizing physical hardware and sharing that among multiple VMs has obvious benefits. For starters, you eliminate the need for all of those extra boxes while having even greater power and capacity. Beyond that, there’s little need to dedicate a physical adapter to every workload because adapters can be shared most of the time.

VIOS debuted with IBM POWER5 servers running AIX® and Linux® workloads. With the availability of POWER6, IBM i workloads were also supported. Through the years, many administrators have come to rely on VIOS, but not everyone is using the latest versions. VIOS 3.1.0 debuted in November 2018, and the latest update, VIOS 3.1.1, arrived a year ago.

Now Is the Time to Upgrade

However, if you have yet to move to VIOS 3.x, you should do so as soon as possible. With Release 2.2.6 reaching end of life as of October, VIOS 2.x versions are no longer supported without an extended support contract. And continuing to use a release that’s out of support can put your organization at risk. 

Maintaining access to IBM support isn’t the only reason you should be running VIOS 3.x on your servers. It’s important to understand that the latest versions of VIOS are fundamentally different from their 2.x predecessors. One important change is that VIOS 3.x is based on AIX 7.2, whereas VIOS 2.x was based on AIX 6.1. Just as there are advantages to running AIX 7.2 over AIX 6.1, there are advantages to running VIOS 3.x over VIOS 2.x. Most significant is that newer POWER hardware can be better exploited with AIX 7.2—and by extension, the virtualization code that comes with VIOS 3.x. The code base is cleaner, because IBM removed older unused packages.

These changes to the underpinnings of VIOS have necessitated a transformation of the upgrade process. While I wouldn’t say upgrading to VIOS 3.x is more technically difficult than what you’re accustomed to with 2.x, it must be approached carefully. This is something new that requires planning and preparation.

Upgrade Tools 

IBM has developed a viosupgrade tool, and I recommend practicing with it before upgrading your production machines. If you have spare computing capacity, it may make sense to use live partition mobility to evacuate your frames and perform the work on “empty” POWER frames so that running workloads aren’t affected. The upgrade process should be documented, and you should go in with the expectation that physical to virtual mappings, performance settings and more will need to be verified once you’re done.

While a complete explanation of the upgrade process is beyond the scope of this article, detailed information is available online (see “Upgrading Resources,” below). 

And fortunately, you’re not on your own. Your business partner or IBM Systems Lab Services can help you scope out different options. In fact, this may be a good catalyst to examine your entire environment. Is VIOS 3.x supported on the hardware you’re running? Is your HMC in need of an update? How about your system firmware, or your AIX versions? If these different components haven’t been maintained, this project can and perhaps should mushroom into something well beyond a VIOS upgrade. You may even consider upgrading your hardware, as it may be easier to configure new hardware and install VIO 3.x from scratch while migrating your AIX workloads and decommissioning older hardware. 

Upgrading Resources

Moving AIX Workloads to the Cloud

Edit: As we do more of these migrations we will all get better at it.

Originally posted June 2020 by IBM Systems Magazine

Q: I’m interested in moving my on-premises IBM AIX workloads to the public cloud. What are my options?

Most major public cloud providers have IBM Power® hardware offerings that can run AIX®, IBM i and Linux® workloads. With the growth of hybrid cloud environments, you likely recognize the value of these solutions, but you may be unsure how to proceed.

Let’s start with migration options and tactics. Depending on network bandwidth, your process may be as simple as creating a NIM server in your cloud environment, then deploying by creating and moving mksysb images to the cloud. Assuming your rootvg data is static, you can build the VM and then work on migrating the remaining datavg data.

It may also make sense to use IBM PowerVC, IBM’s advanced virtualization and cloud management offering, to export and import OVA images between your current data center and new cloud provider. Here’s a closer look at more options for migration.

Build a Better VM

Cloud providers typically have a basic OS template that can be used to build and deploy VMs. But moving existing OS images and data to the cloud can be complex and require extensive preparation.

Identify a provider that understands your unique environment and the tools and options you’ll need. Even though non-IBM providers will run your cloud on IBM Power Systems™ servers, their technical personnel may be more familiar with x86 platforms. Finding a partner that’s knowledgeable about Power Systems infrastructure is especially critical if you don’t have the staff or the available cycles to handle both existing operations and migration. Expect to run some proofs of concept and test migrations to allow stakeholders to get comfortable with the migration process and the operational changes that will occur once you start running in the cloud.

The document “Migration Strategies for IBM Power Systems Virtual Servers” details several migration options. These include using IBM PowerHA® SystemMirror® Enterprise Edition with GLVM to sync data in real time ahead of the actual cutover (ibm.co/2YUMJPy). It also examines application-specific replication tools such as IBM Db2® HADR and Oracle® Goldengate. Most databases have specific migration requirements, but you may be able to ship logs to your new server or export and import data. Iron out details and conduct thorough testing.

Familiar tools such as rsync, savevg and restvg can also help with this process. It may be easy enough to migrate most of the data, then run rsync to sync the last bit before the final cutover. Built-in AIX tools savevg and restvg are used to backup and restore non-rootvg volume groups. These commands can simplify the creation of your new volume groups and filesystems on your new VM.

Chris Gibson, AIX and Power Systems consultant with IBM Lab Services, also suggests checking out the “IBM Cloud Mass Data Migration FAQ” (ibm.co/2ZCeOeH). This document, which lists common questions and concise answers for the IBM Cloud Mass Data Migration solution, is a physical data-transfer service (with up to 120 TB of usable capacity) that accelerates migration into the IBM cloud. The solution is an option if over-the-network data transfer options are cost-prohibitive, slow or unavailable.

A smooth journey to the cloud starts with careful planning. Evaluate multiple providers and their solutions, and take the time to understand the various migration options. 

Graphing AIX Performance Data

Edit: This is still a relevant topic

Originally posted October 2019 by IBM Systems Magazine

Understand your unique environment to best collect and graph data.

Q: What’s the best way to collect and graph AIX performance data?

You probably won’t care for the short answer, but honestly, it depends.

The best way to collect and graph AIX* performance data will come down to the unique characteristics of your environment. And naturally, the available time and collection of skills of those on staff, along with the resources being budgeted to your operations, are also critical considerations.

Now for the longer answer: Because one size obviously doesn’t fit all, arriving at a solution takes time and forethought. This is true whether you’re assembling a brand new environment or you’re realizing that what’s worked in the past no longer serves your purposes.

In my experience as a consultant, I’ll typically start by asking—and getting answers to—numerous questions:

  • If you plan to host the infrastructure in-house, do you have the cycles in your schedule to stand up and learn a new tool? For that matter, do you have an existing VM—or spare capacity to create a new VM in your environment—that can be used for the effort?
  • Are you looking to send the data off-site and let someone else create and present the reports? Is this Software as a Service-type solution impractical or even impossible in your enterprise given your corporate security posture, firewall requirements or other constraints?
  • Will you need software support to help with setup or troubleshooting if something goes wrong with the data collection? Do you require outside expertise in interpreting the graphs and reports?
  • Who are the consumers of the information being created: management, technical staff or both?
  • What kinds of decisions will be made based on the data? Will data be used to help with server consolidation, or is the idea to rebalance the workloads by identifying frames that are “running hot”?
  • Are you looking to retain past data on system performance? (It’s not a bad idea to have this information on hand to address user questions or concerns.) 
  • Do you need trend data to help determine when new servers or additional capacity should be added to the environment?

Your Performance Data Toolbox

Of course, commercial performance tools can do much of this work for you. Options like Performance Navigator from Midrange Performance Group or Galileo Performance Explorer Suite from The ATS Group require little intervention once you get them up and running. These and other products can include vendor support, which may be a priority/requirement for your management.

IBM offers PM for Power; there’s a basic version available at no charge as well as the full-featured product. For many environments, the limited functionality of the free version is sufficient—it just depends on your needs. And newer versions of the HMC include built-in performance graphs. That at least provides a quick, easy way to investigate any potential issues.

Other freely available options include open-source tools such as Ganglia and RRDtool. Or you can always manually feed your nmon files into the Microsoft Excel-based nmon analyzer tool.

Nigel Griffiths has posted articles and videos that detail newer techniques using njmon, influxdb and grafana. These are highly customizable solutions that allow you to change your graphical views on the fly. They’re capable of handling huge amounts of data from large numbers of LPARs and can be implemented with minimal technical expertise.

Re-Evaluating Your Needs

The process doesn’t end with the choice you make. Over time, you’ll need to re-evaluate your solution. Do the tools still do the job they’re needed to do? Is greater automation needed? Are new and better options available? Have your requirements changed?

There’s no one way to store and maintain performance data. But if you ask the right questions—and keep asking questions—the effort and resources you invest in a solution will be worth it.

10 Things to Love About AIX

Edit: I miss watching the Refreshments live.

Originally posted June 2019 by IBM Systems Magazine

Illustration by Mark Allen Miller

If you’ve had access to a television at any point since 1997, you’re probably familiar with the instrumental theme to the long-running animated series, “King of the Hill.” The music is performed by The Refreshments, a band from Tempe, Arizona.

I watched the band attract larger followings and perform in well-known venues. Then came the record deal and radio airplay.

Of course, things change. In the case of The Refreshments, the band broke up. In the case of yours truly, live music and late nights eventually lost their appeal. Recently though, I was able to attend an unofficial reunion show consisting of three of the original members, plus the lead singer from one of the groups that would morph into The Refreshments.

The musicians and the audience were older, wiser and certainly grayer, but we were all transported. It’s not that I’d forgotten about those tunes and those days—I still break out my old CDs from time to time—but that night I realized I’d taken them for granted.

AIX: What’s Not to Love?

In a similar vein, I think we, as AIX* professionals, can take our favorite OS for granted. While the OS can do so much, the job in front of us is our focus. We have systems to maintain, so many of us keep to a narrow set of tasks and operations. We’re aware of this whole wider world of AIX features and function, but we may not take time to really think about it.

So think about it now: What do you love about AIX? For me, it’s the system management interface tool (SMIT). I also love the ease of importing and exporting volume groups and the simplicity of managing disks and filesystems.

I love the thought that went into the naming of commands and the way the whole system works together. Gathering performance data and tuning system performance are straightforward processes. Mirroring, unmirroring and migrating disks is a breeze.

Naturally, I love all of the new stuff, but I also love that you can count on things to stay the same: Years-old Korn shell scripts continue to work on the newest versions of the OS, and upgrades allow you to preserve settings and configurations, saving you from having to rebuild LPARs from scratch.

Many of my favorite things about AIX fly under the radar; functionality that many admins and developers might get to utilize only infrequently. Of course, some notable new developments are also helpful. So I put it all together in this quick list of things I really appreciate—and yes, love—about AIX.

1-AIX will run on POWER nodes in Nutanix clusters

Unveiled last year, Nutanix is a converged system that utilizes software-defined storage and networking, eliminating the need to manage an external SAN. If your organization already uses or is considering using Nutanix clusters for your x86 environment, you can use the same hypervisor and virtualization stack for your AIX and POWER* environment. There’s no need to learn the HMC or the VIO server, simplifying systems management for existing Nutanix clients as well as those who are new to running AIX. Once you learn how to perform an operation on one type of cluster, you’ll be able to do it on the other.

2-Live Kernel updates

AIX Live Update allows you to update your kernel without downtime. As it becomes possible to patch more parts of the OS without rebooting, this will allow for on-the-fly updates and less disruptive change windows. When coupled with nondisruptive firmware updates for POWER hardware, you can maintain system security without affecting services and end users. Of course, this must be deployed carefully. Progress still needs to be made in this arena, but you can expect that more AIX and firmware updates will be performed this way.

3-Upgrade on the fly

The alt_disk_install and alt_disk_migration methods greatly simplify the entire upgrade process. If anything goes wrong once you’re up and running on the newer versions of code, you can back out by changing your boot device and rebooting to your original disk. It’s that easy because you’re leaving the original disk alone while confining changes to your cloned root disk. Why waste time and run risks related to bad backups? No one wants to restore an OS after an upgrade gone bad.

4-The AIX toolbox for Linux

This collection of open-source and GNU software benefits from more frequent updates. The environment of choice for many Linux* application developers, these tools are packaged in RPM format and can be downloaded and run without needing to compile. This allows you to run familiar open-source tools and programs on your AIX servers. You can also use YUM to automate the download process and set up prerequisites.

5-AIX, IBM i and Linux can run on the same POWER frame

We all understand this, but take a moment to really think about the flexibility this provides. From one base of reliable, powerful hardware, you can choose the OS that makes the most sense for your application needs. Although I like to run AIX where I can, it’s reassuring to know that I can still stick with POWER to run Linux- or IBM i-specific workloads.

6-The hypervisor and virtualization technologies

Of course, these aren’t actually components of the OS—their heritage comes from the IBM mainframe—but it’s all part of the platform. The hardware/virtualization combo is unquestionably one of the best things about working on AIX. Virtualization is baked into the hardware; it’s not a bolted-on afterthought that consumes CPU and memory like you find on other hardware platforms. And because the company that built your hardware also built your virtualization layer and OS, you can expect IBM to troubleshoot the whole stack should you encounter issues. There’s no need to waste time chasing multiple vendors, trying to get someone to actually own your problem.

7-It’s a very forgiving platform

AIX and IBM Power Systems* hardware provide the flexibility to add physical memory and CPU as needed. This allows you to plan for the future and pay as you go. You can upgrade your machine with no outages. You can mix virtualized and dedicated adapters, as well as make virtual machine configuration changes on the fly. You can set up enterprise pools and share resources across physical frames. Adding and removing memory, CPU and adapters is seamless and simple. The same OS will run on the smallest to the biggest systems. Migration is a breeze: Bring in your new server and use Live Partition Mobility to move your workloads while they are running with no downtime. And it’s easy to adjust your resources if needed.

8-AIX has the capability to become a NIM server

Using a network installation manager (NIM) server allows you to back up, restore and upgrade your system over the network from a central location. You can use it to boot your machines into maintenance mode, and under the covers for other operations when managing AIX servers.

9-Boot from USB

I still encounter a fair number of clients who are unaware of this capability. The VIO server and AIX OS can be booted and installed from flash drives. In addition to being much faster than more traditional boot options, booting from USB completely eliminates the need for physical media like DVDs and DVD drives. This is especially useful when setting up an environment where a NIM server isn’t already installed.

10-The system responds to problems before they become outages

The OS has an error-logging facility that diagnoses issues and helps predict problems that could arise on the system. When coupled with hardware location codes, this makes it simple to determine which disk or piece of hardware to replace. When you set up call home, a problem ticket is generated with IBM, and in many instances, a part will be shipped or a client engineer will get dispatched before you even realize there’s an issue. The system helps keep itself highly available.

Make Your own List

That’s my list. I’m willing to bet you could make your own, and I suggest you do so. I expect you’ll end up with an even greater appreciation for everything AIX has to offer. Comment on this post online with your list of aspects you love about AIX.

Data Backup Options Balance Risk and Cost

Edit: A backup without a test restore is a wish.

Originally posted February 2019 by IBM Systems Magazine

In some environments, disaster recovery (DR) testing and system rebuilding are ongoing. The most dedicated organizations conduct failover tests and run Live Partition Mobility (LPM) operations to evacuate frames so maintenance can be safely performed. Then LPM is used to put LPARs back onto the frames when the maintenance is complete.

Other environments are much more static. LPARs are built, quarterly or semi-annual patches are applied and that’s it. Of course, far too many environments do no maintenance at all. While regular testing is ideal, this level of activity isn’t practical or even necessary for everyone. Before you ever invest in system availability, understand that you’ll always face risk.

We’ve all been in meetings where recovery time and recovery point objectives are set. How far back should your backups go? That depends: How much data can you afford to lose? You must determine the amount of risk that’s acceptable to your enterprise.

Of course, these decisions are often based on business priorities rather than technical considerations. For instance, in some enterprises where real-time or near real-time data replication is seen as cost-prohibitive, backup tapes are shipped to a DR location and then restored to a secondary machine. This provides an extra layer of protection, but it’s also an example of balancing cost versus risk. In this case, risk is the possibility that data may be lost.

Greatest Investment, Lowest Risk

Not too long ago, maintaining two data centers was seen as an option strictly for huge organizations with large IT budgets, but this practice is relatively mainstream now. Obviously, the benefit of protecting data with a secondary data center is that a disaster or any sort of outage is unlikely to take out both facilities simultaneously. However, these solutions still carry risks. For instance, data corruption is still a possibility. If data is maliciously encrypted or destroyed, it may still be copied to your secondary location. For this reason, offline backups should still be a component of your solution.

Also, keep in mind the importance of testing. If you have a high availability (HA) cluster, fail it over regularly and run production for a period of time on your secondary node. (This assumes your failover node is sized to handle the entire workload rather than only its most critical components.) Same goes with a DR site: fail over to it and run production from the secondary location. Verify that everything works as it should. Don’t wait for an unplanned outage or an actual disaster to learn that a critical piece of infrastructure or data didn’t get replicated. Testing may also reveal technical issues with DNS or network connectivity into the secondary data center, or procedural issues that should be ironed out when personnel are fresh and expecting to troubleshoot issues.

Medium Investment, Reduced Risk

If your OS configuration is fairly static, monthly or weekly OS backups may be sufficient. Again though, you must understand the risks. Obviously, in the event of a restore, you’ll need to reintroduce any changes that occurred since the last OS backup. Beyond that, something could happen to your backup image.

If OS and data backups are written to tape, make sure the tapes are safely labeled and stored offsite, and that you have a recovery plan and a method to quickly access them if needed. Remember: A severe outage might lead to a compete loss of access to your machines and data center. Also remember tapes don’t last forever. Have a plan in place to replace them over time.

On the other end of the spectrum are organizations that rely on their storage subsystem to take snapshots. This may seem like sufficient protection, but storage subsystems do fail. Or what if some sort of catastrophe makes the snapshots unreadable? OS images and snapshots can’t be recovered if they no longer exist in a readable form.

Again, testing is critical. You’ll never know your backups are good unless you try to restore them. And when was the last time you audited your backups? Changes happen. Are you sure the backups you set up are still running properly? Even restoring individual files periodically can help you confirm that the backups can be read and the data still exists.

Backups shouldn’t be limited to enterprise systems, either. VIO servers and the HMC should also be backed up and maintained. Make sure boot media and any other necessary tools are readily available should you need to rebuild machines after a disaster.

Risking It All

As I noted at the beginning, with some enterprises, the choice is to do nothing. Backups may not occur at all or they’re rarely tested. Legacy systems may be left to run without being maintained in any way.

Again, risks and costs are being weighed, but in these cases, the risk may be misunderstood, or seen as negligible, while any cost is viewed as onerous. I won’t offer a lengthy defense of IT spending because if you’re reading this, it’s highly likely that you fully understand the need to protect data and the systems that store it. Plus, that’s probably in your job description.

Whatever choices are made, whatever is invested and whatever risk is allowed, it’s critical that your backup and recovery process is thoroughly documented and that everyone in the organization understands the ramifications of these decisions. If you have concerns about, say, recovering your LPARs, make them known immediately, before an event occurs.

Certainly, additional backup solutions and options are available—I didn’t discuss virtual tape, for example—but hopefully some of these points will help spark an honest assessment of your current situation.

Getting Linux on POWER up and Running is Simple

Edit: It is still simple.

Originally posted January 2019 by IBM Systems Magazine

Many organizations operating with AIX* and IBM i environments also rely on Linux* to run their businesses. But even if you know Linux, you may not realize how easy it is to run the OS on IBM Power Systems* servers.

In larger enterprises, maybe there’s a dedicated AIX team and another team of Linux administrators, with everyone doing their own thing. The AIX group isn’t focused on Linux, and the Linux folks don’t know what Power Systems servers are capable of.

Some believe that running Linux on enterprise hardware is too costly or complex. It’s not. Others don’t realize that running Linux on POWER* is even an option.

Depending on your workload characteristics, Linux performance can be significantly better when run on POWER. It’s also worth noting that IBM has worked to make this option even more appealing. Since the POWER8* processor was introduced, IBM has been transitioning the processor to fully support the little endian format. This makes it easier for application providers to recompile and run Linux on POWER without making changes to their source code. As a result, more distributions, packages and applications are being migrated to Power Systems servers all the time.

Given the affordability of open source, you owe it to your enterprise to consider Linux on POWER and to get hands-on with various Linux distributions. That way, you’ll be able to provide meaningful input when your company discusses the pros and cons of available choices.

An Array of Choices

Numerous Linux distributions are available. Some of the more widely used distributions include Ubuntu, SUSE and Red Hat (the company IBM plans to acquire). But any number of other distributions, such as CentOS, Debian and Fedora, don’t require licensing or support fees.

It’s critical to choose a distribution that’s been compiled for and works on Power Systems hardware—but beyond that, the choices are wide open. As all of these distributions are made up of open-source code, they stand out in different ways. Maybe you’ll find it easiest to work with a particular desktop manager, default filesystem type or package manager. Each distribution is unique, of course, but once you’re proficient with one, working with others is fairly easy.

A Familiar Process

If you’ve not experimented with Linux on POWER, here are some things to consider.

First, installing Linux is similar to installing AIX, so if you’re familiar with that process, you shouldn’t have any issues. Choose a distribution and download the appropriate .iso image.

For instance, a web search on “Ubuntu download Power” returns options to download various Ubuntu versions: There’s Ubuntu 18.04 LTS for IBM Power*, the first release to support POWER9*, and Ubuntu 16.04 LTS and Ubuntu 14.04 (both are for POWER8*). SUSE is similar, although you must register for a 60-day free trial. On Red Hat’s website, you can request an evaluation.

In all instances, be sure you’re getting the install images for ppc64le, and if you’re going to run on POWER9, be sure that the latest processor is supported.

I’m assuming your testing will occur on a traditional Power Systems server with some spare capacity. I’m further assuming you’re running a VIO server, and that you’ve obtained the necessary permissions from your IT management. Testing is even easier if you have either a Linux-only variant of Power Systems hardware—for instance the L922—or hardware acquired from an OpenPOWER community vendor, such as Raptor Computing Systems’ Talos II.

On a traditional system, the installation process begins by copying the .iso image that you downloaded to the virtual media repository in your VIO server. This allows you to boot from a virtual DVD over vSCSI. Then you can either have your SAN administrator provide a LUN that you can use for testing, or you can map a spare disk in your frame to an LPAR, or you can carve up a logical volume in your VIO server to use as a backing device for your Linux installation. Again, all this is familiar for anyone who’s installed AIX.

To make this LPAR available to your network, obtain the appropriate IP address information. While you won’t need much processing power or memory, sizing your test LPAR appropriately will obviously lead to better results.

Once your LPAR is defined, simply boot from your virtual DVD. In many cases, the defaults listed in the various installer menus will be sufficient, but it’s worth taking the time to familiarize yourself with the Linux environment by going through the various menus and trying different settings and options. Practice setting up repositories and user IDs, make changes to filesystems, load software and configure the system.

After going through this process a few times with one distribution, try another. Incidentally, this is why having access to a “crash and burn” test system is critical; you can do what you want without impacting others.

To install Linux as a client hosted by IBM i, view the “IBM Support” document listed in the “Linux on POWER References,” below.

Simple, Seamless

If you allow users on your system, in most cases, they won’t even realize that Linux is running on POWER as opposed to the x86 hardware they may be accustomed to—but they could notice the improved performance. You have the hardware, and getting Linux up and running on it is a simple process.

IBM Power Systems L922 benchmark information: ibm.co/2Teo8zC

Little endian primer: ibm.co/2DFkudC

IBM Redpaper (REDP5496): “IBM Power System L922: Technical Overview and Introduction”: ibm.co/2FA6Km6

OpenPOWER Foundation: bit.ly/2BdsHDD

IBM Support: Create a client partition (i, AIX or Linux) hosted by an IBM i server partition using HMC Classic: ibm.co/2FqpMvt

Linux Distributions

Ubuntu: bit.ly/2PYU10p

SUSE: bit.ly/1rmA4dg

Red Hat: red.ht/2QBniv3

CentOS: bit.ly/2qKtINp

RaptorCS: bit.ly/2wpNH5z

Debian: bit.ly/2Psp7OI

Fedora: bit.ly/2RWsb2p

NIM Server Simplifies Installing and Upgrading AIX

Edit: I still love NIM

Originally posted December 2018 by IBM Systems Magazine

Installing and upgrading your AIX* OS can be done in numerous ways.

For starters, you can install it from base media that’s downloadable from IBM. It’s simply a matter of populating your virtual media library with the appropriate images and using virtual DVDs and virtual SCSI adapters. The downside to using base media is that it requires customization once the OS is installed. This wasn’t a problem back when we were running one OS on each physical machine, but when you’re looking at loading many images onto a single physical frame, scalability becomes an issue.

Logically moving the install media devices and adding custom scripts, user IDs, cron jobs and site-specific information to your LPARs can also be time-consuming if done manually. Over time, many sites have just taken to upgrading and migrating existing systems rather than reinstalling them.

Alternatively, using the current versions of the VIO server and AIX, you can install your OS from images that you copy to USB flash drives. Flash drives are fast and by installing VIOS from USB, they solve the chicken and egg dilemma of how to install an OS when you’re installing the first machine in the data center. However, many of the same pitfalls apply with flash drives.

While many organizations install AIX using IBM PowerVC* virtualization and cloud manager to capture and deploy OS images, this method isn’t for everyone. Some encounter difficulties when coordinating among siloed teams to get PowerVC virtualization and cloud manager operational. (For one example, consider SAN teams that would have to allow access to their switches. This sort of bureaucracy can be overwhelming if you’re tasked with getting a green-field environment up and running quickly.)

The Advantages of NIM

So yes, when it comes to installing AIX, you have options. But for me, there’s only one real choice: the Network Installation Manager (NIM). NIM runs on an AIX LPAR, making it simple to deploy, even for those who haven’t previously used it. I also appreciate the control NIM gives me. I can choose where in the environment to run NIM, and I can have multiple NIM servers in multiple locations.

In addition, NIM is great for upgrades and migrations. If you need to migrate an older system to new hardware, you can take a backup image (in AIX it’s known as a mksysb) and bypass hardware limitations that might exist with older AIX versions. Yes, you can do the same thing from base media, but again, to make the solution scale, it’s best to use the network. Assuming your NIM server has the appropriate resources, it’s no trouble kicking off multiple installs or updates simultaneously.

NIM runs on an AIX LPAR, making it simple to deploy, even for those who haven’t previously used it. I also appreciate the control NIM gives me. I can choose where in the environment to run NIM, and I can have multiple NIM servers in multiple locations.

–Rob McNelly, Power Systems architect for Meridian IT

I also use NIM for alt_disk_ migrations and alt_disk_upgrades. By cloning your OS’s root volume group (rootvg) to a spare disk, you can perform operations on that volume group copy. Rather than take time during a change window, you can perform the upgrade beforehand, without affecting your running workloads. When you want to take backups, take them directly to your NIM server. If you already have a mksysb image, you can copy it to your NIM server. Then, after creating a spot, you can use that backup image to either restore or clone your LPAR.

Finally, as it’s sometimes difficult to get ports opened in enterprises, there’s also an option to use HTTP with your NIM server.

As I previously stated, NIM is easy to use, but it has one important requirement: Your NIM master must be at the highest level of AIX in your environment. You can’t use an older AIX version to install or restore a newer AIX version.

NIM Is Tried and True

As you evaluate different methodologies for performing installations, patching and ongoing maintenance in your environment, it’s easy to overlook tools we’ve been using confidently for years. Although newer options like PowerVC and BigFix* software (which uses NIM under the covers) will continue to gain wider adoption going forward, in my opinion, nothing beats the tried and true NIM server.

A Closer Look at Performance and Security Updates to AIX 7.2

Edit: I am installing AIX next right now.

Originally posted October 2018 by IBM Systems Magazine

I still hear from people who are convinced that the AIX* OS is going away. I’ve done my best to refute these arguments by pointing out that IBM’s support of Linux* is not a threat to AIX. I’ve even asked some well-known experts to explain why our favorite OS isn’t going anywhere.

But if you want to really understand why the OS has a bright future, start by taking a look back. IBM has been putting out roadmaps for years. In charts available prior to the Version 7.2 release, “AIX Next” served as a placeholder name for the upcoming variant. If you go back to 5L’s debut in 2001, you can see a consistent cadence of OS releases (and retirements) every three to five years. And every year or so, new service pack support is announced.

As with previous releases, I’ve seen current charts labeled “AIX Next.” AIX 7.2 became generally available in 2015, so we can expect something new around 2020. Will it be called AIX 7.3? AIX 8.1? AIXi? AIX X? Only time will tell, but rest assured, “next” is coming.

Performance and Security

In the meantime, let’s discuss the AIX OS in the present. I recently attended an IBM briefing about the latest AIX 7.2 technology level (TL). TL3 is expected to be available in the latter part of this year. Here are some highlights:

  • AIX will support running up to 1536 threads (192 cores running at SMT8) and up to 32 TB of RAM in a single LPAR. When I stop to think about that, I’m amazed. For as long as I’ve been at this, a terabyte of anything still seems like a large number.
  • In conjunction with IBM PowerVM* virtualization and IBM POWER9*, you’ll be able to include AIX in a processor-based chain of trust to secure the booting process. Secure Boot for firmware images helps prevent unauthorized access to customer data—either through unauthorized firmware running on a host processor or from security vulnerabilities in authorized service processor firmware or its hardware service interfaces.
  • Trusted Remote Attestation of firmware images enables a remote system to determine the level of trust in the integrity of the platform. The OS will also support trusted install and update, and an option will allow only privileged users to run kernel tracing.
  • Enhanced support for alt_disk_mksysb installs allow for customized boot images to be copied during alternate disk maintenance
  • Additional open-source tools and solutions from the AIX Toolbox for Linux applications
  • JFS2 file space reclaim for enhanced efficiency with thin provisioned storage solutions
  • Look for new Multipath I/O (MPIO) enhancements to support disk storage attached through the AIX iSCSI software initiator. MPIO storage resiliency will also be enhanced with changes to IBM-recommended MPIO drivers. Related: The recommended multipath driver to use on AIX and the VIO server when attached to SVC and IBM Storwize* storage devices running microcode levels Version 7.6.1 and later will be changing from SDDPCM to the default AIXPCM (ibm.co/2nzM1Do). SDDPCM won’t be supported on POWER9.
  • AIX 7.2 TL3 will run in SMT8 mode as a system default. This change stems from POWER9 showing impressive results with SMT8 (ibm.co/2nzM1Do).

Lastly, IBM recently released a statement of direction (SOD) that should excite administrators who need to connect their AIX systems to Windows* environments. The SOD reads in part: “IBM intends to enable the SMB2 (server message block) version for AIX to enable data exchange between AIX and Windows OSes.”

The Quiet Transformation

With each new release and each new TL, AIX users get more features and greater functionality. The OS, although superficially similar to AIX 4.3.3 or 5.3, has undergone many improvements to get us to AIX 7.2, yet IBM has done so in such a way to be minimally disruptive to the platform’s long-time administrators and users.

An important example of IBM’s care to minimize disruption comes with the release of the new POWER9 processor-based systems. AIX allows for Live Partition Mobility to help with planned migrations. LPARs running AIX levels supporting POWER9 servers can be migrated live from POWER7* or POWER8* systems to POWER9 systems and be run there as is, without workload interruption. Some clients seem to be unaware of this critical option. You’ll definitely want to keep this in mind as you plan and prepare to migrate to POWER9 hardware.

The AIX OS has always been rock-solid. While all of the changes over the years are incremental in nature, taken together, they’ve transformed AIX.

POWER9 Brings Changes to the HMC

Edit: I want my HMC.

Originally posted June 2018 by IBM Systems Magazine

The Hardware Management Console (HMC) is evolving, and you’ll need to adapt. IBM is moving away from the traditional x86-based hardware appliances and will only be selling POWER* processor-based 7063-CR1 HMC appliances going forward. One reason for the change is that some clients have concerns about running Lenovo hardware in their data centers, and as the supply of these appliances dwindles, we’ll no longer be able to order them. Others have always questioned why we were managing IBM Power Systems* hardware with x86 servers in the first place.

According to roadmaps I’ve seen, the 9.1.910 and 9.1.920 releases will be available sometime this year for both platforms. However, with the 9.2.930 release (expected in 2019), the code will be compiled for POWER only—not x86. As end of marketing and hardware support takes place, it will be time to migrate your data center away from x86 HMCs entirely.

Upgrades on the Horizon

Currently, you can choose from four different options to run your HMC. Of course, there’s the traditional x86-based HMC appliance that you’ve been using since the IBM POWER4 days. Alternatively, you can run a virtualized HMC (vHMC) image in either the VMware, KVM or Xen hypervisors on your own x86 hardware. Or you can run an HMC appliance that’s based on the POWER processor. Finally, you can run a vHMC image in an LPAR on POWER hardware. As with any other workload, the HMC code will be capable of utilizing the strengths of the POWER hardware—including additional threads, greater memory bandwidth and superior performance.

As you update the firmware on your POWER hardware, you’ll need to upgrade your HMC code. This is another area of change, as HMC code nomenclature will be different going forward. For instance, today’s HMC V8 R870 M1 denotes the version, release, maintenance level and any fixes. The version correlates to the POWER family, the release is the corresponding firmware, the maintenance is the service pack, and the fix is not used at this time. Starting with Version 9 of the HMC code, we’ll still have version, release, maintenance and fix, but now with—for example, V9.1.910—the version will be the POWER family and the release will only increment on major revisions, meaning you’ll see only infrequent updates. The maintenance will be the firmware release and the fix will be any PTFs. So rather than get new HMC releases, we’ll see new HMC maintenance levels.

To support these changes, IBM plans to regularly seek input from HMC clients. You’ll see your first survey 30 days after initial login; then you’ll receive new surveys every 180 days thereafter. The HMC team will carefully review the feedback and use this information to improve the tool.

Additional IBM PowerVM* virtualization simplification enhancements are also planned in support of the transition from the classic menus to the enhanced HMC GUI. For example, IBM intends to provide templates to simplify system deployments as well as integrate performance and capacity metrics. Efforts will also be made to simplify the process of partition provisioning. In training I recently attended, it was acknowledgment that the early beta releases of the enhanced GUI were less than stellar, but don’t let your first impressions tarnish your view. The performance and usability of the GUI has greatly improved.

Moving to POWER9

As you move to POWER9* and upgrade your HMC, consider:

  • The V9 HMC code will no longer allow you to manage any POWER6* hardware that might still be running in your data center. You’ll need the V9 R1.910 code to support the S914, S922 and S924 systems. And as new hardware models become available, you’ll need to update your HMC code to manage them.
  • Farther down the line, plans call for the HMC to be able to manage OpenPOWER hardware so you can manage all your IBM and non-IBM systems together. Note that not all functions will be supported initially; non-supported functions will be either inaccessible or will trigger error messages.

Adjusting to the New HMC

It’s time to get on board with the new HMC. While it can be frustrating to try and do things with the GUI that you could do in your sleep with the classic menus, myriad outlets are available for assistance.

The HMC is an integral part of how we manage our IBM Power Systems hardware. Keeping on top of these changes is an important part of maintaining the overall health of the systems we support.

An In-Depth Look at POWER9

Edit: Still love new hardware.

Originally posted March 2018 by IBM Systems Magazine

The POWER9 era is upon us. As you undoubtedly know, IBM announced six new POWER9 servers in February to go along with the initial POWER9 server that was unveiled in December.

Following up on the introduction of the S914L922S922S924H922 and H924 boxes, IBM released rperf and CPW numbers on Feb. 27. Note that as part of the benchmarking process, IBM has published numbers that reflect the addition of all known security and bug mitigations that will be installed on the new systems, which GA on March 20. This of course is a response to the Meltdown and Spectre bugs.

If you’re looking for more detailed information about these announcements, there are a couple of presentations that I highly recommend. In this video, IBMer Nigel Griffiths examines the S924 that he received via the early ship program. Hear his impressions of the server, and watch as he pulls out fans, moves the machine in and out of the rack and shows you the server internals, cable management arm and more. It’s a fun 14 minutes.

For a deeper dive, check out this IBM Power Systems Virtual User group replay and accompanying slides.

This 2-hour presentation by IBM’s Joe Armstrong is well worth your time. Here are some summary notes to give you an idea of what’s covered:

-The model AC922, announced in December, consists of POWER9 chips built from SMT4 “split” cores, while the six new servers run on the POWER9 SMT8 “big” cores. This is illustrated in slides 4-6.

-POWER9 chips have 8 billion transistors, compared with 4.2 billion in POWER8 and 1.2 billion in POWER7. POWER9 is 14nm, versus 22nm for POWER8 and 45nm for POWER7.

-In contrast to the buffered memory and custom chips used in POWER8 systems, POWER9 scale-out systems use a commodity form-factor direct-attached solution for the DDR4 memory subsystem. This allows for better pricing and lower latency. Expect to see buffered memory in scale-up enterprise-class servers down the line.

-Keep in mind that IBM will only support systems that use official IBM memory DIMMs. 16G (feature code EM62), 32G (EM63), 64G (EM64) and 128G (EM65) DIMMs are available for order.

The frequency that the memory will run at depends on the number of DIMMs that are populated per socket. One machine can have 16 DIMMs per socket, or 32 DIMMs total, for a maximum of 4 TB of memory.

-POWER9 servers have PCIe Gen4 adapters running at 192 GB/s peak bandwidth, doubling the rate on POWER8 servers, which use PCIe Gen3 adapters. Note that your Gen3 adapters will work with the new Gen4 ports.

-There are four processor modes (see slide 14): disable all modes, enable static power saver, enable dynamic performance and enable maximum performance, settings that correspond to minimum, nominal, turbo and ultra operational frequencies. After logging into ASMI, you should, according to IBM, be able to change modes as needed without a reboot. Running in nominal mode means the system doesn’t automatically make changes on the fly, as the other modes do. Under max or dynamic performance, fan noise may be louder than you’ve accustomed to with earlier systems.

-With the model S924 (and variants), you’ll have options of 12 cores running at 3.4-3.9 GHz (feature code EP1G), 10 cores running at 3.5-3.9 GHz (EP1F), and eight cores running at 3.8-4.0 GHz (EP1E). These will be in the P20 IBM i software group. Note that these frequency numbers indicate the turbo and ultra speeds.

-With the model S922 (and variants), you’ll have options of 10 cores at 2.9-3.8 GHz (EP19), eight cores at 3.4-3.9 GHz (EP18), and four cores at 2.8 to 3.8 GHz (EP16). These will be in the P10 IBM i software group.

-With the model S914, you’ll have options of eight cores running at 2.8-3.8 GHz (EP12), six cores running at 2.3-3.8 GHz (EP11), and four cores running at 2.3-3.8 GHz (EP10). The 8- and 6-core versions will be in the P10 IBM i software group; the 4-core option will be in the P05 IBM i software group. This is also the only system that defaults to the dynamic performance mode.

-With the model L922, you’ll have options of 12 cores running at 2.7-3.8 GHz (ELPX), 10 cores running at 2.9-3.8 GHz (EPPW), and eight cores running at 3.4-3.9 GHz (ELPV).

-There are up to four 400G NVMe drives, and you can assign each one to its own LPAR. In other words, you could assign one individual NVMe drive to one individual LPAR for up to a total of four drives in four LPARs. While you won’t be able to hot plug them like you can with an SAS drive, this setup nonetheless is great to use for internal boot drives for VIO servers. You could also logically carve up these drives, virtualize them in your VIO server and serve them to your vSCSI clients. In other words, you could assign internal disk to your LPARs as well (provided these aren’t heavy, write-intensive workloads).

-AIX and VIOS OS images can be downloaded as a single install image from IBM Entitled Support. This simplifies installation from USB drives.

-The chart on slide 23 presents a good overview of the machines, including details like the number of sockets, the amount of memory, the number of CAPI 2.0 slots, and more. Slides 24-51 get into the specifics of each machine. Slides 52-57 cover different I/O adapter options, and slides 58-70 go into supported operating systems, including roadmaps that extend years into the future.

HMC options are covered in slides 73-76. The CR7, CR8 and CR9 are no longer being sold, so look for the 7063-CR1 HMC, which is based on POWER processors. Alternatively, you could choose to run one of your HMCs as a virtual HMC.

Make sure you’re running V9R1.910 HMC code to manage your POWER9 servers. Keep in mind if you update to this version, you will no longer be able to manage POWER6 servers in your environment.

-Don’t lose sleep over migrating to POWER9. Slides 77-82 get into details about migration and other topics. If you don’t have PowerVM installed, learn how to get temporary PowerVM Enterprise Edition codes for your older hardware so you can migrate workloads to POWER9 using Live Partition Mobility. On that note, since all new POWER9 servers will have PowerVM Enterprise Edition by default, that means all POWER9 servers can run Live Partition Mobility by default.

-The POWER9 power supplies will run 1400W 200-240 VAC. POWER8 servers run 900W power supplies.

As you can see, POWER9 gives us a lot to be excited about. As I’ve said before, I can’t wait to get my hands on these new machines.

Seriously, AIX is Not Going Away

Edit: Seriously.

Originally posted February 2018 by IBM Systems Magazine

Lately I’ve received a number of inquiries about the future of AIX. In a sense, I understand the fears. Sometimes we hear about companies migrating from AIX. Often you’ll see Linux featured in mainstream tech media, and seldom will you find much about AIX. But AIX is not going away.

Don’t take my word for it. Here are other views on the future of AIX. Hopefully the responses from these prominent folks will prove persuasive.

(Note: These comments contain minor edits.)

Joe Armstrong, Power Systems VUG (this is from a January email to user group members):

IBM is continuing to invest in the AIX operating system, and I have seen AIX roadmaps well into the future. Again, AIX is not going away. However, Linux is growing in the industry, and Power Systems are a superior platform to run Linux workloads. SAP HANA and the growing number of Machine Learning and AI workloads are a prime example of where IBM Power Systems shine….

In reality, the VUG sessions have always covered more than AIX. Power Systems VUG is a much more meaningful name, and I intend to continue providing the same informative webinars that you have enjoyed over the past ten years, and I will continue to cover AIX specific topics.

With the introduction of POWER9, new features, and updates to existing features, there is a lot to cover in the Power space. Rather than replacing AIX related webinars, I plan to offer some extra webinars, so you may see more than one per month….

Nigel Griffiths, IBM:

(An AIX user) asked me, “Do you think that AIX is dead for the future? There are fewer and fewer proposals from recruiters.”

I answered with:

That is a truly bizarre conclusion to come to on that evidence. AIX is a multibillion-dollar business for IBM. Why would IBM stop that revenue? That would be bonkers!

AIX is running in the vast bulk of major companies in the world, and AIX is running their most vital workloads. IBM and AIX are here for the long term. There is no better UNIX on the planet and it is the OS that my bank account is held in! Sure the AIX guys don’t make the same volume of noise as the Linux fanatics, as they are quietly running the core systems.

Don’t get me wrong, Linux is great fun but when I run into critical to the business problems, I want to be on AIX… the AIX support team are second to none—and I know many of them personally (being in the same company for so many years). So claiming AIX is dead is IMHO rather silly.

Earl Jew, AIX performance expert, IBM:

AIX won’t be fading for a long while, if ever. Do we still have a niche for IBM z mainframes? Do we still have a happy following of IBM i customers? Yes and yes. Likewise AIX will endure too. All that is happening now is… other IT niches are emerging, growing and evolving… For instance, Linux on Power has exploding opportunities with SAP HANA, AI, cloud and machine learning.

The “rise of the rest” doesn’t mean AIX will fade and die. AIX will endure because the nature of AIX workloads is durable. Commercial enterprises will always need fast secure reliable processing of traditional structured data on systems of record. AIX is best for open systems that make money, mean money or are money. Such systems are thus long-lived; they tick along on AIX through decades as they spark, grow and evolve. AIX will also persist because AIX evolves while remaining essentially AIX through the decades. AIX evolves to better exploit POWER technologies and AIX evolves to better serve what runs on AIX—yet AIX is still AIX.

We learn AIX and keep it. We use AIX and love it. We port to AIX to stay. This is why AIX will not fade and die.

Shawn Bodily, an IBM Champion and former IBMer now with Clear Technologies:

When someone asks me why AIX, I often ask some of the following questions: Is security important to you? Do you like having one OS across all server sizes? Is RAS important to you?

That last question I ask with some fear of it being rhetorical and even a bit cliché. AIX is a mature enterprise class OS that, when combined with PowerVM and POWER processor, has among the highest uptime and least amount of security vulnerabilities. AIX offers some unique features that are not only unavailable in Linux, but not expected in Linux anytime soon. One example of that is Live Kernel Update. This key differentiating feature has come to fruition in just the last couple years. This is a testament to the continued development and long term viability of AIX….

Here’s more from Nigel:

We also get another erroneous conclusion: People think that since IBM is not announcing AIX V8, then AIX is dead. IBM can put out massive new functions in the TL levels of AIX 7.2 without an AIX V8. AIX V8 would be disruptive, as it implies a slow overwrite install upgrade. People confuse version numbers with commitment to the product. I really don’t get it….

I think technical people see themselves as AIX experts as they use AIX commands every day and they know AIX features very well. They are betting their careers on developing AIX skills. AIX is here on the screen in front of them “talking” to them. They don’t think of themselves as POWER8 experts as the server is miles away in a dark room.

Again, IBM is still heavily invested in AIX, which runs critical workloads in large businesses the world over. As Nigel says, we do not need an AIX V8 to realize additional improvements in the operating system; enhancements arrive all the time.


One final thought: If you’re truly concerned about the future of AIX, take action. Spread the word. Attend conferences and user group meetings. Share webinar replays with colleagues.

I’ve written about other things we could do, like interact on Slack or irc. There’s also the AIX sub-Reddit and the AIX forum.

It’s one thing to love the operating system, as we all do. But we should be proactive about it. If we don’t help spread the word, people can easily convince themselves, despite ample evidence to the contrary, that AIX is going away. What are you doing to help dispel this perception?

POWER9 Hardware and More to Look Forward to in 2018

Edit: Some links no longer work.

Originally posted January 2018 by IBM Systems Magazine

For most of us, the holidays are a time to unwind. Of course I say “most of us” because in IT, someone is always needed to keep tabs on those machines on the raised floor. Even though laptops, phones and VPNs make it possible to arrange for coverage remotely, there’s still nothing like being able to completely unplug from work for a week or two (preferably on a beach somewhere). So if you were tasked with being on call during the holidays, I salute you. And if you were fortunate enough have time off at the end of the year, I hope you enjoyed it, and I assume you’re now ready to get back to it.

Being away from our jobs is certainly relaxing, but with everything that goes on during the holidays, it’s easy to lose track of news about AIX and IBM. With this in mind, let’s take a quick look back at some things you may have missed over the final weeks of 2017.

No doubt, you’ve heard that POWER9 is here. While I expect 2018 to be an eventful year with POWER9 server announcements―as well as the expected updates to AIX, IBM i and the VIO server itself―the first POWER9-based server, the AC922, was unveiled in December. This announcement naturally drew a lot of coverage from mainstream tech outlets, including ZDNetCRN  and Tech Crunch

Given this IBM statement of direction, look for much more going forward:

IBM intends to offer clients with IBM Power E870, E870C, E880, and E880C systems the following capabilities that are designed to provide a smoother migration path to the POWER9 technology-based systems when they become available.

IBM plans to offer system upgrades from Power E870, E870C, E880, and E880C systems to the next-generation POWER9 systems that will maintain the serial number of the existing IBM POWER8 systems.

IBM intends to deliver the capability for the next-generation high-end system with POWER9 processors to participate in the same Power Enterprise Pool with Power E870, E880, or E870C/E880C systems.

Although this doesn’t speak to timing―and this article is certainly not meant to be an announcement of any kind―it’s entirely reasonable to assume that new POWER9 servers will be coming in the relatively near future. I, for one, cannot wait to start installing them with clients.

Other Changes

Big changes are also in store for the HMC. I wrote about the four HMC options, including a virtual HMC that will run on x86 and a virtual HMC that will run on Power servers. There’s also an HMC appliance that will be based on POWER hardware as x86 hardware gets phased out over time. Because some data centers won’t allow Lenovo hardware onto their floors (which the current HMC appliances are based on) and since POWER hardware is a better choice anyway, it makes even more sense to switch from x86-based HMC appliances.

In addition, there’s a new version of PowerVC. Read about version 1.4.0 here and here. Some highlights include integrated software defined storage, support for machines running KVM on POWER, the capability to import/export deployable images, the capability to capture a live virtual machine and UI updates.

PowerAI

is a platform that bundles the most popular machine learning frameworks with all of the dependencies and optimizations needed to get up and running quickly. To get started with PowerAI, download the code or request a trial.

https://www.ibm.com/developerworks/community/blogs/Power_Systems_Solutions/entry/5_Things_to_Know_About_IBM_PowerAI?lang=en

https://www.ibm.com/us-en/marketplace/deep-learning-platform

Another interesting development is “cloud ready” AIX images. I’ll let Chris Gibson fill you in:

In addition to installation images for AIX, “cloud” image formats are also made available that can be readily deployed with PowerVC. These images contain a default AIX base media install configuration that includes Cloud Init and its dependencies. The images can be obtained from the IBM Entitled System Support website or IBM Passport Advantage.

This covers plenty of other ground, including the capability to install AIX from USB:

AIX 7.2 technology Level 2 and AIX 7.1 technology 5 support installation via a USB flash memory stick on POWER8 and later systems. A USB flash memory stick containing an AIX installation image can be created by first downloading the AIX installation image from the IBM Entitled System Support website. A single volume installation image of these AIX levels is available on the Entitled System Support website for writing to a USB flash device. Once downloaded, the AIX installation image can be written to a USB flash memory stick. It is recommended that a recently manufactured USB flash memory stick be used.

Here are a couple recent updates about IBM’s PowerHA product suite: the new PowerHA SystemMirror GUI and the latest version of PowerHA SystemMirror for Linux.

Finally, the Power Systems best practices document was recently updated.

Two Worthwhile Webinars

The monthly AIX Virtual User Group meetings are a great resource. Even if you can’t tune into the live webinars, the replays are typically available a few days after the session. Subject matter experts from IBM and elsewhere cover a wide range of products and technologies. Recent presentation topics include PowerAIthe AC922  and the new HMC interface. Although I’ve linked to the presentation materials, I recommend you also listen to the session replays, which can be found on the homepage. The AIX Virtual User Group replay archives go to 2008, with presentations going back to 2007. Much if not most of this information remains relevant today.

Another webinar series originates from the U.K. The format is similar to that of the AIX Virtual User Group, and again, quality, in-depth information is provided on a variety of topics. The archives are here. The U.K. group has recently covered PowerAI, the Cloud Management Console and the enhanced HMC GUI  (this latter session includes a demonstration).

The Case for Keeping Up

It may sound trite, but in our line of work, there is always more to learn. Hopefully you’ll find value in the information I’ve cited. Even if some of this material isn’t currently relevant to your job, you never know when you may find yourself in a meeting or in a discussion with a colleague and be asked for your two cents about cloud or machine learning or any number of subjects. Obviously our day-to-day responsibilities are substantial, but I believe that informing ourselves about what’s new and what’s changing is worth the time and effort.

The Value of Performance Data

Edit: Are you tracking your system performance?

Originally posted December 2017 by IBM Systems Magazine

At the most recent IBM Technical University event in New Orleans, I was talking with Randy Watson of Midrange Performance Group (MPG). He mentioned that many customers don’t keep any performance data whatsoever.

Randy’s words surprised me. Everyone should carefully track system performance. To say it’s worth the cost and effort is an understatement. This data provides a variety of important benefits.

Three Scenarios

Consider this scenario: Your phone rings in the middle of the night. You’re told that users are having issues with one of your systems. Once you clear your head, you bring up the performance graphs that display your LPAR’s historical data. You can immediately see where and when things changed. Now you’re on track to determine what’s causing the issues you’re seeing.

Performance graphs provide an easy way to visualize your environment running normally compared to how it looks when there’s a problem. Of course in real life things are seldom cut and dried. For instance, subtle changes may require you to go back over longer period of time to find something.

While graphs can prod things in the right direction, they do have their limits. Sometimes graphing your data can hide or at least distort the truth, a point AIX performance expert Earl Jew makes in his articles and lectures. Much of what you see when interpreting graphs will depend on the specific items you’re looking at and the length of your intervals.

Still, performance graphs are typically helpful in these situations. If someone tells you that performance is degraded, you need data. How can you begin to understand the impact of a change to your environment if you have no idea what normal looks like?

Historical data is also useful in areas beyond performance. Here’s another scenario: Management tells you that it’s time to migrate to POWER8 servers, and they want to know what models and hardware components you recommend for the refresh. Naturally, you’re not going to guess. You’ll check the aggregated historical performance data that encompasses all of the LPARs in your environment. You’ll project what workloads can be expected to do over the expected lifespan of the new hardware and estimate the performance gains the new hardware will bring. You’ll give management every reason to take your thoroughly researched recommendations seriously.

And now for one more scenario that highlights one more benefit of consulting performance data: Your enterprise is looking to add new workload to the physical hardware and is considering consolidating some other workloads from other data centers. Where is the best place for this new workload to land? Can existing servers handle additional memory or CPU, or should new adapters be brought in? Is ordering all new server hardware the right answer?

Performance Monitoring Tools

Various products―some fee-based and others that come at no cost―can be deployed to monitor performance and alert you to potential problems. Tools can certainly save you the effort of looking up rperf numbers, creating spreadsheets and guessing.

Cluster management

* Nutanix running on Power is an offering for customers that use Hyperconverged servers, which debuted earlier this year. Nutanix allows you to run capacity planning reports directly from Prism. The reports, which include graphs and charts, will inform you of your capacity usage, projected growth requirements, and are designed to help you manage your cluster resources.

* Ganglia is a cluster monitoring tool that’s designed for AIX high-performance computing (HPC) environments.

https://www.ibm.com/developerworks/community/wikis/home?lang=en_us#!/wiki/Power+Systems/page/Ganglia

Scripts and software

* nnomchart is a Korn shell script for AIX or Linux. It converts nmon collected files to HTML and displays more than 50 AIX and Linux performance graphs and configuration details.

http://nmon.sourceforge.net/pmwiki.php?n=Site.Nmonchart

* lpar2rrd is free software:

http://www.lpar2rrd.com/

The tool offers you end-to-end views of your server environment and can save you significant money in operation monitoring and by predicting utilization bottlenecks in your virtualized environment. You can also generate policy-based alerts, provide capacity reports and forecasting data. The tool supports IBM Power Systems* and VMware* virtualization platforms. It is agentless (it receives everything from the management stations like vCenter or HMC). Collected data set can be extended about data provided by the OS Agents or NMON files.

Vendor Tools
Note that I don’t endorse any of the products listed here, but these commercial solutions are certainly worthy of your consideration.

Galileo Performance Explorer

Help/Systems Robot Monitor

Midrange Performance Group Performance Navigator 

In addition, IBM has its performance management product, as well as PowerVP. You could even just activate topas or nmon recording on each of your LPARs.

Setting a Course of Action

Once you choose a product or tool, you then need to decide what you want to accomplish. Is your focus going to be performance monitoring or capacity planning? Are you most interested in graphs and dashboards? Do you want to see trends?

During my discussion with Randy Watson, he mentioned that ultimately, most customers will use performance data either to conduct some kind of server sizing or to implement workload consolidation (scenarios 2 and 3 from earlier). You’ll need to collect data for a reasonable amount of time in order to make any useful projections, so the sizing process in particular can take awhile if you haven’t previously collected data.

According to Randy, the size that the data MPG’s product generates on local disk varies depending on the number of LUNs in the environment, but with 5-minute intervals, 1-5 MB per day can be expected. He said that MPG tries to manage its customers’ historical consolidated files by only keeping 90 days of disk data. In addition, they delete about 20 percent of daily file size by removing redundant data (e.g., configuration data that doesn’t change). A year’s worth of data should be kept by default. For most customers, that amounts to less than 1 GB of data for that consolidated file.

I’m sure other vendors take similar approaches to keep a handle on the amount of data being collected. Then again, with the large disk sizes that are available now, spending a reasonable amount of capacity on historical performance data shouldn’t break anyone’s budget.

Maintaining uptime is important, as is planning for the future. Not only do you need to keep your servers running, you must proactively ready them for what lies ahead. Are you doing your part?

IBM Debuts Hyperconverged Servers

Edit: Have you tried this yet?

Originally posted August 2017 by IBM Systems Magazine

In May, IBM announced it was partnering with Nutanix to “bring new workloads to hyperconverged deployments.” In July IBM unveiled two new hyperconverged systems. So what does IBM’s move into the hyperconverged infrastructure market mean? For that matter, what is a hyperconverged infrastructure?

https://www.youtube.com/watch?v=5R8l81K8UB8

Per Wikipedia, a hyperconverged infrastructure describes systems that virtualize everything. It includes a hypervisor, software-defined storage and software-defined networking. It will typically run on commodity hardware.

This would be different from the IBM Power Systems servers that I’ve used over the years. In those environments, the machines connect to a storage area network (SAN) via fibre channel adapters. Although PowerVM gives me a great hypervisor and access to an internal network switch, a hyperconverged cluster of servers has direct-attached disks, and the servers communicate over a 10G Ethernet network, sans a SAN. Seriously, no SAN is involved.

So why is IBM interested in Nutanix? Their claim is that they are able to make your underlying infrastructure invisible. They have also been growing by leaps and bounds over the past few years.

It’s very possible that you are already running—or at least thinking about running—an x86-based Nutanix cluster. Historically, Nutanix clusters would run on x86 hardware from Nutanix, Dell, HP or Lenovo. You would set up your cluster and choose your hypervisor: ESXi, Hyper-V or Nutanix’s free hypervisor, AHV, which is based on CentOS KVM.

As noted, IBM has two new servers, the CS821 and CS822, which run the Nutanix software. They’re available in a few different hardware configurations.

The CS821 is model 8005-12N. It’s a 1U server that has 2×10 core 2.09 GHz POWER8 CPUs with up to 160 threads, 256G memory and 7.68 TB of flash.

The CS822 is model 8005-22N. It’s a 2U server that has 2×11 core 2.89 GHz POWER8 CPUs with up to 176 threads, 512G memory and 15.36 TB of flash.

Now, under the IBM-Nutanix union, you have a choice when it comes to the processor: POWER or x86. The CS821 and CS822 servers run AHV, and the virtual machines running on top of the hypervisor are running Linux on Power. AIX and IBM i aren’t supported as virtual machines at this time.

Nutanix handles all cluster management through its Prism product. The management interface is accessible via browser, command line, shell, etc. You mix and match your clusters based on the hypervisor you pick, and run them all through the same instance of Prism (although you would have to drill down to manage each cluster individually). With the CS821 and CS822 machines, this means that your new POWER based cluster will appear in Prism as just another cluster that happens to be using a different processor. You won’t be able to mix and match POWER and x86 nodes in the same cluster, but you can still manage a POWER cluster in much the same way as you’d manage an environment of existing x86 clusters.

What exactly do you gain by running Nutanix software? For starters, it’s an established product that’s scalable, reliable and distributed. The storage layer is handled by the Acropolis Distributed Storage Fabric (ADSF), which determines where to store your data on disk. Since a minimum cluster consists of three nodes, out of the box you will have resilience as the data gets copied―locally, and also to at least one other node, depending on the resiliency factor you choose and how many nodes are in the cluster.

ADSF is designed for virtualization. It handles tiering across your spinning hard disks, SSDs, etc., and, as your VMs relocate to different hosts in the cluster, it will take care of getting the hot data to the right node. In addition, ADSF handles snapshots, clones, deduplication and compression.

You can set up replication factors for your storage depending on how many nodes you have in your cluster. For example, choosing RF3 will allow for one node in your cluster to fail. RF5 will allow for two nodes to fail.

When it’s time to grow your cluster because you need more CPU, memory or disk, just add another node. It’s seamlessly discovered and integrated.

For an in-depth look at the technical specifications of the product, I recommend the Nutanix Bible

Part 1 discusses a brief history of infrastructure and it discusses the problems that Nutanix is trying to solve. Part 2 primarily covers Prism, the basics of the GUI and navigation, upgrading your cluster and accessing I/O metrics. There are screen shots. In addition, there’s a capacity planning feature that includes details about projections of when it might make sense to add nodes based on the current and predicted workloads.

Part 3 is the book of Acropolis, the storage compute and virtualization platform. Acropolis is “a back-end service that allows for workload and resource management, provisioning, and operations…This gives workloads the ability to seamlessly move between hypervisors, cloud providers, and platforms.” Included is a visual comparison of the Acropolis and Prism layers. Another image shows a typical node. That’s followed by a visual of a cluster looks with the nodes linked together.

Different Nutanix components are defined, including:

  • Cassandra, the metadata store
  • Zookeeper, the cluster configuration manager
  • Stargate, the I/O manager
  • Curator, MapReduce cluster management and cleanup
  • Prism, the UI and API
  • Genesis, the cluster componenet and service manager
  • Chronos, the Job and Task Scheduler
  • Cerebro, Replication / DR manager
  • Pithos, vDisk configuration manager
  • Acropolis Services, handles task scheduling, execution, etc.
  • Dynamic Scheduler, makes VM placement decisions

Finally, you can see how Nutanix handles the different levels of potential failure, including disk and node failures.

There’s much more, and the document continues to be updated. If you read through the Nutanix Bible, I think you will have a very good understanding of the platform and how it differs from other cluster solutions you’ve used.

As you continue to plan for updates to your data center, you should really give IBM Hyperconverged Systems powered by Nutanix a closer look.

Project Monocle Will Simplify Patching

Edit: Some links no longer work.

Originally posted July 2017 by IBM Systems Magazine

How do you go about determining what fixes you need for your system to remain up to date?

Do you use FLRT or FLRT LITE?

Do you just log into IBM Fix Central and start looking at what is available?

What if you had a dashboard that you could log into that showed you your system names across your whole environment, along with the current level of firmware and which AIX and VIOS OS version is running? What if it also showed you the recommended versions to upgrade to? What if you could also see the machine type and serial along with the IP address of your LPAR or frame? Would you be interested in getting a tool like this running in your environment?

What if it had a dashboard that gives an overview of what needs to be updated and what systems are up to date? And what if that tool allowed you to create and share plans with other stakeholders in your organization to help with change management planning? Moreover, what if it allowed you to filter on OS, or firmware, or VIOS. What if you could choose the types of machines, or the current levels you wanted to drill down on.

I was recently given access to a demo version of Project Monocle, which is a tool that provides for all of the functionality I described above, and I have to say I am very impressed with it. I look forward to getting it up and running so that I can do further testing in my environment. Right now, the tool is available at no charge as a technology preview, so I would suggest reaching out to the team ASAP so you can try it out for yourself. In order to get tools like this created and IBM resources assigned to work on them, users need to let IBM know what improvements they want to see. This is an example of a tool that can help simplify the lives of Power Systems administrators.

There were some interesting blog posts written about the creation of the project, including these:

http://www.jaredcrane.com/ibm-project/

http://www.stefanieowens.com/project-monocle/

““This is the best story of design at IBM in the last three years… The team came to them saying they need one-click firmware updates for Power Systems… By doing field research, they found out that this was a human problem, not a system problem. Not only did research inform what they built, but what they built was beautiful itself as well.” — Phil Gilbert, General Manager of Design at IBM

The blog posting continues with: “The Monocle team was tasked to explore the field of updates and upgrades that are mission critical to keeping all of IBM Power Systems server products secure. Think: these servers are those that run major data centers around the world that are the backbones of credit card companies, major retailers, and even governments in some instances. The sponsoring product team originally came to us asking for a ‘one-click update’ for all Power Systems. Through user research, we discovered that a one-click update was actually not the right way to go.

“We found that the current process of updating servers is rigorous and time intensive. It causes headaches for enterprises to not only find the appropriate fixes their servers need, but then it’s even worse to actually schedule downtime on the servers to fix the issues and report those repairs for security compliance purposes. It’s next to impossible to automate this process; in fact, automating it could even make matters worse! Even after that process is completed, an enterprise still has to report on its security patching in order to maintain compliance with industry regulations. As one of our Sponsor Users identified during an interview, planning and managing security patches and updates is not only a pain to perform, but also to report on.

“Not only does the security patching process take a lot of effort, but it is vitally important to the safety and security of the enterprise data; one mistake here could take an entire organization down.

Project Monacle is also described here, along with some screen shots and more details:

https://www.ibm.com/developerworks/community/wikis/home?lang=en_us#!/wiki/Power Systems/page/Monocle Patch Management

“We recognize that problem in IBM and set out to make your life easier. You may have heard of Project Monocle, but if not, it is a zero installation web application technology preview, providing a consolidated view of your inventory with the ability to drill down and view patch compliance. It actually works off existing IBM technologies such as the Technical Support Appliance (TSA), Fix Level Recommendation Tool (FLTR), and Fix Central. If you do not currently have TSA, contact the development team at bmonocle@us.ibm.com to get started. If you do have TSA and are interested in using Monocle send an email also to bmonocle@us.ibm.com and ask to get connected to Monocle.

“Finding the patch you need can be a daunting task. Project Monocle uses Fix Central and FLRT to provide you with recommended levels for each system type: AIX, IBM i, Firmware, VIOS, and HMC. Compare recommended and latest versions to see which is right for your environment. You can even see all of the APARs that are part of each update/upgrade to see how they’ll affect your systems. Need to build a report for your internal review teams? No problem, that data is all there at your fingertips…

There is more information about the technology preview that is available here:

“Project Monocle is a zero installation web application, providing a consolidated view of your inventory with the ability to drill down and view patch compliance.

“This Technology Preview provides the opportunity to use Monocle at no charge. All customers who have, or are eligible to get, the Technical Support Appliance (TSA), can gain access.

If you do not currently have TSA, please contact the development team at bmonocle@us.ibm.com to get started.

If you are not familiar with TSA, you can get more information and watch informative videos here

“Benefits IBM Technical Support Appliance (TSA) helps you:

-Streamline IT inventory management by intelligently discovering inventory and support-coverage information for IBM and non-IBM equipment

Improve technical support management with analytics-based reports and collaborative services

-Mitigate costly IT outages via operating system and firmware recommendations for selected platforms

“How it works -Configure TSA to discover basic support-related information such as hardware inventory, code levels, virtual machines, and OS information from designated devices.

-Inventory information is shared with IBM TSS using security-rich transmission protocols.

-IBM uses advanced analytics and worldwide support knowledge to help identify code currency and support contract vulnerabilities.

-Continuously collaborate with your IBM TSS focal point.”

Image your workflow in this new environment. TSA is gathering information about your machines from your HMC, and it is sending that information to IBM. You are then able to log in to Project Monocle and see almost real time information about your systems and how current your environment is as far as patching is concerned.

Imagine you are now able to set up upgrade plans, share those plans with others, let them approve or deny them, and have an audit trail of the plan and subsequent decisions that is available for review by interested parties.

Patching is a critical component of maintaining your systems, and this is a tool that can simplify the data gathering and decision making. You will be able to tell in an instant which systems need to be patched and what level they should be running, all from one screen.

Additional Resources:

TSA solution overview http://www.ibm.com/services/us/en/it-services/technical-support-services/technical-support-appliance/

Download the TSA image file and setup guide: https://www.ibm.com/support/fixcentral/swg/selectFixes?parent=ibm~Other%2Bsoftware&product=ibm/Other+software/Technical+Support+Appliance&release=All&platform=All&function=all

The Great Debate: AIX Versus Linux

Edit: I talk about this all the time.

Originally posted May 2017 by IBM Systems Magazine

In computing circles that I’m involved with, the debate rages on: AIX versus Linux. Administrators wonder, “Why would anyone want to keep running an OS that’s supported by one single vendor? Why wouldn’t you want to move everything to the shiniest and newest operating system and get away from ‘legacy’ enterprise computing?”

The odds are high that most AIX administrators have used both AIX and Linux and are well versed in both. Those that have a background in both OSs are better able to have informed discussions about the pros and cons of the environments compared to someone that has never used AIX, but that certainly doesn’t stop them from having an opinion.

Similar debates have raged on for years within the mainframe community. I’ve lost track of the number of times someone has declared the mainframe to be dead every time a new technology comes along. But when you look at the volume of critical transactions that still happen on the mainframe, it’s hard to believe that it’s going away any time soon.

AIX Advantages

The AIX debate is a little bit trickier. In many cases, it’s easy to port away from AIX and run on Linux or Windows. I may be a dying breed, but I still think that AIX is the premier UNIX flavor that is available today, chiefly because the hardware and OS have been coupled together to provide enterprise level reliability, availability and serviceability.

This isn’t an OS that’s running in a legacy or maintenance mode. The latest version of AIX, 7.2 TL0 was released in December of 2015, and AIX 7.2 TL1 was released in November of 2016. One of the highlights with TL1 is the ability to install service packs and technology levels without rebooting. The platform prioritizes high levels of uptime for critical workloads, and is well-suited for environments where downtime costs real money and reliability is a must.

For example, instead of bolting on a software-based hypervisor, POWER systems natively have hypervisors built into the hardware. By using VIOS and AIX together with the POWER hardware, you have an integrated stack that comes from one company. If something goes wrong, it’s much easier to get help from that single vendor. I’m not opposed to running Linux workloads; I just think that AIX is a more mature and robust OS. If given the opportunity to run Linux, I would consider POWER as a candidate to run my Linux workloads.

Breaking Down the Differences

It was interesting to replay the presentation that Andrew Wojnarek made to the Philadelphia Linux User Group on April 11. It’s nice to see that I’m not the only one who thinks that there are real advantages to the AIX environment.

Wojnarek supports a large fleet of machines—roughly half AIX and half Linux—and he says he has a pretty good feel for what it is like to administer both environments. He goes through the basics of AIX and why you would run it. Some of his arguments in favor of AIX include things like standardization—i.e., you can run the same OS on small servers and huge enterprise servers. Compare that to the subtle differences you will find between Redhat, SUSE, Debian, Ubuntu, etc.

He reminds us that when we are working with AIX we are in a ‘walled garden.’ He points out that there’s a standard way doing things with standard tools and commands. He talks about the built-in Logical Volume Manager, and the ability that we have with JFS2 to both increase and decrease the size of filesystems while they are online, which can be problematic with other filesystems on Linux depending on the type of filesystem that you are running. He talks about mksysb, the built-in tool to make backups that can be used to restore your server, either to the same hardware you took the backup from or to other hardware in your environment.

Device handling is a breeze on AIX. In Linux you have to echo values and edit files, whereas in AIX you just chdev a device. To discover something new, you run cfgmgr. To list attributes you run lsattr. Things are just easier and more consistent.

Wojnarek’s presentation isn’t an AIX love fest, however. He does discuss what he dislikes about the OS, and there’s a good discussion toward the end with the user group members. I recommend you watch the replay.

Some other advantages of AIX that weren’t in the presentation include the ability to use alt_disk_copy and alt_disk_upgrade to have online copies of your rootvg and to actually upgrade your running OS, which you can activate the next time you reboot. If you run into problems, you just reboot from the original set of disks.

Moreover, AIX has the advantage of having IBM PowerHA high availability software integrated into the OS at the kernel level and mainframe heritage virtualization baked into the hardware, not as an add-on hypervisor. AIX on enterprise hardware has built-in error reporting and diagnostics, and when call home is enabled, we might find an IBM CE dispatched to fix a problem before we even knew anything about it.

Consider Your Needs

Instead of all the arguing about which OS is better, sometimes it is worth stepping back and thinking about who is using it and why. Why do they want uptime and reliability? Why is it worth paying for hardware and software, compared to getting commodity hardware and a virtualization solution?

I’ve heard some great analogies over the years, including this one: Both a kayak and a container ship are seafaring vessels. One is better suited for taking large amounts of cargo across long distances. The smaller solution might get the job done, but you want to find and use the method that is suitable to the job at hand. Nobody would balk at spending more money on a container ship if that was the best solution. The same should hold true in the computer room.

Of course, there are some disadvantages with AIX. Perhaps you want to run the same flavor of Linux on your desktop and server—you can’t do that with AIX. Or maybe you want to learn AIX, but you don’t have access to education or hardware. The IBM Academic Initiative helps to fill the education void, but access to hardware is a legitimate barrier to those that want to learn more about the platform.

It can seem harder for someone to learn ksh if all they ever knew was a Windows or MacOS GUI and bash on Linux. There’s a learning curve with AIX, but that’s true of any OS—it takes time to become proficient.

I know the world loves Linux, but there are still many of us out here who love AIX. Linux users would be well-served to objectively listen to the key points in this never-ending debate to see if the advantages that AIX users take for granted might benefit their environments.

How to Download Fixes

Edit: Still a good post.

Originally posted April 2017 by IBM Systems Magazine

I still find customers that are unsure of how to download fixes, so I want to cover the steps that I use when I download fixes for AIX 7.2 as an example use case.

When I download fixes from IBM, I go to IBM Fix Central

As it states on the website, “Fix Central provides fixes and updates for your system’s software, hardware, and operating system.”

You can either find or select a product from that initial landing page. In my example case, I am going to find product. I search for AIX, and I select version 7.2 fixpacks as you can see in Figure 1 below.

Figure 1

After clicking on continue, I decide I want to get the level 7200-00-03-1642, so I select that Service Pack, as in Figure 2 below.

Figure 2

I then select continue in order to proceed.

IBM has restricted operating system fixes to machines that have current maintenance agreements with IBM, so I need to enter the machine type and serial number of the machine that I am going to install the fixes on. (See Figure 3 below; click to view larger.)

Figure 3

After putting in the correct information and selecting continue on that page, the website comes back and has me agree to terms and conditions before I am able to download the files. (See Figure 4 below.)

Figure 4

After clicking on the ‘I agree’ option, I can go ahead and download the files using Download Director, if I choose this option they will be download directly to my workstation. However, there may be a need for me to download the fixes directly to a machine in my computer room. This option will assume that the machine in question has internet access.

It can be a tedious process to download many gigabytes of files to my laptop, then turn around and move those same files to a machine in the computer room, especially if the option exists to perform this operation in one step. This is especially so in an environment where I may have relatively fast download speeds, but my upload speeds into my computer room are constricted. Some admins may find that this is the case when they are working from home, or when their office WAN connection is not very fast. Instead of spending all of that time moving files around, many times I prefer to create disk space on a server, and download the fixes directly to that server. This is very useful if the computer room has a fast internet connection. (See Figure 5 below.)

Figure 5

On the far right side of the screen I am able to find a section where I change my download options. My options will consist of using Download Director, using bulk FTPS, using HTTPS in my browser, or ordering the fixes on physical media and have IBM send them to me so that I can load them into a DVD drive (or other optical device) and use them that way. (See Figure 6 below.)

Figure 6

Ordering the fixes on media from IBM can be a good choice: Having the media on hand can make it pretty easy to find a particular level of AIX or AIX fixes over time. This of course assumes you have a good system of tracking your physical media. It can also be a good choice if you have limitations on your internet speeds or connectivity, and can be useful in a disaster recovery scenario or other recovery situation that might involve bootable media. The downside to this option is that now IBM charges customers for using this option, along with the delay that it may take to get IBM to ship your fixes, so many customers choose to download them instead.

Download Director and HTTPS in my browser will both save the files to my local workstation, but in this scenario that is not the way I want to obtain the fixes. In the past, we were able to select bulk FTP as a download option, but now there is a relatively new change where we have to use bulk FTPS instead.

After selecting bulk FTPS, I get my order number, the number of files, the total size I am going to be downloading, along with the name of my FTPS server and the user ID and password I should use for my download. There are also FTPS hints. There is a statement that informs us that on AIX clients we should use ‘ftp -s’ to start the FTPS session, and then enter passive mode immediately. Their example has us run the commands

ftps> passive
ftps> binary
ftps> mget *

This should be familiar to you if you used bulk ftp as a download option in the past.(See Figure 7 below; click to view larger.)

Figure 7

So why was there a change by IBM to FTPS? This FAQ can help provide some answers. In short, the change allows for encrypted communication and secure bulk FTP download.

I am sure many of my readers are well aware of these options to download fixes, however I still find customers that were not aware of the change from FTP to FTPS. It is not a huge modification, but it is just enough of a change that we need to remember to do things just a little bit differently when we plan to obtain our next set of fixes for our next maintenance window.

Tools for Documentation

Edit: Some links no longer work.

Originally posted August 2015 by IBM Systems Magazine

Back in 2012, I wrote a blog post titled “The Case for Documentation.” Then just recently, a reader made a comment:

“I see in 3 years there has not been a single comment on this article. I’ve been so deep into Power and AIX, like you mentioned, walking around with knowing all there is to know in and around the environment I look after. Finally there has been an official request submitted to document PowerHA clustered environments and other smaller ones. I am so much in the thick of it that I start off and end up with a too technical visio drawing or veer off track in explaining an area. Have you got a guideline or template to give me an idea how I can start and get to finish a fairly sort and sweet “walk through” document that is just informative enough to satisfy those at management level or even my specialized level?”

Let me address this with some of my favorite tools to document Power Systems. Some of these tools have “prettier” output than others, but I think they’re all valuable when it comes to documenting your running systems.

PowerHA Tools

The original question was PowerHA specific, so let’s start with the PowerHA snapshot tool. The cluster snapshot tool lets you save and restore cluster configurations by saving a file a record of all the data that defines a particular cluster configuration. Then, you can recreate a particular cluster configuration, provided the cluster is configured with the requisite hardware and software to support the configuration. This snapshot tool can also make remote problem determination easier because the snapshots are simple ASCII files that can be sent via e-mail.

You can also use the PowerHA-specific qha and qcaa scripts. These are real-time tools that you can use with your running systems more than as a deliverable, but they’re still valuable. Alex Abderrazag has provided a nice script to help you understand cluster manager internal states.

HMC Scanner

When it comes to documenting the way my servers have been configured, I like to use HMC Scanner. HMC Scanner gives you a nice summary spreadsheet with almost anything you want to know about your environment, including serial numbers, how much memory and CPU are free on your frame, how each LPAR is configured, information on VLANS and WWNs, and much more. I did a video on running HMC Scanner and IBM’s Nigel Griffiths has also posted a video on HMC Scanner for Power Systems. HMC Scanner works for AIX, IBM i, Power Linux and VIOS LPAR/VM.

System Planning Tool

I also like to use the IBM System Planning Tool (SPT), which I blogged about in “Configuring Your Machine Before it Arrives” and which you can find on the IBM support tools website. The SPT provides nice pictures of the machines showing which slots are populated and assigned to which LPARs.

If you’re comfortable with the command line, you can manipulate sysplans with the following commands, which may be easier than going into the GUI to do the same functions:

lssysplan
rmsysplan
mksysplan
cpsysplan

viosbr

For VIO server-specific documentation, I like to use viosbr. After you’ve taken a backup, run:

viosbr –view –file

This provides a lot of information to document the setup of your VIO server. It will show your controllers, physical volumes, optical devices, tape devices, Ethernet interfaces, IP addresses, hostnames, storage pools, optical repository information, ether channel adapters, shared Ethernet adapters, and more.

snap –e

AIX-specific commands would include snap –e, which lets you gather a great deal of system information and run custom scripts to include other information with your snap. This tool is often run in conjunction with support to collect the information they need to help resolve issues with your machine.

prtconf

Another worthwhile command is prtconf. This command gives you information like model number, serial number, processor mode, firmware levels, clock speed, network information, volume group information, installed hardware, and more.

IBM i Options

For IBM i, the midrange wiki has good information about different methods you can use to gather data, including how to print a rack config from a non-LPAR system:

  1. Sign on to IBM i with an appropriate userid
  2. On a command line, perform command STRSST
  3. Select option 1, Start a service tool
  4. Select option 7, Hardware service manager
  5. F6 to Print Configuration
  6. Take the defaults on Print Format Options (use 132 columns)

HMC

In the new HMC GUI, you can select your managed server then Manage PowerVM and you have options to see your virtual networks, virtual storage, virtualized I/O, and more. This information can also be helpful in documenting your environment.

Self-Documenting Tools

I find there’s value in having systems that can “self-document” via scripts and tools compared to administrators creating spreadsheets that might or might not get regular updates as soon as changes occur. Somhotlinke might find self-documenting tools don’t provide the correct information, which leaves us with the question of whether it’s better to have no documentation or wrong documentation when you’re working on a system.

Self-documenting tools are a starting point. Whatever documentation you have on hand, take the time to double-check what the actual running system looks like compared to what you think it looks like. By not assuming anything about your running systems, you can avoid creating additional problems and outages because reality didn’t match what the documentation said.

Many Different Documentation Tools

From the frame, to the OS, to the VIOS, to the HMC, there are many different pieces of your infrastructure to keep an eye on and many different tools you can use to document your environment. I’m sure readers use many other tools and I’d be interested in hearing about those. Please weigh in with a comment.

To VIOS or Not to VIOS Revisited

Edit: Still worth considering.

Originally posted September 2014 by IBM Systems Magazine

In 2010, I wrote an article that covered the pros and cons of the virtual I/O server (VIOS). It’s still a topic that I run into today, especially as more IBM i customers consider attaching to SANs. In the article, I mentioned some of the concerns customers have, including their VIO server being a single point of failure, and the new skills that are required to administer the VIO server.

I want to reinforce the idea that you can build in redundancy when you design your VIO servers to reduce single points of failure. Some customers like to have dual VIO servers on each physical frame, but you can take it further than that. You can have one set of VIO servers to handle your storage I/O, and another pair to handle your network I/O. Some customers go one step further and segregate their production LPARs onto production VIO servers, and put their test/dev LPARs onto another set of VIO servers.

More Flexibility

You have a great deal of flexibility in how you configure and set up your Power Systems servers depending on the needs of your business.

IBM has made great strides in the usability of VIOS, especially for those uncomfortable with the command line. If you truly don’t want to log in as padmin and do your work from the shell, the Hardware Management Console (HMC) GUI gets better with each new release.

When you click on the Virtual Resources section of the HMC, you have access to Virtual Storage Management, Virtual Network Management and Reserved Storage Device Pool Management. Although these options have been around for a while, some don’t realize they exist or that ongoing improvements are been made to the interface and the choices that are available.

These options continue to become more powerful. For example, when I go into Virtual Network management, I can create a VSwitch, Modify a VSwitch, Sync a VSwitch and Set a VSwitch mode. I can view my existing VLANs and my shared Ethernet adapters.

Similarly, I can manage my storage through the Virtual Storage Management GUI. Modifying which hdisks are assigned to which LPAR and modifying virtual optical disk assignments to partitions can all be handled via the GUI.

I still prefer to use the VIO command line, and I still encourage you to learn how to do it as I think you have more power and control over the system using that method, but it’s becoming less mandatory to work as padmin than it used to be.

Easier Installation

Another powerful new tool is the capability to use the HMC GUI to actually install VIOS. Instead of fooling around with physical media or setting up your NIM server to allow you to load your VIOS, you can now manage a VIOS Image Repository on your HMC, where you store the VIO optical images on the hard drive of your HMC. I was pleasantly surprised when I was shipped a 7042-CR8 HMC with the HMC V8.8.1 code on it, the VIO install media was preinstalled on the HMC hard disk.

Loading that first VIO partition onto a new system was a snap. Once I got everything properly configured on the network and defined my VIO partition via the HMC, I was able to easily load multiple VIOS LPARs by clicking on the Install VIOS radio button and filling in a few network parameters in the GUI.

This is quite a change for people who are new to Power Systems servers, or those who don’t have NIM servers or don’t know how to use NIM servers. IBM i shops may never have a NIM server in their environments so that option isn’t even available for them.

When customers purchase some of the smaller Power Systems servers and opt to get a split backplane, it can be a challenge to get their second VIO server loaded as they can’t connect their DVD to their second disk controller. Allowing for installation from the HMC greatly simplifies the deployment of VIOS, especially in new environments. Preloading the necessary code only makes it that much easier.

More Alternatives

Another development that has arisen since I first wrote that article is the widespread adoption of NPIV, which gives admins an alternative to vSCSI. The advantage is that instead of being concerned with mapping LUNs from VIOS to client partitions, you can offload some of that heavy lifting to your SAN team. Now the SAN team is able to map LUNs directly to the client LPARs that will be using them. Some SAN teams don’t care for the extra burden. In one scenario that made the change, they had nearly a hundred LPARs on a frame, and they had been handling the vSCSI mappings at the VIOS level. This allowed the SAN team to map a great many LUNs to a relatively few WWNs. Once they migrated to NPIV, this burden shifted, and the SAN team was less than thrilled about it.

Comfort and Choice

The debate will continue, but the resistance seems to have lessened somewhat around the deployment of VIOS. As more shops get comfortable with the technology and more people spread the word, there is less fear around using this method to share adapters across many LPARs.

IBM continues to allow for choice in how you build your machines. I still know of customers that don’t virtualize anything and instead have dedicated CPUs and adapters for each LPAR. This type of a setup is becoming more rare as companies realize all of the benefits of virtualizing their environments using VIOS.

IBM Delivers With POWER8

Edit: And now I wait for POWER10.

Originally posted April 2014 by IBM Systems Magazine

POWER8 technology created some buzz when it was first discussed at the Hot Chips conference and slides that describe the chips could be found online before today. But now we have more information about the actual systems that will be shipping when they become generally available in June.

When you look at the Power Processor Technology Roadmap since 2004, you can see that we regularly get new, more powerful chips. We are almost spoiled. When IBM says it is going to deliver, it does just that, with both new hardware and new OS releases.

In 2004 we had POWER5 followed by POWER5+. In 2007 we had POWER6, which led to POWER6+. In 2010 we had POWER7 and the most current, POWER7+. In 2014 we have POWER8, and there are already charts that show POWER9 is being planned for the future. IBM has consistently delivered on its roadmaps.

I recently attended an education session for IBMers and business partners that covered information around POWER8 and the new IBM hardware announcements that are being made today. I am going to hit some of the highlights, but additional information will be included in future posts.

The POWER8 Chip

The POWER8 chip is another step up from what has come before. We have gone from four threads to eight threads per core. With simultaneous multithreading (SMT) enabled you can have up to eight threads running on a core, which means you can get more work done per CPU cycle.

The charts that I saw showed a linear increase in the number of transactions that could be completed when you compared SMT1 to SMT2 to SMT4 to SMT8. As you made each transition you could see the number of transactions increase. Obviously some workloads won’t benefit from SMT, but those will be the exception rather than the rule.

I also saw charts that compared I/O bandwidth and memory bandwidth on the new systems compared to older models, and the numbers were impressive. It was a significant increase that I will be discussing further in future articles.

While POWER7 technology had up to eight cores per socket, the POWER8 chip has up to 12 cores per socket. New memory controllers and memory cache on the system improve memory latency and performance.

The way the cores communicate with one another across the SMP interconnect has also improved so it takes less “hops” to go from one core to another in the system. The chip also boasts a direct PCIe Gen3 I/O interface for incredible bandwidth.

There is 512 K L2 cache per core, 96 MB shared L3 cache and up to 128 MB L4 off-chip cache.

Understanding the Models

How comfortable are you with the model numbers of the Power servers. If someone says 720, 770 or 795, do you have a pretty good idea what server they are talking about? With today’s announcement, how many of you were expecting to see 820 and 870 server models? This is not going to be the case. The servers are now named with four- or five-digit combinations of letters and numbers. For this first announcement, the servers all start with the letter “S,” which signifies that they are scale-out servers. As time goes on I would expect to see models that start with the letter “E” for enterprise systems. The second digit indicates that it is running POWER8. The third digit indicates the number of sockets in the server, and the last digit indicates how much rack space it takes up, for now either 4U or 2U.

For example, the S822 is a scale-out server, running POWER8 technology, with two sockets, fitting in 2U of rack space. The S824 is a scale-out POWER8 two socket 4U server. If you see an L in the fifth digit, like the S822L, then that is a Linux-only system, much like today’s 7R1, 7R2 or 7R4 servers.

We need to pay attention to the lettering. The L designates it will only run on Linux. The 2U non-L models can run AIX and Linux. The 4U non-L models can run AIX, IBM i and Linux. At the time of this announcement, you cannot have an I/O drawer with PCIe slots on any of these machines, although a statement of direction indicates that this capability will be available in the future.

Here are the specs for the new servers:

  • The two socket 2U servers (S822) can have different configurations depending on whether you populate both sockets. If you have one socket populated, you can have six or 10 cores, with up to 512 GB of memory. There are six PCIe Gen3 low-profile hotplug adapters in this configuration. If you have both sockets populated, you can have 12 or 20 cores, with up to 1 TB of memory. Nine PCIe Gen3 low-profile hotplug adapters are included in this configuration. You can run PowerVM with AIX or Linux, but not IBM i on this server.
  • The S822L can have 20 or 24 cores, with up to 1 TB of memory and nine PCIe Gen3 low-profile hotplug adapters. You can run PowerVM or PowerKVM and you can only run Linux on this machine.
  • The S814 is a one-socket 4U system that can come in a 4U or tower form factor. It has six or eight cores and 512 GB of memory. You can have seven PCIe Gen3 full-high hotplug adapters, and you can run PowerVM with AIX, IBM i or Linux.
  • The S824 is a two-socket 4U server. If you populate one socket you can have the same specs as the S814, but if you populate both sockets you can get 12, 16 or 24 cores with up to 1 TB of memory. You will have 11 PCIe Gen3 full-high hotplug adapters and can run PowerVM with AIX, IBM i or Linux.

Performance

The rPerf and CPW numbers that I saw showed improvements, and I will write more about this in the future as well. IBM asked us not to share the numbers until they are audited and vetted, but I will be surprised if the improvements, especially when compared with competitor’s machines, are not as dramatic as we saw during the training sessions. It was also amazing how these new systems perform when comparing an S824 vs a POWER5+ 595 or a POWER4 690.

Another part of the story is how this improved performance translates into needing fewer cores to do the work that you need your server to do. That means you will need to spend less to buy hardware, and you will receive better performance per dollar spent.

We will be able to perform Live Partition Mobility operations between POWER6, POWER7 and POWER8 machines, assuming we’re using the correct processor mode. We can run the LPARs in POWER6 mode, POWER7 mode or POWER8 mode. This will also make it possible to run OS versions that are not POWER8 aware assuming you are using VIOS for your I/O.

Miscellaneous Information

You can run AIX in POWER8 mode with full I/O support once you get to:

  • AIX 6 TL7 SP10
  • AIX6 TL8 SP5
  • AIX 6 TL9 SP3
  • AIX 7 TL1 SP10
  • AIX 7 TL2 SP5
  • AIX 7 TL3 SP3

POWER8 support for IBM i will be available in IBM i 7.1 TR8 and IBM i 7.2, as well. We will need to be running VIOS 2.2.3.3 for POWER8 support.

There is also a new HMC model 7042-CR8 that will be available later in the year.

I should be getting my hands on some of these models shortly and will be able to share more information once I do.

These are some of the highlights that I found interesting. What are you looking forward to the most with these new systems?

Top 10 Reasons AIX Will Endure

Edit: Still good stuff.

Illustration by Paul Price

Originally posted June 2013 by IBM Systems Magazine

The AIX* operating system continues to be a leader in the UNIX* marketplace. AIX celebrated 25 years in 2011, and users have every reason to expect that the operating system will continue to evolve and move forward for the next 25.

Businesses of all flavors in all industries have varied experiences with the operating system. Some have been running it for many years—or even from its inception. Others are new to the environment as IBM continues to migrate clients from other UNIX or Windows* platforms.

In most cases, people making the switch want an enterprise-class operating system running on enterprise-class hardware. They don’t want to answer their problems by rebooting the system. Businesses in all industries have critical workloads, and unexpected downtime is not an option—they need robust hardware that can let them know if problems are on the horizon.

Clients should think about their end game and what they’re trying to accomplish. You want a high-performing processor at the heart of your hardware platform.

They also want their hardware to call home to IBM if it has an issue. They want to call IBM support and get answers to all of their hardware and operating system questions. It’s not infrequent to hear stories of clients that didn’t even know they had a problem, but IBM support called to let them know they’d be stopping by to replace a failing power supply and no downtime would be required.

Clients deciding what criteria they’ll use when selecting servers and operating systems shouldn’t base their decision strictly on price, where the acquisition price point wins no matter what the total cost of ownership might be. They should also think about their end game and what they’re trying to accomplish. You want a high-performing processor at the heart of your hardware platform. You want what IBM calls RAS—reliability, availability and serviceability. You should also look for the satisfaction of the platform’s end users along with those who maintain the servers.

Top 10

For these 10 reasons, AIX should still be going strong for many years to come:

1 It’s easy to use. AIX clients can use command-line tools such as smitty, which is menu-driven and can help find the tasks you’re seeking without memorizing the commands and flags on the command line. The tool keeps a history of the commands run and sends the output from those commands to a log file. It can also display the actual command it ran “under the covers.” You can go into smitty, select your options, hit F6 and it will display which command will run. This also lets you automate tasks with a script. If you prefer a GUI, you can run tools such as IBM Systems Director, which can help manage an entire fleet of servers and the virtual machines running on them.

2 It’s easy to learn more about the operating system and the hardware. A great deal of information is available in the IBM Redbooks* publications, freely available documents that cover hardware and software products in great detail. Additionally, many people are writing blogs, publishing articles, recording videos and sharing knowledge with one another. In a short amount of time, you can get up to speed with the various ways to use the system. Even if you’re a longtime user, you can learn more by reading the ample and ever-increasing documentation.

3 It’s easy to get support when you need it. You can call IBM and ask how-to questions, or if you run into issues, you can easily speak to experts that can help. They take snapshots, or “snaps,” of your system to help analyze it, and they have secure shared-screen sessions available to help with troubleshooting, if necessary. Since IBM develops the processor, assembles the machine and creates the operating system, it owns the stack; therefore, the company deeply understands the system your business is running.

4 Because IBM owns the entire stack, it creates the hardware and the firmware. And since it employs the developers, it can get field questions answered by the people who actually wrote the code. You can feel confident knowing that the experts who built the hardware also built the virtualization hypervisor that runs on top of it, enabling virtualization with little overhead.

5 The ecosystem is full of friendly people willing to help you learn. Many users are willing to share their expertise, and if you want to learn more, the sources are available. Training classes and conferences offer opportunities to learn directly from experts. User groups and virtual user groups let users network and learn from one another.

6 AIX just runs. Although it’s obviously recommended that you continue to update the firmware on your server and install fixes and patches to your operating system, if you were to neglect it and let it sit in a corner of your machine room, it would happily hum along with little intervention. Over the years, you’ll find many examples of clients who followed the adage “if it isn’t broken, don’t fix it,” and they just let their systems run. Ask other AIX shops how often their production LPARs go down due to the operating system or the Power Systems* hardware. The answer is likely close to never.

7 As good as the platform is, it keeps getting better. IBM consistently delivers more functionality via faster hardware and more functionality from the operating system.

8 IBM provides innovations not found elsewhere in the enterprise UNIX space. Live partition mobility, or moving a running workload with no outages from one frame to another, doesn’t happen on other platforms. Active Memory Expansion allows for compressed memory, which drives higher utilization of the memory you purchased. Active Memory Sharing allows workloads to shift memory consumption between LPARs as demands for that memory shift over time. Workload partitions (WPARs) let you run multiple workloads on a single LPAR. Simultaneous multithreading allows for more work to be performed per processor core. All of these innovations keep IBM leading other vendors by a wide margin.

9 IBM has a clear roadmap. The company has predictable cycles in releasing new processors, hardware and versions. It has consistently delivered on its technology, where others have stumbled along the way.

10 IBM makes a huge investment in R&D and chip technology. This investment shows in the products that IBM sells. The company also trickles down innovations from other product lines, for example, using mainframe technologies in its midrange servers. IBM has been learning lessons in the computing field for more than 100 years, and that knowledge gets implemented in the hardware it sells. As IBM continues to innovate and invest in the product line, clients will continue to benefit by running an enterprise-class operating system for many years to come.

Getting a Handle on Entitled Capacity and Virtual Processors

Edit: Some links no longer work.

Entitled capacity and virtual processors frequently come into play when you’re working with shared processor pools, and multiple virtual machines are using that shared processor pool.

Originally posted July 12, 2012 by IBM Systems Magazine

Entitled capacity and virtual processors aren’t new to Power Systems. They frequently come into play when you’re working with shared processor pools, and multiple virtual machines (VMs) are using that shared processor pool.

However, many people struggle with these concepts, particularly individuals who are new to the Power platform due to a migration from some other flavor of UNIX.

Physical and Virtual CPUs

First, keep in mind you can never use more physical CPUs than virtual CPUs as defined in your LPAR. Even if you allocate one virtual processor to an LPAR and set it to be uncapped, you can’t run more than one physical processor because there would be no other virtual processors available.

This way, you can limit the LPARs in your shared processor pools even if your LPAR is uncapped and there are 16 processors available in a shared processor pool. You still won’t be able to use more than one physical CPU because you only allocated one virtual CPU.

A virtual processor can represent from 0.1 to 1 of a physical processor. If you have one virtual processor, the range it can physically consume will never be more than one. If you have three virtual processors, you can use from 0.3 to 3, but never more than three.

It makes sense, as you’re basically giving your VM the illusion that it’s dealing with a physical processor. If it boots up, and sees three virtual processors, even if it’s running on 0.3 physical processors, it won’t see more than three processors. If it’s running uncapped and wanted to use four physical processors, where would they run if there are only three virtual processors?

Complicating the Issue

Simultaneous multithreading (SMT) can confuse the issue more. With POWER7 you can have four SMT threads, so the one virtual processor you set up will appear as four logical processors in your VM. If you were to turn off SMT, you would only see one logical processor.

When you assign physical processor resources to your VM, you’re setting up your entitled capacity. No matter what the other VMs on your frame are doing, your VM is entitled to use that much physical processor. It might donate spare cycles it’s not using, but if the VM needs those cycles, it’s guaranteed to get them.

If your VM is uncapped, it can utilize excess cycles in the shared processor pool. By doing this, you might find your entitled consumption can exceed 100 percent. You might find your VM consistently runs at 300 percent of entitled capacity. Capping the VM will limit how much physical processor it can use, and you’ll never run at more than 100 percent of the entitled capacity. This is another way to limit a VM’s processor utilization—you can cap it as well as limit how many virtual processors it has.

No Easy Way Out

It would be easy to say, “Great, I’ll just leave all of my VMs uncapped and configure them to use all of the physical CPUs available by setting all virtual processors on all VMs to the same number of physical processors that I have in my shared processor pool, and let them fight it out.” Although this might be a way to get started, this method can have drawbacks.

Remember to pay attention to your workload. Consider the additional context switching you’ll see if your VM is uncapped, but its entitlement is too low. Keep in mind the number of virtual processors you’ve defined. If you’ve defined eight but only use two, you might end up with additional overhead on your system. Although the system does have processor folding—in which it won’t schedule work onto the unused virtual processors in order to have better memory cache hits—it’s best to avoid defining excess virtual processors in the first place.

Also remember that your virtual processors can impact your job stream. It might make sense to have four virtual CPUs and 1.6 processing units, this would give you 0.4 processing units on each virtual CPU. With a highly threaded workload, this might be optimal. However, if you have two virtual CPUs and the same 1.6 processing units, each virtual CPU gets 0.8 physical CPUs to utilize. Depending on the workload, this scenario might make more sense.

Monitor your workload and try to match up your real-life workload with the settings on the VM. If you entitled the VM to 0.5 physical CPU, but it’s consistently using two physical CPUs, try bumping up the entitlement so you know it won’t be starved for resources later. Right now, your pool might have enough capacity to allow the VM to use those two CPUs without issues, but if workload characteristics change, you could discover that jobs that used to run fine having issues because they’re starved for resources.

Another way to manage resources in an uncapped pool is by using the weights you assign to VMs through the hardware management console (HMC). This might not be as granular as you think. There won’t be much of a difference between one VM getting a 200 share, and another getting a 180 share. So use some meaningful numbers, make the higher-priority VMs 250 and the lower-priority machines 50, for example. You want the numbers to actually mean something when two VMs are competing for resources, and one is far more important.

Educate Yourself

For more on entitled capacity and virtual processors, I recommend the additional articles and IBM Redbooks papers listed in the Resources section of this article. Another way to learn more is by testing LPARs on a sandbox server and seeing what happens to your systems when you make dynamic changes to your physical and virtual processors on a running system. Reading about the topic is one thing, but to make sure you understand it, make changes and see if you get the results you expected.

An LPAR Review

https://robmcnelly.com/an-lpar-review/


Configuring Processor Resources for System p5 Shared-Processor Pool Micro-Partitions
http://www.ibmsystemsmag.com/aix/administrator/systemsmanagement/Configuring- Processor-Resources-for-System-p5-Shar/

IBM Redbooks
http://www.redbooks.ibm.com/redbooks/pdfs/sg247590.pdf

AIXpert Blog: AIX Virtual Processor Folding is Misunderstood
https://www.ibm.com/developerworks/mydeveloperworks/blogs/aixpert/entry/aix_virtual _processor_folding_in_misunderstood110?lang=en

POWER7 Virtualization Best Practice Guide
http://www.ibm.com/developerworks/wikis/download/attachments/53871915/P7_virtualization_bestpractice.doc?version=1

Reliable Restores

Edit: Some links no longer work. Originally published on IBM Systems Magazine

mksysb backups make AIX recovery easy

February 2011 | by Anthony English

If you ever need to restore your AIX system, you’ll need a reliable OS backup. You can create this via the mksysb command, which, as the name implies, makes a system backup. That’s not to say it gets your entire AIX system, but it does create a backup of the OS itself, the root volume group. For any other volume groups you’ll need to rely on other backup utilities.

You might need to build a system from a mksysb for disaster recovery or if your OS has become corrupted, but those aren’t the only times a backup comes in handy. A mksysb is a simple and effective way of migrating to new hardware. It can also be used to clone an existing AIX system. For example, you could create a Standard Operating Environment (SOE) LPAR, take a mksysb backup of it and use that to build new LPARs.

Tracing Your Roots

A mksysb is much more than a backup of the files in the rootvg file systems. It includes a boot image, optional software that has been installed into rootvg and system informational files. The mksysb contains the layout of the rootvg logical volumes and the file systems. This is important, as those file systems get created as part of the mksysb restoration process. That saves a lot of work and time. Restoring a mksysb even gives you the option of recovering your devices, so you don’t have to reconfigure network settings, disk attributes and so on. You’d normally use this only when you’re restoring onto the same system you backed up.

In the days of stand-alone systems, the mksysb command would write to a dedicated device such as a tape drive. Today it’s more common for the mksysb to be written to a file on disk and stored on a different LPAR. That way it can be made ready for use without needing to load physical media such as tapes or DVDs.

At Your Command

The mksysb command can be run from the command line or using the SMIT fastpath smitty backsys. You have to specify the output device or file. The mksysb file that’s created is typically between 2 and 4 GB, but it could be much larger, depending on the size of your rootvg. The target file system needs to have enough space for this file, and be large-file-enabled. The ulimit should be set to unlimited for the user who runs the backup.

Updating Your Image

When the mksysb is run, it includes details about volume groups, logical volumes, file systems, paging space and physical volumes. These details are stored in a file called /image.data, which can be created at any time with the mkszfile command, or at the time of the mksysb using the -i flag. This flag provides get an up-to-date snapshot of the file system sizes and mount points. Figure 1 shows the output of a mksysb command that has been written to a file on the /backup file system. As the -i option was used, you can see that a new /image.data file was created.

Be Exclusive

There may be files or directories in your rootvg that you don’t want to include in the mksysb. You can use another utility to back them up, or you may not want to back them up at all. To exclude certain files from your backup, create a file called /etc/exclude.rootvg and enter the patterns of file names or directories that you don’t want to include. Figure 2 shows an example of /etc/exclude.rootvg. When using this exclude file, you need to call the mksysb command with the -e flag. The mksysb command documentation provides more details on the file format.

mksysb to DVD

It’s common to create the mksysb backup from a NIM server (see links), or with the mkdvd command which will create the backup in a format suitable for DVDs. Many administrators are familiar with using NIM to create the mksysb, so I’ll focus on creating a mksysb backup in DVD-compatible format. This doesn’t require burning the backup onto a physical DVD. If you’re using the virtual I/O server (VIOS) Virtual Media Library, you can use mkdvd to create the mksysb file in ISO format, copy it to the VIOS and then load onto a virtual optical device when you want to use it. This is a simple and quick way of cloning or recovering your AIX OS, and it lets you keep bootable OS backups handy without the need for physical media. For more information about the virtual media library, read “Media Release.”

When creating the mksysb in DVD format, the mkdvd command can use an existing mksysb file or create a new mksysb. If the mksysb has been created beforehand, mkdvd can point to it using the -m flag. If you create a new mksysb backup first, the mkdvd creates a new /image.data file. It also accepts the -e flag to exclude unwanted files or directories.

New File Systems

To create a new mksysb and save it in ISO file format, use mkdvd -eS. This creates some temporary file systems in rootvg, so make sure you have spare disk space. You can specify an alternate volume group for the file systems with the -V flag. The -S flag will ensure the final ISO files don’t get deleted, so you can copy them to a remote host such as the VIOS. You’ll need to clean them up on the source host after you’ve done the copying.

The final mksysb file in ISO format is put into /mkcd/cd_images and is called “cd_image_” followed by the process ID as its suffix. If multiple volumes are required, the final images have suffixes to indicate their volume number. You can see a sample output of the mkdvd command with its default file systems in Figure 3.

You can leave the mkdvd to create the file systems it needs in the default locations, or point to other directories using the flags outlined in Figure 4.

A backup is only useful if it can be relied on for successful restores. When it comes to restoring the mksysb, you may need more than the original backup to proceed. The restore process requires 1) a device to boot from, 2) the mksysb backup itself and possibly 3) the AIX product media for installing devices. Ordinarily, if you’re restoring to the same system you backed up from, the mksysb will serve as the boot device, provided the backup itself is bootable. If you’re restoring to a different system, for example for a disaster recovery test, or if the backup was done via mkdvd with the -B flag (non-bootable), you’ll need to boot off the AIX installation media. The media can be a file-backed device presented via a virtual optical device on the VIOS.

You can customise your mksysb backup to specify options such as the disks you want to restore to, and whether to recover device information such as network settings. You can make these selections via the System Managed Storage (SMS) menus when you boot the target LPAR in maintenance mode. You can also do the restoration in unattended mode by editing the /bosinst.data file before you run the backup. If the system you’re restoring to has access to a diskette drive, you can create your own /bosinst.data and point to that at the time of the restore.

How do you know if your mksysb contains all of the device drivers you need for restoration? If you’re restoring to a different server, you’ll generally want to boot from the AIX product media to get any missing device drivers installed. If you are restoring to the same hardware configuration you backed up from, but have booted from product media that is a later version than your mksysb, you’ll be asked to load the product media so that the system you restore to will have its AIX software updated. You can overwrite these additional installations by editing the bosinst.data file and setting INSTALL_DEVICES_AND_UPDATES to no. The default is yes.

Ready-Made Recovery

It’s important to do regular tests of the mksysb restoration, not only to ensure the backup was successful, but also as a means of having documented procedures for rebuilding your system in the event of a disaster. It pays to be confident with your mksysb restoration procedures. Hardware failures aren’t the only reason you may need to use them. Simple mistakes can damage or render an OS as unusable, and the ability to restore AIX quickly and reliably is a key element in your system-recovery strategy.

Tape Storage: An Oldie but a Goodie

Edit: Good information.

Originally posted August 2011 by IBM Systems Magazine

As much as we like to tell ourselves that older technology is becoming obsolete, I still see fax machines, dot-matrix printers, dumb terminals, and tape drives out in the wild. Some will argue that we should be doing disk-to-disk backups and eliminate tape entirely, but when it comes to cost per GB and the ease of storage and transport, tape isn’t going away any time soon.

We recently had a customer that is new to AIX ask how we could back up all of its volume groups onto a single tape so it wouldn’t need an automatic tape changer, thus eliminating the need to handle more tapes than necessary.

The organization had multiple volume groups, including its rootvg, and it wasn’t using more space on disk than would fit on one tape. So the question was, “How could the customer get all of those volume groups and all of the data onto a single tape?” Basically, it needed to append all of the other volume groups onto the tape after the mksysb was done.

Search for a Solution

Looking online, I found someone else had the same question in a forum. “Is it possible to use AIX’s mksysb and savevg to create a bootable tape with the rootvg and then append all the other VGs?”

To create the backup, one astute responder suggested a script similar to this one:

tctl -f /dev/rmt0 rewind
/usr/bin/mksysb -p -v /dev/rmt0.1
/usr/bin/savevg -p -v -f /dev/rmt0.1 vg01
/usr/bin/savevg -p -v -f /dev/rmt0.1 vg02
/usr/bin/savevg -p -v -f /dev/rmt0.1 vg03
tctl -f /dev/rmt0 rewind

The script’s author stated:

  • mksysb backs up rootvg and creates a bootable tape.
  • Using “rmt0.1” prevents auto-rewind after operations.

He went on to explain restore procedures, for rootvg, boot from tape and follow the on-screen prompts (a normal mksysb restore). For the other volume groups:

tctl -f /dev/rmt0.1 rewind
tctl -f /dev/rmt0.1 fsf 4
restvg -f /dev/rmt0.1 hdisk[n]

“fsf 4” will place the tape at the first saved VG following the mksysb backup. Use “fsf 5” for the 2nd, “fsf 6” for the 3rd, and so on.

If restvg complains about missing disks, you can add the “-n” flag to forego the “exact map” default parameter. If you need to recover single files, the writer suggested:

tctl -f /dev/rmt0 rewind
restore -x -d -v -sf -f /dev/rmt0.1 ./path/file

In addition, I recommend adding a tctl offline or rewoffl at the end of the script to eject the tape. Otherwise, you will have a bootable tape sitting in your system. And, depending on your bootlist, if the machine restarts you could boot off of the tape, or if someone forgets to swap the tapes, you will overwrite it.

Exclusivity

If your data set nearly fits on a single tape, you can use /etc/exclude.rootvg and /etc/exclude.vg01 to exclude files and directories that we don’t need to back up. If there’s some scratch data on the system that doesn’t need to be backed up and restored, just exclude it.

This mksysb documentation tells us that the tape format includes a boot image, a bosinstall image and an empty table of contents, followed by the system backup (root volume group) image. The root volume group image is in backup-file format, starting with the data files and then any optional map files. In order to exclude files, it explains:

Use -e to exclude files listed in the /etc/exclude.rootvg file from being backed up. The rules for exclusion follow the pattern-matching rules of the grep command.

To exclude certain files from the backup, create the /etc/exclude.rootvg file, with an ASCII editor, and enter the patterns of file names to exclude in your system backup image. The patterns in this file are input to the pattern matching conventions of the grep command to determine which ones will be excluded from the backup. If you want to exclude files listed in the /etc/exclude.rootvg file, select the Exclude Files field and press the Tab key once to change the default value to yes.

  • For example, to exclude all of the contents of the directory called scratch, edit the exclude file to read:
/scratch/
  • To exclude the contents of the directory called /tmp and avoid excluding any other directories that have /tmp in the path name, edit the exclude file to read:
^./tmp/

All files are backed up relative to . (current working directory). To exclude any file or directory for which it is important to have the search match the string at the beginning of the line, use the ^ (caret character) as the first character in the search string, followed by . (dot character), followed by the filename or directory to be excluded. If the filename or directory being excluded is a substring of another filename or directory, use the ^. (caret character followed by dot character) to indicate that the search should begin at the beginning of the line and use the $ (dollar sign character) to indicate that the search should end at the end of the line.

A Viable Option

Obviously, the ability to backup an entire system onto a single tape only works in smaller shops with smaller amounts of data to back up, but there are still quite a few around today. While we like to think everyone is working in large enterprise data centers with around-the-clock operations staff, and dedicated storage, network and server teams, plenty of smaller customers have small staffs and small data sets where this idea might come in handy.

Storage Migration Tips

Edit: Still good stuff. Some links no longer work.

Move data without downtime using AIX

Originally posted April 2011 by IBM Systems Magazine

Organizations change storage vendors all the time, for many different reasons. Maybe a new storage product has come out with new features and functionality that will benefit the organization. Maybe the functionality isn’t new, but is unknown to your organization and someone decided it’s needed. Maybe a new storage vendor will include desired functionality in the base price. Maybe it’s a “political” decision. Maybe the equipment is just at the end of its life.

Whatever the reason, when it’s time to move from one storage subsystem to another, what are some options that you have to migrate your data using AIX? With ever-growing amounts of storage presented to our servers, and databases with sizes from several hundred GB to a few TB becoming more common, hopefully you’re not even considering something like a backup and restore from tape–along with all of the downtime that goes with it. Instead, you should focus on how to migrate data without downtime.

Evaluate the Environment

The first question I would ask how is your environment currently set up? Are you currently using virtual I/O (VIO) servers to present your logical unit numbers (LUNs) to the client LPARs in your environment using virtual SCSI or N_Port ID Virtualization (NPIV)? Are you presenting your LUNs to your LPARs using dedicated storage adapters? Take the time to go through different scenarios and look at the pros and cons of each. Call IBM support and get their opinion. Talk to your storage vendor. The more information you have, the better your decision will be. If possible, do test runs with test machines to ensure your procedures and planning will work as expected.

Possible Migration Solutions

If you’re using dedicated adapters in your LPARs to access your storage area network (SAN), it could be as simple as:

  • Loading the necessary storage drivers
  • Zoning the new LUNs from the new storage vendor to the existing host bus adapters (HBAs)
  • Running cfgmgr so that AIX sees the new disks
  • Adding your new disks to your existing volume groups with the extendvg command
  • Running the mirrorvg command for your rootvg disks, and the migratepv command to move the data in your other volume groups from the old LUNs to the new LUNs

The trick here is making sure that any necessary multipath drivers that are needed will coexist together on the same LPAR. In some cases, you may not be able to find out whether your desired combination is even supported. It may be possible that no one has tried to mix your particular storage vendors’ code before. This might be a nice time to test things in your test environment.

A cleaner solution may be to use a new VIO server for your new disks. If you have the available hardware on your machine–which would consist of enough memory, CPU and an extra HBA to bring up the new VIO server–then it could be the ideal scenario. A new VIO server, with the new storage drivers, and the new LUNs being presented to your existing client LPARs using vSCSI may be your best bet. The advantage of this method is the storage drivers are being handled at the VIO server level instead of the client level, like they would be with NPIV. The disadvantage would be handling all of the disk mappings in the VIO server. I prefer to run NPIV and map disks directly to the clients’ virtual Fibre adapters, but again you could have the issue of mixing storage drivers so you would really need to test things before trying it on production LPARs.

If a new VIO server isn’t feasible for whatever reason, and you’re currently running with dual VIO servers and vSCSI, you should be able to remove the paths on your client LPARs that are coming from your secondary VIO server, then unmap the disks that are coming from your second VIO server. You can then remove the existing disks from your second VIO server, remove any multipath code and then repurpose it to see the new disks with the new code.

Clean Up

After the data has been migrated, you can go back and clean up the old disks and then zone the new disks to the secondary VIO server as well. Remember to correctly set up your no_reserve locks in the VIO servers and your hcheck_interval attributes on your clients for your new disks.

Chris Gibson has a great article that covers migration scenarios in more detail, which you can read on the developerWorks website.

While your data is migrating, you might want to watch what is happening with your disks. In some cases, such as with the mirrorvg command, you might not be able to get disk information and run logical volume manager (LVM) commands as your volume group is locked. While you can still run topas to watch your disk activity and see that data is being read from your source disk and written to your target disk, you might want to get more detailed information. In this case, look at the –L flag in the AIX logical volume manager, which Anthony English covers, also on developerWorks.

“On LVM list commands, the -L flag lets you view the information without waiting to obtain a lock on the volume group. So, if you come across the message, which tells you the volume group is locked, and you really can’t wait, you could use:lsvg -L -l datavg

The first -L doesn’t wait for a lock on the volume group. The second one is to list logical volumes. To list a single logical volume, such as lv00, use:lslv -L lv00

And to list physical volumes (PVs), which are almost always virtual:lspv -L hdisk3

Backing Up Cloud

Edit: Other people’s computers.

Considerations on security, data ownership

Originally posted February 2011 by IBM Systems Magazine

I miss the good old days when I had maintenance windows that were long enough that I could bring my machine down to single user mode and back up the whole system. These backups contained all of the data that mattered to the company at the time. Twenty years ago, I could only back up my machine with reel-to-reel tape drives. I’d bring my machine down to single-user mode to perform the backup, and each tape backup would take 12 minutes. I remember this because we would set the time on a portable kitchen timer when we started each tape. When the timer went off, we’d head to the computer room to swap out the tape, and go to the console to “press G to continue the backup.” All of the important data lived on that one machine. We didn’t worry about distributed computing environments, as we weren’t running any at the time. Sure we had a few PCs scattered here and there, but they weren’t critical. The entire company and all of its data lived on that central machine, and users who sat in front of green-screen dumb terminals accessed it. There wasn’t any data that users stored locally; it was all stored on the machine in the computer room.

When I hear about cloud computing, this is still the kind of environment I picture: where people are logged into a central machine that exists in a computer room in the sky. I use several Web-based applications like salesforce.com, webex.com or Google Mail, where I know nothing about the servers nor where the applications run, and I don’t necessarily care about the hardware or operating systems the applications use. I log in, use the service and log out. I often find myself logging into the IBM virtual loaner program website, where I can utilize slices of IBM hardware for short periods of time for demonstrations or proof of concepts or education.

I’ve worked with companies that have cloud offerings, where I can very easily log in, spin up some resources on their servers and then spin them back down when I am finished with them. As long as my response time is acceptable, do I really care about the physical hardware these virtual instances run on?

I’ve also had customers who were unable to get resources to test hardware in their environment. Using the cloud, they were able to log on to a cloud provider, spin up some server resources, do the that they needed and spin the resources back down–all without waiting for their internal IT departments to acquire and configure hardware for them. This would also benefit users who have test hardware several generations behind what they’re using in production. Instead of using old hardware, they can use more modern machines in virtual environments as needed.

Consider This

There are benefits to cloud computing, but there may be a few things to contemplate when considering a leap from your own computing assets to those that you don’t control. I realize that these days we’re usually accessing cloud-based applications over the Internet instead of from a green screen directly attached to computers in the machine room, but concerns like privacy, security and availability need to be considered along with all the benefits that are touted with the cloud.

Backup and recovery is another consideration when deploying services to the cloud. How do we back up our data that lives in the cloud? Surely cloud providers offer snapshots and local backups, and maybe that’s good enough for what you’re doing. If you wanted to copy your data to machines that are under your control, would you use the network and some kind of continuous data protection in order to move data from the cloud to machines you own so that you have another copy of it? Or would that method of data protection defeat the purpose of having someone else handling infrastructure management?

What happens if somewhere down the road you decide you want to get out of the cloud? Are there going to be issues with getting your data or OS images back under your control? Can you easily clone the systems back onto your own hardware or will you be looking at server reloads?

I have watched customers struggle with liberating their data from outsourcing companies and contracts. The companies that manage the machines have custom tools and scripts that they don’t want to hand over. They may have information around how the machines were configured that they don’t want to share. What’s your plan to get out of the cloud or move to another cloud provider if you find the one you are using isn’t for you? What do you do if the service you’re using goes down, or the company goes out of business or they change the interface so much that you no longer like the way you use the tool? Will upgrades and outages happen on your timetable or on theirs? When you get used to accessing servers and applications from anywhere there’s a network connection and then you find the provider has an outage, you want to be sure providers offer information and status updates on when they expect to recover the systems.

I enjoyed reading a blog post from John Scalzi who was trying an experiment where he would exclusively use Google Docs and a Google laptop computer to write a novel. Technical glitches began causing delays, and he eventually retuned to working from his desktop, saying, “ Until ‘the cloud’—and the services that run on them—can get out of your way and just do things like resident programs and applications can, it and they are going to continue to be second-place solutions for seriously getting work done.”

Return to Centralization

While there are definite advantages to the cloud-computing approach in some situations, I can’t help but think that the whole idea has a “Back to The Future” feel to it, where we take distributed computing resources and try to centralize them again, or worse yet, rebrand existing offerings as cloud offerings so we can say we’re on the cloud bandwagon. Certainly there are going to be applications and situations that will benefit from moving applications out of data centers. We just need to be sure to do our homework and educate ourselves before making the leap.

HMC Users: Important Fix Available

Edit: Some links no longer work.

Originally posted August 30, 2011 by IBM Systems Magazine

This information has been circulating for awhile, and Anthony English covers the topic here and here. But I want to make sure HMC users are aware of this important update and the need to make sure you have the fix loaded if you’re at V7R7.3.0.

A problem is known to exist when using dual HMCs in one of two environments: either one HMC is at a different level than the other, or both HMCs are at the base HMC V7R7.3.0 level without fixes.

The problem is possible exposure to corruption that could cause you to lose partition profiles.

A fix is available and should be installed immediately on any HMC that might possibly be impacted by this problem.

If you’re using an HMC and an SDMC, be sure to get the fix for the SDMC as well.

From the IBM technical bulletin:

“This PTF was released July 18, 2011, to correct an issue that may result in partition configuration and partition activation profiles becoming unusable. This is more likely to occur on HMCs that are managing multiple systems. A symptom of this problem is the system may display Recovery and some or all profiles for partitions will disappear. If you are already running HMC V7R7.3.x, IBM strongly recommends installing PTF MH01263 to avoid this issue. If you are planning to upgrade your HMC to the V7R7.3.x code level, IBM strongly recommends that you install PTF MH01263 during the same maintenance window to avoid this issue.”

The efix can be found here. This package includes these fixes:

  • Fixed a problem where managed systems lose profiles and profiles get corrupted resulting in Recovery state which prevent the ability to do DLPAR/LPM.
  • Fixed a security vulnerability with the HMC help content.

As noted, this is the statement IBM released in July, before the fix became available. The fix–MH1263 PTF–is now out, so be sure to install it.

Again, from IBM:

“Abstract: HMC / SDMC Save Corruption Exposure
Systems Affected: All 7042s
Communicable to Clients: Yes

“Description:
IBM has learned that HMCs running V7R7.3.0 or SDMC running V6R7.3.0 could potentially be exposed to save area corruption (where partition profile data is stored).

“Symptoms include loss of profiles and/or recovery state due to a checksum failure against the     profiles in the save area. In addition, shared processor pools names can be affected (processor pool number and configuration are not lost), system profiles lost, virtual ethernet MAC address base may change causing next partition activation to fail or to have different virtual Ethernet MAC addresses, loss of a default profile for all or some of the partitions.

“Partitions will continue to run, but reactivation via profile will fail if the profile is missing or corrupted. All mobility operations and some DLPAR operations will fail if a partition has missing or corrupted profiles.

“Environments using HMCs or SDMCs to control multiple managed systems have the greatest exposure. Triggers for exposure include any of the following operations performed in parallel to any managed system: Live Partition Mobility (LPM), Dynamic LPAR (DLPAR), profile changes, partition activation, rebuild of the managed system, rebooting with multiple servers attached, disconnecting or reconnecting a server, hibernate or resume, or establishing a new RMC connection.

“Recommended Service Actions:
Prevention/Workaround:
There is no real work-around other than limiting the configurations to a single HMC managing a single managed system.

“Customers who have not yet upgraded or installed HMC 7.7.3 should delay the upgrade/install if at all possible until a fix is available.

“Customers who have not yet installed and deployed SDMC 6.7.3.0 should avoid discovering     production servers until a fix is available.

“Customers that have 7.7.3 or SDMC 6.7.3.0 deployed should:

  • Immediately do a profile backup operation for all managed servers:

    bkprofdata -m <managed system name> -f <filename>

  •  Minimize the risk of encountering the problem by using only a single HMC or SDMC to  manage a single server via the following options:
  1. Power off dual HMC/SDMC or remove the connection from any dual HMC/SDMC.
  2. Use one HMC per server (remove/add connections as needed if necessary).
  3. A single HMC/SDMC managing multiple servers might be done relatively safely if the operations listed under triggers above are NOT done to two different servers concurrently.

“Recovery:
 NOTE: Recovery will be easiest with a valid backup of the profile data. So it is extremely important to backup profile data prior to an HMC upgrade or after any configuration changes to the save area. If a    profile data backup exists this problem can be rectified by restoring using:

    rstprofdata -m <managedsysname> -l 3 -f <backupfilename>

“In addition to user backups, profile backups can be extracted from the previous save upgrade data (DVD or disk); a backup console data (if available); or pedbg.

“If a good backup does not exist, call your HMC/SDMC support to determine if recovery is possible.

 “Fix:
A fix to prevent this from occurring is due out by the end of July (Editor’s note: We realize this is now available but wanted to include the verbiage for completeness), but the PTF will not fix an already corrupted save area. A follow-up notification will be sent as soon as it is available.”

Please heed the warnings and load this fix as soon as possible if you’re running V7R7.3.0. And don’t run any HMCs at V7R7.3.0 while running others at a lower level.