Edit: Some links no longer work.
Originally posted July 30, 2013 on AIXchange
Check out this document from IBM’s Dirk Michel, “AIX on Power – Performance FAQ.” It’s only 87 pages, but there’s great information. I encourage you to read it and become familiar with its contents.
Chapter 2 asks and answers the question, “what is performance?”
“For interactive users, the response time is the time from when the users hits the button to seeing the result displayed. The response time often is seen as a critical aspect of performance because of its potential visibility to end users or customers. The throughput of a computer system is a measure of the amount of work performed by a computer system over the period of time. Examples for throughput are megabytes per second read from a disk, database transactions per minute, megabytes transmitted per second through a network adapter. Throughput and response time are related. In many cases a higher throughput comes at the cost of poorer response or slower response as well as better response time comes at the cost of lower throughput.”
Chapter 4 covers workload estimation and sizing:
“Some questions to consider before beginning the sizing exercise:
1. What are the primary metrics, e.g., throughput, latency, that will be used to validate that the system is meeting performance requirements?
2. Does the workload run at a fairly steady state, or is it bursty, thereby causing spikes in load
on certain system components? Are there specific criteria, e.g., maximum response time that must be met during the peak loads?
3. What are the average and maximum loads that need to be supported on the various system components, e.g., CPU, memory, network, storage?”
Chapter 5 covers performance concepts along with CPU performance, multiprocessor systems, multithreading, processor virtualization, memory performance, caches, cache coherency, virtual memory, memory affinity, processor affinity and more.
Chapter 6 is an examination of performance analysis and tuning:
“This chapter covers performance analysis and tuning process from a high level point of view. Its purpose is to provide a guideline and best practice on how to address performance problems using a top down approach. Application performance should be recorded using log files, batch run times or other objective measurements. General system performance should be recorded, and should include as many components of the environment as possible. Before collecting any data or making tuning or configuration changes, define what exactly is slow. A clear definition about what aspect is slow usually helps to shorten the amount of time it takes to resolve a performance problem since a performance analyst gets a better understanding what data to collect and what to look for in the data.”
Section 6.3.4 presents a performance analysis flow chart.
Chapter 7 gives a performance analysis how-to:
“This chapter is intended to provide information and guidelines on how to address common performance problems seen in the field, as well as tuning recommendations for certain areas. Please note that this chapter is not intended to explain the usage of commands or to explain how to interpret their output.”
Chapter 8 includes frequently asked questions. Here’s one I like:
“I heard that… should I change…?
“No, never apply any tuning changes based on information from unofficial channels. Changing
performance tunables should be done based on performance analysis or sizing anticipation.”
Chapter 9 features things you should know about POWER7. Section 9.10 covers virtualization best practices, for example:
9.10.1 Sizing virtual processors
- The number of virtual processors of an individual LPAR should not exceed the number of physical cores in the system
- Shared processor pool: the number of virtual processors of an individual LPAR should not exceed the number of physical cores in the shared processor pool
9.10.2 Entitlement considerations
Best practice for LPAR entitlement would be to set the LPARs entitlement capacity to its average physical CPU usage and let the peaks addressed by additional uncapped cycles. For example, an LPAR running a workload that has an average physical consumed of 3.5 cores and a peak utilization of 4.5 cores should have 5 virtual processors to handle the peak CPU usage and an entitlement of 3.5.
Chapter 11 covers the AIX Dynamic System Optimizer, and Chapter 12 explains how to report a performance problem using perfpmr.
Obviously there’s far more than I’ve listed here. Read it for yourself and share your thoughts in comments.
Also take a look at today’s PowerLinux announcement: http://www-03.ibm.com/press/us/en/pressrelease/41582.wss
“The PowerLinux 7R4 is the high-end addition to IBM’s line-up of Power Systems PowerLinux servers running industry standard Linux from Red Hat and SUSE. Joining the PowerLinux 7R1 and 7R2 models, the PowerLinux 7R4 delivers a new level of performance with up to 4 sockets and 32 cores. “Powerful POWER7+ DCM processors that offer:
• 3.5 GHz and 4.0 GHz performance with 16 or 32 fully activated cores
• Up to 1024 GB of memory
• Rich I/O options in the system unit:
• Six PCIe 8X Gen2 slots in the system unit
• Two GX++ slots for I/O drawers
• Six hard disk drive (HDD)/solid-state drive (SSD) SAS small form factor (SFF) bays and integrated SAS I/O controllers
• Integrated Multifunction Card with four Ethernet, two USB, and one serial port”