Maxmizing IOPS

Edit: Some links no longer work.

Originally posted March 24, 2015 on AIXchange

Recently I listened to a discussion of the differences in input/output operations per second (IOPS) in various workload scenarios. People talked about heavy reads. They talked about heavy writes. They debated whether it was better to use RAID5, RAID6 or RAID10. Things got a little heated.

I came away thinking that I should cover this topic and share some resources with you. For instance, this article provides basic information about physical disks, but also makes some interesting points:

“Published IOPS calculations aren’t the end-all be-all of storage characteristics. Vendors often measure IOPS under only the best conditions, so it’s up to you to verify the information and make sure the solution meets the needs of your environment.

IOPS calculations vary wildly based on the kind of workload being handled. In general, there are three performance categories related to IOPS: random performance, sequential performance, and a combination of the two, which is measured when you assess random and sequential performance at the same time.

Every disk in your storage system has a maximum theoretical IOPS value that is based on a formula. Disk performance — and IOPS — is based on three key factors:

    Rotational speed
    Average latency
    Average seek time

Perhaps the most important IOPS calculation component to understand lies in the realm of the write penalty associated with a number of RAID configurations. With the exception of RAID 0, which is simply an array of disks strung together to create a larger storage pool, RAID configurations rely on the fact that write operations actually result in multiple writes to the array. This characteristic is why different RAID configurations are suitable for different tasks.

For example, for each random write request, RAID 5 requires many disk operations, which has a significant impact on raw IOPS calculations. For general purposes, accept that RAID 5 writes require 4 IOPS per write operation. RAID 6’s higher protection double fault tolerance is even worse in this regard, resulting in an “IO penalty” of 6 operations; in other words, plan on 6 IOPS for each random write operation. For read operations under RAID 5 and RAID 6, an IOPS is an IOPS; there is no negative performance or IOPS impact with read operations. Also, be aware that RAID 1 imposes a 2 to 1 IO penalty.”

Again, that article is focused on physical disks. But I’m also seeing more and more solid state devices (SSDs) being deployed. These charts compare spinning disks to SSDs, and they’re eye-opening. While a 15K SAS drive might see 210 IOPS, an individual consumer grade SSD might see 5,000 or 20,000 IOPS. Disk subsystems like the IBM FlashSystem 840 show 100 percent random 4K reads IOPS of 1.1 million, while a read/write workload might have 775,000 IOPS.

Here’s an interesting tool that lets you configure environments for SSD and physical disk and compare their performance. By moving other variables around, you can model hard drive capacity and estimate workload read /write percentages and drives being used.

What methods do you use when configuring your disk subsystem? Is SSD being deployed in your environment? What RAID levels are you targeting?