Why I Decided Not To Put Flash In The Array

My story starts about 3 years ago, where at the time I was the Information Systems director for a large non-profit in Atlanta, GA. One of the initiatives at the time was to become 100% virtualized in 6 months; and there were obviously many tasks that needed to be accomplished before reaching that milestone. The first task was to upgrade the storage platform, as we had already surpassed the performance characteristics for the current workloads. As with any project, we looked at all the major players in the market, we ran trials, talked to other customers, and did our due-diligence for the project. It was not only important for us to be mindful of costs being a non-profit but we wanted also to be good stewards in everything we did. 

The current storage system that we were looking to upgrade was a couple 7.2K RPM, 24 TB chassis’. We had plenty of storage for our needs but latency was in the 50ms range encompassing only about 3000 IOPs. Obviously not the best to run a virtualized environment on as you can see!! We looked at the early All Flash Arrays that were just coming out and we also looked at the Hybrid Arrays, all of them promising increased IOPs and lower latency. The problem was that they were not an inexpensive proposition. So, the dilemma of being good stewards and at the same time needing single digit latency with more than 50K IOPs was a challenge to say the least. 

About the same I met a gentleman that told me some magical stories that sounded almost too good to be true! This man’s name is Satyam Vaghani, the PernixData CTO, creator of VVOLS, VAAI and VMFS.  Soon after meeting Satyam, I was given the privilege of getting my hands on an alpha build of PernixData FVP. I ran and tested the product during the alpha and beta stages at which I in turn immediately purchased and became PernixData’s first paying customer. I had never purchased a product in Beta before, but I felt this product was out of the ordinary. The value and the promise were proved even in beta, where I didn’t have to buy new storage just for performance reasons and thus saved the organization collectively over $100,000. This wasn’t a localized problem; it was an architecture problem that no array or multiple storage systems could solve. So, if I were in that position today, I’m sure the calculation over 3 years would be close to $500,000 worth of savings, do to the scale-out nature of the FVP solution. As the environment grew and became 100% virtualized I no longer would have had to think about storage performance in the same way. I no longer would have had to think about the storage fabric connections in the same way as well. Talk about a good feeling of not only being a good steward but also astonishing the CFO on what was achieved. 

This to me validated the waste and inefficiencies that occur when flash is being used at the storage layer. Disk is cheap when used for capacity and so it has never made sense to me to cripple flash performance by putting it behind a network in a monolithic box that can have it’s own constraints and bottlenecks. 

Fast forward to today where flash is now much more prominent in the industry. The story is even stronger today, how can anyone not be conscientious about spending over 100K on a single array that can only achieve 90,000 IOPs with single digit millisecond latency? When someone can buy a single enterprise flash drive for $500 that does over 50K IOPs with microsecond latency, then the question that must be asked, can you defend your decision from the CFO or CIO and feel good about it?

Don’t get me wrong; I’m not saying FVP replaces storage capacity, if you need storage capacity, then go and purchase a new storage array. However, this doesn’t mean that you have to buy an AFA for capacity reasons. There are many cost effective options out there that makes more economic sense, no matter what the dedupe or compression rates that are promised!  

My personal advice to everyone is to be a conscientious objector when deciding to put flash in the array. It didn’t make sense for me 3 years ago and still doesn’t make sense today. 

Clustered VAAI?

In reading the recent updated VMware vSphere Storage API Array Integration (VAAI) White Paper, I noticed a statement that caught my eye.

"VMware does not support VAAI primitives on VMFS with multiple LUNs/extents if they all are on different arrays.."

I understand the difficulty in doing this, but it makes me wonder if the coming VMware vVOLs will be the technology that gives the vAdmin the capability of crossing array bounderies on a single LUN that supports VAAI primitives. 

If anybody has thoughts or insight to this, please tweet or comment.




VAAI Primer

Before I unleash the posts on my VAAI Array project, I thought it best to make sure the readers are somewhat famiar with VAAI

VAAI (vStorage APIs for Array Integration) is a feature first introduced in ESX/ESXi 4.1 and later expanded in ESXi 5.0. It is an API that was developed to enhance performance on the vSphere infrastructure by offloading several tasks to compliant Storage Arrays. *More on the compatible Storage Arrays later*

VAAI Benefits:

  • Atomic Test & Set (ATS), Atomically modifies sectors on a disk without having to use SCSI reservations. This means that LUN access from other hosts won't be the locked. This should increase performance many fold, depending on how many hosts access the same LUN. 

  • Clone Blocks/Fully Copy/XCOPY, directly on the supported Array without having to resort to ESX software data mover and moving data to/from the hosts and to/from the Array. If VAAI is enabled, copying/cloning of data will move at the speed of the hardware Array.
  • Zero Regional Blocks, Zeros out a large number of blocks on the Array for provisioning. This allows vSphere to speed up provisioning and do other tasks. 

  • Thin Disk Space Reclaim, this API uses SCSI UNMAP instead of SCSI Write and is based on VMFS 5.0. This basically tells the Array to write Zeros where something was deleted and then tells vSphere it's available. 

Keep in mind there are several things to be aware of when using VAAI, so I want to list two of them that I think most will need to be aware of. 


  1. If the source and destination VMFS volumes have different block sizes, then ESXi resorts to the default data mover. So suppose you used a 8MB block size on ESXi 4.1 and then upgraded to ESXi 5 and then as a result upgraded your file system to VMFS 5.0. You would think that the block size would either change or wouldn't matter, but the result is that it doesn't change the block size to the ESXi 5.0 default of 1MB. So if you add a VAAI enabled Array to the mix, then Hardware assisted offload won't work, until you recreate the datastores to be the same default block size. 
  2. If the source VMDK type is eagerzeroedthick and the destination VMDK type is thin, then VAAI offload won't work. 

This post was meant to be a short summary of VAAI. Feel free to comment on anything I missed or items that you think will benefit the readers.