Why I Decided Not To Put Flash In The Array

My story starts about 3 years ago, where at the time I was the Information Systems director for a large non-profit in Atlanta, GA. One of the initiatives at the time was to become 100% virtualized in 6 months; and there were obviously many tasks that needed to be accomplished before reaching that milestone. The first task was to upgrade the storage platform, as we had already surpassed the performance characteristics for the current workloads. As with any project, we looked at all the major players in the market, we ran trials, talked to other customers, and did our due-diligence for the project. It was not only important for us to be mindful of costs being a non-profit but we wanted also to be good stewards in everything we did. 

The current storage system that we were looking to upgrade was a couple 7.2K RPM, 24 TB chassis’. We had plenty of storage for our needs but latency was in the 50ms range encompassing only about 3000 IOPs. Obviously not the best to run a virtualized environment on as you can see!! We looked at the early All Flash Arrays that were just coming out and we also looked at the Hybrid Arrays, all of them promising increased IOPs and lower latency. The problem was that they were not an inexpensive proposition. So, the dilemma of being good stewards and at the same time needing single digit latency with more than 50K IOPs was a challenge to say the least. 

About the same I met a gentleman that told me some magical stories that sounded almost too good to be true! This man’s name is Satyam Vaghani, the PernixData CTO, creator of VVOLS, VAAI and VMFS.  Soon after meeting Satyam, I was given the privilege of getting my hands on an alpha build of PernixData FVP. I ran and tested the product during the alpha and beta stages at which I in turn immediately purchased and became PernixData’s first paying customer. I had never purchased a product in Beta before, but I felt this product was out of the ordinary. The value and the promise were proved even in beta, where I didn’t have to buy new storage just for performance reasons and thus saved the organization collectively over $100,000. This wasn’t a localized problem; it was an architecture problem that no array or multiple storage systems could solve. So, if I were in that position today, I’m sure the calculation over 3 years would be close to $500,000 worth of savings, do to the scale-out nature of the FVP solution. As the environment grew and became 100% virtualized I no longer would have had to think about storage performance in the same way. I no longer would have had to think about the storage fabric connections in the same way as well. Talk about a good feeling of not only being a good steward but also astonishing the CFO on what was achieved. 

This to me validated the waste and inefficiencies that occur when flash is being used at the storage layer. Disk is cheap when used for capacity and so it has never made sense to me to cripple flash performance by putting it behind a network in a monolithic box that can have it’s own constraints and bottlenecks. 

Fast forward to today where flash is now much more prominent in the industry. The story is even stronger today, how can anyone not be conscientious about spending over 100K on a single array that can only achieve 90,000 IOPs with single digit millisecond latency? When someone can buy a single enterprise flash drive for $500 that does over 50K IOPs with microsecond latency, then the question that must be asked, can you defend your decision from the CFO or CIO and feel good about it?

Don’t get me wrong; I’m not saying FVP replaces storage capacity, if you need storage capacity, then go and purchase a new storage array. However, this doesn’t mean that you have to buy an AFA for capacity reasons. There are many cost effective options out there that makes more economic sense, no matter what the dedupe or compression rates that are promised!  

My personal advice to everyone is to be a conscientious objector when deciding to put flash in the array. It didn’t make sense for me 3 years ago and still doesn’t make sense today. 

PernixData FVP Hit Rate Explained

I assume most of you know that PernixData FVP provides a clustered solution to accelerate read and write I/O. In light of this I have received several questions around what the “Hit Rate” signifies in our UI. Since we commit every “write” to server-side flash then you obviously are going to have a 100% hit rate. This is one reason why I refrain calling our software a write caching solution!

However the hit rate graph in PernixData FVP as seen below is only referencing the read hit rate. In other words, every time we can reference a block of data on the server-side flash device it’s deemed a hit. If a read request cannot be acknowledged from the local flash device then it will need to be retrieved from the storage array. If a block needs to be retrieved from storage then it will not be registered in the hit rate graph. We do however copy that request into flash, so the next time that block of data is requested then it would then be seen as a hit.

Keep in mind that a low hit rate, doesn’t necessarily mean that you are not getting a performance increase. For example if you have a workload in “Write Back” mode and you have low hit rate, then this could mean that the workload has a heavy write I/O profile. So, even though you may have a low hit rate, all writes are still being accelerated because all the writes are served from the local flash device. 

Where are you measuring your storage latency?

I often times hear from vendors, virtual & storage admins about where they see storage latency in a particular virtualized environment. The interesting part is that there is a wide disparity between what is communicated and realized.

If storage latency is an important part of your measurement of performance in your environment then where you measure latency really matters. If you think about it, the VM latency is really the end result of the realized storage latency. The problem is that everyone has a different tool or place where they measure latency. If you look at the latency at the storage array then you are only really seeing the latency at the controller and array level. This doesn’t always include the latency experienced on the network or in the virtualized stack.

What you really need is visibility into the entire I/O path to see the effective latency of the VM. It’s the realized latency at the VM level that is the end result and what the user or admin sees or experiences. It can be dangerous to only focus your attention on one part of the latency in the stack and then base decisions on what the latency to the application is.

To solve this problem, PernixData has provided visibility into what the VM is observing, and since FVP is a read/write acceleration tier, you can also show a breakdown of latency in regards to read/write acknowledgements. 

As an example using the new zoom function in the new release of FVP 1.5, I can see the latency breakdown for a particular SQL Write Back enabled VM.

 

 

As you can see in this graph, the “Datastore” on the array had a latency spike that attributed to 7.45 Milliseconds, while the “Local Flash” on the host is at 0.25 ms or (250 Microseconds). The “VM Observed” latency is what the actual VM is seeing and thus you have a realized latency of 0.30 ms or (300 Microseconds)!! The reason you may have a small difference between Local Flash latency and VM Observed latency can be do to system operations such as flash device population as well as having write redundancy enabled or not.

To see this from a read/write perspective, you can also go to the "Custom Breakdown" menu and choose "Read" and "Write" to see the "VM Observed" latency broken down into reads and writes. 

 

As you can see the latency for this application was for writes not reads and since this VM is in Write Back mode we are seeing a realized 0.44 ms or (440 Microseconds) latency committed acknowledgment back to the application!!

This is obviously not the only way to determine what the actual latency is for your application, but what is unique, is the fact that PernixData is not making another latency silo solution. In other words, there are plenty of storage products on the market that give a great view into their perfect world of latency, but it’s isolated and not the full picture of what is observed on what matters in your virtualized datacenter. 

 

ESXi Runs in Memory - Boot Options

I hope the title of this post doesn’t surprise you! This is sometimes a forgotten design of ESXi when choosing your boot options. I have been increasingly talking with VMware admins that are deciding to mirror their local drives for ESXi. This seems to be a common design on blade architectures as well, where they use the two open drive bays for mirrored ESXi boot images.

The question to ask, is why do this? If ESXi runs entirely in memory what benefit do you have in mirroring two drives? Yes, you do have another copy of the image, in case of corruption, but wouldn’t it be easier and less wasteful to just store a copy of the image on removable media or use image builder for the resiliency!

Most server architectures are including internal SD cards or USB flash inputs to install ESXi on and there is of course the use of VMware’s Auto Deploy! The use of one of these methods for ESXi boot will not only save resources but will open up more opportunities for new technology usage.

There are many examples of converged storage architectures that would require you to use all available drive bays to maximize capacity usage. Then there is also the use of server-side flash technologies, like PernixData FVP. Having multiple options for local flash will provide more possibilities when you want to create tiers of flash for your differing workloads.

The point of this post is to hopefully illustrate that you don’t have to mirror ESXi for fault tolerance. There are many other alternatives to protect your image and why waste resources on something that could hinder the growth of your virtualized datacenter.

For added reading pleasure, here is a link to some entertaining conversations about installing ESXi on local disk or local USB.

http://community.spiceworks.com/topic/240520-whats-the-advantage-of-installing-esxi-on-a-usb-key-over-local-disks

Features of an Enterprise SSD

When looking for a flash device to use for PernixData FVP or other enterprise use cases, performance and reliability are important aspects to factor in. Just because a drive is spec’d with high IOPs and low latency numbers, doesn’t mean that it will keep up at that rate over time with enterprise workloads.

I would guess that most of you would prefer a consistent performing, reliable flash to higher IOPs or lower latency.  This is one reason why I like the Intel S3700 SSD. This drive does a good job at repeatable results and withstands heavy workloads over time. I’m not saying this drive or others are slow, these drives are still very fast, but they do favor consistency and reliability by design.

  

A little over a year ago Intel introduced a technology that enhanced the reliability of MLC flash. Intel called it HET – High Endurance Technology. This is basically an enhancement in firmware, controller and high-cycling NAND for endurance and performance. The optimization was in error avoidance techniques and write amplification reduction algorithms. The result is new enterprise SSD’s that are inexpensive and deliver good performance at predictable behavior. Keep in mind though that not all Intel drives have HET, this is what separates consumer from enterprise class drives.

This is one reason why Intel can claim “10 full drive writes per day over the 5-year life of the drive”. You will also notice that other manufactures/vendors OEM and incorporate Intel’s 25nm MLC HET NAND into their products. The incorporation of HET set’s Intel apart from the rest, but this doesn’t mean however that there are not others to choose from. It’s when you factor price, reliability, performance, and customer satisfaction that currently leads many to the S3700. 

The other important aspect to consider when looking for an enterprise SSD is read/write performance consistency. Some drives are architected just for read performance consistency. So if you have workloads that are balanced between read/write, or are write heavy then you want to look at a drive that provides consistency for both read and write.

As an example, the Intel S3500 gives better read performance consistency while the Intel S3700 gives consistency for both read and write. (Keep in mind that the Intel S3500 doesn't use HET)

 

Intel S3500 

 

Intel S3700

 

I reccomend taking a look at Frank Denneman's current blog Series that goes into some other aspects of flash performance with FVP. 

 


The First Flash Hypervisor

It's now official, the world has it's first Flash Hypervisor. PernixData has created a transformative technology that will have a resounding affect on the future datacenter. 

PernixData FVP ships today 1.0 of what will become omnipresent in the virtualized world. The growth of virtualization has created a need to accelerate today's applications and allow businesses to continue take advantage of virtualization. There is only one complete solution on the market that addresses this need and takes your datacenter to the next level! 

It's the Flash Hypervisor level in the virtualization stack that will create ubiquitously, because it's ability to scale, accelerate, and manage the world's modern workloads. Check out the details in our latest datasheet

So, join the revolution and download the 60 day trial today!! 

Pernixdata - Solving the I/O Bottleneck

As some of you already know Pernixdata came out of stealth yesterday. I have been eagerly waiting for this time to share how I think Poojan Kumar (CEO) & Satyam Vaghani (CTO) and their great team plan to take the lead in a new market opportunity.

I have had the privilege and opportunity of testing the Flash Virtualization Platform from Pernixdata and I can honestly tell you that it works and it works well. The technology is truly revolutionary and thus plan to post several times over the coming weeks and months about this new innovation.

The best way to describe the “FVP –Flash Virtualization Platform” is to look at it from a data tier perspective instead of just a caching solution. It’s easy just to call it a caching solution, because there isn’t anything like it. Its the breadth of this new platform that masters the I/O workloads and commoditizes the use of Flash in compute. 

 

The I/O bottleneck between storage and compute has hampered the industry for sometime. This started to change when VMware released their VAAI API, but adoption was slow and expensive. Purchasing additional arrays was not the answer from a financial or technological perspective. There really needed to be a new technology to bring everything together. This is where Pernixdata comes in to play, solving the scale and performance problems that have plagued many in the industry. This is done without vendor lock in and architectural changes to the datacenter, saving organizations thousands of dollars.

To join the beta program, send an email request to beta@pernixdata.com

Congratulations Pernixdata for creating a solution for the SMB, and Enterprise market that solves a known virtualization/Cloud problem.