New Book Released

I’m proud to announce that I am a co-author on a new book just released. It’s available now for download and later via print edition. Frank Denneman announced on his blog the official release, as he was the main organizer and author for this project. This is not only my first published book, but also a first as a main co-author. It was indeed exciting and challenging at the same time. I can now at some level appreciate those that have tackled such a feat, as it wasn’t the easiest thing I have ever done! :) 

Being a 300-page book that talks about the many architectural decisions when designing for storage performance, is not something for the faint of heart. The focus and desire was to articulate a deeper technical understanding of the software layers that intersect and interact with the differing hardware designs.

I would like to thank Frank for giving me the opportunity to work with him on this project. It was truly a delight and a rewarding experience!

You can directly download the book here. Enjoy!

PernixData FVP Hit Rate Explained

I assume most of you know that PernixData FVP provides a clustered solution to accelerate read and write I/O. In light of this I have received several questions around what the “Hit Rate” signifies in our UI. Since we commit every “write” to server-side flash then you obviously are going to have a 100% hit rate. This is one reason why I refrain calling our software a write caching solution!

However the hit rate graph in PernixData FVP as seen below is only referencing the read hit rate. In other words, every time we can reference a block of data on the server-side flash device it’s deemed a hit. If a read request cannot be acknowledged from the local flash device then it will need to be retrieved from the storage array. If a block needs to be retrieved from storage then it will not be registered in the hit rate graph. We do however copy that request into flash, so the next time that block of data is requested then it would then be seen as a hit.

Keep in mind that a low hit rate, doesn’t necessarily mean that you are not getting a performance increase. For example if you have a workload in “Write Back” mode and you have low hit rate, then this could mean that the workload has a heavy write I/O profile. So, even though you may have a low hit rate, all writes are still being accelerated because all the writes are served from the local flash device. 

Where are you measuring your storage latency?

I often times hear from vendors, virtual & storage admins about where they see storage latency in a particular virtualized environment. The interesting part is that there is a wide disparity between what is communicated and realized.

If storage latency is an important part of your measurement of performance in your environment then where you measure latency really matters. If you think about it, the VM latency is really the end result of the realized storage latency. The problem is that everyone has a different tool or place where they measure latency. If you look at the latency at the storage array then you are only really seeing the latency at the controller and array level. This doesn’t always include the latency experienced on the network or in the virtualized stack.

What you really need is visibility into the entire I/O path to see the effective latency of the VM. It’s the realized latency at the VM level that is the end result and what the user or admin sees or experiences. It can be dangerous to only focus your attention on one part of the latency in the stack and then base decisions on what the latency to the application is.

To solve this problem, PernixData has provided visibility into what the VM is observing, and since FVP is a read/write acceleration tier, you can also show a breakdown of latency in regards to read/write acknowledgements. 

As an example using the new zoom function in the new release of FVP 1.5, I can see the latency breakdown for a particular SQL Write Back enabled VM.



As you can see in this graph, the “Datastore” on the array had a latency spike that attributed to 7.45 Milliseconds, while the “Local Flash” on the host is at 0.25 ms or (250 Microseconds). The “VM Observed” latency is what the actual VM is seeing and thus you have a realized latency of 0.30 ms or (300 Microseconds)!! The reason you may have a small difference between Local Flash latency and VM Observed latency can be do to system operations such as flash device population as well as having write redundancy enabled or not.

To see this from a read/write perspective, you can also go to the "Custom Breakdown" menu and choose "Read" and "Write" to see the "VM Observed" latency broken down into reads and writes. 


As you can see the latency for this application was for writes not reads and since this VM is in Write Back mode we are seeing a realized 0.44 ms or (440 Microseconds) latency committed acknowledgment back to the application!!

This is obviously not the only way to determine what the actual latency is for your application, but what is unique, is the fact that PernixData is not making another latency silo solution. In other words, there are plenty of storage products on the market that give a great view into their perfect world of latency, but it’s isolated and not the full picture of what is observed on what matters in your virtualized datacenter. 


ESXi Runs in Memory - Boot Options

I hope the title of this post doesn’t surprise you! This is sometimes a forgotten design of ESXi when choosing your boot options. I have been increasingly talking with VMware admins that are deciding to mirror their local drives for ESXi. This seems to be a common design on blade architectures as well, where they use the two open drive bays for mirrored ESXi boot images.

The question to ask, is why do this? If ESXi runs entirely in memory what benefit do you have in mirroring two drives? Yes, you do have another copy of the image, in case of corruption, but wouldn’t it be easier and less wasteful to just store a copy of the image on removable media or use image builder for the resiliency!

Most server architectures are including internal SD cards or USB flash inputs to install ESXi on and there is of course the use of VMware’s Auto Deploy! The use of one of these methods for ESXi boot will not only save resources but will open up more opportunities for new technology usage.

There are many examples of converged storage architectures that would require you to use all available drive bays to maximize capacity usage. Then there is also the use of server-side flash technologies, like PernixData FVP. Having multiple options for local flash will provide more possibilities when you want to create tiers of flash for your differing workloads.

The point of this post is to hopefully illustrate that you don’t have to mirror ESXi for fault tolerance. There are many other alternatives to protect your image and why waste resources on something that could hinder the growth of your virtualized datacenter.

For added reading pleasure, here is a link to some entertaining conversations about installing ESXi on local disk or local USB.

Server-Side Flash Presentation

At VMworld 2013 in San Francisco, I recorded a session at the vBrownBag Tech Talks. There were some technical difficulties during the process and so I thought I would re-record the same talk so that it would be easier to hear and see the presentation. 

This presentation is intended to illustrate why the storage fabric can not be overlooked when designing for storage performance and why server-side flash with PernixData completely solves I/O bottlenecks within the virtualized datacenter. 

I welcome your questions or feedback. 



Features of an Enterprise SSD

When looking for a flash device to use for PernixData FVP or other enterprise use cases, performance and reliability are important aspects to factor in. Just because a drive is spec’d with high IOPs and low latency numbers, doesn’t mean that it will keep up at that rate over time with enterprise workloads.

I would guess that most of you would prefer a consistent performing, reliable flash to higher IOPs or lower latency.  This is one reason why I like the Intel S3700 SSD. This drive does a good job at repeatable results and withstands heavy workloads over time. I’m not saying this drive or others are slow, these drives are still very fast, but they do favor consistency and reliability by design.


A little over a year ago Intel introduced a technology that enhanced the reliability of MLC flash. Intel called it HET – High Endurance Technology. This is basically an enhancement in firmware, controller and high-cycling NAND for endurance and performance. The optimization was in error avoidance techniques and write amplification reduction algorithms. The result is new enterprise SSD’s that are inexpensive and deliver good performance at predictable behavior. Keep in mind though that not all Intel drives have HET, this is what separates consumer from enterprise class drives.

This is one reason why Intel can claim “10 full drive writes per day over the 5-year life of the drive”. You will also notice that other manufactures/vendors OEM and incorporate Intel’s 25nm MLC HET NAND into their products. The incorporation of HET set’s Intel apart from the rest, but this doesn’t mean however that there are not others to choose from. It’s when you factor price, reliability, performance, and customer satisfaction that currently leads many to the S3700. 

The other important aspect to consider when looking for an enterprise SSD is read/write performance consistency. Some drives are architected just for read performance consistency. So if you have workloads that are balanced between read/write, or are write heavy then you want to look at a drive that provides consistency for both read and write.

As an example, the Intel S3500 gives better read performance consistency while the Intel S3700 gives consistency for both read and write. (Keep in mind that the Intel S3500 doesn't use HET)


Intel S3500 


Intel S3700


I reccomend taking a look at Frank Denneman's current blog Series that goes into some other aspects of flash performance with FVP.