Give Me Back My Capacity

Last week I was preaching the PernixData message in Tampa, Florida! While there I received a question that I believe is often overlooked when realizing the benefits of PernixData in your virtualized environment.

The question asked related to how PernixData FVP can give you more storage capacity to your already deployed storage infrastructure. There are actually several ways that FVP can give you more capacity for your workloads, but today I will focus on two examples. In order to understand how FVP makes this possible, it’s important to understand how Writes are accelerated. FVP intercepts all Writes from a given workload and then commits the Write on local server-side flash for fast acknowledgement. This obviously takes a gigantic load off the storage array since all Write I/O is being committed first to server-side flash. It’s with this new performance design that allows you to regain some of that storage capacity that you have lost to I/O performance architectures that are just to far from compute! 

If you are “Short Stroking” your drives, there is now no need to waste that space, use FVP to get even better performance without the huge costs associated with short stroking. Another example is when you have chosen to use RAID 10 (also known as RAID 1+0) in order to increase performance through striping the blocks and redundancy through block mirroring. Why not get up to 50% of your capacity back and move to RAID 6 or RAID 5 for redundancy and then use FVP for the performance tier. As you can see this opens up a lot of other possibilities and allows you to save money on disk and gain additional capacity for future growth.

Try this RAID calculator and see how much capacity you can get back when using an alternate RAID option with FVP! 

PernixData FVP & StorMagic SvSAN Use Case

In continuing to look at alternate ways to provide a good ROI capacity layer with PernixData FVP, Frank Denneman and I will be doing a couple posts on some unique designs with FVP. As I demonstrated in a previous post, FVP accelerates the reads and writes for virtual workloads, while a virtual storage appliance (VSA) can be a great technology to provide the primary storage and data services for virtual workloads.

With this post, I will focus on StorMagic and their iSCSi based VSA product named SvSAN. A couple interesting notes about SvSAN that might actually surprise you! StorMagic claims that they have one of the largest deployments of any VSA in the market. In 2013 alone they had over 800 percent growth! They currently also are the only VSA that can start with two nodes without needing a local 3rd host for a quorum response during host isolation situations. (More on this later)

A few interesting features:

-       vCenter plugin to manage all VSAs from a central point

-       Multi-Site Support  (ROBO/Edge) (Remote office / Branch office / Enterprise edge)

-       Active/Active Mirroring

-       Unlimited Storage & Nodes per Cluster

 

I think SvSAN and FVP combined can provide a great ROI for many environments. In order to demonstrate this, we need to go a little deeper to where each of these technologies fit into the virtualized stack.

Architecture:

SvSAN is deployed on a per host basis as a VSA. PernixData FVP however is deployed as a kernel module extension to ESXi on each host. This means that both architectures are not in conflict from an I/O path standpoint. The FVP module extension is installed on every host in the vSphere cluster, while SvSAN only needs to be installed on the hosts that have local storage. Hosts that don’t have access to local storage can still participate in FVP’s acceleration tier and also access SvSAN’s shared local storage presented from the other hosts via iSCSi.

Once both products have been fully deployed in the environment it’s important to understand how the I/O is passed from FVP to SvSAN. I have drawn a simple diagram to illustrate this process. 

You will notice that really the only difference between a traditional storage array design with FVP, is that you are now able to use local disks on the host. The SvSAN presents itself as iSCSi, so that the I/O passes through the local VSA to reach the local disk. Since virtual appliances have some overhead in processing I/O, it becomes advantageous with such a design to include PernixData FVP for the acceleration tier. This means that only unreferenced blocks need to be retrieved from the SvSAN storage and all other active blocks will be acknowledged from FVP’s local flash device. This will take a huge I/O load off of SvSAN and also provide lower latency to the application.

Fault Tolerance:

When any product is in the data path it becomes very important to provide fault tolerance and high availability for given workloads. SvSAN provides the data fault tolerance and high availability through its creation of a datastore mirror between two SvSAN VSA hosts.

This means if a host goes down or if the local storage fails, a VM can still continue with operations because SvSAN will automatically switch the local iSCSi connection to the mirrored host where there is consistent duplicated data.

The mirroring is done synchronously and guarantees data acknowledgement on both sides of the mirror. I think the really cool part is that the SvSAN can access any side of the mirror at any time without disrupting operations, even during FVP performance acceleration! The fault tolerance built-in to FVP is designed to protect those writes that have been committed and acknowledged on local/remote flash that haven’t yet been destaged to SvSAN layer. Once FVP has destaged the required writes to SvSAN at that point SvSAN’s mirrored datastore protection becomes relevant to the design.

Centralized Management in a Edge Environment:

As noted before, SvSAN only requires two hosts for a quorum during host isolation situations, where hosts or local storage is lost. This is accomplished through a separate service (NSH – Neutral Storage Host) that can be installed in a central location on either physical or virtual. It’s this centralization of a quorum service that can alleviate additional localized costs and management overhead. As it is with FVP, SvSAN can be managed from a vCenter plugin for centralized management. This means one can manage hundreds of enterprise edge sites for primary storage, while also providing centralized FVP management for each performance cluster using SvSAN. This is illustrated in the diagram below.

It’s with the low acquisition costs and simple management, where VSA usage has been popular in ROBO type of environments.  This can be great for primary storage at the enterprise edge but maybe not so great for those applications needing increased localized performance. The options to achieve a high performing cost effective storage solution for a virtualized remote environment have been limited in the past. It’s not until PernixData FVP that there was a solution where you can use inexpensive primary storage, like a VSA and also have a read/write performance tier that provides extremely low latency to the applications. The amazing part is that all this is accomplished through software and not another physical box. 

This post was just meant to be an introduction and high-level look at using a StorMagic’s VSA technology alongside PernixData FVP. I hope to go much deeper technically how each of these technologies work together in future posts.

This is a simple diagram showing centralized management with FVP and SvSAN in a single 2-host edge site. 

Asking the Right Questions - New Storage Design

In the era of Software Defined Storage (SDS), the right questions need to be asked when the need for additional storage performance is required. The growth of virtualization in the enterprise has had a tremendous impact on storage performance and finances. It’s arguable that the biggest line item in most IT budgets is storage for performance reasons. In my opinion this has been a huge driver to the SDS movement today. The problem is that there are so many answers or solutions to what SDS is or isn’t. This only confuses the market and delays the true benefits of such a solution. It is with this notion that I believe we are on the cusp of something transformative.

There are several facets to SDS that could be discussed but the focus of this post surrounds the decoupling of storage performance from capacity.

It’s not uncommon to hear that storage performance is the biggest bottleneck for virtualization. This is one reason why we pay the big money for new arrays that promise a fix to this common problem. The reality is that it’s just a patch to the underlying design & architecture. In fact it has gotten so bad that some consumers have become blasé and a new normal has emerged. I often hear from virtual admins that they don’t have storage performance problems. It’s not till I dig deeper to find that their read/write latency is something of a surprise. An average of 15-20-Millisecond latency and spikes of more than 50-Millisecond is the reality!! How in the world did we get to this new normal? I personally believe we got to this because it’s either been cost prohibitive for many to do anything different and also until recently there hasn’t been a complete new architecture to answer storage performance problems in the market once and for all.

One analogy to this could be the smartphone phenomenon. Have you ever noticed how slow a previous smartphone generation seems when you pick it up after using your latest smartphone? It’s very easy to become accustom to something and have no idea what you’re missing. It’s with this that we need to recognize the new normal (microsecond latency) and understand what is possible!

Let’s breakdown 3 areas that makeup the storage framework at which we consume storage today in regards to virtualization. 

 

 

Characterized Attributes:

Performance = Read/Write Acceleration for I/O

Data Services = Replication, Dedupe, Snapshots, Tiering, Automation, Mgmt., etc.…

Capacity = Data-at-Rest, Storage Pool, Disk Size, Disk Type

 

Looking at each of these 3 areas that make up a typical storage array, where do you spend the most money today? What if you could separate these from each other, what possibilities could emerge?

As you know it’s the decoupling of Performance from Capacity that brings the biggest ROI and not surprisingly the most difficult to separate. It’s with this separation that allows us to move the performance of disk I/O to the compute cluster, close to the application. This means write acknowledgments happen very quickly and low latency can be achieved by leveraging local flash devices in the hosts as a new clustered data-in-motion tier! Aka – PernixData FVP

 

 

This new design no longer eludes a need to purchase storage just for performance reasons. It opens up a lot of other vendors and possibilities to choose from. Do you really need to purchase expensive storage now?  Does the promise of SDS and commodity storage now become a reality? Do you really need to purchase a hybrid or all flash array? Doesn’t this mean that cheap rotating SATA is all I need for capacity sake? If the array provides the needed data services on top of the capacity required, what else do I need to accomplish a true scale-out architecture for this new design? These are all important questions to ask in this new era of storage disruption.

If all all performance needs can now be accomplished and realized from the host cluster, then I now have the ability to achieve the 90-100% virtualized datacenter. This is a realization that often happens when this new design has had time to sink in. So, I challenge each of you to investigate not only how this can make you a hero in your environment but radically disrupt the time you save working on storage performance problems!

 

- Disclaimer: This post is not sponsored or affiliated with PernixData or any other vendor -