Datrium Blanket Encryption

In part 1 of the Datrium Architecture series I discussed how a split architecture opens up a huge amount of flexibility into modern datacenter designs. In part 2 of this blog series, I will be talking about an industry first feature that everyone concerned with security in their datacenter will want to take notice. 

Datrium Blanket Encryption

Sure, there are products in the market that provide encryption for data at rest in a storage platform, but there are no converged products as of today that provide government grade (FIPS 140-2) data encryption end-to-end. Organizations must look for a solution that is FIPS 140-2 validated, not FIPS 140-2 certified. Validation is when NIST evaluated the encryption scheme. However, certified is technically meaningless and is mostly marketing, it may be done in the spirit of NIST's requirements, but it hasn't been validated.

It's only with the Datrium DVX software platform, that all I/O from an application/workload perspective is encrypted upon creation using AES-XTS-256 crypto algorithm and is a validated solution for FIPS 140-2 compliance. Using the underutilized AES-NI chipset built into modern day microprocessors, Datrium will encrypt data in-use and on access in RAM & Flash and in-flight when the second "write" is synchronously sent to the data node for block durability. This means you will have your data encrypted while in-use, in-flight and at rest, so that there is no risk for compromise at any level in the I/O stack. 

There is also no need to have SED's (Self-Encrypting Drives), this implementation is software based and is included at no added cost to the customer. The amount of savings this brings to customers is huge, since SED's are exorbitant upon procurement and on top of that you can't mix differing disk types in most systems today. It then becomes an all or nothing implementation when only using a data-at-rest encryption method based only on the drives. 

Blanket Encryption Use Cases: 

There are many use cases for Datrium's blanket encryption. The obvious ones are….

1) Drive or part replacement. 
2) Prevent network sniffing of I/O traffic. 
3) Rogue processes that tap into host memory. 
4) System theft. 
5) HIPPA & SLA compliance.

Today, Datrium uses an internal key management system for easy setup and management. With this, we support password rotation, startup locked and unlocked modes and in full disclosure, you can also be assured that encryption keys are not stored in swap or in the core dump of the Datrium system. 

Another cool feature is the shipping mode option, where the key is not stored persistently anywhere in the platform. So, during transport of the DVX platform there is no risk of a data breach during transit. When the system is powered up in this locked mode, the administrator must provide the encryption password before the system will serve any data again. 

Enabling encryption is extremely easy to do on any Datrium DVX system. Just issue the command: "datastore encryption set --fips-mode validated" this will enable the FIPS 140-2 validated mode for your data. In order to verify just issue the show command: "Datastore encryption show"
You can also verify in the DVX dashboard under durable capacity, where the green shield is. This will show that encryption is enabled with FIPS 140-2 compliance. 

Now some may ask but wait doesn't this mean if I enable encryption on Datrium that data reduction like dedupe and compression go away. Remember Datrium implemented an always-on system when it comes to data reduction. In so doing, Datrium became the first in a converged platform offering FIPS 140-2 validation without sacrificing data reduction using compression, dedupe or erasure coding. 

I'm blown away that Datrium has not only done the right thing when it comes to offering FIPS 14-2 validation out of the gate but also without a sacrificing performance nor any data reduction technology that customers love us for. 

Additional reading: Datrium Blanket Encryption Whitepaper

My next post in this series will about Datrium's Global Data efficiency. 


Datrium Architecture - Differentiator Series

Since coming to Datrium over a month ago, I have been amazed at the level of talent and product feature advancement that has been orchestrated in such a short time. This has kick started me to write a series of blog posts (in no particular order) on why the Datrium architecture makes so much sense for the modern datacenter. I will be discussing key differentiators and features that make up some of my favorite parts of the Datrium DVX software platform.  

Part 1: Split Architecture

One of the benefits of open convergence, is that you are no longer tied to the traditional two controller storage system, where scale is solely based on available storage and network resources. Datrium’s flexibility goes even beyond the traditional HCI stack, (where compute and data operations share resources in the same appliance). By decoupling performance and data services from capacity a whole new world of possibilities is introduced in the separation of compute and data storage. 

It’s with this that Datrium’s split architecture pioneers a new breed of datacenter designs where flexibility is at heart of what makes the DVX platform so remarkable. For example, you can... 

•    Use commodity-based Flash resources in the host/server where it’s less expensive and more flexible for I/O processing. 
•    Use your own x86 hosts/servers with ESXi, RHV, or Docker as a platform of choice and flexibility for growth.
•    Upgrade performance easily with the latest and greatest flash technology in the hosts/servers. 
•    Lower East/West chatter and remove the vast majority of storage traffic from the network where continued scale can be realized and application performance isolation provides true data locality. 
•    Scale Compute and Storage truly independently.
•    Take advantage of under-utilized CPU Cores for IO processing and data services (Compression, Dedupe, Erasure Coding, Encryption) on your hosts/servers.
•    Utilize stateless hosts while still achieving data integrity, fault and performance isolation between compute and data. No quorum needed, minimum host/servers needed is only one. 
•    Get Secondary Storage in the same platform for a lower TCO.
•    Use multiple based storage controllers by virtue of every host/server introduced into the DVX platform. 

Storage Operations in a Split Architecture:

One example of these benefits in a split architecture is realized when disks die or disk (bitrot or Latent sector) errors occur on a primary or secondary storage system. A rebuild (system) operation is the process that is needed for any data system to be in an healthy fault tolerant state again. Every storage system needs to have a process to rebuild data when data corruption occurs, however, there can be side effects when rebuilds are needed. One of the most common consequences with any rebuild operation is the resource utilization needed to complete successfully. During such an operation, your primary workloads could be slowed and hampered with higher latency.  
Data rebuilds will result in lower performance, and the time to finish the rebuild can be lengthy depending on resource availability, architecture and how much data is resident for a completed healthy state again.  
Another common storage operation is the rebalancing of data when additional capacity or nodes are added to a storage system. Proper rebalancing tasks can drain system resources based on the amount of data that needs to be rebalanced. This sometimes-timely task is important to keep the pool of data available and avoid hot spots from occurring. 

Datrium’s architecture is based on a Distributed Log Structure Filesystem. This filesystem provides a platform where workloads are not hindered from decreased performance during rebuilds or other system operations. The decoupling of performance and data services from capacity has given customers the freedom to finally realize the benefits of true cloud like experience right in their own datacenter. (More on this in a future post) By moving I/O operations to the hosts and keeping data at rest in what we call a data node, we achieve the best I/O latency for applications while keeping data safe and resilient on durable/secondary storage. So, when problems or systems operations occur on durable storage, our intelligent DVX software utilizes underused CPU cores on the hosts for these operations while not stealing any CPU resources from running workloads. You can think of your hosts/servers each as a separate storage controller all working together to facilitate storage operations and system processes. 

As one adds more hosts/nodes to the DVX system, the faster rebuild and/or rebalance operations ensue. During reconstruction of data, multiple hosts/servers can help with system operations. No host to host communication is needed. Each host has its own task and operation to facilitate a faster more efficient rebuild/rebalance on the Datrium data nodes. 
One very cool example of the intelligence in the DVX software is the built-in QOS during a rebuild operation on the data node(s). This is based on the amount of failures or number disks to rebuild. For example, if only one disk fails then less resources from the hosts are needed for the rebuild operation. This is a dynamic process to facilitate the amount of urgency needed for small or larger failures and errors.  

Datrium took the best of traditional 3-tier and HCI architectures into a modern world where performance and system operations are in disagreement. A customer can now utilize open convergence to achieve the performance they never had on-premises and will never get in the cloud. Think of it as future proofing your datacenter for years to come. 

In part two I will discuss Datrium’s Encryption uniqueness and powerful encryption capabilities. 

vBrownBag Build Day Videos

For those that didn’t get a chance to watch the vBrownBag Build Day at Datrium last week, the recording is now available on Youtube. The vBrownBag crew did a very professional job and our very own Devin Hamilton delivered a stellar performance! If you are on holiday this week, you won't be disappointed if you spend some time and watch this 3hr video, intermixed with some great interviews and technical discussions. 

Just released are additional video’s that cover more deep-dive technical discussions from one of engineers and co-founder. I found them to be very interesting and thought provoking.
Boris talks about the life of a Datrium DVX I/O while Lakshmi demonstrates the probability of failures in storage systems and how Datrium architects it's system for data resiliency. Enjoy! 

I/O Architecture of the Datrium DVX - Boris Weissman

The Science of Failure - Lakshmi N. Bairavasundaram

Joining Datrium

It’s with ebullience that I’m joining Datrium as their new Principal Systems Engineer. I’m a firm believer in God leading me to new endeavors and this is no different. He has led me undoubtedly throughout my life and so I know there is great things in store in this new opportunity. 

As some of you already know, it’s rare when an awe-inspiring company culture comes together with great technology! It’s with this combination of greatness that I look forward in exemplifying and also demonstrating how Datrium’s DVX platform can solve real business problems in the enterprise. There are so many exciting things to share about Datrium, so stay tuned!! 

For those that are unfamiliar with Datrium the company, here is a quick snapshot. 

Founded: 2012
Exited Stealth: 7/2015
Location: Sunnyvale, CA
Investors: NEA, LightSpeed, (Angel Investors: Diane Greene and Mendel Rosenblum, Frank Slootman, Kai Li, Ed Bugnion)
Funding to Date: 110M

•    Brian Biles, ex Data Domain founder / VP Product Mgmt. 
•    Hugo Patterson, ex Data Domain Original Chief Architect, EMC Fellow
•    Sazzala Reddy, 2nd Data Domain CTO (employee #15) 
•    Ganesh Venkitachalam, ex VMware Principal Engineer
•    Boris Weissman, ex VMware Principal Engineer

Datrium offers a new take on modern convergence called Open Converged Infrastructure. The (DVX) platform supports VMware, Linux KVM and bare metal Docker with host based flash and appliance based durable storage for cost-optimized secondary storage and archive to cloud capabilities. 

My Nutanix .NEXT 2017 Session

For those that have the privilege of attending the Nutanix .NEXT 2017 conference this week, I would like to cordially invite you to my session titled: “When AWS IS Not Forever” this Thursday at 11:35 am. (Cherry Blossom) #AW204

I have the pleasure of presenting along with Paul Harb, Sr. Director of Nutanix Consulting Services. We will be outlining AWS use cases and exposures when making those critical decisions around workload placement. Then we will demonstrate current tools and services available to transition workloads back to your on-prem datacenter when it makes sense. This session is jam-packed, as we will have two demos, a customer validation and something new to showcase!

If you are unable to attend .NEXT this year, make sure and at least signup to hear the keynotes live. There are some big announcements this year! https://www.nutanix.com/next/live.html

Session Summary:
“As organizations embrace public cloud services, it is critical to maintain the portability of applications and data, which includes the freedom to bring applications back from the public cloud. In this cloud strategy session, we will examine the pertinent technical and business criteria you should take into account when migrating workloads between AWS® and private cloud environments. You’ll also hear about real-world projects where application owners have successfully made the transition back from public clouds. Attendees will walk away with a full understanding of the cloud decision making process, and a project check list for when it’s time to migrate workloads back from AWS®.”

My Next Journey

After reflecting on my career path over the last couple of years, I have come to a decision point where it’s important for me to exercise my passion to it’s fullest potential. Yes, this does involve a job change. I’m happy to announce that I am joining the Nutanix field team as a Sr. Systems Engineer in Atlanta. 

This decision really came down to recognizing the importance of humanizing solutions and technology where a positive impact can be realized. If one humanizes a product or service, it can break down barriers and allow a solution to become more relatable. This to me means being on the front lines where you hear and see first hand how technologies can have a true positive impact. So, job title, status, salary, how many twitter followers one has etc.… are not really at the top of the list of importance. Yes, they are all nice to have, but if you are passionate about making a difference then all that stuff really comes by default. 

Nutanix is a company that has some of best and brightest minds delivering a true cloud experience to enterprises. The level of innovation around what it means to have an enterprise cloud and how this brings a hybrid approach to being a positive technology steward for your business is extremely important in my book. I also strongly believe in the importance of dynamic design and creative ascetics where simplicity provides a template for what matters to the business. This is a pillar that Nutanix started on and has carried throughout its cycles of innovations, and as some in the industry can attest to isn’t an easy thing to do.

I am also inspired with the Nutanix mantra of staying “Hungry, Humble, and Honest.” The phrase embodies values that I look forward to exemplifying.  I also look forward to working with some of my old PernixData friends and making some new ones along the way. 

I plan to keep this blog updated with useful nuggets and updates as I move forward in my role. 

Azure Stack Ignite Notes

As I noted in yesterday’s post, I have been intrigued by Microsoft’s approach to Azure Stack. I took a lot of notes during sessions and conversations at Ignite and so I decided to list them out in a new post. For those that are unfamiliar with Azure Stack, then I would suggest this whitepaper as a starting point. 

In a nutshell, Azure Stack is Microsoft’s hyper-converged integrated system for flexible service delivery using Azure based management and automation. It’s Microsoft’s cloud in a box using Azure’s framework to ultimately Keep a consistent experience. Keep in mind this product is not GA yet, so anything I state below may or may not come to fruition. 

Microsoft at Ignite 2016, announced the 2nd technical preview of Azure Stack, with general availability expected in the second half of 2017. I also heard an expectation of TP3 during the 1st quarter of 2017. 
Currently Azure Stack offers IaaS but later this month Microsoft plans to release an update where Microsoft’s PaaS “App Service Resource Provider” will be available on Azure Stack TP2. 

•    Built on top of Server 2016 Server Core, (Nano Server will be supported in the future) However, TP2 is currently using Server 2012 R2 Server Core. 
•    Hyper-Converged Deployment model with a pair of ToR switches and a BMC switch to manage the server integrated stack. 
•    Hyper-V is sealed under Azure Stack, only access to Hyper-V is from API. 
•    Windows Server Failover Clustering is used along with Storage Spaces Direct. 
•    Minimum Resource Req.: Dual Socket, 8 cores per socket. 256GB memory. 
•    RDMA will be supported - converged NIC: SDN + Storage, or use two 10Gb nics, with switch-embedded teaming for port link residency.  
•    Storage Protocol Support: CSVRFS with ReFS, SMB3, SMB Direct, SATA, SAS, NVMe
•    Multi-Resilient Volume: writes are mirrored for performance then as the data gets cold it will then write a large chunk to parity for a balanced approach of performance and capacity. 

Caching Capabilities:
•    All writes up to 256kb are cached. 
•    Reads of 64kb or less are cached on first miss. 
•    Reads of 64kb + are cached on second miss.  
•    Writes are de-staged to HDD in a optimal order. 
•    Sequential reads of 32 + KB are not cached. 
•    In an All Flash System, only Writes are cached
•    Min. req.: 2 cache devices with Min. req.: 4 capacity devices  

Integrated Stack:
I heard a lot of people say that they didn’t like the direction of an integrated system requirement for Azure Stack. The belief is that it will lock them into a hardware vendor that may not provide enough flexibility or choice. I attended a few sessions where Microsoft heard this feedback loud and clear, and so it was stated that the current plan is to continue and partner with Dell, HPE, Lenovo to provide the hardware for a complete certified Azure Stack integrated system. It’s after this, where Microsoft hopes additional testing and assurance can be made on a defined HCL, Microsoft hopes to offer Azure Stack as a production ready software solution for a wider array of hardware platforms. I personally had hoped for a software only solution, but I do see why an integrated system will allow Microsoft to control the experience and deliver on the expected promises. I just hope that the integrated system is not priced out of reach of most customers, as we have seen similar examples of this in the past! 

An Integrated System is part of what Microsoft calls a Scale Unit. Each Scale unit must include homogenous hardware; however separate scale units can be heterogeneous hardware. Each scale unit must include at least 4 Azure Stack nodes, with one node dedicated for patch and updates. Each scale unit must also include the items in this list. 

•    A pair of ToR Switches for availability and internal networking. 
•    One BMC switch for hardware management
•    An Aggregate switch for external connections

Each scale unit is part of one region, (same physical location), multiple scale units can be part of the same region as well and so are part of the same fault domain. 

The cool part is that you can designate your Azure Stack deployment as a region and in the future this could become part of Azure when deploying new workloads with the option of not only designating Azure regions but your own on-premises Azure Stack regions. 

One surprise that I found was that the Azure Stack software framework is actually run in virtual machines on top of Hyper-V.  I guess this surprised me because I thought that Azure Stack would have been developed as an application with native hooks into the server operating system. I can certainly understand why they chose this path, but it also makes me wonder about performance in this type of design. This of course can be easily rectified by an optimization product like from DataCore software! :) 

Currently authentication is joined through Azure, but plans to support Facebook, Twitter, MS Account, & AD will be supported for authentication and authorization on Azure Stack. 

If and when you use Azure Stack App Service you will have the ability to deploy new internal assemblies, like Go, Ruby, Java and even components like Oracle Clients. These are types of things that Azure Stack will have that the Azure public app service won't have support for. 

As you can see there is a lot going on here, but I'm only touching the surface. I hope to provide more interesting notes about the Azure Stack Architecture as I have more time to play with TP2. 



Microsoft Ignite 2016 Experience

Last week I attended Microsoft Ignite 2016 here in Atlanta. It was nice to have this event in my own back yard as its always a bonus to sleep in your own bed at night!  This was the first Microsoft Ignite/TechEd I’ve been to in a while and here are a couple things that I found motivating… 

Here come the Women!!

I have never seen so many women in attendance at a IT convention, not saying I have been to all of them, but I was encouraged to see so many women participating and attending infrastructure, virtualization, and cloud native sessions. Yes, we still had long restroom lines, but it was clear we weren’t alone. I don’t know the percentages, but I wouldn’t be surprised to see that at least 10% of the attendees were women! I find this extremely encouraging as this industry definitely needs a better balance! 

The new Microsoft – 

With 23k plus at Ignite, it was noticeable that the new Microsoft (under Satya Nadella) has some new faces under the cloud native umbrella. It was also evident that a lot of the old faces of Microsoft’s yester-year were present. Microsoft knows that it has a lot to do to convince and educate those that are resistant or reluctant to change. For example, there were a few sessions that were geared to organizational change and digital transformation to the cloud. Not only were the sessions introductory, but educational in hopes to ease fears of the public cloud.  

Since Microsoft has such a wide base of traditional applications and is deeply engrained into on-premises infrastructures, this obviously creates a unique challenge but also provides an opportunity for Microsoft in the new era of cloud. 

Introducing the opportunity: Azure Stack. I believe Azure Stack has the potential to not only put Microsoft on top of cloud revenues but change the competitive direction of AWS and GCP. However, with a longer than planned delays and/or technical missteps Microsoft could leave enough room and time for competitive challengers to enter this new product category. 

Stay tuned for my next post on the Azure Stack architecture, and why I think this new product could accelerate the move to cloud native applications. There are some exciting and noteworthy architectural features that I think will surprise you!  

Great job Microsoft on a well organized event this year? I'm looking forward to attending next year in Orlando, FL. 

VMworld 2016 Session Vote

This year I decided to make my first application into the VMworld Call for Papers. I have been wanting to do this for some time but my time and commitments never aligned. It's been a learning experience and so we will see how it goes! So, if you have find interest in learning more about migrating workloads to VVols and all the details that go with this, then please vote for this session. 

Migrating Workloads to VMware VVols [9059]
Careful planning is needed for successful workload migration to VVol based storage. Depending on the scenario and the datacenter environment it’s important to understand what is expected and required when migrating Virtual Machines in a mixed environment of VMFS and VVols. We will look at the steps and options available in a virtualized heterogeneous storage infrastructure, in addition to available VMware partner solutions.

Tech Evangelism Commentary

Since joining DataCore as Technology Evangelist, I have been hearing many in the community asking me, “What exactly is a Tech Evangelist?” “What does a Tech Evangelist do?” I know every company most likely has a tweak or difference in how they articulate a job description or title. However, I believe there are some commonalities, which I wish to address in today’s post. 

Word Study

It is interesting that many people understand the word, “Evangelist,” to have some religious connection to it. That would be true, as I can attest to this first-hand, since my formative years were spent in a Pastor-Evangelist family. This led me later in my life to pursue a “Theology” degree where I was given additional exposure as to how an Evangelist communicates, listens, teaches, preaches, and relates in such a way as to inspire and influence the audience or Community. 

The word, Ευαγγελιστής, actually originated in ancient Greece prior to biblical times, where it was understood to mean a “reward given to a messenger for good news” or “bringer of good news.” Therefore, I can conclude the way “Evangelist” is used in the phrase “Technology Evangelist,” as laden with many values or connotations that I believe are necessary to bring about many positive benefits and results to a company wanting to educate its community.   


Another common belief that I have heard is that a Tech Evangelist is just someone in marketing. In my opinion for a Tech Evangelist to become a trusted advocate, he/she needs to have one foot in engineering and the other foot in marketing. This helps build a bridge where a deeper technical understanding of a product and/or service can properly be disseminated to the general public. However, this isn’t the only place where a Tech Evangelist roams. It’s common to see a Tech Evangelist support the goals and visioning found in Corporate, Support, Sales, Engineering, and Marketing departments. It’s important to understand and interface with each area and apply learning’s where the most help can be provided.  I like “floater” as a term in which a good Evangelist is gifted with listening skills to internally and externally both gain and craft creative content, so as to tell the story most effectively. To be most successful, I believe that a Tech Evangelist needs to work in an organization, hierarchy, or department, that allows for flexibility of thought and experimentation, so as to test the best new ways to communicate the content or message.

I love the lofty Wikipedia definition of a Tech Evangelist:

“A technology evangelist is a person who builds a critical mass of support for a given technology, and then establishes it as a technical standard in a market that is subject to network effects.”

I also believe the core part of the role of a Tech Evangelist is not just educating or communicating to the public, customers, or clients, about a particular technology, but about building an army of believers. One person cannot have a big impact in building critical mass in the industry without an army of supporters all passionately preaching the same message. This is one reason why I think one typically observes startups hiring a Tech Evangelist, or even a large corporation hiring a Tech Evangelist to focus on one product or service in a myriad of products or features. 

Types of Activities

There are many activities that a Tech Evangelist would be involved with; in fact others perform some of the same activities with different titles. The key is to explain how some form of tech works and why it matters rather than just describing a given feature. Below are just a few examples. 

  • Speak at technical conferences
  • Manage an external technical awards programs
  • Produce technical blog posts
  • Produce technical informative white papers
  • Lead social media engagements
  • Participate in the industry community 
  • Company ambassador at technical events
  • Liaison between engineering and external technical experts
  • Drive internal awareness and conversation

In order to do some of these things well, I think there are several important qualities that someone needs to obtain or naturally possess. This by no means is an exhaustive list below: 

  • Informed about the industry and community around them
  • Passionate about what they are doing
  • Authentic & Humble about what they communicate
  • Listening skills to focus on priorities
  • Speaking & Communication skills to capture the essence of an exposition

If I missed any important areas that you think would add to this list, please feel free to comment or contact me

I am listing below in alphabetic order other noted Tech Evangelists who are focused in the Virtualization or Storage industry: If I have missed someone please let me know. I will add them to the list

Update: Since many of you have sent me names to include, I have decided to create a Google Sheet. This sheet will list those that hold the "Tech Evangelist" title and those that don't hold the title but have been nominated and/or conduct activities of a Tech Evangelist in the community. If you find any errors, please let me know, I pulled details from Twitter and LinkedIN. Click Here... 

Additional reading:

Official Announcement: New Calling

I’m excited to officially announce that I joined DataCore Software as their Tech Evangelist! This is a position that I’m truly excited about as it aligns with my goals and aspirations while still engaging in technical conversations. I will be working alongside engineering and with many other great individuals at DataCore. I can’t wait to tell you about some of the hidden gems that I have found in my short time already! There is some great innovation happening at DataCore that I think will amaze many people. 

For those who are unfamiliar, DataCore provides software that abstracts storage and server data services. This abstraction provides enhanced performance, availability, storage management and much more. DataCore was the first to develop storage virtualization and has proven to be very successful. They have over 11,000 customers with over 30,000 deployments around the world. Currently DataCore’s SANsymphony-V software product line has seen 10 generations of innovations and hardening to make it truly enterprise worthy. 

There is so much to tell and so much in the works that I will have to save it for another time! 

New Book Released

I’m proud to announce that I am a co-author on a new book just released. It’s available now for download and later via print edition. Frank Denneman announced on his blog the official release, as he was the main organizer and author for this project. This is not only my first published book, but also a first as a main co-author. It was indeed exciting and challenging at the same time. I can now at some level appreciate those that have tackled such a feat, as it wasn’t the easiest thing I have ever done! :) 

Being a 300-page book that talks about the many architectural decisions when designing for storage performance, is not something for the faint of heart. The focus and desire was to articulate a deeper technical understanding of the software layers that intersect and interact with the differing hardware designs.

I would like to thank Frank for giving me the opportunity to work with him on this project. It was truly a delight and a rewarding experience!

You can directly download the book here. Enjoy!

Examining AeroFS

During VMworld 2015, I took time to learn more about AeroFS. This is a company that really caught me off guard, as I was surprised and excited to hear about a free version available with enterprise features, but before I jump into the details, here is quick company snapshot. 

Founded: 2010
Founders: Yuri Sagalov – CEO & Weihan Wang – CTO
Funding: 15.5 Million from several private investors and firms. Including the likes of Ashton Kutcher
Location: Palo Alto, CA
Customers: 2,000 Organizations – As I understand it even Apple uses them for file sharing! 

AeroFS is a file sharing/syncing software appliance that is deployed and operates locally on-premises. Think of it as your Dropbox but is run totally behind your firewall and without the added monthly expense. I was intrigued because AeroFS gives customers the flexibility to expand their storage without paying additional fees while gaining access to all the enterprise features. 

There is one feature that I think many will like. AeroFS can be deployed in two different ways. One option is to deploy an AeroFS Team Server, where a centralized copy of the data lives on dedicated server. Think of this as your traditional file server but with modern file sync capabilities. The other option is a decentralized approach, where there is no centralized copy of data, each copy is synced with peers when changes are instituted. With this deployment method, no users will notice if your centralized file server is down for any reason. 

Besides the fact that AeroFS is deployed in our own private cloud, you will happy to learn that additional security measures are resident. For one all data transmitted client-to-client and client-to-server (whether on the LAN or on the Internet) is encrypted using AES-256 with 2048-bit RSA. In addition you can manage your own PKI (public key infrastructure) with AeroFS as the root CA. 
I also found it very useful to manage public links with expiration limits and the options to password protect links. This is in addition to the ability to manage more permanent users in the AeroFS UI with two-factor authentication options. 

I found the installation very straightforward as I downloaded and installed their virtual appliance in my vSphere 6.0 U1 lab. In fact, the hardest part is making sure DNS is setup properly internally and externally. Once you have DNS setup it's important that you open the proper firewall ports for connectivity from your devices or endpoints. However this is an area where I think AeroFS could spend some more time on in my opinion. It would help new users if they were provided a blueprint of use case deployment best practices. For example, are you going to use AeroFS for internal sharing or is it going to be public facing, what should the customer think about concerning DNS, Security and redundancy?

After installation, you will have access to HTML5 management interface and the ability to deploy your end points. You will notice in my screenshot, that a lot of functionality is built into the Mac client. 

Supported Platforms:

  • OVA – vSphere & VirtualBox
  • QCow2 - OpenStack, EC2
  • VHD – Hyper-V

When it comes to supported devices, Windows/OSX and Apple/Google apps are available! This is another area where there could be some improvement. I only tried the iPhone app, but it was very basic in functionality and would have liked to see some more advanced functionality. It did work as expected and was easy to deploy just with quick scan of a QR code from my virtual appliance setup screen!

It was also with delight that I noticed they have a preview that you can download to run AeroFS on CoreOS and Docker. You can also use other flavors of Linux by using a short command line bash script to run AeroFS. 

I strongly recommend giving it a try in your lab. AeroFS has a free version with most of the features included up to 30 users. I commend AeroFS for thinking outside the box on this, as it opens the doors for lab users, and the SMB market to take advantage of a low cost, secure alternative to the Drobbox’s of the world. 





FVP Freedom Edition Launch

As you know we shipped PernixData FVP 3.0 version yesterday, but what you might not know is that we also shipped PernixData FVP Freedom Edition. This in my opinion is an exciting addition to the product family and based on the feedback we have already received it’s taking off in a major way!! Keep in mind this is totally free software with no restrictions or time limits. 


For those unfamiliar with the Freedom Edition I have outlined the supported features that come with this release. 

Supported Features
•    vSphere 5.1, 5.5 and 6.0
•    Maximum 128GB of Memory (DFTM) per FVP Cluster
•    Unlimited VM’s and Hosts
•    Write Through Configuration

If you want DFTM-Z (Memory Compression) or the ability to configure Write Back for your virtual machines then you can easily upgrade to our standard and enterprise licensing options. 

Freedom Community Forum
We are launching with the Freedom edition a brand new community forum. This is to provide support and collaboration among the Freedom users. As you might guess, we are planning to add a lot of content over the next few weeks, so the more questions or interaction you have on the forum, the more it will make it useful for the Freedom community. In order to access this forum, you can visit https://community.pernixdata.com and click sign-in. We have enabled SSO support, so all you have to use is your same PernixData download portal account and we will redirect you back into the community forum. 

If you haven’t already requested the Freedom edition, you can request access here. Once registered you will automatically receive an email with instructions on how to gain access to the software and portal. This is totally an automated process, so you will get your Freedom license key the same day you request it!!

PernixData FVP 3.0 - What's New

I’m pleased to announce that PernixData FVP 3.0 has been released to the masses! This has been a combination of many long hours by our engineering and staff in order to reach this unprecedented milestone.

Some of the highlighted features in this release are a result of a seasoned approach to solving storage performance problems while keeping a keen outlook toward what the future holds! In this post I will mention at a high-level what some of the new features are but look for more detailed posts coming soon.

Support for vSphere 6.0
We now have support for vSphere 6.0 using FVP 3.0! If you are running a previous version of FVP, you will need to upgrade to this release in order to gain full vSphere 6 support. If you are in a process of migrating to vSphere 6, we now have support for a migration plan from previous versions of ESXi running FVP. For example, FVP will support mixed environments of vCenter 6.0 with hosts running ESXi 5.1 or newer.  However keep in mind that FVP 3.0 will no longer be supporting vSphere 5.0 as a platform.

New HTML5 based User Interface
FVP 3.0 offers a completely new user experience. FVP 3.0 introduces a brand new standalone webclient where you will be able to configure and monitor all your FVP clusters. In addition, the new standalone webclient now gives you visibility into other FVP clusters that may reside in a different vCenter or vSphere cluster!!

This doesn’t mean you won’t have visibility in the vSphere webclient; we still have a plugin available that will give you the basic FVP analytics. However, all configurations and detailed analytics will only be available in the new standalone webclient.

Some may ask why we built our own webclient which I think is a valid question. The truth is that in order for us to control the full user experience for FVP we had to grow our own while still supporting the vSphere client for those quick looksee’s. I think you will be pleasantly surprised how robust and extensible the new standalone webclient is.

New Audit Log

In addition to providing FVP actions and alarms through vCenter tasks/events, FVP 3.0 now has a separate audit log. This is where you can easily see all FVP related actions and alarms for a given FVP cluster. The part I like is the ease of just doing a quick review of what’s changed without having to visit each host in vCenter.


Redesigned License Activation Process

The license activation process has been streamlined to offer greater simplicity and ease of use.  You can now activate and manage all of your licensing online through the new PernixData UI. All you need is a license key while the new FVP licensing activation process will do the rest. You also have the ability to see more details on what is licensed and what isn’t in the new UI. 

As you can see a lot of innovation has gone into this new release. In fact there is so much to reveal, I'm going to do a series posts over the next few weeks. To learn more and download FVP 3.0 release please visit: http://www.pernixdata.com/products or start a trial at: https://get.pernixdata.com/FVPTrial

Time Scale: Latency References

In a world where speed is of the utmost importance, it has become apparent to me that there is a notion of relativity in latency. In other words, one needs a form of measurement - a time scale to understand how fast something really was and/or is.

With high frequency low latency trading as depicted in this video, milliseconds is the name of the game. A loss of 2 seconds, as an example, can be a matter of losing millions of dollars or prevention of a catastrophic financial event.  

In using this example, how can one feel what 2 milliseconds feels like? Can one tell the difference between 2 or 3 milliseconds? I find it fascinating that we as humans sometimes base what is fast or slow on what it feels like. In fact, how do you measure a feeling anyway? We usually have to compare (base line) to determine if something is really faster or slower. I would argue that it’s sometimes the results or effect of latency that we measure against. In low latency trading, the effect or result can be devastating, and so there is a known threshold to not go past. However, this threshold is constantly being lowered or being challenged via competitive pressure. This means it’s important to constantly have latency references to measure against in order to determine if the combined effect will have positive or negative results.

This is why testing synthetic workloads (in order to determine what the performance is) can result in inaccuracies of what is truly fast or slow. When one tests only one workload, it’s not depicting the combined effect of all disparate workloads and their interactions as a whole.  Another inaccurate way to measure is to base decisions solely on what the end users feel is faster or slower. I know it can be interesting to see what the end-user thinks, but it’s not an accurate way to look at the whole system, as a way to measure. The results seen for all work done (may be based on a project) is a better way to measure the effect. This can obviously complicate the process of measuring, but there are places to focus that will give a more accurate look on latency effects as a whole if one follows what I call the time scale reference.

Contrasting what we deem fast historically and what is on the horizon is not just interesting but important for baselines. Proper latency measurements become important milestones to feel the full effect of speed and acceleration.

Let’s say for example you had to take a trip from Atlanta, GA to San Francisco, CA in a truck carrying peaches. You had two routes to choose from. One would take 3 days and the other route would take 6 months. Now, if you wanted to take the scenic route, and you had tons of time, without the peaches, you might want to take the longer route. However, if you took 6 months, those peaches would smell quite bad by the time you got to San Francisco!! Using real world analogies like this on a time scale that we can decipher is important in order to see the differences and the effect it may have. Now why did I choose 3 days vs. 6 months for this example? A typical Solid State Disk has an average latency around 100 microseconds. Compare that to a rotational standard hard drive at about 5 milliseconds. If I scale these as to compare how drastic the time difference is between the two, it’s 3 days for the SSD and 6 months for the standard hard drive. Now, I can really see and feel the variance between these two mediums, and why a simple choice like this can make a gigantic difference in the outcome. Let’s now take it up to another level. What if we now had the capability to travel with our truckload of peaches to San Francisco in 6 minutes instead of 3 days or better yet how about 40 seconds? Today 6 minutes is possible as it applies to standard DRAM, but 40 seconds isn’t too far off, as this is representative of the Intel and Micron announcement for 3DXPoint NAND.

If I take these latency numbers and plug them into my datacenter, then I can start to see how simple choices can really have a negative or positive impact. You may now be saying to yourself, “Well if I go with SSD’s today, then tomorrow I basically need to rip and replace my entire stack to take advantage of the newer thresholds of latency, like the new 3DXPoint NAND, or even whatever is next!!” The exciting part is that you don’t have to replace your entire stack to take advantage of the latest and greatest. Your truck carrying those peaches just needs a turbo boost applied to the engine. You don’t need to buy a new truck, which is why choosing the proper platform becomes very important. Choosing the right truck the first time doesn’t tie your hands with vendor lock-in when it comes to performance.

In conclusion, I hope that one can now understand why proper base lines need to be established, and real world measurements need to be referenced. It’s not just a feeling of what is faster. We now are past the point of recognizing the true speed of something as a feeling. It’s the cause and effect or result that determines the effective threshold. Then tomorrow the threshold could lower with new innovations, which is all the more reason to have a time scale of reference. Without a reference point people become accustomed to the world around them, missing out on what it really is like to travel from Atlanta to San Francisco in 40 seconds or less. Don’t miss out on the latency innovations of today and tomorrow; choose your platform wisely

                        My Daughter was inspired to draw this for me based on this post! :) 

FVP Color Blindness Accessibility

With much respect and detail our PernixData engineers are interested in every facet of the customer experience. Something that may seem small to others can be a big deal to some. It’s with this that PernixData thinks about every feature in a holistic manner that all can appreciate. One such feature that fits this model is providing visual accessibility to those with color blindness. With 1 in 12 men and 1 in 200 women having some form of color blindness, it becomes important that the FVP UI is readable and understandable no matter the impediment.

 It was in FVP 2.0 that we made modifications to the colors in our UI to deal with the most common forms of color blindness: Deuteranopia (~5% of males), Protanopia (~2.5% of males), and Tritanopia (~.3% of males and females). For example, the “Network Acceleration” line graph was made to be  lime green. In addition all colors were tested with “Color Oracle” and application that simulates different forms of color blindness.

In addition, we made it easy to recognize each line on a chart uniquely identifiable. This was accomplished by providing the ability to toggle lines on/off. For example, if you aren't sure which line is referring to the datastore, just toggle the others off or toggle the datastore selection off/on, and this will clearly show the datastore line.

When designing the FVP interface it was also recognized that color is used as a secondary source of information that provides further insight and impactful information to the primary source. For example, in the host/flash device visualization, the color of the tiles (red, green, yellow) indicates the state of the relevant object. If there is a problem however, alarms/warnings also show an exclamation point on the tile in addition to the coloring of the tiles.


In memory computing - a survey of implementations to date

In Memory Computing Blog Series
Part 1: In Memory Computing - A Walk Down Memory Lane

In the series’ previous post we took a walk down memory lane and reviewed the evolution of In Memory Computing (IMC) over the last few decades. In this blog post we will deep dive into a couple of specific implementations of IMC to better understand both the innovations that have happened to date and their limitations.

The Buffer Cache

Buffer caches, also sometimes referred to as page caches, have existed in operating systems and databases for decades now. A buffer cache can be described a piece of RAM set aside for caching frequently accessed data so that one can avoid disk accesses. Performance, of course, improves the more one can leverage the buffer cache for data accesses as opposed to going to disk.

Buffer caches are read-only caches. In other words only read accesses can potentially benefit from a buffer cache. In contrast, writes must always commit to the underlying datastore because RAM is a volatile medium and an outage will result in the data residing in the buffer cache getting lost forever. As you can imagine this can become a huge deal for an ACID compliant database. ☺ 

The diagrams below depict how the buffer cache works for both reads and writes.

For reads, the first access must always go to the underlying datastore whereas subsequent accesses can be satisfied from the buffer cache

For writes we always have to involve the underlying datastore. The buffer cache provides no benefits.

The fact that the buffer cache only helps reads is severely limiting. OLTP applications, for example, have a good percentage of write operations. These writes will track the performance of the underlying datastore and will most probably mask any benefits the reads are reaping from the buffer cache. 

Secondly, sizing the buffer cache is not trivial. Size the buffer cache too small and it is of no material benefit. Size it too large and you end up wasting memory. The irony of it all, of course, is that in virtualized environments the server resources are shared across VM and are managed by the infrastructure team. Yet, sizing the buffer cache is an application specific requirement as opposed to an infrastructure requirement. One therefore ends up doing this with no idea of what other applications are running on the host or sometimes even without an awareness of how much RAM is present in the server.

API based in memory computing

Another common way to leverage IMC is to leverage libraries or API within one’s applications. These libraries or API allow the application to use RAM as a cache. A prevalent example for this is memcached. You can learn more about memcached here. Here’s is an example of how one would use memcached to leverage RAM as a cache:

function get_foo(foo_id)     
foo = memcached_get("foo:" . foo_id)     
return foo if defined foo      
foo = fetch_foo_from_database(foo_id)     
memcached_set("foo:" . foo_id, foo)     
return foo 

This example shows you how you must rewrite your application to incorporate the API at the appropriate places. Any change to the API or the logic of your application means a rewrite.

Here is a similar example from SQL. 


In the SQL example note that you, as a user, need to specify which tables need to be cached in memory. You need to do this when the table is created. One presumes that if the table already exists you will need to use the ALTER TABLE command. As you know DDL changes aren’t trivial. Also, how should a user decide which tables to keep in RAM and which not to? And does such static, schema based definitions work in today’s dynamic world? Note also that it is unclear if these SQL extensions are ANSI SQL compliant.

In both cases enabling IMC means making fundamental changes to the application. It also means sometimes unnecessarily upgrading your database or your libraries to a version that supports the API you need. And, finally, it also means that each time the API behavior changes you need to revisit your application.

In Memory Appliances

More recently ISV have started shipping appliances that run their applications in memory. The timing for these appliances makes sense given servers with large amounts of DRAM are becoming commonplace. These appliances typically use RAM as a read cache only. In the section on buffer caches we’ve already discussed the limitations of read only caches. 

Moreover since many of these appliances run ACID compliant databases consistency is a key requirement. This means that writes must be logged in non-volatile media in order to guarantee consistency. If there are enough writes in the system then overall performance will begin to track the performance of the non-volatile media as opposed to RAM. The use of flash in place of mechanical drives mitigates this to some extent but if one is going to buy a purpose built appliance for in memory computing it seems unfortunate that one can only be guaranteed flash performance instead.

Most of these appliances also face tremendous startup times. They usually work by pulling all of their data into RAM during startup so that it can be cached. Instrad of using an ‘on demand’ approach they preload the data into RAM. This means that startup times are exorbitantly high when data sizes are large. 

Finally, from an operational perspective rolling in a purpose built appliance usually means more silos and more specialized monitoring and management that takes away from the tremendous value that virtualization brings to the table.


IMC as it exists today put the onus on the end user/application owner to make tweaks for storage performance when in fact the problem lies within the infrastructure. The result is tactical and proprietary optimizations that either make sub-optimal use of the infrastructure or simply break altogether over time. Wouldn’t the right way instead be to fix the storage performance problem, via in memory computing, at the infrastructure level? After all, storage performance is an infrastructure service to begin with. For more information, check out http://www.pernixdata.com