VMware vSAN - A Complete HCI Solution

VMware vSAN - A Complete HCI Solution

26 Nov 2017

Tracking the growth of HCI

Hyper Converged Infrastructure (HCI) is disrupting the traditional storage markets (SAN, NAS + DAS) as enterprise IT begin to replicate Hyperscale public cloud provider’s infrastructure in their own data centres. Public cloud giants such as AWS, Microsoft Azure and now Google have developed their own scale-out software defined enterprise storage that is built on commodity servers and disks that power their respective cloud platforms. These storage platforms have enterprise features from existing SAN and NAS devices added into software which allows the use of cheaper commodity hardware whilst improving performance and scalability.

A report from 2016 shows the predicted change from traditional storage to HCI and Hyperscale technologies for the next 10 years:

A snapshot of the total storage market.

Source: http://wikibon.org/wiki/v/The_Rise_of_Server_SAN

Business Challenges

With traditional storage solutions, customers faced several challenges: The hardware was proprietary, bundled in by the storage vendor which created storage silos that often lacked granular control.  Enterprises therefore often required storage teams with storage vendor specific training and use vendor specific configuration tools adding to resource overhead. If the storage vendor was changed this created further silos within the teams in terms of skill sets.  Deploying traditional storage was also time consuming as it often included multiple people from storage, networking and server teams and lacked real automation.

By moving the storage into software, HCI solutions such as VMware Virtual SAN (vSAN) provide a linear, scalable solution using the same management and monitoring tools VMware admins are already using whilst providing modern policy-based and automated solution for storage.

VMware vSAN Architecture

vSAN is an object-based file system where VMs and snapshots are broken down into objects and each object has multiple components.  vSAN objects are:

  • VM Home Namespace (VMX, NVRAM)
  • VM Swap (virtual memory swap)
  • Virtual Disk (VMDK)
  • Snapshot Delta Disk
  • Snapshot Memory Delta

Other objects can exist, such as vSAN performance service database or VMDKs that belong to iSCSI targets.  An object can be made up of one or more components depending on different factors such as the size of the object and the storage policy assigned to the object. 

Traditional storage models utilise LUNs or volumes to present storage with specific disk configuration such as RAID level or performance characteristics that covered all workloads on said LUN or volume which restricts flexibility and increase wastage. vSAN uses storage policies at software level only applied to components where required and this enables precise control of virtual machines or even individual virtual disks while sharing the same storage unit underneath.

A storage policy defines availability factors as Failure to Tolerate and stripe size.  For an object to have a failure tolerance of RAID 1 would mean two full copies of the data is distributed across two hosts with a third witness component on a third host resulting in tolerance for a full host failure. 

Further factors can be defined such as striping for performance, availability level and the ability to limit IOPS. Rack awareness and multiple site fault domains can also be configured which can dictate how the objects are distributed across the vSAN datastore. 

vSAN uses the concept of disk groups to pool together flash devices and magnetic disks as single management constructs.  A disk group is comprised of 1 flash device for the read cache / write buffer and up to 7 capacity devices that can be magnetic disks (Hybrid mode) or flash (All-Flash mode).  A disk group must have a cache tier with a capacity tier and a host can have up to 5 disk groups in total. All disks groups across the cluster will typically partake in forming a single vSAN datastore that is available for all VM’s across the cluster.

Any supported hardware can be used for vSAN. VMware have an extensive Hardware Compatibility List available for anyone to design their own vSAN backed virtualization platform based on a specific requirement. VMware also provide vSAN ready nodes with a multitude of server hardware vendors that come pre-built with a pre-validated set of precise supported hardware components in each node. VMware Cloud Foundation (VCF) is also another offering where the complete HCI offering including vSAN, vSphere is bundled in with VMware NSX, their software defined networking solution which provide similar benefits to vSAN for storage, when it comes to networking. 

Other HCI providers typically require a virtual appliance to run on the host to offload the storage known as a Virtual Storage Appliance. This typically requires reserved CPU and Memory from each host in the cluster which reduces the resources available for other workloads.  vSAN is directly embedded into the vSphere Hypervisor kernel by the deployment of vSphere Installation Bundle (VIB).  Therefore, while vSAN does still require resources, typically up to 10% of the host’s compute, this doesn’t compete with other VMs for resources.  As it is integrated with vSphere, the admin uses the same tools that they use for vSphere to manage vSAN. vSAN has full support for native vSphere features and functionalities such as vMotion and DRS.

Advanced features can be enabled to make efficient use of the storage, features such as deduplication, compression and erasure coding.  Deduplication and compression can be simply enabled from a ticbox on an All-Flash configuration., Storage savings here will vary depending on the type of data but it is reported to save as much as 7x savings.  Deduplication and compression is a single cluster-wide setting to enable and disable.  Deduplication occurs when the data is de-staged from the cache tier to the capacity tier and is limited to the disk group.  Compression is applied after deduplication just before the data is wrote to the capacity tier.

Erasure Coding is another feature available for All-Flash configurations that provides the same level of redundancy as RAID1 but reduces the capacity requirements by taking the data and breaking into multiple pieces, spreading it across multiple nodes and adding parity data so it can be recreated in the event one or more of the pieces are lost.  To use erasure coding RAID5 a minimum of 4 hosts are required in the cluster and to use erasure coding RAID6 a minimum of 6 hosts are required.

A key VMware vSAN feature is Erasure Coding, demonstrated by this graph.

From vSAN version 6.0, VMware introduced a new Virtual SAN on-disk filesystem (vSAN FS). The new version delivers a new VMDK delta file called vsanSparse which takes advantage of the new on-disk format writing and extended cashing capabilities to deliver efficient performance. This results in the delivery of performance-based snapshots and clones comparable to traditionally array level snapshots from the likes of NetApp, Nimble or EMC arrays.

The typical deployment methods supported include standard (up to 32 nodes) as well as a 2-node Remote and Branch Office (RoBo) clusters with vSAN Standard licenses. As vSAN is all in software these deployments can be scaled easily as required. Unlike traditional storage, scaling up through adding additional disks has no additional vSAN costs associated while scaling out would require additional nodes with additional vSAN licenses.

vSAN Stretched Clusters

An example of VMware vSAN's stretched clusters method across multiple locations

vSAN stretched clusters enabled through the vSAN Enterprise licenses is a very popular alternative for dual campus or geographically dispersed sites requiring un-interrupted availability across both sites. A stretched vSAN cluster configuration enables high availability across data centres with zero RPO (recovery point objective), and by leveraging storage policies, redundancy can be specified locally within a site or also redundancy across sites on a per-VM basis. A similar solution using a legacy SAN platform such as NetApp Metro-Clusters or EMC VPLEX would have had significant associated costs due to additional hardware requirements while a vSAN stretched cluster is an ideal low-cost solution that meet the same requirements through a software-defined approach without the need for specialist hardware.

Automation

To provide customer options for automation and for rapid efficient provisioning, vSAN features an extensive management API and multiple software development kits (SDKs).  Using these, administrators can orchestrate all aspects of the vSAN environment such as installation, life-cycle management, troubleshooting and monitoring.

vSAN APIs can also be leveraged through vSphere PowerCLI cmdlets.  Repeatable deployment and configuration tasks such as assigning storage policies and checking compliance can all be automated to mitigate against human error and to speed up the process.  Automation can also be leveraged for life-cycle management tasks such as applying upgrades and patches.

The latest VMWare vSAN features

VMware vSAN has been running on a 6-month update cycle since 2015. Recent feature updates include:

  • Native encryption data-at-rest
  • Resilient management independent of vCenter
  • Stretched clusters with local failure protection
  • Simple networking with Unicast
  • vSAN easy 1-click install

Final Thoughts

If you are looking at a HCI solution, VMware Virtual SAN (vSAN) offers a low-risk migration to an HCI solution for existing VMware vSphere customers that is architecturally best for all existing vSphere customers. Unlike other HCI solutions, a vSAN-based HCI solution doesn’t require additional tools or training either.

Reap the benefits.

Find out how VMware vSAN can improve your environments.

Request further info