One of the new products which accompanies the vSphere 5.0 release is the Virtual Storage Appliance. The purpose of this product is to allow customers to utilize the local disks on the ESXi hosts in order to create a shared storage environment for their virtual infrastructure, thus being able to take advantage of the advanced features such as HA and VMotion which are reliant on shared storage. The idea behind this is to avoid the costs of a hardware based SAN/NAS system to allow SMB customers to implement vSphere and its advanced features at a more attractive price point.
VSA Cluster Storage Architecture:
The VSA cluster is two (or three) VMs that run in the ESXi environment. Depicted above is the architecture from a storage perspective, and its important to understand the levels of abstraction and how we finally arrive at a shared storage resource.
The the very bottom of the stack, is the physical ESXi host (physical server) which houses the local hard disks. Presumably, there is some kind of hardware RAID capability in this server either as a function of the BIOS or a RAID card which takes all the disks and combines them together using RAID protection to give a local volume. VMware says that RAID10 is a requirement here, but this is not a hard and fast requirement as far as I can tell — more on that below.
You then install ESXi onto this local volume and by doing so format it with the VMFS file system. When you install the VSA, the installer uses the remainder of the disk space not taken up by the ESXI install and the VSA VM itself for the “shared storage” capacity and presents that to the VSA VM as series of VMDKs which the VSA VM combines using a LVM to form a primary & secondary volume. The VSA VM runs an NFS server and exports this volume back to the ESXi host. Each VSA VM does this, and hence you end up with 2 NFS volumes (in a 2-node cluster): VSADs-1 and VSADs-2. Just a little bit of inception going on here! 🙂 Its important to note that only half he space is actually exported as a NFS volume due to RAID10 protection.
To elaborate a little on the the primary and secondary volumes in the VSA VM — remember that each volume exported by the VSA VM is protected via RAID10. So one half of the VSADs-1 RAID10 mirror lives on VSA1 and the other half lives on VSA2. In this way, the environment can tolerate disk failure as well as node failure and still remain operational thanks to the RAID10 protection. What I haven’t been able to dig into yet is the replication mechanism for keeping the primary/secondary volumes in sync. I suspect it might be something like DRBD (not verified, just a guess).
Now that we better understand how the VSA works under the covers, its important to note that there are a number of considerations and caveats to be aware of when deciding to utilize the VSA:
- the VSA manager (server side of the plug-in which allows you to manage the VSA) needs to be installed on a Windows based vCenter server. this means you cannot utilize the vCenter Appliance (VCA) as it is Linux based. To me this is definitely a downside as the VCA is extremely easy to setup (done via OVF and can be up and managing an environment in minutes) and perfectly suited to SMB environments with its internal database. Hopefully this can be addressed in future releases. Looking at the VSA manager, it looks to be all tomcat/java based, so there is no reason it cannot run on a the Linux based VCA
- when setting up the VSA, each ESXi host must be a fresh install with no virtual machines running on it. further more, each ESXi host must have only the default vSphere standard switches or port groups. you cannot create any additional switches or port groups. once the VSA has been setup, you are then free to modify the networking
- the ESXi hosts must be on the same subnet as the vCenter server
- the ESXi hosts must not be in another HA cluster. the VSA setup utility sets up its own HA cluster for the environment
- maximum supported hard disk capacity per ESXi host is 64TB
- there are specific requirements around networking: each ESXi host requires 4 NIC ports minimum, and you require 2 VLANs (one for front-end and one for back-end traffic)
- 72GB of RAM is the maximum supported & tested RAM configuration with the VSA
- memory overcommit on VMs is not supported when utilizing the VSA. VMware’s reason for this is because if swapping occurs, there could be severe performance slow down. I don’t necessarily agree with this, as if you put enough spindles in the local host, it should not be an issue. But again this is VMware’s official support statement.
- VMware says you should have 8 or more hard disks in RAID10 in the ESXi hosts. I see no reason why you could not utilize RAID5 or a different hard disk count. In fact, in my testing, I did not utilize any “local RAID” per se as I was running in a nested ESXi environment, and the actual LUN was utilizing RAID5 on the back-end in a FAST-VP pool. I suspect that VMware recommends a minimum of 8-disk RAID10 on the hard disks for performance reasons. But there is no reason why you wouldn’t treat spindle count on the ESXi hosts’ local drives just like you would for sizing a SAN LUN for traditional environments. Not enough spindles = performance issues no matter if they are local disks or SAN disks. But again, this is VMware’s official support statement requiring RAID10 and a minimum of 8 disks.
- the VSA mirrors the data utilizing RAID10 (a primary and replica volume each on different hosts). this is not configurable, so plan on this from a capacity perspective. If you have 8 disks in your ESXi host doing a RAID10 giving you a volume of 1TB, and you have 2 hosts for a total of 2TB — you will end up with 1TB of usable capacity in your environment. in VSA1 500GB will be primary, 500GB will be secondary, and similarly for VSA2.
- the VSA exports the volumes as NFS; there is no support for iSCSI
- if you are running vCenter as VM, it CANNOT be running on the hosts participating in the VSA cluster
Next, we will look at how to get the VSA up and running in a nested ESXi environment and following that some general tasks and see what is/is not possible with the VSA compared to traditional shared storage as well as how it handles some failure scenarios.