The Virtunet DifferenceThe customer went with VirtuCache because:
September 1, 2020; Connecticut, USA.
WAN latency does not contribute to VM latencies when a storage array fails in a VMware metro / stretched cluster when all storage IO is cached to VMware host Flash or RAM.
This post discusses how VirtuCache did this for Mashantucket casinos, by explaining the read and write IO path in VMware Metro Cluster before and after installing VirtuCache.
WRITE IO PATH IN VMWARE METRO CLUSTER
At the customer, ESXi hosts in both datacenter locations are connected to storage appliances at both sites. The storage network between ESXi hosts at one site and storage array on the other site goes over a 1gbps WAN link, and the storage network between ESXi hosts and storage array at the same site is over 10gbps LAN.
Since the IO path from ESXi hosts to the storage appliance at the same site is shorter than the IO path to the appliance at the remote site, all reads and writes from VMs go to the array at the local site only, hence these paths are called active (or ALUA optimized) paths, and the paths from the hosts to the array at the remote site (which is separated by a WAN link) are the inactive paths (or ALUA unoptimized paths). Storage IO goes over the inactive paths only if the storage array with the active paths fails.
Write IO path in metro storage cluster with VirtuCache installed on every ESXi host
Once VirtuCache was installed on all hosts at both datacenters, all writes from VMs are now written to SSD / RAM that’s in the local ESXi host and another copy of the writes is written to SSD / RAM in another host in the same datacenter. This happens regardless of whether the local storage array is in operation or fails. In other words, VirtuCache will send a write acknowledgement back to VMware when VirtuCache commits writes from VMs to cache media in the local ESXi hosts, without the write being committed to backend storage array. Now there is a VirtuCache background job that continuously syncs the locally cached writes to the backend storage array, however this VirtuCache write flush process does not contribute to VM write latency. And it is for this reason that the local storage network latency or inter-datacenter WAN latency does not contribute to VM write latency when VirtuCache is in the IO path. During regular operation, VirtuCache syncs the write cache to the local array. When the local storage array fails, VirtuCache syncs the write cache to the remote array. Whether VirtuCache flushes the writes to the local array or remote array, VM write latencies remain the same. Since WAN latency does not factor into VM write latency, a lower bandwidth / higher latency link between datacenter will work just fine.
Read IO path in vMSC with VirtuCache installed on every ESXi host
VirtuCache caches frequently and recently used reads to in-host cache media. Since most of the reads will be serviced from in-host SSD / RAM, there will only be a small number of reads that go over the local storage network (in case of regular operation) or over the WAN link (when the local storage array fails). Since the volume of reads coming from the backend array will be small, a lower bandwidth / higher latency link between datacenter suffices.
CUSTOMER COST / BENEFIT
With VirtuCache installed, the customer can now go with a lower bandwidth WAN link, stretch the cluster across longer distances, and tolerate WAN latency peaks of up to 200 milliseconds (much more than the 5-10 millisecond WAN latency that the storage vendor recommends), without adversely impacting VM read and write latencies.
The customer had 4 hosts at each site and decided to buy a 1gbps WAN link to the internet instead of a 10gbps point-to-point link between datacenter locations.
The below table lists VirtuCache cost over 3 years and cost savings for the customer due to the fact that they decided to buy a 1gbps internet WAN link instead of a more expensive 10gbps point-to-point link. As you can see the cost saving more than makes up for the money spent on VirtuCache.
|VirtuCache with a 3.2TB Samsung NVME SSD deployed in each of the 8 hosts. Perpetual license with 3-year support.||Cost savings because the customer decided to buy a 1gbps WAN link instead of a 10gbps point-to-point link.|
|Cost components||$5K per host for VirtuCache and $1.1K for the 3.2 TB Samsung NVME SSD.||Difference in cost between 10gbps P-2-P WAN link and 1gbps internet WAN link = $3.2K/month|
|Cost over 3 years||8 VirtuCache licenses + 25.6TB NVME SSD capacity = $49K||WAN cost savings over 3 years = $115K|