Reducing write latencies in a VMware stretched SAN cluster
There are only a few applications, financial trading software being one example, that require very low latencies, lower even than what’s possible with an all-flash array (AFA). VirtuCache caching to in-host RAM results in lower VM latencies than an AFA. This is because RAM latencies are an order of magnitude lower than NVME SSDs, and in the case of VirtuCache the cache media (RAM) is connected to the host CPU through a high speed memory bus, versus in the case of an AFA where the NVME SSDs are behind the network and storage controller.
High write latencies in a stretched SAN cluster
Tourbillon Capital Partner is a hedge fund. They run proprietary trading software within VMware VMs that requires under 5 millisecond latencies. Tourbillon has two VMware clusters with a few nodes in each cluster. Each ESXi cluster is connected to a Pure Storage SAN array. Both ESXi clusters are in different datacenters, but connected to each other over a 10gbps WAN link. A stretched SAN cluster across these two ESXi clusters is created using Datacore software. Simply speaking what the Datacore stretched cluster accomplishes is that all VM writes are synchronously written to both Pure Storage arrays – the array that’s in the same datacenter as the VM, and also to the remote Pure Storage array. In this way Tourbillon’s IT folks assure themselves of seconds-to-minutes RPO and RTO time in case of a datacenter outage.
The problem with this architecture was that sometimes VM write latencies exceeded the 10ms ceiling that was required by their trading application. This was because writes had to go over their WAN link between datacenters. Even though the WAN link was 10gbps, it would spike to > 5ms latencies from time to time. Typically, the standard deviation for latencies in a long distance WAN link is quite a bit more than in shorter LAN links of the same speed.
Caching VM writes to in-host RAM reduced write latencies considerably
Tourbillon deployed VirtuCache to fix this issue. VirtuCache was installed in every host, in both ESXi clusters. It was configured to cache reads and writes to in-host RAM, with the write cache replicated to another host in the same datacenter, which in turn resulted in sub-millisecond VM write latencies at all times. In this way, VirtuCache effectively papered over the underlying high WAN latencies, when large volume of writes were transmitted from VMs.