Our main design principles were to develop a host side caching software that is the highest performing storage performance tier in the market and also the easiest to deploy & use.
VirtuCache is kernel mode software for VMware that clusters together any in-host SSDs (and/or in-host RAM) installed across VMware hosts in a VMware cluster and then caches frequently and recently used data from any SAN based primary storage appliance to this clustered pool of host based high speed media. Subsequently, by automatically serving more and more data from in-host SSDs or RAM, VirtuCache substantially improves storage performance for VMware from any SAN based storage appliance, thus improving the performance of applications running within VMs and increasing the density of VMs running on each host, without requiring an expensive upgrade to SSD based storage appliances.
Note: Though most VirtuCache deployments use in-host SSDs as caching media, we can also use in-host RAM either exclusively or in addition to in-host SSDs. However since RAM is itself a constrained resource on the host and it is 4x the price/performance of SSDs for random IO, we recommend that our customers’ use RAM only if they have spare RAM on the host.
All references to SSDs in the below section apply to caching to DRAM as well.
All read requests from the VMs on the Host are intercepted by VirtuCache software in the VMware Kernel. VirtuCache first looks up the local cache for this data. If the data is in the SSD, it is served to the VM from the SSD (called ‘cache hit’). If the data is not in the SSD, the I/O path proceeds along its original course, and VMware retrieves the data from the backend LUN/Volume. At that point VirtuCache copies the data to the local SSD as well. Subsequently if the same data is requested again on the Host, it is now served from the local SSD, instead of from the backend storage appliance. In this way VirtuCache accelerates read operations by serving up more and more data from in-Host SSDs.
All writes from VMs on the Host are written to the local SSD without synchronously writing to the backend storage appliance. By writing to the in-server SSD, writes are substantially accelerated, however the fact that we are not synchronously committing the writes to the backend storage appliance introduces the risk of data loss / corruption in case the local Host or SSD were to fail. To guard against this possibility, VirtuCache protects the local cache by replicating / mirroring the writes across Hosts in a VMware cluster.
Syncing ‘Dirty’ Writes to backend storage
Dirty Writes are writes on the local SSD cache that have not yet been synced with backend storage. VirtuCache has a background task that continuously syncs Dirty Writes to the backend SAN based storage. VirtuCache adjusts the speed and frequency at which Dirty Writes are synced based on the latencies exhibited by the SAN, so as not to choke the SAN by trying to sync to the backend appliance too quickly. Also, at no point in time will more than a few minutes of Dirty Writes be stored on the local SSD. This is to avoid large amounts of Dirty Writes following the VM during a vMotion.
Cache Replication to protect against local Host or SSD failure
One of the main benefits of clustering SSDs across VMware Hosts is being able to mirror the cache across VMware Hosts in a distributed fashion. The administrator specifies the number (0, 1, or 2) of copies of cache that need to be kept for each local cache in the cluster. The number of copies indicates the maximum number of node failures that can be sustained before there is data loss in the cluster. If a customer chooses to keep, say, 2 copies of cache for each local Host based cache, VirtuCache automatically replicates the dirty writes across two SSDs on two other VMware Hosts in the same cluster. We default to using the vMotion network for such replication. However a separate network can be configured as well. Reads are not replicated since the backend storage appliance is always consistent as far as reads go. In the event of a Host or SSD failure, VirtuCache syncs the backup copy of the dirty write cache from another Host to the backend storage appliance.
Flow control to prevent write intensive VMs from taking over the SSD
Since the SSD capacity deployed within VMware Hosts is typically a small percentage of the total LUN capacity of the backend storage appliance, care needs to be taken to prevent write intensive VMs from taking over the entire SSD. VirtuCache allows bursty writes from VMs to be written to the SSD at native SSD write speeds without synchronously syncing the data to the backend disk. However for prolonged write intensive activity from VMs, VirtuCache’s flow control feature throttles back the write speeds to the SSD. This helps ensure fair allocation of SSD capacity to other VMs on the Host and ensures orderly de-staging of writes from the SSD to the backend LUN.
Keeping the cache ‘fresh’
We use a combination of Least Recently Used (LRU) and First-in-First-Out (FIFO) algorithms to replace less frequently used older data with newer data in cache, much like how traditional Operating Systems have been using these algorithms for Disk-to-Memory caching.
One of the biggest challenges in developing Kernel mode software for Host side storage acceleration for VMware is to do it in such a way so as to be able to get the software blessed by VMware, which meant using publicly available APIs from VMware. Most of our competitors have taken the easier route of developing such software using a VM based approach, either requiring a VM per Host (Virtual Storage Appliance) or requiring agents in guest VMs.
VirtuCache is now certified as Partner Verified and Supported as attested by the below link:
Our Solution is managed using the VirtuCache management VM. Only one instance of the management VM needs to be deployed per vCenter instance. The management VM lets administrators centrally manage all the VirtuCache instances on ESXi hosts managed by that vCenter instance using either our web user interface or our vCenter plug-in. The management VM is not in the IO path and can be powered off without affecting caching behavior.
By accelerating reads and writes using a VMware kernel only deployment, the performance improvement that VirtuCache brings to our customer’s existing storage appliance rivals the performance of an All Flash Array, without the pain and costs involved in an upgrade to an All Flash Array.