How to Select SSDs for Host Side Caching for VMware — Interface, Model, Size, Source and Raid Level ?
In terms of price/performance, enterprise NVME SSDs are the best choice for in-VMware host caching media. The Samsung PM1735 (in PCIe form factor) and the Micron 9400 Max (2.5″ U.x form factor) NVME SSDs are my favorites. If you don’t have a spare U.x NVME slot or a conventional PCIe slot in your ESXi host, which precludes you from using NVME SSDs, you could use SAS or enterprise SATA SSDs. If you choose to go with SATA/SAS SSDs, you will also need a high queue depth RAID controller in the ESXi host. In the enterprise SATA SSD category, the Micron 5400 Max, and in the SAS category, the Samsung PM1653 are good choices. If you don’t have any slots in the host to install an SSD, then the only choice is to use the more expensive but higher performing host RAM as cache media.
This blog article will cover the below topics.
- Endurance and Write IOPS rating of the SSD are the two most important selection criteria
- Performance and cost for our recommended SATA and NVME SSDs
- Why do not use consumer SSDs though they have a high IOPS rating?
- Should you buy host SSDs from server vendor or retail?
- How to size the SSD?
- How many SSDs do you need?
- Why is Queue Depth so important?
- Do you need to RAID the SSD?
Question – What problem are you trying to solve?
You have no storage performance issues if you have less than 5 milliseconds (ms) latency at the VM level at all times. So there’s no need for you to go through this article if you don’t have this problem. 🙂
There are two types of storage performance problems in the VMware world. If the aggregate peak storage MBps from all VMs on a single ESXi host is greater than 100MBps, then the high latencies you are experiencing are a direct consequence of high throughput. However such high throughput is rare and so it’s not often that we run into this problem at customers. The more prevalent problem is when you experience high VM latencies (> 20ms peak latencies) even at low throughput (say < 1MBps peak throughput). Please note that you want to track VM level stats and not appliance or datastore level stats.
Both these problems can be solved by host side caching software caching to in-VMware host SSD. If you are experiencing high latency at low throughput then either a SATA, SAS, or NVME SSD will work. Now if you require low latencies at very high storage throughput (say > 100MBps storage IO per host), then definitely go with write-intensive enterprise NVME or SAS SSDs, and not with SATA SSDs. More on SSD selection in later sections.
the two most important ssd selection criteria — write iops rating and endurance
Write IOPS rating for random 4KB block size is the single most important parameter to select an in-VMware host SSD. This is because most storage IO from within VMware is random and uses small block sizes. All enterprise SSDs do well on reads, but it’s only a few that do well on writes for small block (4KB) random workload. Also it’s almost always the case that an SSD with a higher random write IOPS rating is also higher performing on random read IOPS. Lastly, since VirtuCache accelerates both reads and writes, we pay closer attention to the write IOPS specs for the SSD.
The next most important parameter is the endurance of the SSD. Endurance is measured in terms of the total amount of lifetime writes in petabytes that the SSD OEM warrants the SSD for. Since all enterprise SSDs are warranted for 5 years, this parameter is either expressed in petabytes written over 5 years or in a parameter called DWPD (Drive Writes Per Day). DWPD is the number of times the entire capacity of the drive can be written to on a daily basis and warranted by the OEM for 5 years. Host side caching involves continuously replacing older, less frequently used data with newer, more frequently used data, and both deletes and new writes are write operations, and so you need a high endurance SSD. The SSD OEM warrants the SSD for the earlier of 5 years or when the write endurance limit of the SSD is reached.
my RECOMMENDED sata, SAS, and nvme ssds (as of 2024)
My favorite SSDs are the Samsung PM1735 NVME SSD (in PCIe x8 form factor) or the Micron 9400 MAX NVME SSD (2.5″ U.2/U.3 interface). Datacenter NVME SSDs come in 2.5” U.2/U.3 form factor or conventional PCIe form factor. If your server doesn’t have either of these slots, but it has a spare SATA / SAS slot then you could go with the Micron 5400 MAX SATA SSD or the Samsung PM1653 SAS SSD. However, if you decide to go with SATA or SAS SSDs, ensure that the RAID controller in the host is high queue depth. If you don’t have any slot in the server for an SSD, then the only choice is to use the more expensive and higher performing host RAM as cache media.
Below is a table that compares key metrics for these SSDs and host RAM as tested by us, for storage IO generated within a VMware VM for 100% random 100% read tests using 4KB block size, and where VirtuCache was caching the entire Iometer test file to in-VMware host caching media (100% cache hit ratio).
In-VMware Host Cache Media |
Read Throughput (MBps) |
Read Latency (ms) |
Cost $/GB (in 2024) |
Endurance (Petabyte Writes) |
Standard Deviation for Latencies |
---|---|---|---|---|---|
Host RAM |
630 |
0.4 |
$3 |
Not a concern. Very High. |
Very low. |
Micron 9400 MAX 3.2TB, U.2/U.3 form factor |
400 |
0.5 |
$0.25 |
17.5 |
Very low. |
Samsung PM1735 NVME 3.2TB, PCIe form factor |
300 |
0.7 |
$0.30 |
17.5 |
Low. |
Micron 5400 MAX 3.8TB, Enterprise SATA |
120 |
4 |
$0.20 |
24 |
Medium. |
Samsung PM1653 3.8TB, Enterprise SAS |
210 |
1.4 |
$0.25 |
7 |
Low. |
What about SAS SSDs?
SAS SSDs tout better error control and lower failure rates versus other SSDs. They are lower latency and more consistent in performance than SATA SSDs. Also all SAS SSDs are enterprise grade so unlike SATA SSDs you don’t need to spend the extra time making sure that you indeed have an enterprise grade SSD vs in the case of SATA SSDs, if you didn’t read the specs, you might end up buying a consumer SSD (a big NO). The only drawback of a SAS SSD is that you do need a high queue depth RAID controller to be able to RAID0 this SSD and by doing so assign a high queue depth to the SAS SSD. So if you are buying new SSDs, I would recommend that you buy a NVME SSD as a first choice or SAS (if you don’t have an NVME / PCI slot).
Consumer SSDs are cheaper and high IOPS, so why not consumer SSDs?
First of all, consumer SSDs have low endurance (less than 500TB in most cases). Also, consumer SSDs are warranted for 3 years and not 5 years like their enterprise counterparts.
Secondly, by looking at the IOPS rating for some consumer SSDs, you might get the impression that they are higher performing than enterprise SSDs. However, in a VMware environment, you are better served by lower latencies and low standard deviation for latencies than simply comparing IOPS ratings across SSDs. Unfortunately, SSD OEMs don’t list latencies, they only list IOPS or MBps throughput. For instance, a few Samsung consumer SATA SSDs are higher throughput / IOPS than the enterprise grade Micron 5400 MAX, but the Micron 5400 MAX is much lower latency than these Samsung SSDs and far more consistent (low standard deviation for latencies) as well.
Where to buy SSDs from?
You can buy host side SSDs from your server vendor or from retailers like Amazon.com, CDW, Newegg, etc. The SSD costs much less if bought from a retailer than if the same SSD was bought from the server vendor. The retailers also pass through the SSD OEM warranty of 5 years versus the same SSD when rebranded by the server vendor is now warranted by the server vendor for only 3 years. The one “advantage” with a server vendor branded SSD is that it does make the server management console light go green vs. the same SSD you buy from Amazon.com might not.
What size SSD?
My rule of thumb for generic IT workloads is that 10-20% of media serves 80-90% of storage requests. So 20% of storage used by all VMs on that host should be the SSD capacity. While evaluating VirtuCache, (using VirtuCache stats screen) if you notice that the cache hit ratio is low and the SSD is full, you should increase the SSD capacity to get to > 90% cache hit ratio. You could do that by replacing your existing SSD with a single higher capacity SSD (preferred) or getting two smaller but equal size SSDs and creating a RAID 0 array across the two SSDs.
I wouldn’t recommend using RAID controllers for NVME SSDs. First of all, RAID controllers for NVME SSDs are rarely fitted in servers. This is because a single NVME SSD is now 15TBs large and NVME SSDs are also quite fast. Hence RAID is not needed for NVME SSDs for both capacity and performance reasons. Secondly, the current generation of NVME RAID controllers is not very good. They increase the latencies of the underlying NVME SSDs quite a bit.
How many SSDs do you need?
With VirtuCache, if you are caching only reads (Write-Through Caching), then you need only one SSD per host and only for those hosts that need to be accelerated. If you are caching writes as well (Write-Back Caching), you will need one SSD per host and for all the hosts in the VMware cluster. This is because in case of write caching, VirtuCache commits a write to the local SSD and a copy of that same write is also synchronously copied to another SSD in another host in the same ESXi cluster. Writes are mirrored across two hosts in this fashion to protect against data loss in case of a host failure.
Also, I don’t recommend that you use multiple SSDs in a single host, instead use a single SSD. This is because a RAID0 array of a single SSD is higher performing than RAID0 array of multiple SSDs.
Queue Depth is a number assigned by the device vendor to their storage device (or software) that advertises to the component above the device in the storage IO path, the maximum number of IO requests the device (or software) can process in parallel. Every device or software component in the storage IO path has a Queue Depth associated with it. IO requests that are sent to that device, that are in excess of the Queue Depth, get queued. You don’t want any queueing on any device in the storage IO path, else the latencies go up. A higher Queue Depth means that the device is lower latency and generally higher performing.
If using SATA SSDs, please check the Queue Depth of the SSD device and the RAID controller. Esxtop command in ESXi tells you the Adapter Queue Depth (field called AQLEN) for the RAID controller and Disk Queue Depth (field called DQLEN) for the RAID0 SSD device. In the case of cheap (low Queue Depth) RAID controllers, the RAID controller becomes a bottleneck to the RAID0 SSD behind it, hence it is very important that both the AQLEN and DQLEN be greater than 512 for the RAID0 SSD device and the RAID controller.
In the case of NVME SSDs, Queue Depths are always greater than 1200. But it’s always a good idea to confirm the queue depth (AQLEN and DQLEN) of the NVME SSD regardless.
Here is a related blog post on how Queue Depth affects storage latency and IOPS in VMware.
This section is not applicable to NVME SSDs, since the Queue Depth of a NVME SSD is very high without using a RAID controller ahead of it.
Yes, you should RAID the SATA/SAS SSD in each host but not for the conventional reason of protecting data. You need to RAID-0 the SATA / SAS SSD to assign the SSD a higher Queue Depth than what the default VMware SATA driver is capable of assigning to it. By assigning the SATA SSD a higher Queue Depth, a larger number of requests can be processed by the SSD, thus improving throughput and reducing latencies. A higher Queue Depth than what is possible by the default VMware SATA driver can only be assigned to the SATA SSD in this fashion.
You shouldn’t do RAID-1 or a higher RAID level since VirtuCache takes care of protecting against data loss if an SSD or host fails. For read cache, the reads are always kept in sync between the local SSD and the storage array at all times. And in the case of write caching, VirtuCache protects the writes on the SSD by mirroring writes over the network to an SSD in another host (write cache replication). So even if a host were to fail, you don’t lose any data. Also, having multiple SSDs in RAID1 (or higher RAID level) deteriorates SSD latencies considerably.
-Use enterprise SSDs, not consumer.
-Use NVME SSD if you have a spare PCIe slot or U.2/U.3 NVME slot in your server. If you don’t have these slots, use a SAS SSD (or SATA as a second choice) behind a high queue depth RAID controller. If you don’t even have a spare SATA / SAS slot in the server, then the only choice is to use host RAM as cache media.
-If using SATA / SAS SSD, make sure that the RAID controller has a Queue Depth higher than 512. For NVME SSD, Queue Depth is not a concern.
-For accelerating reads + writes, you need cache media (host RAM / SSD) in every host in the ESXi cluster. For accelerating only reads, cache media is needed for only those hosts needing acceleration.
-If using SATA / SAS SSD, RAID0 a single SSD. Do not RAID1 (or use higher RAID levels) multiple SSDs. If using NVME SSD, the topic of RAID is not relevant.
– Though you can use multiple SSDs in a host, It’s preferable to use only one SSD.
Disclaimer: The author and Virtunet have no affiliation with Samsung, Micron, or any other SSD OEM. There was no monetary compensation made or free SSD samples sent to the author or Virtunet from Samsung or Micron.