To Improve CEPH performance for VMware, Install SSDs in VMware hosts, NOT OSD hosts.

SSDs deployed for caching in CEPH OSD servers are not very effective. The problem lies not in the SSDs, but because they are deployed at a point in the IO path that is downstream (in relation to VMs that run end user applications) of where the IO bottleneck is. This post looks at the performance shortcoming of CEPH and its solution.

There are two options for improving the performance of CEPH.

Option 1 is to deploy SSDs in CEPH OSD servers for journaling (write caching) and read caching.

Option 2 is to deploy SSDs in the VMware hosts (that connect to CEPH over iSCSI) along with host side caching software, that then automatically caches reads and writes to the in-VMware host SSD from VMware Datastores created on CEPH volumes.

Below are reasons for why we recommend that you go with Option 2.

How to Select SSDs for Host Side Caching for VMware – Interface, Model, Size, Source and Raid Level ?

In terms of price/performance, enterprise NVME SSDs have now become the best choice for in-VMware host caching media. They are higher performing and cost just a little bit more than their lower performing SATA counterparts. The Intel P4600/P4610 NVME SSDs are my favorites. The Samsung PM1725a is my second choice. If you don’t have a spare 2.5” NVME or PCIe slot in your ESXi host, which precludes you from using NVME SSDs, you could use enterprise SATA SSDs. If you choose to go with SATA SSDs, you will also need a high queue depth RAID controller in the ESXi host. In enterprise SATA SSD category, the Intel S4600/S4610 or Samsung SM863a are good choices. If you don't have a spare PCIe, NVME, SATA, or SAS slot in the host, then the only choice is to use the much more expensive and higher performing host RAM as cache media.

This blog article will cover the below topics.

- Write IOPS rating and lifetime endurance of SSDs.

- Sizing the SSD.

- How many SSDs are needed in a VMware host and across the VMware cluster?

- In case of SATA SSDs, the need to RAID0 the SSD.

- Queue Depths.

- Where to buy SSDs?

CEPH Storage for VMware vSphere

CEPH is a great choice for deploying large amounts of storage. It's biggest drawbacks are high storage latencies and the difficulty of making it work for VMware hosts.

The Advantages of CEPH.

CEPH can be installed on any ordinary servers. It clusters these servers together and presents this cluster of servers as an iSCSI target. Clustering (of servers) is a key feature so CEPH can sustain component failures without causing a storage outage and also to scale capacity linearly by simply hot adding servers to the cluster. You can build CEPH storage with off the shelf components - servers, SSDs, HDDs, NICs, essentially any commodity server or server components. There is no vendor lock-in for hardware. As a result, hardware costs are low. All in all, it offers better reliability and deployment flexibility at a lower cost than big brand storage appliances.

CEPH has Two Drawbacks - High Storage Latencies and Difficulty Connecting to VMware.

Improving Storage Performance of Dell VRTX

Dell's PowerEdge VRTX hyper-converged appliance can either have all hard drive datastores or all SSD datastores, but you can't have SSDs act as tiering or caching media for HDD volumes. That's where VirtuCache comes in.

VMware View Storage Accelerator (VSA), a.k.a. Content Based Read Cache (CBRC), versus Virtunet VirtuCache

VSA caches only those blocks that are shared by all VDI VMs, and it can only use a maximum of 2GB host RAM. Because of these two reasons, it ends up caching only a small subset of blocks from the Master VM 1. In comparison, VirtuCache caches all storage IO, both reads and writes, whether it is from Master VM, end user VDI VMs, server VMs, ESXi kernel IO, and it can cache to large amounts of in-host SSD and/or RAM. These two aspects of VirtuCache ensures that almost all storage IO is serviced from in-host cache media.

More details in the table below.

Improving Performance of Log Management Application at a Service Provider

Business Intelligence, Log Management, Security Information & Event Management (SIEM), Search and Analytic software like Splunk, Elastic Search, Cognos, HP Vertica, HP Autonomy, need to provide real-time visibility into large volumes of fast changing data. When these applications are deployed in traditional VMware VMs connected to centralized storage, such large volume of write and read operations puts pressure on existing storage infrastructure resulting in much slower than real-time ingest and analysis speeds that are expected of such applications.

GUI comparison, PernixData FVP vs Us – VMware Write Back Caching Software.

For the most part both our GUI and workflows are similar. This article compares, with screenshots, steps to install and configure VirtuCache and PernixData FVP.

The only differences between us and Pernix stem from the fact that we leverage VMware's own capability in the areas of clustering, license management, and logging, whereas Pernix programmed these features separately within their software. Overall these additional screens add a few clicks and pages in Pernix versus us, but again I want to emphasize that we are more similar than different, in terms of the GUI and workflow.

Replaced PernixData for vGPU based VDI

First of all - PernixData was a good host side caching software. Unfortunately for their customers, after they were acquired by Nutanix, Nutanix end-of-lifed their software. Our software, called VirtuCache, directly competes with PernixData FVP.

PernixData’s FVP vs. Virtunet – Both VMware Kernel Write Caching Software

More similar than different

Both us and PernixData differentiate from rest of the host side caching vendors in similar ways - that we are kernel mode software; both of us cache writes in addition to reads; have data protection strategies in place to prevent against data loss in case of multiple simultaneous hardware failure; do not require networking or storage to be reconfigured; and do not require agents per VM or VM per host.

This article is the first in a series of two articles that compares our software versus PernixData FVP. The second article compares (with screenshots) GUI and configuration steps for PernixData FVP and us.

Below is how we compare on important criteria.

Infinio’s Read Caching versus Virtunet’s Read+Write Caching Software

The biggest difference is that we accelerate both reads and writes, Infinio accelerates only reads. A few others are - with us you can apply caching policy at the datastore and/or VM level versus only at the VM level with Infinio; we accelerate creation of VMs, snapshots, and other VMware kernel operations, which they don't. More details in the table below.

Virtunet Systems


Accelerates both reads and writes.1

Accelerates only reads.

By not caching writes, not only are writes not accelerated but the reads that are behind writes on the same thread are not accelerated, and so reads are slowed down as well. 1

Caching policy can be applied at the VM and/or Datastore level. Since the number of Datastores is typically much less than VMs, its quicker to configure caching for all the VMs in the cluster by assigning caching policy at the Datastore level than at the VM level. All VMs within the Datastore inherit the Datastore wide caching policy by default. You can of course apply a different caching policy at the VM level.2

Caching policy can be applied only at the VM level. So if you have large number of VMs, the process of configuring caching for all the VMs in the cluster becomes onerous.2

When new VMs (server or desktop VMs) are created, these VMs automatically inherit the Datastore caching policy.

For all new server or desktop VMs, policy has to be manually applied to these VMs. This is especially problematic in non-persistent VDI, where new VMs are being continuously created and without admin notification.

All storage IO, whether it originates within VMs or VMware kernel, is accelerated.3

Storage IO originating in the VMware kernel is not accelerated. So creation/deletion of VMs, snapshots, and linked clone operations are not accelerated, since these operations are initiated from within the VMware kernel and not from within any VM.3

Page 1 of 512345