How to simulate production workloads running in VMware VMs using FIO or Iometer?

How to simulate production workloads running in VMware VMs using FIO or Iometer?

Here is a quick way to reproduce your entire ESXi cluster-wide production workload using only one VM running a storage IO testing tool like FIO or Iometer.
This exercise is useful when you evaluate new storage technologies to see how they might perform with your real-life workload, but without actually deploying those in production.

The focus of this post is to do this in under 30 minutes and using freely available Iometer or FIO tools.

Step 1: If most of your workload is running in Linux VMs, use FIO. If Windows VMs, use Iometer.

Step 2: The table below lists the three most important characteristics of your production workload for you to be able to simulate it in Iometer or FIO, with instructions for how to collect this data. The fields denoted with <> need to be replaced with appropriate values in FIO command line or Iometer GUI.

Storage IO parameters taken from any one of your ESXi hosts, preferably the host that’s doing the most IO.

SSH to one of your production hosts > At esxcli type esxtop,  then type d, and calculate the values using equations listed below.

<Block Size>: Payload carried by each IO. (KiloBytes)

[(MBREAD/s + MBWRTN/s) ÷ (READ/s + WRITES/s)] X 1000

<Read/Write Mix>: Proportion of read IOPS to total IOPS.

READ/s ÷ (READ/s + WRITES/s)

<IO Depth>: Number of simultaneous IO requests generated by your workload.

On the same esxtop screen, type u and calculate the sum of all ACTV and QUED values for all your LUNs. If the ACTV and QUED values for a Datastore are zero while you are watching this screen, then assume the value of 1 for ACTV and 0 for QUED for every LUN.

You will also need the below parameters to plug into FIO and Iometer:

  1. <Test File Size>: 5GB file size is a good balance between two competing objectives – quicker test completion times and ensuring that the dataset is large enough to overflow appliance memory buffers.

  2. <Random Workload>: Use 100% random workload since VMware workloads are random. Also, random workloads stress the storage infrastructure much more than sequential.

  3. <Number of Test VM Cores>: Since real-life applications are multi-threaded, you need to define the number of threads that Iometer or FIO will spawn, and it is over these threads that storage IO is generated by FIO and Iometer. The parameter is called ‘numjobs’ in FIO and ‘Workers’ in Iometer. It should be set to the number of CPU cores assigned to the VM. Since the number of threads that can be processed simultaneously cannot exceed the total CPU cores assigned to the VM, a higher value will just choke the CPU and not be of much use.

STEP 3: Run the test by either following Step 3a if you are using FIO in Linux or 3b if you are using Iometer in Windows. 

STEP 3a: If using FIO in a Linux VM.

run the below command from within your Linux VM

sudo fio -size=<Test File Size> -direct=1 -rw=randrw -rwmixread=<Read/Write Mix> -bs=<Block Size> -ioengine=libaio -iodepth=<IO Depth> -runtime=1200 -numjobs=<Number of Test VM Cores> -time_based -group_reporting -name=my_production_workload_profile

for instance:

sudo fio -size=5GB -direct=1 -rw=randrw -rwmixread=69 -bs=4K -ioengine=libaio -iodepth=12 -runtime=1200 -numjobs=4 -time_based -group_reporting -name=my_production_workload_profile

The above syntax means that FIO will run on a 5GB file, the workload is fully random, 69% read and 31% write, storage payload per IO is 4KB, with 12 simultaneous IO requests generated against the FIO test file, the test will run for 20 minutes, and it is a multi-threaded test that uses 4 threads (processes) running in parallel.

STEP 3b: If using Iometer in a Windows VM.

Firstly, install the older 2006.07.27 edition. Don’t use the latest 1.1.0 edition which has bugs.

Here is a blog post with screenshots on how to run Iometer. Now if you are familiar with Iometer then skip this link and proceed to the below section that lists the fields and associated values to plug into the Iometer GUI.

  1. Create as many ‘Workers’ in Iometer as there are <Number of Test VM Cores>. An Iometer ‘Worker’ is a thread (process).
  2. For each ‘Worker’, on the ‘Disk Targets’ tab, set the ‘# of Outstanding I/Os’ to be equal to <IO Depth> ÷ <Number of Test VM Cores> and set the ‘Maximum Disk Size’ equal to 10Million sectors, which equals a 5GB test file. Configure each worker with the same values.
  3. Assign ‘Access Specification’ to each ‘Worker’. ‘Access Specification’ is the workload pattern generated by each ‘Worker’. ‘Transfer Request Size’ should be equal to the <Block Size>. Set ‘Percent Random/Sequential Distribution’ to 100% Random. Set the ‘Percent Read/Write Distribution’ per the <Read/Write Mix>. Assign the same ‘Access Specification’ to all ‘Workers’.

STEP 4: Measuring Performance

Run the FIO or Iometer test for 20 minutes and then collect the read & write throughput and latency from the FIO or Iometer output. If the latencies are under 5ms, then your storage infrastructure is performing fine.

If you are evaluating VirtuCache, wait till the ‘cache hit ratio’ field (VirtuCache GUI > ‘Performance’ tab) reaches 99%, before you take throughput and latency readings.

STEP5: Scaling Iometer and FIO to mimic workload for your entire ESXi cluster.

The above steps showed how to mimic the workload for one of your hosts in the cluster. To scale this test to represent the storage workload for your entire ESXi cluster, simply multiply the <IO Depth> by the number of hosts in your cluster and run the test again using the new <IO Depth> value.  If the <IO Depth> exceeds 64, then clone the test VM, and distribute the <IO Depth> equally across these multiple test VMs, ensuring that the <IO Depth> stays under 64 per test VM. A higher <IO Depth> than 64 will prevent your production workload from being properly simulated by a single VM.

Download Trial Contact Us