- SuperMicro servers for three ESXi hosts and three CEPH nodes.
- 38x 16TB SATA HDDs in each CEPH node. So total raw storage capacity was 1.8PB.
- Samsung PM1725b 6.4TB PCIe SSD used as cache media in each host.
- 20gbps VMware and iSCSI network.
- Need for Petabyte scale storage, which necessarily needs to be cheap on a $/GB basis.
- Need for faster processing of Genome files under analysis, which has an additional unique challenge that each file under process could be hundreds of GB.
- VirtuCache with 6.4TB Samsung PCIe SSD was deployed on each host for caching 'hot' data from CEPH.
- A combination of CEPH storage software and VirtuCache host side caching software provided the customer with 1.8PB storage capacity and 1.5M IOPS throughput for only $164K.
The Virtunet DifferenceThe customer selected VirtuCache and CEPH because:
As it relates to storage, genome sequencing workflow has two requirements.
Need for cheap ($/GB) Petabyte scale storage.
Need for faster processing of Genome files under analysis , many of which are a few hundred GB in size.
Why CEPH storage?
Distributed: Like most file systems used in Genomics currently, CEPH is also clustered storage. It clusters together commodity servers and presents this cluster as centralized storage, over iSCSI, NFS or Object protocols, much like a traditional NAS / SAN array. It needs a minimum of 3 servers, so it can replicate data 2-way, and be able to sustain a node loss of one. The clustering of servers allows you to replace a failed server or add additional capacity by adding a new server, with all operations performed live.
Cheap at Petabyte scale: Since one HDD is now 16TB in capacity, a 3-node CEPH cluster can scale up to many Petabytes of storage, at a cost of only 10-20 US cents/GB, possibly making it the only widely accepted on-premises storage technology that can effectively compete with cloud storage on cost.
CEPH being open source can be deployed by your IT team, or you can use RedHat, SUSE, or our CEPH support options.
Our recommendation is to cache data in the compute nodes versus in the backend array?
Need for parallelized access to data: Access to ‘hot’ data is more parallelized if each compute node / VM has access to a local SSD versus the conventional approach where the SSDs in the backend array are provisioned across many compute nodes connected to the array. Hence server-side cache performs better than array side cache. Genomics workflow requires highly parallelized access to files and at very high throughput.
Need to cache large datasets: Genomic datasets are large and each file under processing can be over 100GB in size. A single PCIe SSD is 8TB (as of 2020) in capacity and hence easily able to cache large datasets.
High throughput: Enterprise grade PCIe SSDs do 600K random read IOPS / 200K random write IOPS or more, so each SSD is higher throughput than entire mid-range all-flash appliances. Also unlike in an array, where the SSD is behind the network and storage controllers, in the case of server side caching the SSD is right on the motherboard and connected to the CPU over a high speed PCIe slot. This combination of a high speed SSD over a high-speed bus, can convey very high throughput to the compute CPUs.
If you want even higher throughput you could use server memory in addition to a PCIe SSD. Memory is 20-30% faster than PCIe SSDs. For instance with VirtuCache, if you use both server RAM and SSD as cache media, VirtuCache tiers the cache between memory and SSD.
Cheap: The most popular SSD in our customer base for the Genomics use case is the 6.4TB Samsung PM 1725 in the PCIe x8 form factor (2020 price of $3000). This SSD comes in sizes that range from 1.6TB to 7.8TB.
Choice of server-side caching software: The choices for server-side caching software depend on your compute OS. If you have bare-metal Linux, KVM, or docker then open source caching software like Bcache (Google) will work well. If your compute nodes run VMware, then our VirtuCache is the only software that caches reads and writes. There are others that cache reads only but those won’t work for the Genomics use case, because at many stages in this workflow the workload is write intensive.
If you use virtual GPUs for Genome analysis?
Currently, for storage IO generated/requested by a GPU, the IO request goes though the compute CPU thus putting a burden on compute CPUs and introducing additional latency. VirtuCache is the only software that services storage IO requests initiated by vGPUs directly from in-server SSD bypassing the compute CPU entirely. This improves storage throughput conveyed to the GPU and does not burden the compute CPU.
Storage infrastructure details at Providence Health.
Providence Health is a large US healthcare organization with over 50 hospitals and 600 clinics. Their Genomics research wing is involved in the full Genome sequencing workflow from converting raw files to genome sequences and analyzing this data. The storage infrastructure listed below is a specialized ESXi and storage cluster that runs about a hundred VMs that continuously download and process genome sequences from many hospitals worldwide. The raw data along with post-processed information is then stored and backed up to the CEPH storage target. Some of their software applications also use the new VMware vGPU functionality to offload computational tasks to virtual GPUs attached to VMs.
The table below lists their new CEPH and VirtuCache related storage infrastructure along with costs.
|Cost of h/w, s/w, 3-year support||Cost / unit|
|CEPH storage connected to ESXi over iSCSI||3 SuperMicro servers with 600TB Hard Drives in each. These servers run all CEPH software components – OSD, MON, and iSCSI gateway. Raw capacity = 1.8PB (usable 900TB). Their internal IT manages CEPH.||
Storage for full Genome sequences.
$0.15 / GB of usable data.
|VirtuCache is configured to cache to a 6.4TB Samsung PM1725b PCIe SSD in each host. This combination provides 6GigaByte/sec throughput per host.||Improving the storage performance of CEPH.||$ 24,000||$1.3 / Megabytes per second.|
Signup for the VirtunetSystems Newsletter