Years ago I walked into a meeting with a customer and as the CIO walked in he asked his staff what their prime directive was. Not skipping a beat they all in unison stated “To provide a reliable and consistent computing environment”. As I worked with this customer I discovered that this was a regular/normal event in meetings.
I asked a staff member later over lunch what exactly this mantra meant to them. He described it in the context of his car. While he might want to modify his car to go 200mph, it was more important to have a car that could get him to work on time every day, and not spontaneously slow down or crash on occasion even if on the other 4 days out of the week he would have a 2 minute commute.
Hybrid Storage works by the combining the speed of flash with slow but lower cost for capacity magnetic media to deliver a fast cost effective storage. There are a lot of tricks to try to hide the slower disk (Write being Coalesced together, large read cache and read ahead cache algorithms) but fundamentally there will be workloads where read IO must be served from the magnetic disks and this will introduce a certain variability. We call these requests where the read request was not in the cache a “cache miss”. As these misses ad up, the magnetic disks can become a bottleneck. There may be confusion about this in the industry but hybrid storage systems fundamentally can not cheat the laws of physics.
The end result is inconsistency. Is some caches a huge number of end user queries on an application may be lighting fast. When data in an untouched region is requested however things can change. When a Doctor pulls up an old patient note and responses go from 1-2 seconds, to 2 minutes there is a noticeable shift in end user experience. Sometimes this difference in experience is acceptable. Other times it will result in countless calls to the Helpdesk and lost productivity of expensive resources. You can put a rocket on an ordinary passenger car to make it go 200mph but it can only sustain that for a certain length of time.
As a former storage admin I am familiar with the endless tricks we employed to try to make magnetic disks perform consistently. We used wide striping and placed data on the outside (faster spinning) part of the disks. We deployed smarter and smarter DRAM and NVRAM caching systems. We used log structured file systems and data structures (and the expense of streaming read performance). We partitioned cache, and adjusted its block size. we used various “nerd knobs” of adjusted data reduction features for specific pools or workloads or caches. Much like trying to make my 4 cylinder mid sized passenger car drive 200MPH, you eventually hit a wall of diminishing returns. Hybrid is not the path forward for business critical applications that need highly consistently latency.
How do we transition to seamless, consistent low latency and the amazing end user experiences that come with it?
Despite claims to the contrary the only real solution to this problem is to move away from magnetic storage to persistent memory such as flash. All flash systems can deliver amazing low latency for even the most exotic of workloads like in memory databases. Previously all flash was reserve for only the most important applications for cost reasons, but now things are changing. The good news is Virtual SAN’s space efficiency features can make all flash cheaper than competing hybrid solutions. While Bugatti’s have held their value, the price of all flash Virtual SAN has come down quite a bit. If you have not looked at an All Flash Virtual SAN with these new features you may be shocked at how cost effectively you can deliver reliable and consistent infrastructure to more users and applications.
John Nicholson is a Senior Technical Marketing Manager in the Storage and Available Business Unit. He can be found on Twitter @Lost_Signal.