3.018: fast vp - world's smartest storage tiering (part 1)
With the availability of VMAX Fully Automated Storage Tiering for Virtual Pools (FAST VP), there will undoubtedly be a raft of "we were first" and "me too" claims from competitors.
I will preemptively respond to both in this post.
As I've said many times before, being "first" in the market only really matters for as long as you are also "the only." As soon as there are more than one supplier of a feature, the discussion moves on to "which implementation is better."
I hereby assert than VMAX FAST VP is the smartest, most efficient, fastest,
easiest and most affordable sub-LUN automated tiering available in the market today
(and for the foreseeable future)
Second, I contend that no other vendors' automated tiering offering even comes close to VMAX FAST VP – and thus nobody has a basis for claiming "me too."
As I hope to explain, effective automated storage tiering requires much, much more than the basic ability to relocate data across tiers at a sub-LUN granularity. To even be considered as a contender, competitors will have to address three areas of FAST VP differentiation:
- Effective Implementation
- Granular Data Management
- Advanced Controls
For each of these I will propose some questions the customers may want to consider when comparing implementations, along with the specific unique advantages offered by VMAX FAST VP.
I have split this post into two parts (it got a little longer than I planned).
Part 1 follows…
effective implementation
The first set of questions center on how easy it is to get started with automated tiering and how well it will be able to handle various workloads:
How do I get started?
- How do I determine the right policies?
- Why should I expect this will work?
- Does it consider current workloads?
- Does it optimize both reads and writes?
- How efficient is it to acquire and deploy?
- Does it work with (and for) everything?
FAST VP has been designed from the outset to be easy to implement, to automatically optimize the performance of a wide variety of applications, and to be extremely affordable. Let's look at how FAST VP addresses each of these questions.
how do i get started?
With VMAX FAST VP: you can be up and running literally in minutes, not months. First, of course, you need to have Enginuity 5875 installed on your VMAX (a non-disruptive upgrade for existing installations), and you will also need the appropriate FAST VP software licenses installed.
From there, the process couldn't be easier:
- Place your open systems data on Virtually Provisioned LUNs (probably in an FC pool). If you're an existing VMAX customer, you probably are already using Virtual Provisioning – if so, you can just start with step 2 or 3. (Note, these can be "thin" (sparsely allocated) or pre-allocated VP LUNs – your choice). If not, use thick-to-thin clones to relocate your LUNs into VP.
- Group these VP LUNs into application-based Storage Groups to make things easier to manage.
- Define one or two more tiers of storage (e.g., a Flash pool and a SATA pool). You'll need some unused space in at least one of these pools – ideally all of them (FAST VP does moves, not swaps, and so needs a place to move stuff to).
- Use the supplied wizard to create and enable FAST VP policies for each storage group. These don't have to be the same, even for applications sharing the same pools (a flexibility that not all implementations offer).
- Start up the application(s), stand back and watch FAST VP do its thing! After an initial monitor period (usually less than 2 hours), FAST VP will start moving data according to the policies you have defined.
Compare this to competitive solutions that can require that existing LUNs be relocated into special new "hybrid" devices – a process itself that can take days. And many competitors also require as much as 24 hours of analysis before they start working…with FAST VP, your tiers will begin being optimized much, much sooner.
how do i determine the right policies?
To help in defining your starting policies, EMC provides a tool called Tier Advisor via our pre- and post-sales technical consultants. From experience and observation, we know that different applications and workloads will almost always require a different amount of capacity in each tier for the most cost-effective and optimally performing configuration.
Using performance data collected from existing workloads, Tier Advisor can model these workloads against various FAST VP policy configurations. These can be can be optimized for cost, performance or both, providing as much as a 40% performance improvement and a 40% cost savings for the storage drives vs. an equivalent VMAX configuration using 100% 15K rpm drives. Some applications can be optimized for even more cost-savings, while those that need the most performance possible can frequently attain significant gains at the same cost as an all-15K configuration.
FAST VP also supports truly dynamic reconfiguration, so if the workload (or its importance) changes, the storage administrator can make adjustments that are effected in minutes instead of days.
Without tools like Tier Advisor, you would probably be shooting into the darkness and taking hours or even days to figure out if you've guessed right. Trial and error is an expensive learning methodology.
why should i expect this to work?
When EMC announced our FAST intentions at the VMAX launch back in April 2009, we had already done months of research into what would be required for automated tiering to be successful. In fact, VMAX itself was designed, architected and built for FAST VP. The internal architecture included the hooks to support the tracking metadata for FAST V1 and FAST VP, and the plans for FAST VP influenced many internal design decisions for the hardware and Virtual Matrix interconnect.
The algorithms behind VMAX FAST VP are based upon (and validated against) traces of real-world customer applications against production data. The Symmetrix Performance Engineering team collected literally millions of IOPS from thousands of recorded customer workloads (Steve Todd talked about this in his Let the I/O Do the Talking blog post ). These traces covered a wide variety of applications & market segments, across different operational windows (production, reporting, backup, etc.), giving us a large dataset upon which to model various algorithms.
Based on these traces, plus the extensive cache prefetch intelligence built up within Symmetrix over the past 20+ years, FAST VP is optimized around recognizable access patterns to get the right data to Flash even before it is accessed for the first time. FAST VP is designed to move the smallest amount of data possible up to a higher tier that will deliver the desired response time benefits, and it attempts to utilize as much of the fastest tier as the governing policies will permit. FAST VP is also biased to move as much unused capacity down to the lowest tiers as possible, in effect driving busy data up to the best $/IOPS tier and idle data down to the best $/GB tier.
Net result? The system becomes optimized to the most cost-effective utilization very, very quickly.
This contrasts starkly to competitor solutions that are much more of a bolt-on feature, apparently invented in the labs without any real-world application awareness (don't make me name names here). These approaches are often limited by their historical architectures to be far less efficient and optimized, forced into using extremely large units of relocation that are not optimal for any application (more on this later) and limited by their architectures to tracking only the most basic of metadata and/or responding very slowly to additions or changes to workloads. Some even force customers into a single effective "policy," an attribute that clearly doesn't fit in today's highly virtualized, heavily consolidated multi-tenant application environments.
does it consider current workloads?
FAST VP monitors I/O workloads and access patterns continuously during the defined statistics gathering windows (more later). These statics are analyzed every 10 minutes and compared to the current policies to identify relocation candidates. Thus, changes to policy, workloads and/or capacity utilization are dynamically recognized and FAST VP adapts to them. The service processor participates in this analysis by translating the collected data into move thresholds that Enginuity uses to identify relocation targets. Should the SVP go off-line, Enginuity will continue moving based on the most recent thresholds – these are not typically do not change drastically after the initial analysis, so while not 100% optimal, the system will continue to adapt to workload change in such an event.
FAST VP monitors the effective response times per tier and per pool, and will stop moving data "up" to flash (for example) if the devices are responding slower than the tier where the data currently resides. Additionally, FAST VP monitors write pending backlogs of each tiers' devices and will defer a relocations until the WP limits fall below the overload threshold of those devices.
Does it Optimize for both Read & Write?
FAST VP actually optimizes for 3 different I/O types: Cache Read Miss, Write Miss, and Cache Prefetch. Based upon the real-world observations, simple Read Miss optimization alone is insufficient to make optimal use of tiering. In fact, and despite the fact that much has been written to the contrary, Flash can significantly benefit write performance – especially in random I/O workloads. An overloaded hard disk drive will suffer extended response times as the workload increases, while Flash drives will deliver consistent read and write response times independent of the I/O queue.
VMAX FAST VP also supports and assists Symmetrix' intelligent cache prefetch algorithms, making them even more effective by automatically pre-promoting data to higher tiers – often in advance of the cache prefetch!
how efficient is it to acquire and deploy?
It depends.
OK. Seriously: for some applications VMAX FAST VP can be up to 40% faster at up to 40% lower $/GB dive cost as compared to using 100% 15K rpm hard disk drives. Depending upon the application and SLA requirements, even better performance can be delivered at the same cost as HDDs, or the same performance can be delivered at perhaps at a lower cost than the HDDs.
Your mileage may/will vary, of course – not all applications or price/performance objectives are the same. That's why we have Tier Advisor, of course!
VMAX FAST VP will typically require less Flash than competing implementations to deliver equal or better performance. This is partially due to FAST VP's algorithms and variable granularity of relocation. It is also thanks to the architecture of VMAX which allows for as few as 4 Flash drives to be installed with full RAID protection. Some competitor's architectures require the installation of drives in groups of 8 or 16, driving up costs and reducing flexibility. And of course, 4 Flash drives won't be sufficient for every application or configuration – but Tier Advisor will tell you how much you'll really need.
Finally, FAST VP allows all capacity to be shared by multiple applications, with each application (storage group) having its own unique policy. This sharing allows for multiple different application SLAs to be consolidated into a single array (or a subset of an array) without the risks of one application "hogging" all the Flash from others – a very real risk with some competitive implementations.
does it work with (and for) everything?
Being built upon the tried and proven VMAX Virtual Provisioning, FAST VP works with everything that VP supports – transparently. This means you can use FAST VP with all drive and RAID types/combinations supported by VMAX. FAST VP supports sparsely-allocated (thin) or pre-allocated (fat) VP devices, optimally relocating across tiers with full awareness of unused blocks that haven't been written.
Some competitors limit their implementation to only fully-allocated LUNs, and others only to thinly-allocated LUNs. Of course, FAST VP supports all Symmetrix VMAX Open Systems functionality – TimeFinder, SRDF, Control Center, PowerPath, Symmetrix Management Console and Performance Analyzer, etc. etc. etc.. All of the existing APIs (Solution Enabler, SMI-S, etc.) and new APIs (VAAI, T10 UNMAP, etc.) that interface into Symmetrix are transparently supported by FAST VP. You can even use FAST VP on the new Data at Rest Encrypting engines.
Again, not all competitors support all their features with their automated tiering implementations.
And finally (for this section at least), FAST VP enables all I/Os by any application to be serviced quicker. And this is not only for those data areas that have been promoted to the Flash tier. By relocating the busiest part of the workload to Flash or FC, the lower tier devices are inherently less busy. With less of a workload backlog slogging the heads all over the drives, these tiers can actually get more work done, and they can get it done faster. This means that the average response times of 10Krpm FC drives will be closer to the theoretical 9ms, 15K drives will be closer to 6ms, and 7200rpm SATA drives at 12ms. In a world where overloaded FC drives are often operating at 10-20ms response times (or worse), FAST VP can be a benefit to applications even if all of their working set capacity doesn't actually fit into the Flash tier.
This is perhaps the hidden value of FAST VP…adding even a small amount of Flash to your existing workloads can provide a significant improvement to your overall performance.
In part 2, I will focus on the aspects of Granular Data Management and Advanced Controls.
technorati tags: EMC, Symmetrix, VMAX, FAST VP, FAST, FAST V2, Fully Automated Storage Tiering, Flash Drives, EFD, SATA, Simple Intelligent Modular Storage, Powerful, Trusted, Smartest, Record Breaker, #EMCBreaksRecords, Breaking Records, m sub-LUN, Easy Tier, Dynamic Tiering, Virutal Pools, Virtual Provisioning
Comments