0.059: bold, fast and green
Nope, I'm talking about EMC's introduction today of Flash Solid-State Drives for Symmetrix DMX-4 - the first-and-only enterprise-class application of Flash technology.
Now, if you've already read Chuck's blog post (The Enterprise Strikes Back) and Mark's early-morning coverage (Enterprise Flash for DMX-4), you should have a pretty good understanding of who needs these things and why, and on the technology itself. No need for me to rehash that ground. And even Stephen-the-Packrat has noticed that What's Old is New Again, reinforcing the significant differences from these enterprise-flash drives and the stuff that Apple slaps into its iProducts. Oh, and here's the obligatory link to the original WSJ "scoop" on today's news.
Since I have had a front-row seat to the accelerated evolution of this technology into what today is a truly enterprise-ready solid-state storage solution, I thought I'd share a little about the journey that has brought us to this point.
Sorry for the delayed posting. I've been technical reference support for the SSD part of today's launch, which kept me pretty busy all day. Judging by the nature of the questions and the early coverage, this Flash thing seems pretty hot (pardon the pun). I'll cover the rest of today's Symmetrix announcements in a separate post.
Oh, and if you haven't stopped by The New EMC.com, you definitely should - it's a whole new experience. Today's announcement landing page is an excellent example of how the new technologies behind EMC.com provide a more rich and engaging approach to the company's web presence.
flash didn't happen overnight
You may be surprised to learn that this all started long before there was much public discussion of using Flash Memory-based solid-state drives, and predates even the emergence of commercial NAND Flash drives from M-systems (now SanDisk), Samsung, STEC Inc. and others back in the 2005-2006. I won't go so far as to imply that EMC was planning for this from the very beginning, as both Mark and Stephen have suggested, but I will say that the underlying architecture of Symmetrix has proven instrumental to the rapid adoption, integration and efficiency of this new technology.
Fact is, EMC has been preparing for the inevitable emergence of practical and cost-effective solid-state devices since even before the initial Symmetrix DMX even shipped. Back then, nobody could predict when Flash SSDs might be cheap enough to replace the spinning rust the storage industry is currently built upon. But the engineers and architects designing Symmetrix knew that it would happen one day, and that the Symmetrix DMX generation needed to be ready for it when it did.
And there were quite a few things that could be predicted about persistent solid-state storage devices, even back then. For example, it was pretty clear we wouldn't be faced with the performance and reliability limitations of USB thumb drives or the CF flash cards that were the then de-facto standard. And although we did imagine a "box-o-flash" made out of hundreds of those CF cards, we all pretty much expected that the most practical packaging (at least initially) would be in the form factor of a disk drive, complete with a standard drive interface (be it Fibre Channel, SATA or SAS). Even though higher densities and perhaps even greater performance could be achieved in a more customized form factor, there's an entire industry that knows how to develop software, qualify and support "disk drives." So, like the early days of RAID, the engineers planned on integrating a "standard" device rather than hand-crafting a new one.
Back then most people didn't think much about efficient energy use within the data center, but it was obvious that an SSD that didn't need any energy to maintain persistence would have an advantage over both SDRAM-based solutions and electro-mechanical hard drives. That energy efficiency is Topping the CIO's Most Wanted List is an added bonus of this whole endeavor.
Of course everyone knew that Flash SSDs would be fast - significantly faster than hard drives, although still slower than SDRAM. Back then, write performance was still expected to be slow (as is true today for the laptop SSDs that have been getting most of the attention of late), so it was expected that the write caching of Symmetrix would be of benefit (it is, but with the performance of the STEC drives, write caching will actually play a different, as I'll explain later).
But with sub-millisecond response times from the SSDs themselves, the role of Symmetrix read cache and pre-fetch algorithms would undoubtedly change. Algorithms designed to pre-fetch sequential reads, leverage "free reads" and take advantage of referential locality that occurs in most "random" workloads would be practically unnecessary with drives than could respond to ANY I/O request without the latency of head movement or rotational positioning.
And it was pretty clear that the initial enterprise-class SSDs would be very expensive and lower in capacity per device as compared to hard drives, meaning that early adopters probably wouldn't be able to even consider an all-Flash storage array - Symmetrix was going to have to run Flash alongside normal hard drives in the same array, at least in the beginning.
Each of these predictions directly influenced the evolution of Symmetrix DMX and Enginuity (the Symmetrix operating system software, aka microcode), and both have been being optimized for the eventuality of today's announcements for the past several years.
a key component of in-the-box tiered storage
In fact, Flash has been a key component of the Symmetrix in-the-box strategy for tiered storage from the start, although until today it probably appeared this strategy was focused only on the lower SATA-end of tiered storage.
Not so, grasshopper.
No, in fact, virtually every one of the key enhancements we've added to Symmetrix DMX since 2003 have been in preparation for Flash SSDs. Oh sure, they've had immediate benefit for spinning rust disk drives as well, but they're all really pre-requisites for bringing Flash SSD technology to the massively-consolidated, performance-intensive, yet cost-conservative enterprise storage market.
Let's take a quick review of some of the externally-visible things that have been added to Symmetrix and Enginuity since 2003 and how they apply to Flash SSDs:
- RAID5: while "mirror everything" used to be the way-of-Symmetrix, you just can't justify the cost for every application any more, and it's probably overkill for enterprise Flash SSDs anyway. (So is RAID 6, but those were added pretty much just for the fat-and-slow SATA drives).
- TimeFinder/Snaps: Space-saving snapshots. With the cost of SSD, you don't want to make any more copies of your data than you need to. The recent Asynchronous Copy on First Write enhancements in 5772 ensure that the Snaps have minimal impact on the response times of the primary volumes on the Flash SSD.
- Modular Packaging: Symmetrix DMX-3 and DMX-4 are "enterprise-modular" arrays, allowing for almost unlimited flexibility of configuration - you can have one "quadrant" supporting as many as 600 drives for maximum capacity, or you can have a quadrant optimized for performance with as few as 32 drives. This approach now lets you dedicate a quadrant to Flash SSDs to maximize their performance (you'll still need the 32 regular disk drives in that quadrant to support DMX's PowerVault, but you can use the space on those drives for other things as well).
- Cache Partitioning: With Flash SSDs, you don't really need a lot of cache for reads, but you do want to have a modicum of cache for pending writes (I'll explain why in a moment). In an interesting twist, you might actually want to decrease cache to a bare minimum for read-intensive applications. Dynamic Cache Partitioning helps to ensure that your memory is used where it's needed most, even as the system dynamically reallocates based on actual workloads.
- Symmetrix Priority Controls: Similarly, you want to be sure that the Flash drives receive the appropriate relative priority to everything else in the system, and internally, Enginuity uses the underlying mechanisms to protect "normal" disk drives from starvation caused by the hyper-responsive SSDs.
- Virtual Provisioning: This one's probably obvious, but with the cost of SSDs, you really want to buy as little of it as possible so thin provisioning is almost imperative to maximize utilization. Over-provisioning allows for future growth with a minimum of hassle - just add another group of SSDs to the pool before expanding your databases.
- Switched Infrastructure: In addition to the inherent fault-isolation and reliability improvements afforded by the new point-to-point DMX-4 back-end, it also serves to minimize the latency overhead for the Flash SSDs. While the overhead of an arbitrated loop is minimal and practically undetectable for a regular hard drive, even a little latency is noticeable with SSDs. And if/when future enterprise-class SSDs hit the market with a SATA interface instead of Fibre Channel, the DMX-4 will be ready.
- Asynchronous Replication: while clearly justifiable on the merits of being able to replicate data a significantly longer distance than possible with synchronous replication, asynchronous replication is expected to be the preferred method of protecting data stored on Flash SSDs, for a very simple reason: after you've paid to attain minimal response times, the last thing you're probably going to do is add another millisecond or two of transmission time to your writes.
- SRDF/S Response Time improvements in 5772: But if your application DOES require synchronous replication, you'll want the fastest possible response times, so the enhancements made in 5772 (also in the upcoming 5773) could well make a lot of difference for Flash SSDs.
the real magic is inside
The real key to the "enterprise-ness" of the STEC Flash SSDs that EMC announced today is pretty much equally divided between the drive itself and the optimizations that have gone into Enginuity in preparation for them.
inside the drives
As you've probably read by now, the STEC ZeusIOPS drives themselves are in fact optimized for random AND sequential I/O patterns, unlike the lower cost flash drives aimed at the laptop market. They use a generously sized SDRAM cache to improve sequential read performance and to delay and coalesce writes. They implement a massively parallel internal infrastructure that simultaneously reads (or writes) a small amount of data from a large number of Flash chips concurrently to overcome the inherent Flash latencies. Every write is remapped to a different bank of Flash as part of the wear leveling, and they employ a few other tricks that I've been told I can't disclose to maximize write performance. They employ multi-bit EDC (Error Detection) and ECC (Error Correction) and bad-block remapping into reserved capacity of the drives. And yes, they have sufficient internal backup power to destage pending writes (and the mapping tables) to persistent storage in the event of a total power failure.
(Of course, in a Symmetrix, they'll be powered by the integral standby power through any momentary power outages or through an orderly shutdown in a total power fail situation. But it's nice to know there's a backup to the backup).
Perhaps the most oft-repeated questions have been about Flash wear out. To my knowledge, there isn't a drive-level rating for these drives (yet). But even at the rated minimum 100,000 writes per cell, we know that they'll endure several years for all practical use cases except perhaps a pathological 100% pure random write stress test (which would probably kill a hard drive sooner than an SSD). And experience shows that SLC flash will handle significantly more writes than the rated minimum and there are SLC NAND flash parts coming soon that are even more resilient.
Bottom line: It makes no sense for EMC to sell a drive that would cost them time, money or reputation, so I'm pretty much convinced by the mathematicians who tell me not to lose any sleep over this.
On the Enginuity side, several optimizations have been made to maximize performance and extend the life of the drives.
One good example is the aforementioned write caching. With effective write performance that pretty much matches read latencies, there's not a lot to be gained - performance wise, that is - by caching writes to the disk. BUT, buffering writes can help reduce the wear and tear on the drive. See, the longer Enginuity can delay sending writes to the drive, the higher the probability that a subsequent write supercedes an earlier one. This "write folding" is a key foundation of reducing the amount of data SRDF/A has to transmit, and it will have a similar effect on reducing the amount of "writes" a Flash SSD has to deal with.
Other enhancements minimize code paths when the source of a read is an SSD, skipping any logic to determine whether pre-fetch algorithms should be engaged. The aforementioned "free reads" are skipped outright, knowing that the drive itself has already fetched the "rest of the track" into its SDRAM buffer should it be needed. No need for re-ordering I/Os, either, since there's no rotational latency or seek times to optimize. And the kid gloves are removed for the rare drive rebuild as well, effectively rebuilding ALL the hypers at once instead of sequentially, since there's no real performance difference or overhead for totally random vs. sequential requests.
And then there's the logic to ensure that "lesser drives" aren't starved in the never-ending quest for performance, leveraging the underpinnings of the Priority Controls feature. And on the "image that" front, it turns out that these drives can actually complete an I/O request before the requesting code was ready to accept a response. Code written to handle the tortoise-like drives of the 90s would stand no chance with these drives; fortunately this was expected and the code was easily optimized around this new operational model.
And there's more, but I'll leave them for another day...I've delayed this post long enough, and the details can wait.
a new day has dawned
So, as this First Day of the New Era of Storage winds down, I have to say that the positive responses and near universal support for the significance of today's announcement are appreciated and welcomed. And though it sounds self-serving and pompous, I personally believe that Flash SSD technology will become ubiquitous in the not-so-distant future as prices decline and performance improves (thanks to Moore's Law) at rates unheard of in the entire history of the hard disk drive.
The simple fact is nobody expected this announcement to happen until probably 2009 at the earliest. The competition has been caught flat-footed, and EMC has a head-start in addressing a very real (and previously unmet) customer requirement. That can't be popular with the anti-EMC crowd, but then, they don't buy much EMC storage anyway. What *IS* important is that there are more customers than the competition wants to admit that need the response times that these drives can deliver, and they need it NOW! And while I don't know for sure, I am guessing that EMC's direct, channel and partner sales folks around the globe have met more of those interested customers today than even they thought existed.
(And for those of you who carry a bag for EMC, I second DaveD's advice that you should get out and talk to every single one of your customers about this technology TODAY. I'm betting you'll find more interest than you can imagine. And if not, heck, it gives you something new to talk to your customers about. So get out there!)
Inevitably, customer demand will drive today's nay-sayers and sideline sitters to follow the path the EMC is trailblazing, just like they did over 17 years ago when EMC introduced the first Intelligent Cached Disk Array (ICDA), and created a whole new market for external storage where none existed before.
And trust me, they will follow, no matter what FUD and misdirection they may spout while they're trying to catch up.
If I may be so bold.
Thanks for understanding my silence on this subject up until today.
There are many of you who read and participate in my blog who have been scratching around the edges of today's announcement on multiple occasions, both here and in other storage blogs, and I can only offer my heartfelt apologies that I was unable to engage or respond on this subject until now. As you can now probably deduce, everyone within EMC who knew anything about this was sworn to absolute silence (even 'Zilla, who sniffed this out early last summer). And as surprised as many are that we actually pulled this "cone of silence" thing off, I believe that silence served to focus people's efforts and actually helped to accelerate the project.
But now that it has been announced, I am hopeful that we can engage in productive discussion about the technology, its future and its practical applications. Collaboration will only help to accelerate the economics and the adoption of this exciting new technology, and I sincerely look forward to the dialog. For my part, I'll endeavor to answer any questions that I can on the subject.