1.023: it's just a flash-y science experiment
And now, my oft-requested take on the 1 Meellyun IOPS flash technology science experiment that IBM is promoting so heavily:
That's right - Barry Whyte and IBM's Almaden Lab team are to be congratulated for their accomplishment, as I actually did in the first comment to BarryW's boastful blog post on the event. This is indeed an important milestone on the road to wide-scale commercialization of solid-state persistent storage, even if it isn't an actual product announcement (IBM admits you can't buy their experimental configuration for at least 9-12 months).
Commendations all around...
But surely you don't think that's all I have to say now, do you...
why can't we just be friends?
What I did find somewhat uncalled for about IBM's bluster around this was, well - the antagonist tone of IBM's bluster. From BarryW's blog title ("...Actions speak louder than words"),
to TonyP's misrepresentations of SPC-1 as the workload that was used (it was actually non-standard version of the SPC-1 workload, as BarryW so honestly explains in his post) EDIT Sep 5, '08: BarryW has clarified that it wasn't the SPC-1 after all (see comments below), and TonyP has corrected his post (presumably to avoid the Wrath of the SPC), to the quotes from IBM executive Charlie Andrews in the Byte and Switch article where he asserts that putting flash drives into a storage array is "like taking a jet and putting it on a two-lane road."
I guess Charlie was using the aging and decrepit DS8000 as his reference point...oops, sorry.
The posturing and positioning from IBM is totally expected - like everyone else, they're playing catch-up in the flash game, and the obvious response when you get caught flat-footed is to try to redefine the rules. So it's in their best interests to make it sound like flash requires special treatment, belongs in the server, won't work in the array, etc. etc. etc. It all just helps to justify IBM's continuing delay in getting flash to market.
But still, it was an odd thing for Charlie to say, especially since EMC has been delivering jet-speed performance with flash drives in the DMX-4 since the beginning of the year. The fact is that EMC has figured out how to integrate flash drives into both Symmetrix DMX and CLARiiON, while accommodating and mitigating the characteristics of NAND flash and the inherent differences between solid-state vs. spinning rust, while IBM has yet to make that milestone. The advanced architecture of Symmetrix and the decades of architectural and algorithmic optimizations in dealing with I/O prioritization, queue depths and error correction afford EMC more than a head-start over the competition.
What IBM is really admitting is that you can't simply install a flash drive in your storage platform an immediately take advantage of it.
Unless, that is, your storage platform's architecture was designed properly in the first place! (Many seem to forget that the predecessor of the Symmetrix was in fact a solid-state storage accelerator (DRAM based), and many of the issues that IBM and others are fretting over today were addressed by EMC more than 2 decades ago.)
ok. it wasn't really real...
When pressed on his blog, BarryW admitted that the experiment wasn't anything close to real-world. It was merely an experiment, performed with an artificial 4KB/block workload (70% read / 30% write), using more than the officially supported number of SVC nodes (max supported is 4 2-node clusters), fronting a home-brewed storage device concocted of P-servers with FusionIO PCI-bus flash devices, operating with ZERO data protection (no RAID5, no mirroring, no Data Integrity Bits, no nothing), and yes, operating without any of the normal software overheads of mirroring, replication, or any of that other messy reality that gets in the way of science.
The most interesting thing is that even in this controlled lab experiment, IBM wasn't able to attain response times from their customized solution that were significantly faster than what EMC has been delivering with the standard STEC ZeusIOPS Enterprise Flash Drives integrated with the DMX-4 for months now. That's right, despite all of the hand-tuning, and putting storage on the PCI bus instead of behind a disk drive interface, the SVC's best 4K read response times were essentially the same as the DMX-4: just slightly less than 1ms.
The only difference? The DMX-4 numbers include the operational overheads of RAID5 and checksum verification that the data being read is actually what was written, while the SVC did not.
FYI - End-to-end data integrity verification is a critical requirement for any storage device to protect against undetected data corruption caused by faulty logic or even sub-atomic particle bombardment (which gets worse at higher elevations). StorageMojo reported last year about CERN's data corruption research, in which they found that 1 in every 1500 data files contained undetected corruptions. Overlapped error correction is required throughout the data path to protect against such corruptions, and this is especially important with flash devices. This is why both Symmetrix and CLARiiON have always employed data integrity validation on every read from storage - an unfortunate but necessary overhead. And I'll note that the SVC has no such data protection - but then, neither do most non-EMC storage platforms. Does yours?
Bit I digress...
To me, what is most disheartening about IBM's announcement isn't actually what IBM said or didn't say - it is instead the continuing misrepresentation of the technology by the press and analysts covering it. For some reason the Byte and Switch article covering IBM's milestone concluded with findings by a Citigroup analyst who asserts that read/write endurance of MLC flash "remains challenging" - as if that had anything to do with the subject.
For the record, both the STEC ZeusIOPS drive that EMC is shipping today and the FusionIO device that IBM used in their SVC science experiment are SLC-based devices. Read/write endurance is an issue that the suppliers of both devices have solved already. The real challenge is the effective integration of these devices into servers and storage so as the maximize the response time benefits of the drives.
And despite what Charlie asserts on behalf of IBM, flash storage really is all about I/O response time.