1.044: ibm's amazing splash dance
Leave it to the folks over at Big Blue to throw cold water on the whole flash storage revolution.
The links gets you this white paper: Solid state disks for enterprise storage - IBM’s approach to new storage technology.
UPDATE: Just in case IBM moves or withdraws the referenced white paper, I have saved a copy of it here on my blog site.
With a title like that, I figured this paper would be the long-waited IBM response to my previous Flashdance post, even though it was probably at least in draft weeks before I started my post.
I wasn't to be disappointed.
I take that back
In fact, I was INCREDIBLY disappointed by this paper. Coming from a supposed technical powerhouse with vertical integration from storage to server to implementation services (and the stranglehold that means they hold over customers), I had expected that IBM would deliver to us a well thought out explanation about how they would be helping to accelerate and leverage this new technology across all dimensions of their products to improve performance, reduce costs, and deliver innovative solutions.
What we have been given instead is a paper full of excuses, misrepresentations of the technology, outright inaccuracies and banal promises of something better still to come. Heck, where most competitor's have squatted down on one of the three "excuses" that Chuck Hollis and I both predicted over the past year, IBM has instead decided to invoke all three:
- The technology isn't ready for the enterprise yet
- There aren't any valid use cases
- We're working on something even better, just wait
And though I'm sure that BarryW will spring to the defense of this paper, there is no denying that the authors are not in sync with the realities of flash. In fact, the IBM white paper contradicts SNIA's Solid State Storage Initiative's SSD white paper (Solid State Storage 101 – An introduction to Solid State Storage). Which is peculiar, since the chairperson of the SSSI is in fact an IBM employee (Phillip Mills). Phil didn't write the paper, and I have no clue whether he reviewed it, so I'm not throwing him under the bus. But I'd like to think that IBM would try to keep the Left Hand and the Right Hand working together toward the same objectives.
but what’s the objective?
Usually ‘white papers’ are written for one of two reasons. Most frequently the intent is to inform and educate readers as to a specific differentiating feature, product or technology. Less often, vendors use ‘white papers’ as a means of explaining why their products aren’t yet supporting some new technology. This form of paper is usually written in the form of educational prose to camouflage the real message: you can’t get this technology from us yet.
By my read, this paper is in the latter category.
There’s really no new news about the technology of solid state storage or Flash drives in this paper – in fact, the aforementioned SNIA SSSI paper offers a far more comprehensive (and unbiased) perspective.
But this paper contains a plethora of excuses for why it has taken IBM over a year to bring flash technology to the market. You might have missed them, so I’ll put the spotlight on a few to help you pick them out amongst the distractions:
- Flash write endurance and performance are a particular challenge for IBM’s DS8K, hence all the detail about this in the paper.
- The primary reason is that the DS8K has a relatively tiny write cache (max 8GB), requiring data to be destaged more frequently to the drive. In a large-cache array like Symmetrix, which can use more than 200GB of cache to buffer writes, EFDs are less frequently written to which both minimizes wear and improves performance.
- A secondary reason is that the DS8Ks typical I/O block size is smaller than the internal page size of most EFDs, a fact that exaggerates a phenomena known as write amplification causing accelerated wear out of the media. The fact the paper doesn’t even mention this is indicative of a camo white paper – you wouldn’t want to mention something bad that you have no way to mitigate.
FYI: Write amplification means that when you write (say) a 512Byte block to the SSD, it actually has to internally perform a Read-Modify-Write of its nominal page size (say 4KB) to actually store the data. So although only 512 bytes of data changed, 4096 bytes of flash undergo an erase+write cycle, reducing the expected lifetime of the drive.
- The paper makes several assertions as to the “complexity” of using Flash as a tier, and asserts that IBM will mitigate this with “management software.” These statements are the mask behind which IBM is hiding three facts that you may have overlooked:
- On the DS8K you cannot non-disruptively relocate an FBA LUN or a CKD Volume from one class of disk drive to another. If you created the device on 15Krpm HDDs, that’s where it will stay, unless you use some external means to copy the data to a different device. With Symmetrix you can easily and non-disruptively relocate a LUN or Volume to Flash; clearly IBM doesn’t want to highlight their inabilities in a white paper like this.
- IBM *has* announced how it intends to get around this limitation – for mainframe users at least. If you use DFSMS or DB2 on your mainframe (neither of which are free), you will “soon” be able to relocate datasets using what is in effect “host-based copy.” No announcements for how you’ll solve the problem on open systems or the AS400, though.
- In IBM lingo, the term “management software” is usually a euphemism for “proprietary lock-in solutions” – to get the benefits of IBM’s vertical integration, you basically have to commit to using IBM products top-to-bottom. That’s fine, I guess, but the rest of the world is moving to open standards to attain integration. For example, T10 recently completed what should be the final draft of the latest round of updates to the SCSI specs, and they now include a standard SCSI sense code to identify a device as being a Flash Device. Once implemented, any vendors’ database or file system will be able to determine if a target LUN resides on Flash, and self-optimize to maximize the performance advantage.
- In addition to the DS8K, the paper asserts that Flash SSDs are coming on the other storage platforms they sell. Outside of the (single-tiered) XIV and the (Frankenstorage) SVC, IBM doesn’t make any other storage devices. So this claim is a thinly veiled attempt to capitalize on the efforts of their OEM suppliers like NetApp, LSI and Dot Hill. Of course, those guys are all even slower than IBM themselves in supporting flash technology, so that’s not a really inspiring revelation.
- IBM isn’t being very aggressive in pushing the technology or applications of Flash. Throughout the paper, there are assertions that Flash won’t ever replace 15K drives, how it will be a long time before we see MLC in the enterprise, and how “the next” technology is the one we really need. This isn’t the language of a company who is working hard to accelerate the technology or its adoption; instead it says “please wait until we figure out how to leverage this technology” – I suspect they haven’t figured out how to monopolize customer dollars with Flash, and hope to stall things long enough for “the next thing.” Being as IBM has proprietary interests (e.g., protected IP) over both “next things” they reference (Racetrack and Phase Change), I suspect the delay is anything but altruistic.
The truth is that not only was IBM caught flat-footed last year with EMC’s announcement of Flash, but their sloth continues to this day. They’re doing everything they can to steer customers away from flash drives, even overpricing them, in an effort to hide the fact that the aged DS8K is a poor architecture on which to use “super-fast” storage (IBM’s terminology, not mine).
Sad, but true.
Hopefully readers of IBM’s papers won’t be mislead, and they’ll just look elsewhere for both the real facts as well as working products that actually deliver the price/performance advantage of EFDs today. And as of this writing, the only place to get EFDs in every class of commercial external storage array is where it all started – at EMC. Only here are EFDs already integrated into products and solutions that are benefitting literally hundreds of Symmetrix, CLARiiON and Celerra customers around the world.
fixin’ the facts
I know first hand that IBM is a big company (I used to work there), and I know how hard it is to coordinate across a mass of people. But to get simple facts wrong in a white paper – well, it just makes me wonder if the authors and reviewers actually took the time to research their paper before writing it.I found several factual inaccuracies in this paper, so in the interest of keeping us aligned on the facts, I’m offering several corrections. Maybe someone at IBM will see this and fix the paper (if they do, I’d appreciate a citation noting my contributions):
Page 3: Enterprise-class flash drives actually made their “appearance” in 2007, not 2008. As BarryW likes to harp, he actually blogged about the STEC ZeusIOPS drives that he had in his labs in September 2007. I trust BarryW wasn’t lying, so the paper’s authors are off by a year.
Page 9: Using multiple channels to parallelize writes inside an Enterprise-class Flash Drive not only benefits write performance, it also allows an EFD to deliver sustained READ performance at over 200MB/s (on the latest generation STEC ZeusIOPS drives). Without this fact, the last paragraph on this page might leave the reader to believe that EFDs are slower than the disk drives shown in the chart on Page 5.
Page 10: The Endurance paragraph asserts the SLC NAND has a WRITE endurance of 1 million cycles, when in fact typical SLC NAND components are rated at 100,000 WRITE cycles (and frequently exhibit better than 300,000 cycles). Likewise, I’ve never heard of MLC NAND being rated as high as 100,000 cycles, with 10,000 being typical.
Page 10: The last paragraph references a Flash SSD rated at 50K READs/sec and 17K WRITEs/sec with a 5-year rated lifecycle, and asserts that these drives are “new, and … will experience a learning curve that will improve reliability and price.” While that statement isn’t entirely inaccurate, the STEC ZeusIOPS is far from “new” – especially in terms of technology maturity cycles. In fact, ZeusIOPS is actually in it’s second generation, will millions of hours of production run-time under its belt. (OK yes, new to IBM, but that’s really irrelevant).
Page 11: One asserted application for Flash SSDs is database logs, but logs are typically 100% sequential WRITE. This sort of workload will see relatively little benefit from flash drives, and since disk drives can sustain similar write performance (especially if dedicated to the logs), using HDDs is probably more cost-effective. Positioning Flash for database logs is a common mistake, often made by people who have had no practical experience with the devices. My advice to most EFD prospects is to save money by using HDDs for the logs and put the frequently accessed index tables on the EFDs.
Page 11: The implied statement in the second paragraph is that 28 EFDs costs more than 500 15Krpm HDDs – in 2008. Point in fact, the only vendor who could sell you these EFDs in 2008 was EMC, and by year’s-end those 28 EFDs actually cost about 1/3 less than 500 15K HDDs. IBM is still pricing the ZeusIOPS for the DS8K at 30-40x the 15K drives (with an announced April availability on new systems only), so if you want to save money, you’ll have to buy a new Symmetrix instead. You’ll be presently surprised that EMC is now selling EFD’s for even less (on a $/GB basis) than they were in 2009.
Page 12: The paper implies that “new” SMART and ECC management are needed to support EFDs – true. But EMC has been supporting these in Symmetrix over a year, and on CLARiiON since last Fall. And there’s really nothing more special than what it took to support SATA drives (which IBM also just recently introduced for the DS8K, over a year and a half after Symmetrix delivered support), so I understand the need for IBM to explain the delays.
Page 13: As I’m sure BarryW will attest, the authors of this paper made the same mistake as did TonyP about the benchmark used for Project Quicksilver – it in fact was NOT a variant of the SPC test, but instead was a workload generated by an IBM-internal mixed-workload generator tool
IOmeter (corrected 9 March 09). The SPC would have forbidden IBM from publishing the 1 million IOP claim using an unreleased configuration (and an unaudited test) for the SPC, hence IOmeter instead of an real-world “official SPC” benchmark. Frequent readers will recall that I recently pointed out how IOmeter could be used to report fictitious and irrelevant results, so there’s a bit of skepticism in the industry about 1M IOPS claims of Quicksilver <shields on>. (retracted 9 March 09)
let’s stick to the facts, please
Look – I understand that enterprise-class Flash drives are relatively new to the industry. But the technology isn’t, so there’s no real excuse for getting the facts wrong here. Rather than play on the naiveté of the market with misleading positioning, we vendors have a responsibility to stick to the facts. I honestly don’t have a problem with IBM’s attempts at camouflage, but I’d have preferred to NOT have had to do the second half of this post to correct the blatant mistakes and misrepresentations of fact. Whether these were intentional or simple errors of research, no matter – the market deserves better.
Hopefully, this post has/will help to set things right again.