« 1.043: the fine art of strategy | Main | 1.045: administrivia - comments feed and new tsa domains »

March 07, 2009

1.044: ibm's amazing splash dance

mickey's splash danceLeave it to the folks over at Big Blue to throw cold water on the whole flash storage revolution.

On the same day that both IDC and Gartner confirmed that IBM is losing share in the external storage market while EMC is gaining, the following Tweet from "ibmstorage" floated across my TweetDeck:

IBM's approach to new storage technology
"Solid state disks for enterprise storage"
http://tinyurl.com/acom2s (pdf)
ibmstorage , Fri 06 Mar 10:32 via web

The links gets you this white paper: Solid state disks for enterprise storage - IBM’s approach to new storage technology.

UPDATE: Just in case IBM moves or withdraws the referenced white paper, I have saved a copy of it here on my blog site.

With a title like that, I figured this paper would be the long-waited IBM response to my previous Flashdance post, even though it was probably at least in draft weeks before I started my post.

I wasn't to be disappointed.
 

I take that back

In fact, I was INCREDIBLY disappointed by this paper. Coming from a supposed technical powerhouse with vertical integration from storage to server to implementation services (and the stranglehold that means they hold over customers), I had expected that IBM would deliver to us a well thought out explanation about how they would be helping to accelerate and leverage this new technology across all dimensions of their products to improve performance, reduce costs, and deliver innovative solutions.

What we have been given instead is a paper full of excuses, misrepresentations of the technology, outright inaccuracies and banal promises of something better still to come. Heck, where most competitor's have squatted down on one of the three "excuses" that Chuck Hollis and I both predicted over the past year, IBM has instead decided to invoke all three:

  1. The technology isn't ready for the enterprise yet
  2. There aren't any valid use cases
  3. We're working on something even better, just wait

And though I'm sure that BarryW will spring to the defense of this paper, there is no denying that the authors are not in sync with the realities of flash. In fact, the IBM white paper contradicts SNIA's Solid State Storage Initiative's SSD white paper (Solid State Storage 101 – An introduction to Solid State Storage). Which is peculiar, since the chairperson of the SSSI is in fact an IBM employee (Phillip Mills). Phil didn't write the paper, and I have no clue whether he reviewed it, so I'm not throwing him under the bus. But I'd like to think that IBM would try to keep the Left Hand and the Right Hand working together toward the same objectives.

but what’s the objective?

Usually ‘white papers’ are written for one of two reasons. Most frequently the intent is to inform and educate readers as to a specific differentiating feature, product or technology. Less often, vendors use ‘white papers’ as a means of explaining why their products aren’t yet supporting some new technology. This form of paper is usually written in the form of educational prose to camouflage the real message: you can’t get this technology from us yet.

By my read, this paper is in the latter category.

There’s really no new news about the technology of solid state storage or Flash drives in this paper – in fact, the aforementioned SNIA SSSI paper offers a far more comprehensive (and unbiased) perspective.

But this paper contains a plethora of excuses for why it has taken IBM over a year to bring flash technology to the market. You might have missed them, so I’ll put the spotlight on a few to help you pick them out amongst the distractions:

  • Flash write endurance and performance are a particular challenge for IBM’s DS8K, hence all the detail about this in the paper.
    • The primary reason is that the DS8K has a relatively tiny write cache (max 8GB), requiring data to be destaged more frequently to the drive. In a large-cache array like Symmetrix, which can use more than 200GB of cache to buffer writes, EFDs are less frequently written to which both minimizes wear and improves performance.
    • A secondary reason is that the DS8Ks typical I/O block size is smaller than the internal page size of most EFDs, a fact that exaggerates a phenomena known as write amplification causing accelerated wear out of the media. The fact the paper doesn’t even mention this is indicative of a camo white paper – you wouldn’t want to mention something bad that you have no way to mitigate.

      FYI: Write amplification means that when you write (say) a 512Byte block to the SSD, it actually has to internally perform a Read-Modify-Write of its nominal page size (say 4KB) to actually store the data. So although only 512 bytes of data changed, 4096 bytes of flash undergo an erase+write cycle, reducing the expected lifetime of the drive.
  • The paper makes several assertions as to the “complexity” of using Flash as a tier, and asserts that IBM will mitigate this with “management software.” These statements are the mask behind which IBM is hiding three facts that you may have overlooked:
    • On the DS8K you cannot non-disruptively relocate an FBA LUN or a CKD Volume from one class of disk drive to another. If you created the device on 15Krpm HDDs, that’s where it will stay, unless you use some external means to copy the data to a different device. With Symmetrix you can easily and non-disruptively relocate a LUN or Volume to Flash; clearly IBM doesn’t want to highlight their inabilities in a white paper like this.
    • IBM *has* announced how it intends to get around this limitation – for mainframe users at least. If you use DFSMS or DB2 on your mainframe (neither of which are free), you will “soon” be able to relocate datasets using what is in effect “host-based copy.” No announcements for how you’ll solve the problem on open systems or the AS400, though.
    • In IBM lingo, the term “management software” is usually a euphemism for “proprietary lock-in solutions” – to get the benefits of IBM’s vertical integration, you basically have to commit to using IBM products top-to-bottom. That’s fine, I guess, but the rest of the world is moving to open standards to attain integration. For example, T10 recently completed what should be the final draft of the latest round of updates to the SCSI specs, and they now include a standard SCSI sense code to identify a device as being a Flash Device. Once implemented, any vendors’ database or file system will be able to determine if a target LUN resides on Flash, and self-optimize to maximize the performance advantage.
  • In addition to the DS8K, the paper asserts that Flash SSDs are coming on the other storage platforms they sell. Outside of the (single-tiered) XIV and the (Frankenstorage) SVC, IBM doesn’t make any other storage devices. So this claim is a thinly veiled attempt to capitalize on the efforts of their OEM suppliers like NetApp, LSI and Dot Hill. Of course, those guys are all even slower than IBM themselves in supporting flash technology, so that’s not a really inspiring revelation.
  • IBM isn’t being very aggressive in pushing the technology or applications of Flash. Throughout the paper, there are assertions that Flash won’t ever replace 15K drives, how it will be a long time before we see MLC in the enterprise, and how “the next” technology is the one we really need. This isn’t the language of a company who is working hard to accelerate the technology or its adoption; instead it says “please wait until we figure out how to leverage this technology” – I suspect they haven’t figured out how to monopolize customer dollars with Flash, and hope to stall things long enough for “the next thing.” Being as IBM has proprietary interests (e.g., protected IP) over both “next things” they reference (Racetrack and Phase Change), I suspect the delay is anything but altruistic.

The truth is that not only was IBM caught flat-footed last year with EMC’s announcement of Flash, but their sloth continues to this day. They’re doing everything they can to steer customers away from flash drives, even overpricing them, in an effort to hide the fact that the aged DS8K is a poor architecture on which to use “super-fast” storage (IBM’s terminology, not mine).

Sad, but true.

Hopefully readers of IBM’s papers won’t be mislead, and they’ll just look elsewhere for both the real facts as well as working products that actually deliver the price/performance advantage of EFDs today. And as of this writing, the only place to get EFDs in every class of commercial external storage array is where it all started – at EMC. Only here are EFDs already integrated into products and solutions that are benefitting literally hundreds of Symmetrix, CLARiiON and Celerra customers around the world.

fixin’ the facts

I know first hand that IBM is a big company (I used to work there), and I know how hard it is to coordinate across a mass of people. But to get simple facts wrong in a white paper – well, it just makes me wonder if the authors and reviewers actually took the time to research their paper before writing it.I found several factual inaccuracies in this paper, so in the interest of keeping us aligned on the facts, I’m offering several corrections. Maybe someone at IBM will see this and fix the paper (if they do, I’d appreciate a citation noting my contributions):

Page 3: Enterprise-class flash drives actually made their “appearance” in 2007, not 2008. As BarryW likes to harp, he actually blogged about the STEC ZeusIOPS drives that he had in his labs in September 2007. I trust BarryW wasn’t lying, so the paper’s authors are off by a year.

Page 9: Using multiple channels to parallelize writes inside an Enterprise-class Flash Drive not only benefits write performance, it also allows an EFD to deliver sustained READ performance at over 200MB/s (on the latest generation STEC ZeusIOPS drives). Without this fact, the last paragraph on this page might leave the reader to believe that EFDs are slower than the disk drives shown in the chart on Page 5.

Page 10: The Endurance paragraph asserts the SLC NAND has a WRITE endurance of 1 million cycles, when in fact typical SLC NAND components are rated at 100,000 WRITE cycles (and frequently exhibit better than 300,000 cycles). Likewise, I’ve never heard of MLC NAND being rated as high as 100,000 cycles, with 10,000 being typical.

Page 10: The last paragraph references a Flash SSD rated at 50K READs/sec and 17K WRITEs/sec with a 5-year rated lifecycle, and asserts that these drives are “new, and … will experience a learning curve that will improve reliability and price.” While that statement isn’t entirely inaccurate, the STEC ZeusIOPS is far from “new” – especially in terms of technology maturity cycles. In fact, ZeusIOPS is actually in it’s second generation, will millions of hours of production run-time under its belt. (OK yes, new to IBM, but that’s really irrelevant).

Page 11: One asserted application for Flash SSDs is database logs, but logs are typically 100% sequential WRITE. This sort of workload will see relatively little benefit from flash drives, and since disk drives can sustain similar write performance (especially if dedicated to the logs), using HDDs is probably more cost-effective. Positioning Flash for database logs is a common mistake, often made by people who have had no practical experience with the devices. My advice to most EFD prospects is to save money by using HDDs for the logs and put the frequently accessed index tables on the EFDs.

Page 11: The implied statement in the second paragraph is that 28 EFDs costs more than 500 15Krpm HDDs – in 2008. Point in fact, the only vendor who could sell you these EFDs in 2008 was EMC, and by year’s-end those 28 EFDs actually cost about 1/3 less than 500 15K HDDs. IBM is still pricing the ZeusIOPS for the DS8K at 30-40x the 15K drives (with an announced April availability on new systems only), so if you want to save money, you’ll have to buy a new Symmetrix instead. You’ll be presently surprised that EMC is now selling EFD’s for even less (on a $/GB basis) than they were in 2009.

Page 12: The paper implies that “new” SMART and ECC management are needed to support EFDs – true. But EMC has been supporting these in Symmetrix over a year, and on CLARiiON since last Fall. And there’s really nothing more special than what it took to support SATA drives (which IBM also just recently introduced for the DS8K, over a year and a half after Symmetrix delivered support), so I understand the need for IBM to explain the delays.

Page 13: As I’m sure BarryW will attest, the authors of this paper made the same mistake as did TonyP about the benchmark used for Project Quicksilver – it in fact was NOT a variant of the SPC test, but instead was a workload generated by an IBM-internal mixed-workload generator tool IOmeter (corrected 9 March 09). The SPC would have forbidden IBM from publishing the 1 million IOP claim using an unreleased configuration (and an unaudited test) for the SPC, hence IOmeter instead of an real-world “official SPC” benchmark.

Frequent readers will recall that I recently pointed out how IOmeter could be used to report fictitious and irrelevant results, so there’s a bit of skepticism in the industry about 1M IOPS claims of Quicksilver <shields on>. (retracted 9 March 09)

let’s stick to the facts, please

Look – I understand that enterprise-class Flash drives are relatively new to the industry. But the technology isn’t, so there’s no real excuse for getting the facts wrong here. Rather than play on the naiveté of the market with misleading positioning, we vendors have a responsibility to stick to the facts. I honestly don’t have a problem with IBM’s attempts at camouflage, but I’d have preferred to NOT have had to do the second half of this post to correct the blatant mistakes and misrepresentations of fact. Whether these were intentional or simple errors of research, no matter – the market deserves better.

Hopefully, this post has/will help to set things right again.

 


TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d834c659f269e20112793b446328a4

Listed below are links to weblogs that reference 1.044: ibm's amazing splash dance:

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

drakpzone

I read the paper, and have to agree with you that it sounds very much like a "stay away from flash so that we can figure out how to manage this thing", rather than an objective analysis of flash drives in today's storage environments. The "don't go there" feeling while you read the document is pretty much depressing, at least from a storage administrator point of view.

SRJ

WHOA! Do my eyes deceive me?!

"hence IOmeter instead of a REAL-WORLD BENCHMARK."

Did you just call the SPC a "real-world" benchmark?! Is EMC finally going to come out and play in the "real world" with the rest of the gang?!

Or were you just saying that because it suits your current purpose?

the storage anarchist

Nope, sorry - I still don't think the SPC tests are anything close to "real world."

In fact, given the accellerating adoption of server virtualization, and the rather drastic change in IO profiles that comes with it, I'd say that the SPC is getting increasingly irrelevant every day. I'm pretty sure that you won't find ANY real-world workload that even closely resembles the archaic SPC-1 these days.

And who really cares which system can best support a workload that hasn't been "real" since the 1980s?

Seriously...

SRJ

So you only refer to the SPC as a "real-world benchmark" when it suits your purpose then....got it. Thanks for clearing that up!

BTW - You say that the authors of the paper mistakenly called the Quicksilver IOmeter performance test a "variant" of the SPC benchmark. Perhaps you should re-read it...they clearly neither said nor implied any such thing.

After seeing this example of your blatant misrepresentation of what the authors said and/or implied, I'm sure your readers will want to verify everything else you (mis)characterized in your review of the paper. I probably will too when time permits, but for now I've learned (from you) that it's safe to make some assumptions, take bits of information out of context, draw some conclusions that best serve me, and then accept them as, and state them as, the absolute truth.

Thanks BarryB!

the storage anarchist

You're always such fun to have a discussion with, SRJ.

For the record, the statement in question is (and I quote):

"Although it is not yet an official SPC benchmark, an SVC controller was demonstrated by the Project Quicksilver "Proof of Concept" to provide LUN managementand SAN access to SSD data with over 1,000,000 IO/sec—four times the current SPC record, also owned by SVC."

By implication, that is claiming that Quicksilver delivered over 1Million SPC IOPS - which is absolutely not the case. Pick on my reflection of the words (indeed, there is no mention of "variant" - my bad). But the fact is, the authors imply that Quicksilver demonstrated 1M SPC IOPS...which BarryW has prior confirmed was not the case.

SRJ

It doesn't imply anything...the sentence *begins* by stating that it *isn't* an SPC benchmark!

However, after reading it again, I could see how you could come to that conclusion if you wanted to. Their use of the word "official" could cause confusion. Therefore, I would retract the word "blatant" from my previous comment.

See - I'm not just fun...reasonable too! =)


the storage anarchist

OK, so now the record's straight.

But the real point is, as BarryW confirmed, IBM didn't report 1M SPC IOPS for Quicksilver, they reported 1M IOmeter IOPS!

So, it was neither an "Official" nor an "Unofficial" SPC test...and if it HAD been an SPC test, it would not have met the testing requirements of the SPC, and thus it would have had to have been a "variant."

Barry Whyte

Sorry, nowhere did I mention IOmeter.

This is a tool much like EMC have for generating customer like OLTP workloads (I forget what you call yours, I've come across it at various customers) We have a similar tool - so this is not a "zero" generator like IOmeter, it generates mixed workloads of your chosing at various xfer sizes and so represents a real customer workload.

I have not reviewed this paper - until I saw the tweet - I have some feedback of my own going internally!

the storage anarchist

BarryW - sorry for the mistake, musta got XIV and SVC confuzeled in my brain. I just corrected that paragraph.

Funny, Phil Mills also said he hadn't seen the paper yet (as of Friday). Shouldn't the authors be reviewing such papers with IBM's most prominent flash personalities before publication? B^)

Stephen Foskett

It is a shame that a paper has to come out casting EFD in general in a negative light. But there is some good news here!

Those of us who are not at all bothered by OEM storage (especially that of NetApp, LSI, and Dot Hill) are very pleased that IBM will probably leverage their work and make it available to their customers. The many many users of SVC will be similarly happy to have support on that platform!

And I'm happy to learn that EMC EFD now costs between 11 and 12 times as much as a 15k drive instead of 30 times as much!

When will someone buy STEC already?

The comments to this entry are closed.

anarchy cannot be moderated

about
the storage anarchist


View Barry Burke's profile on LinkedIn Digg Facebook FriendFeed LinkedIn Ning Other... Other... Other... Pandora Technorati Twitter Typepad YouTube

disclaimer

I am unabashedly an employee of EMC, but the opinions expressed here are entirely my own. I am a blogger who works at EMC, not an EMC blogger. This is my blog, and not EMC's. Content published here is not read or approved in advance by EMC and does not necessarily reflect the views and opinions of EMC.

search & follow

search blogs by many emc employees:

search this blog only:

 posts feed
      Subscribe by Email
 
 comments feed
 

 visit the anarchist @home
 
follow me on twitter follow me on twitter

TwitterCounter for @storageanarchy

recommended reads

privacy policy

This blog uses Google Ads to serve relevant ads with posts & comments. Google may use DoubleClick cookies to collect information (not including your name, address, email address, or telephone number) about your visits to this and other websites in order to provide ads about goods and services of interest to you. If you would like more information about this practice and your options for not having this information used by Google, please visit the Google Privacy Center.

All comments and trackbacks are moderated. Courteous comments always welcomed.

Email addresses are requested for validation of comment submitters only, and will not be shared or sold.

Use OpenDNS