« 0.059: bold, fast and green | Main | 0.061: swinging from the sidelines »

January 16, 2008

0.060: blinded by the light

For those of you who were so breath-taken by EMC's unexpected "viper on steroids" lightning strike with Enterprise Flash SSDs, here's my perspective on the rest of Monday's Symmetrix announcements:

They were pretty neat, too, although clearly not as
as the enterprise-class flash drives will be.

And so, before I dig into the rest of the neat that was announced, you gotta admit - it is truly exhilarating to be totally surprised with the announcement of a disruptive technology that could very well supercede the performance, power (and hopefully the cost) limitations of spinning disk drives!

Of course, the competition has responded with the expected aplomb. Hitachi has gone on record with the assertion that this is all an uninteresting niche play limited to the needs of the Fortune 50 Money Eyes. Meanwhile, IBM's designated storage blogger is gleefully cheering from the sidelines that EMC is retreating to its roots in solid-state storage.

Methinks perhaps they've been blinded by the flash (if not outright  blind-sided).

From my perspective, the roots of the so-called EMC Specialty Shop aren't in solid-state storage at all, but rather they are entwined with a proven track record of out-innovating competitors in the storage space for nearly 3 decades. You need only look at EMC's Innovation Timeline to see the legacy of being the first to deliver solutions to very real and broad-based customer problems over that timeframe - from RAID to ICDA to SRDF to DMX and now flash drives.

Even IBM's recent XIV acquisition is an admission of that fact, coming months after Joe Tucci let the world know that EMC had set its sights on the cloud storage market with the impending Hulk & Maui products. And given that it is likely to be at least a year before the IBM Blue logo goes on the Nextra box and it gets into the bags of IBM's mainstream sales machine, I suspect that Hulk/Maui will technically beat IBM into that market as well.

That said, rest assured that neither IBM nor Hitachi are internally treating enterprise-ready flash drives as another Al Capone's vault. Inside they all (now) know that enterprise flash drives are very real, that they serve a very real and current customer problem, that they will inevitably change the way we think about storage in the future, and that they need scramble to catch up to the lead that EMC has established. They're not really stoopid - they'll be trying to get into the game as quickly as they can.

And while today's enterprise-flash drive benefits may primarily be their incredibly fast response times and energy-efficient IOPS/watt, we all know that customer demand and cost erosion will rapidly expand the market. The future of flash-based storage is inarguably ahead of us.

As to why TonyP would try (in his blog) to compare the 73GB & 146GB enterprise flash drives that EMC just announced to the new "larger" 31.5GB (and 10x slower) consumer-grade flash drives that IBM just announced this week for their blade servers (the drives that come with only a one-year, limited warranty) ?

I honestly haven't a clue.I dont know

OK - enough of that fun. On with the new Symmetrix stuff...as usual, there's lots to talk about! 

it just keeps getting better

Not even 6 months since the announcement of the DMX-4 and a wealth of enhancements for both DMX-3 and DMX-4, Monday's announcements present another plateful of new features for both platforms. No matter how you slice it, this represents a healthy dose of investment protection for the large (and expanding) DMX-3/DMX-4 installed base.

So let's look at the list:

  • Enterprise flash drives (DMX-4 only)
  • 1TB 7200 rpm SATA-II drives (DMX-4 only)
  • New GigE Channel Director with hardware-assisted IPv6 & IPsec encryption
  • Cascaded SRDF
  • HyperPAVs
  • Array-based Native Compatible Flash(copy)
  • Virtual Provisioning

Admittedly, the flash drives were announced as only qualified for the DMX-4, and the DMX-3 doesn't support SATA drives so it can't support the new 1TB SATA-II drives either. But everything else above will indeed be supported across both platforms.

I'll summarize each of these below, skipping over the flash drives (lest I be accused of chest-thumping).

1tb sata-ii drives

There's not really a lot that needs to be said here. DMX-4 was the first enterprise-class storage array to support SATA-II drives when it began shipping in Q3'07. At the time, it was announced that the DMX-4 would support a 750GB SATA-II drive in Q4'07. However, the powers that be within EMC decided instead to skip over the 750GB drive, and go straight to the 1TB in Q1'08, avoiding the inherent support and sparing costs for the drive in return for a better $/GB offering for customers.

Personally, I think that was a brilliant idea (even if it wasn't mine).

hardware-accelerated ipv6 and ipsec

This one is really straight-forward as well, although I can't for the life of me explain why, after nearly 5 years in the market, Symmetrix DMX is still the only enterprise storage array to offer native support for Gigabit Ethernet, not to mention native iSCSI host connectivity and native replication over IP networks. Native support saves customers thousands and thousands of dollars in external channel converters.

Oops! I know the competition reads my blog religiously, so I'll should probably shut up about that.

If your badge says IBM or Hitachi, please just forget I even mentioned it.Dont tell anyone

Suffice to say that this new GigE director improves upon the existing offering with native IPv6 support (as required by the US Federal Government Procurement Directives, among others), and it also can be licensed to run data-in-flight encryption using the industry standard IPsec encryption protocol for either iSCSI or SRDF connections (yes, that's right, IPsec is a separately licensed feature, licensed and enabled per GigE port). Importantly, the hardware will also perform data compression prior to the encryption (or without the encryption, for that matter) to reduce the traffic overhead. And of course, it also still runs on IPv4 networks, for maximum flexibility.

cascaded srdf

This one is a cost-saving twist on Symmetrix multi-hop replication (SRDF/AR), as well as an alternative to Concurrent SRDF for three-site replication.

Like SRDF/AR, SRDF/Cascaded replication is configured A-->B---//-->C, where the A-B link uses synchronous replication (SRDF/S), and the B-C link uses (asynchronous) adaptive copy. The difference lies in the fact that SRDF/Cascaded requires only a single copy of the data volumes in B - this replica acts both as the target (R2) of the source device(s) on array A, and (simultaneously) as the source (R1) to the target(s) on array C. This replica on array B is conveniently referred to as an R21 (get it)?

Cascaded requires less storage on the B site than SRDF/AR. And while it does require a new SRDF/Cascaded licenses on all three arrays, it also can eliminate the need for a TimeFinder license for array B as well; combined these can reduce the total acquisition cost of a cascaded 3-site solution.

Operationally, Cascaded SRDF can significantly improve the RPO of the remote site (array C) as compared to SRDF/AR. On the other hand, it likely will require more bandwidth than SRDF/AR, although Cascaded can be configured to use either SRDF/A or SRDF/DM (adaptive copy) between B--C, giving you some flexibility to balance bandwidth vs. RPO.

And as I said, Cascaded can also be considered as an alternative to running concurrent SRDF/S and SRDF/A off of the primary array - this is something several customers have asked for so as to simplify the operations and distribute the replication workload off of the production system.

With now three different ways to implement 3-site replication, the appropriate choice depends on several factors, including acquisition price, the cost of bandwidth, the target RPO and the amount of changes that have to be replicated. The characterization rules are beyond the scope of what I can cover here, so I encourage you to work with your EMC technical resources to make that decision. But know that you have a new option.

One important FAQ about Cascaded: at a minimum, array B has to be running 5773 to support Cascaded. Right now, I've heard that the plan is to qualify with array C running 5772, and maybe even array A. But I don't know for sure that this will all get done by GA - be sure to check with your EMC technical folks on this as well.

hyperpavs and native compatible flash

(Note to reader - if you don't grok mainframe Hypnotized just skip this section)

HyperPAVs are a feature of the latest z/OS releases for the zSeries that allow the host/channel interface to dynamically create and dissolve parallel access volumes on the fly to improve I/O parallelism. Like IBM's own implementation for the DS8000, Symmetrix support for HyperPAV requires a per-Symmetrix license for HyperPAVs on top of the separate Symmetrix license for standard PAV aliases.

Native Compatible Flash is a new array-based implementation of IBM-compatible FlashCopy replication semantics for the mainframe. Leveraging significant enhancements to the underlying copy engine within Enginuity that supports the volume- and extent-based local replication capabilities of Symmetrix, this new implementation allows mainframe applications and utilities to leverage the advanced capabilities of Symmetrix using standard Flashcopy semantics. I understand that this new array-based implementation  (also separately licensed) will ultimately replace the previously-offered host-based Compatible Flash, so you may want to start discussions with your account team about a transition plan.

It is perhaps important to note that these new features come to you as a result of EMC's renewed technology licensing and support agreements. Although I won't go into the specific terms of this agreement, it does serve to ensure EMC customers that these technologies are being built with full knowledge of IBM's specifications and with their support - coopetition at its finest!

virtual provisioning

And finally (yup, I saved the best for last), EMC announced that indeed Virtual Provisioning for Symmetrix will ship this quarter, with support for both the DMX-3 and DMX-4 platforms.

Not a lot has changed since EMC first announced its intent to deliver this back with the DMX-4 was introduced - everything is basically the same as when I covered the DMX-4 launch, as you can see for yourself (next to the last section of this post).

No, nothing has really changed.

Except the name, that is.

Joe Tucci (aka the EMC Naming Committee-of-One) decided somewhere over Maryland, on a flight from Boston to Orlando, that the storage anarchist was right, and the moniker for this feature really should be consistent with other terms like virtual memory and virtual servers - just like I said in my second-ever blog post: storage virtualization: naming gone awry). And when he got off that plane, everyone immediately scrambled to change their presentations to reflect the new name before we presented (this was at the EMC TC Conference that I mentioned back in October).

OK. Alright. Joe probably never read my blog. And he didn't ask my opinion (I *wasn't* on his plane). He probably just figured we'd already used the name on the Celerra implementation that's been shipping since January 2006, and there was no good reason to change the name. Whatever. Doesn't matter.

The Naming Committee has spoken: Virtual Provisioning it is.

(And yes, personally, I think it's a damn good name)

a look inside svp

A few key points about Symmetrix Virtual Provisioning. First, it really is much more than just "thin provisioning." Oh sure, it enables customers to improve their storage utilization. And it allows for "just in time" addition of new capacity. There are comprehensive (and unavoidable) alerts and alarms as utilization thresholds are exceeded, both on the thin devices and on the storage pools from which they assign capacity, so you know when its time for adding "just in time" capacity.

And like other implementations, you can even oversubscribe storage: the amount of "virtual" but unallocated storage assigned to "thin devices" can in fact be greater than the pool that supports the thin device, it can be greater than the arrays' currently unused capacity - and it can even be greater than the array could possibly ever hold (which, of course, will make it very difficult to do add capacity "just in time").

And as a result of customer input, you can also prohibit over-subscription, ensuring that nobody oversubscribes a pool unintentionally. Clever people, those customers - in fact we received several neat ideas like this one from the Technical Advisory Panels that we've been running (I probably haven't mentioned those before - maybe I'll let you in on this secret during a slow week sometime).

But like I said, Virtual Provisioning is so much more than just optimizing utilization. In fact, 9 out of 10 customers we asked about thin provisioning said that their primary interest wasn't the "thin" part at all - what they wanted most was the fast-and-easy storage provisioning that was inherent in thin provisioning implementations, like those from 3PAR, EqualLogic and even Centera. In fact many said that they didn't WANT "thin" - they just wanted "fast."

EMC listened.

And the Symmetrix Engineers responded.

So Symmetrix Virtual Provisioning is not only space efficient, it is FAST and SIMPLE. Faster even than creating a standard Symmetrix volume - and even that has been sped up over 6x faster than when DMX-3 first shipped (with 5773 it takes less than 10 minutes to allocate a terabyte).

And for those customers who want the wide-striping benefits of thin provisioning without the risks of over-subscription, the engineers included the ability to pre-allocate the maximum defined capacity of the (not-so-) "thin" device.


Rounding out the feature set of Symmetrix Virtual Provisioning, the initial release will include full support for virtually all Symmetrix functionality. This includes both TimeFinder and SRDF replication support, thin-to-thin (only in this release) for maximum utilization savings. This is something that the developers over in USP-V land seemingly still have not delivered, despite promises to numerous customers to ship by years' end 2007.

For Symmetrix Virtual Provisioning, it will support on Day 1: Open Replicator, Cache Partitions, Priority Controls, Flash Drives, and 1TB drives. It will have full support from Symmetrix Management Console, as well as reporting and trending with ControlCenter - in fact about the only major thing not supported in this release are mainframe CKD and iSeries OS400 volumes (and though support for at least CKD will likely be added in the future, the feedback from mainframe customers  that this is far less important than some of the other things we're working on for them).

"chunk"(tm?) size

I've put this in its own separately labeled section because everybody seems to be trying to find out what the actual "chunk" size is for Symmetrix Virtual Provisioning. In fact, so many press and industry analysts are asking that I'm beginning to think one of EMC's competitors is behind the question. I mean, seriously, who else but someone who's already implemented thin provisioning would really understand the implications of "chunk" size enough to care?

For those of you who don't know what the heck "chunk size" means (now listen up you folks over at IBM who have yet to implement thin provisioning on your own storage products), a "chunk" is the term used (and I think even trademarked by 3PAR) to refer to the unit of actual storage capacity that is assigned to a thin device when it receives a write to a previously unallocated region of the device.

For reference, Hitachi USP-V uses I think a 42MB chunk, XIV NEXTRA is definitely 1MB, and 3PAR uses 16K or 256K (depending upon how you look at it).

Back at the office, they've taking to calling these "chunks" Thin Device Extents (note the linkage back to EMC's mainframe roots), and the big secret about the actual Extent size is...(wait for it...w.a.i.t...for....it...)...the engineers haven't decided yet!

That's right...being the smart bunch they are, they have implemented Symmetrix Virtual Provisioning in a manner that allows the Extent size to be configured so that they can test the impact on performance and utilization of different sizes with different applications, file systems and databases. Of course, they will choose the optimal setting before the product ships, but until then, there will be a lot of modeling, simulation, and real-world testing to ensure the setting is "optimal."

But like I said, the curiosity level is so acute, I just gotta figure that someone is trying to find out the answer so that they can start preparing their competitive response FUD.

Now, many customers and analysts I've spoken to have in fact noted that Hitachi's "chunk" size is almost ridiculously large; others have suggested that 3PAR's chunks are so small as to create performance problems (I've seen data that supports that theory, by the way).

Well, here's the thing: the "right" chunk size is extremely dependent upon the internal architecture of the implementation, and the intersection of that ideal with the actual write distribution pattern of the host/application/file system/database.

Given that, you should understand that Symmetrix is basically a track-oriented system (a DMX-3 or DMX-4 track is 64KBytes). While Symmetrix never reads or writes an entire track unless it needs to, the system inherently looks at disks not as a collection of 512 byte blocks, but as a series of blocks organized into 64KB tracks (arranged as cylinders, but that's not relevant to this discussion). What's important here is that the the smallest practical Extent size for Symmetrix is 64KB. And the largest? 64GB, but you can bet that the Extent size won't be anything near that big.

The optimal Extent size is dependent upon several factors. First is the pattern that sequential tracks are laid out across the members of a RAID group, coupled with the number of members in the RAID group (you know,1+1,  3+1, 7+1, 6+2, 14+2). Then comes consideration of the way that pre-fetch algorithms want to handle sequential reads - the Extent size needs to align with all of these algorithms to maximize the efficiency of cache and thus to minimize the overhead of actually doing I/O to the thin device.

Then there are the considerations of the amount of memory it will require to support the thin device virtual mapping tables to the actual storage. The smaller the Extent size, the larger these mapping tables and the more global memory required to support them, BUT the less data manipulation required to effect the addition of each new Extend to the thin device, although that work will have to be done more frequently. Conversely, the larger the Extent, the smaller the memory footprint, the more complex the task of adding Extents, but the process will occur less frequently.

Given that, if the extents are too small, RAID calculations and/or prefetch will be sub-optimal. Too large, and I/Os may time out waiting for the allocation of a new Extent. And larger Extents could in fact waste a boatload of capacity if the host database/file system/application "scribbles" across the volume instead of clustering writes close together.

So if you were able to follow all that, tell me - can you predict what the "best" Extent size for Symmetrix is going to be: 10 tracks? 12 tracks? 16? 24? 64? 144? 42MB?

Give up?

I'll admit it - I don't know either. At least, not yet.

So to all of you who are pressing for an answer - I ask: why do you care? Surely you don't know what's "best" or "right" - at least, you don't know better than the men and women who are writing the code and testing the applications every day.

I say we just wait and see what they decided.

And when they do, I promise: I'll explain the reasoning.


don't be surprised if there's more than meets the eye

I have to admit that there is likely to be more in the release of Enginuity 5773 than I've discussed here. Qualification may well exceed expectations, and there are indeed more than a couple of new features "on the bubble" for making GA this quarter. But with all the revenue recognition rules, these things won't be discussed until EMC knows that they'll GA this quarter.

Still, I'm pretty sure I'll have a few more interesting things to talk about when 5773 ships in the middle of March 2008.

In closing, I sincerely hope you found this post interesting and informational, whether ye be employee, investor, customer or competitor! We're at the dawn of a brave new era, and it's undoubtedly going to be exciting.

Please, drop me an email or leave a comment - I'd love to hear your thoughts on this week's announcements!



TrackBack URL for this entry:

Listed below are links to weblogs that reference 0.060: blinded by the light:


Feed You can follow this conversation by subscribing to the comment feed for this post.

Barry Whyte

Just on the SSD side of things, as you no doubt know, STEC touted these round all of us last year, and its not the 'just stick them in the box we have' approach thats of real interest - anyone can, and I'm sure will do that.

Its what else we can use them for that has really excited me and mine. Is some really fast hard-wired blocks (by that I mean fixed to certain LUNs) the only, or even the best customer use? If I'm paying $80/GB + I'd want to not only spread them around, but get every single one of those IOPs for my money... think bigger picture... dare I say outside the box... especially the big power hungry monolothic box...

For sure these are a groovy packaging, but we need more suppliers with the same level of performance to truly force the price down otherwise storage vendors will continue to command whatever price they can get for them. Everyone has to make a profit, and the patented ware-leveling, ECC and block-level migration within the STEC drive does command a premium over "USB stick NAND" - if we want to see these prevail outside the Fortune100 then the price has to drop, and fast.

the storage anarchist

I couldn't agree with you more - this is just the beginning of the ways that flash technology will be leveraged in the storage industry...that's what's so neat about it!


Lots of good information there - I like detailed posts like that, keep it up.

But 10 minutes to allocate 1 Tb? Try less than 10 seconds using a certain competitors product.....(but I guess it all depends when you start measuring)




Hi Barry,

Interesting post as always. I find it interesting that you say the Symm engineers have not decided on ‘chunk’ size yet but are testing what works best. Im sure they guys at Hitachi did the same and didn’t just pick the 42MB number out of the air. As Im sure you know, the Hitachi implementation borrows heavily from their tried and tested Copy On Write technology which they have been shipping for, and learning from, for years.

Now I wont pretend to know the ‘chunk’ size used for COW pools, so may be they didn’t cut and paste that piece of code. But I can be sure of one thing – namely the overhead that is placed on the front end processors when COW is used is considerable. Far more than with other software such as ShadowImage and TrueCopy. Obviously the more pages that are demanded form the COW pool, the busier the front end MP’s. They can get pretty hot. Note, I have only looked at this performance on a USP and not a USP-V, which obviously has faster processors.

My point being, Hitachi no doubt has considerable experience with the effect of on-demand allocations from pools like those used in their dynamic provisioning. My guess would be that the good folks in Symmetrix engineering will also come out with a fairly large chunk size. Although I suppose you guys have a lot more room for movement since you guys use superior processors and write tighter ucode ;-)

Oh and of course track size on the USP is 256K as opposed to 64K on the DMX, 4 times the size. So if the Symm was to go with 42MB it would have to source 4 times as many tracks as the USP. SO MY GUESS AS TO THE NUMBER THAT THE SYMM ENGINEERS FINALLY DECIDE ON IS AROUND 10MB. Whats the prize for the correct guess?

Oh and surely there are times when the Symm does read and write an entire track when only a sector or two are required – read ahead, SRDF…..?

Oh and BTW - nice move on the SSD!!

the storage anarchist

Nigel - nice to hear from you again, and thanks for the kudos!

Good guess on the chunk size, but no ciger - your logic took you down the wrong path.

I see that 3PAR is having a boatload of fun picking on USP-V's "chubby" allocation size, though. I have to admit that I hadn't thought it related to COW; my bet was that is was a prime multiple of their RAID stripe sizes and their track size...maybe you're right.

And yeah, Symm will take advantage of "free reads" whenever possible - caching the rest of the track behind the desired sector. Turns out to really help increase cache hit ratios for a lot of applications.

And SRDF today is far more effecient than days of old WRT partial tracks. Always room for improvement, to be sure.

The comments to this entry are closed.

anarchy cannot be moderated

the storage anarchist

View Barry Burke's profile on LinkedIn Digg Facebook FriendFeed LinkedIn Ning Other... Other... Other... Pandora Technorati Twitter Typepad YouTube


I am unabashedly an employee of EMC, but the opinions expressed here are entirely my own. I am a blogger who works at EMC, not an EMC blogger. This is my blog, and not EMC's. Content published here is not read or approved in advance by EMC and does not necessarily reflect the views and opinions of EMC.

search & follow

search blogs by many emc employees:

search this blog only:

 posts feed
      Subscribe by Email
 comments feed

 visit the anarchist @home
follow me on twitter follow me on twitter

TwitterCounter for @storageanarchy

recommended reads

privacy policy

This blog uses Google Ads to serve relevant ads with posts & comments. Google may use DoubleClick cookies to collect information (not including your name, address, email address, or telephone number) about your visits to this and other websites in order to provide ads about goods and services of interest to you. If you would like more information about this practice and your options for not having this information used by Google, please visit the Google Privacy Center.

All comments and trackbacks are moderated. Courteous comments always welcomed.

Email addresses are requested for validation of comment submitters only, and will not be shared or sold.

Use OpenDNS