« 0.072: wanna get away? | Main | 0.074: emc world 2008 »

April 01, 2008

0.073: 5773 > c

In case you've been wondering, the previously announced Q1'08 Symmetrix enhancements, including Enginuity 5773, the 73GB and 146GB enterprise flash drives, the 1 TB SATA-II drive and the new GigE I/O director all shipped on schedule last week. As usual, there's quite a bit to talk about, because in addition to what has been announced already, there are several additional features in this code release that revenue recognition rules prohibited EMC from disclosing until everyone was confident that they would actually make the GA release in Q1.

But discussion of perhaps the most significant new feature in 5773 was held back for another reason: to get all the patent applications filed before it was disclosed. This one new feature could well prove to be the foundation of a whole new era in remote replication - potentially changing the nature of distance replication more than flash drives will change the storage media end of the equation.

But it'll take a little explaining (hint: it has something to do with the title of this post), so before I get into the details, a little background...
 

the quest to accelerate remote replication

When presenting the Symmetrix technology roadmap to customers and prospects under NDA, I've frequently joked that EMC has assigned not one but two separate engineering teams to work on improving the performance (both throughput and response times) of distance replication. The first team, I joke, is working on breaking the speed of light barrier (another hint: that'd be the "c" in the title), while the other is focused on the more promising technologies of data compression and de-duplication to improve the amount of data that can be replicated per unit of time. And the punch line is that if the former team ever succeeds, you can be assured the EMC^2 won't be in the storage business any more Big Grin.

Now before you turn on your April Fool's day filter, let me assure you that I'm not claiming that the speed-of-light team has come up with a breakthrough (at least not yet, anyway).

No, in fact, it was the de-dupe team that came up with a radical innovation that seemingly "cheats" the transmission speeds of remote replication, and it's pretty exciting (remember, I told you that I work with smart people!).

the tachyon advantage

OK - I'll admit it - I'm having fun with titles (it IS April Fool's day, after all).

No, this isn't about using tachyons (those hypothetical particles that travel at superluminal speeds) to transmit data, but rather the actual Tachyon Fibre Channel Controller from PMC-Sierra that Symmetrix DMX incorporates for its front- and back-end Fibre Channel connectivity.

What the de-dupe team found is that there is a hidden feature within recent generations of this chip that allow a single bit, under certain circumstances, to represent TWO bits of information. Think about that for a minute - twice as much data per bit...even though it's only applicable some of the time, that's pretty radical.

Of course, to work, there has to be a PMC-Sierra Tachyon chip on both sides of the communication link, along with software to take advantage of this somewhat hidden feature. Net-net, it's pretty easy to implement for SRDF on Symmetrix, but it would be more difficult to implement for storage-host FC links, owing as most HBA's are built around QLogic or Emulex chipsets. Also, neither Hitachi nor IBM can use this in most of their storage arrays for the same reason - most of their products don't use Tachyon chips for their FC interfaces, and those other chips just don't support this 2-for-1 capability.

So this is another EMC-unique feature (for now, anyway). Aren't you glad you bought the best?

there are lots of zeros out there

One of the most challenging things about this discovery was to determine which two-bit pattern we would use this feature for. Obviously, we had four choices ("00", "01", "10", "11"). By taking line traces of active SRDF links, EMC's engineers were able to determine that the vast majority of the bits transmitted are in fact zero-bits - and this even though all-zero data blocks are typically compressed by the protocols into a short-hand ("write "00000000" 8092 times"). In fact, on average, nearly 65% of the bits transmitted over SRDF were in fact "0".

But far fewer are adjacent ("00") and aligned.

Still, almost 34% of the total bits transferred were in fact aligned double-zeros, far more than all other bit combinations - and most importantly, these were quite frequently byte-aligned, as required by this new-found capability. Makes sense, if you think about it - most of those 32- and 64-bit integers are used to store numbers that are relatively small (years, months, days, credit charges, account balances, etc.). So that's why the team decided to use this new two-fer bit to represent "00".

Mathematically, if you can transmit 34% of the data using half as many bits, you reduce the number of bits you have to transfer in total by 17%. Which, while not necessarily earth-shattering, is nothing to be ashamed of. On top of the SRDF performance enhancements delivered in 5772 (30% reduction in latency or 2x the distance), this new enhancement adds another 17% latency improvement (or ~1.4x more distance at the same latency). Combined with 5772, SRDF/S customers could see a 50% reduction in latency. And 5773 allows SRDF/A cycle times to be set below 5 seconds (with RPQ) - this new feature adds a little headroom to maximize bandwidth efficiency for the shortest possible RPO.

hats off to the smart guys

Of course, there's a lot more to be discussed about 5773 and flash drives and the like, and I'll try to cover the rest later this week. But I think this new "data compression" feature deserves the most attention. Given the benefits, I expect that other vendors will work night-and-day to get thing into their chip-sets and HBAs as quickly as possible. And who knows - perhaps someone will even find a way to compress more that just one two-bit pattern (a-la quantum computing, where each bit actually has THREE different values).

But for now, that's all science fiction, while DMX-3 and DMX-4 customers who move to Enginuity 5773 will see an immediate improvement in their SRDF performance. So hats off to Dr. Lirpa Sloof and the team that figured this out - you're well on your way to EMC's 2008 Innovator of the Year Award!

Like I said - I work with smart people.
 


TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d834c659f269e200e5518c27068833

Listed below are links to weblogs that reference 0.073: 5773 > c:

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Nigel

Barry,
Interesting post.

Quick question if you dont mind -

Isnt SRDF replication based on tracks? You know, if you update a sector on a track of a large LUN does the Symm not send the entire track over the wire to the remote site?

Surely there are times when this is wasteful and this must in some ways be a cost cutter so that too much control memory is not used up keeping track of (no pun intended) individual sectors etc??

Im no authority on this and certainly not on EMC replication. But with the Symm being essentially track oriented I imagine this may happen. Couldnt you cut the number of unnecessary bits being transmitted by not transmitting full tracks when only a sector or two has changed.....??? Complicated I imagine.

Im asking, as I say Im no authority, and may be the Symm doesnt work this way, but I imagine it does.

Oh and about Hitachi not using Tachyon chips in their kit..... Im open to being wrong about this, but..... Im sure they do. Im under the impression they use Tachyon DX2 and DX4 chips depending on 2Gbps or 4Gbps. I will have to ask around about this one. Id be interested to know how you know this?

Nigel

the storage anarchist

1) SRDF was optimized to do partial track updates over the wire several generations ago.

2) You might wanna Google "Lirpa Sloof" ;*)

Nigel

I'll put it down to the time difference, traditionally we only do april fools in england up until noon. I only saw your post at about 23:00 my time ;-)

Thats my excuse anyway - doh!

BTW - I didnt know that about partial track updates. Does that apply to all LUNs even very large ones?

the storage anarchist

OK, well, sorry then - I posted this at about 8AM EDT, so it was past noon already. You're excused.

The whole track-centric thing is grossly misunderstood - most people believe it still works the way it did in the early years. But today most I/O operations are performed at a much smaller granularity (4K or 8K, which aligns well to the majority of file systems and database engines). About the only thing that works on larger increments is pre-fetch, which will read "the rest of the track" if there is no higher priority demand for the spindle (BTW this has an uncanny benefit to cache hit ratios, even today).

SRDF is really based off of Symm Devices - large LUNS are made up of metas of multiple Symm Devices. So yes, partial track updates are applied independent of LUN size (or type).

The comments to this entry are closed.

anarchy cannot be moderated

about
the storage anarchist


View Barry Burke's profile on LinkedIn Digg Facebook FriendFeed LinkedIn Ning Other... Other... Other... Pandora Technorati Twitter Typepad YouTube

disclaimer

I am unabashedly an employee of EMC, but the opinions expressed here are entirely my own. I am a blogger who works at EMC, not an EMC blogger. This is my blog, and not EMC's. Content published here is not read or approved in advance by EMC and does not necessarily reflect the views and opinions of EMC.

search & follow

search blogs by many emc employees:

search this blog only:

 posts feed
      Subscribe by Email
 
 comments feed
 

 visit the anarchist @home
 
follow me on twitter follow me on twitter

TwitterCounter for @storageanarchy

recommended reads

privacy policy

This blog uses Google Ads to serve relevant ads with posts & comments. Google may use DoubleClick cookies to collect information (not including your name, address, email address, or telephone number) about your visits to this and other websites in order to provide ads about goods and services of interest to you. If you would like more information about this practice and your options for not having this information used by Google, please visit the Google Privacy Center.

All comments and trackbacks are moderated. Courteous comments always welcomed.

Email addresses are requested for validation of comment submitters only, and will not be shared or sold.

Use OpenDNS