« 4.001: when you say tiering, do you mean degradation? | Main | 4.003: a big thing in a small package »

July 06, 2011

4.002: Does page size matter -- a rebuttal

imageHitachi Data System's CTO Hu Yoshida continues to try to defend the 42MB page size utilized by the Hitachi Dynamic Tiering (HDT) on the VSP. Apparently his first attempts to put lipstick on the pig didn't go over so well, so now he has resorted to good-old competitive FUD as he tries to convince his readers that the smaller granularity employed by VMAX FAST VP (7.5MB) delivers poorer "cost performance" than HDT.

Hu's basic premise is that the smaller the page size, the larger the amount of metadata that has to be maintained. He (incorrectly) asserts that VMAX FAST VP requires 54 times as much metadata than does VSP HDT. Further, he claims that managing and checking all that metadata requires 54 times more CPU cycles, reducing performance. He also makes a rather outlandish claim that the smaller page size requires 54 times more data movement.

With all due respect to Hu, his claims are total hogwash!
 

let's look at the facts

Hu conveniently overlooks a several very relevant facts (and tries to misdirect away from them in his response to my comments):

facts:
  1. FAST VP's  metadata is (by design and intent) significantly smaller than the size of the data extent itself. Recognizing that relocating between tiers would always involve drives of different performance capabilities and cost, FAST VP was optimized to minimize wasted capacity on expensive Flash drives and to minimize the CPU overhead by moving the smallest amount of data possible.
  2. The CPU overhead of FAST VP's metadata is virtually insignificant, basically incremented a counter on any cache miss operation at the granularity of the FAST VP extent (7.5MB).
  3. When it decides to promote an extent, FAST VP only has to search for the busiest FAST VP extent and then copy only those 64KB tracks within the 7.5MB extent that are non-zero (tracks that have never been written by the host are never copied).
  4. While HDT has to search through a presumably smaller amount of metadata, it then has to copy a much larger 42MB of data to a different tier. A Copy operation takes CPU cycles too, plus there's the overhead of recalculating RAID protection if the source and destination don't use the same RAID geometry.

The beauty of FUD when it is done well is that it is virtually indistinguishable from fact or truth (the two aren't the same) – especially if you WANT to believe it. I'm sure there will be many among the Hitachi installed base that will use Hu's FUD as "evidence" that FAST isn't as good as HDT.

But allow me to translate all this into relevant and meaningful comparison:

comparison example

  • Let's say both VMAX FAST VP and VSP HDT use 8 bytes of metadata per page/extent

I have no idea how much metadata HDT uses, so I'll use a reasonable number for this example.

  • HDT will thus require only 8 bytes of metadata for each 42MB of data
  • FAST VP will require (42MB/7.5MB)*8bytes=5.6*8=48 bytes of meta data (6x more rounded up)

I have no idea where Hu got 54x more metadata, but the incremental FAST VP metrics are tracked in metadata at a granularity of 7.5MB. VP- and track–level metadata are separate and tracked at different granularities.

  • For a 1TB volume, HDT will require (1TB/42MB)*8=24,966*8=199,728=195KB of metadata
  • For the same 1TB volume, FAST VP will require 6 times as much metadata= 1,170KB
  • Worst case – both will have to look at the ENTIRE metadata in order to find the busiest page/extent to promote
  • That's 195KB worth of compares vs. 1,170KB worth of compares, in HDT's favor
  • But once found, HDT has to MOVE 43,008KB of data, while FAST VP moves only 7,680KB of data.
  • So, HDT has to "touch" 52,100KB of data, while FAST VP has to "touch" (worst case) only 8,850KB of data (FAST VP only has to move non-zero blocks, so the actual KB touched can be even less).
  • You might even say that the VSP CPUs have to do almost 6x as much work as do the VMAX CPUs.

And just so it isn't overlooked, in this example the added metadata of FAST VP takes up just over 1MB of VMAX global memory, while HDT potentially MOVES and WASTES at least 42-7.5=34.5MB more data cache memory for every page relocated.

efficiency, speed or 'cost performance'?

What's more, FAST VP clearly delivers better performance and quicker reaction times to changes, as I discussed in my immediately prior post. That's the "performance" part of the "cost performance" analysis that Hu referenced from ESG, Gartner and Wikibon.

But folks, I have to tell you that I have no idea how Hu calculates the "cost" part of the equation, but I suspect it uses algorithms based on Hitachi Math*. In my book, when Product H has to do more work and requires more memory to deliver a benefit that is clearly inferior to that delivered by Product E, then Product E is clearly the more cost-effective solution – no matter how much lipstick you try to put on Product H.

And Product H is, well, as they say: It is what it is…

image

End of story…

* Hitachi Math: a modernistic form of algebra that arrives at irreproducible results that also have the unique property of having absolutely no bearing on reality
 


TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d834c659f269e201538fb2653b970b

Listed below are links to weblogs that reference 4.002: Does page size matter -- a rebuttal:

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Post a comment

Comments are moderated, and will not appear on this weblog until the author has approved them.

This weblog only allows comments from registered users. To comment, please Sign In.

anarchy cannot be moderated

about
the storage anarchist


View Barry Burke's profile on LinkedIn Digg Facebook FriendFeed LinkedIn Ning Other... Other... Other... Pandora Technorati Twitter TypePad YouTube

disclaimer

I am unabashedly an employee of EMC, but the opinions expressed here are entirely my own. I am a blogger who works at EMC, not an EMC blogger. This is my blog, and not EMC's. Content published here is not read or approved in advance by EMC and does not necessarily reflect the views and opinions of EMC.

search & follow

search blogs by many emc employees:

search this blog only:

 posts feed
      Subscribe by Email
 
 comments feed
 

 visit the anarchist @home
 
follow me on twitter follow me on twitter

TwitterCounter for @storageanarchy

recommended reads

privacy policy

This blog uses Google Ads to serve relevant ads with posts & comments. Google may use DoubleClick cookies to collect information (not including your name, address, email address, or telephone number) about your visits to this and other websites in order to provide ads about goods and services of interest to you. If you would like more information about this practice and your options for not having this information used by Google, please visit the Google Privacy Center.

All comments and trackbacks are moderated. Courteous comments always welcomed.

Email addresses are requested for validation of comment submitters only, and will not be shared or sold.

Use OpenDNS