« 3.020: reality check - vsp vaai support | Main | 3.022: powerful, trusted and smart...meet dumb and dumber »

February 24, 2011

3.021: spec sfs wars

I won't be the first to observe that most benchmarks are unrelated to anything anyone would see in the real world – no matter how hard we try, it borders on impossible to build truly representative artificial workloads. And even if we could, the second order challenge with benchmarking in the IT world is that the desire to win leads us all to often to create test configurations that would never exist in the real world.

Yet it is a dominant gene of some sort that drives us carbon-based life forms to use such artificial comparisons as the basis of what we hope are informed decisions.

The latest two entries in the SPECsfs benchmark comparisons, both of which were released within the past 24 hours or so, provide an interesting comparison of performance. In this machine-vs-machine battle, we have IBM's SONAS pitted against EMC's new VNX platform, each boasting to have shattered the SPECsfs benchmark performance records.

Now, so far as I am aware, nobody runs SPECsfs as a production workload, so the direct meaning of these results is debatable. All they really tell us is which system configuration is better at running this specific benchmark – I personally would not want to extract anything more than that from the results.

But if you look closely at the results, there is perhaps another story…

First, take a look at this excerpt from the IBM SONAS SPECsfs results

IBM Corporation : IBM Scale Out Network Attached Storage, Version 1.2
SPECsfs2008_nfs.v3 = 403326 Ops/Sec (Overall Response Time = 3.23 msec)

sfs2008-20110204-00176

Throughput
(ops/sec)
Response
(msec)
42508 1.5
85093 2.0
127276 2.4
170410 2.7
213185 3.1
254902 3.5
298717 4.1
340996 4.5
384590 5.8
403326 11.3

 

Next, the EMC VNX VG8 Gateway w/ EMC VNX5700 storage results:

EMC Corporation : EMC VNX VG8 Gateway/EMC VNX5700, 5 X-Blades (including 1 stdby)
SPECsfs2008_nfs.v3 = 497623 Ops/Sec (Overall Response Time = 0.96 msec)

sfs2008-20110207-00177

Throughput
(ops/sec)
Response
(msec)
49030 0.4
98167 0.4
147368 0.5
196664 0.7
245819 0.8
295373 0.9
344974 1.1
394453 1.3
444426 1.7
497623 3.3

 

On the surface, you see that IBM's product did just over 400,000 SPECsfs ops/second, while the EMC product did just under 500,000. So, you'd probably assume that the EMC product is about 25% faster than IBM's

But look a little closer, and focus on that dotted line. That dotted line is the average response time of all observations. This is an important metric of SPEC sfs – so important, that the results actually call out this average response time in the header of the results.

  •  
    • IBM SONAS average response time: 3.23 milliseconds
    • EMC VNX average response time: 0.96 milliseconds

Folks, that says that the EMC solution is more than 3 times faster than IBM's.

but wait!

That's pretty amazing, but not totally unexpected, because the EMC config used almost 100% solid state drives which can deliver a pretty consistent response time of less than 1 millisecond (as little as 0.4 milliseconds in fact, as we see above), while IBM used 15K rpm hard disk drives which have an average response time of around 6msec under nominal workloads.

Apples vs. oranges, the naysayers will say, and I will not argue with that – if we wanted to have a fair comparison, we should ask IBM to rerun their SONAS product using all SSDs instead of the 1,475 HDDs they used.

But…we don't necessarily have to wait for them to do that (I doubt they ever will, actually).

insight

SPECsfs is a somewhat cache-friendly workload – that is, up to a certain point, most of the I/O's will be cache-hit I/Os, especially for a cache-rich product like IBM's SONAS which was configured with over 1.4 TERABYTES of DRAM cache, all told. Looking at the results chart, we can guess that most of the benchmark I/O's were being serviced out of cache, at least up until the next to the last observation.

This is probably a safe guess, because we know that a 15K rpm disk drive's average response time under nominal load is about 6msec. Since all but the last observation was LESS than 6msec, a decreasing majority of those I/Os were likely cache hits. And as soon as the benchmark working set exceeded the cache size, however, the workload swung over to being more predominately cache-miss, and thus the I/Os presented to the HDDs was beyond "nominal", and the measured response time almost doubled.

different architectures, different approaches

Now, the VNX is not a large-cache storage platform – so comparing heads up to the cache-rich SONAS product isn't necessarily apples-to-apples either (did I forget to mention that DRAM is expensive?).

VNX does, however, natively support solid state flash drives, and SSDs support a much higher "nominal" workload – that's why the VNX configuration used only 436 SSDs vs. IBM's 1,475 HDDs. As a result, the benchmark working set doesn't have to fit into cache to deliver great response times on the VNX, and we see by the gently rising slope of the VNX results that the response time elongates on a much more gentle curve.

It's not apples to apples, but where IBM effectively throws cache at the problem, VNX uses flash. It's a tale of two architectures, each using its differentiated capabilities to deliver the best possible results.

What else would you expect?

in closing

I ask you to consider this: look back at how many IOPS each platform was able to support when delivering 3.3msec response time:

  • IBM SONAS:   234,043 SPECsfs ops/sec (halfway between the 3.1 & 3.5ms observations)
  • EMC VNX:      497,623 SPECsfs ops/sec

Remembering (as you must) that this is an artificial benchmark, run on entirely different hardware configurations with markedly different costs, and that the results bear no definitive corollary to any known real-world workload…can you answer this question:

Which product is faster?

 


TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d834c659f269e2014e864be5d7970d

Listed below are links to weblogs that reference 3.021: spec sfs wars:

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Alex Miroshnichenko

All good points.
However .. my biggest beef with spec sfs - since the days it was called LADDIS ! - is the absence of cost metrics.
I really would like to see a metrics of price per IOPS. I understand why it is politically hard to do, but TPC somehow managed to make it happen.

Mike Riley

Which is fastest? This one:
http://www.spec.org/sfs97r1/results/res2006q2/sfs97r1-20060522-00263.html.

Come on, guys. Don't bother submitting something that is twice as slow. When you reach 927,000 at 2.7ms, give us a call!

the storage anarchist

Umm...and just who do you think is going to believe that 1997 SPECsfs97_R1.v3 results are in any way comparable to 2008 SPECsfs2008_nfs.v3 results?

Come on, NTAP - you're going to have to do better than that!

Account Deleted

Coming front the world of clustered file systems, specSFS seems to be the standard benchmark that is easily available to access and is the mostly widely abused benchmark of all times. Now before the spec guys get upset with me, the problem isn’t with the specSFS tool-set, the problem is with the vendors who use it (I am guilty of this vendor deception personally).

David666

Would it be fare to assume that at the lowest I/O rate all I/Os are served from cache? If this is the case, IBM's response time is 1.5ms vs. VNX' 0.4ms. So perhaps the correct answer is that - comparing apples to apples - VNX is almost x4 faster.

Tomer Perry

Hmm,
Luckily, spec report also shows that "fileset size" - which means how much data spec used.
In the IBM case, its 49805.3 GB.
Last time I've checked its a bit bigger then 1.4T of cache.
So... data can't be fully cached ( that's the main purpose of specSFS - don't be cache friendly !!!)

the storage anarchist

Tomer -

Anyone who has run SPECsfs2008 knows that the actual working set of the benchmark is a fraction of the total fileset, and that indeed the benchmark is VERY cache friendly - if your cache is large enough, that is.

There is no other way that you could ever attain response times less than the 6ms nominal response time of a 15K rpm hard drive if cache were not playing a SIGNIFICANT role in buffering both reads AND writes...

Tomer Perry

So, in spec you have the create phase, which creates all the fileset, the warmup, which gives on the chance to cache some data ( assuming that you can't cache all the data in the create phase) and only then the actual run. So, the chances that one would be able to cache the data is really small ( and this is usually the case in spec based on my experience).
So, data caching is rare. That what I meant about spec trying to avoid DATA cache hits.
When it comes to metadate ( which is most of the ops in spec) its a different story. And since AFAIR, only 30% or so are actual data ops ( read/write)this is the part that will cause higher latency.
So, it might be the case the read/write ops comes up with 20ms latency, but the rest of the 70% of the workload gives much better response time.
Unfortunately, spec disclosure don't show latency per of type ( you do have it in the spec run log)
Another piece of info that spec disclosure don't reveal is pricing information...
So, how much an SSD based system cost? I guess a lot. But it will surely provide much better response time on most ops.

Kaneda San

I understand what you're saying, but your ending tagline reminds me so much of Fox News: "We Report, You Decide!"

the storage anarchist

Tomer - thank you for making my point.

Kaneda San - Welcome to my blog. I try to mix it up here - sometimes expressing strong support for my company's products, sometimes pointing out the weaknesses of competitors' products/marketing/FUD, and sometimes just trying to enlighten readers with perspectives on relevant topics so that they (and you) can make better informed decisions. I hope you like it (not everyone does).

The comments to this entry are closed.

anarchy cannot be moderated

about
the storage anarchist


View Barry Burke's profile on LinkedIn Digg Facebook FriendFeed LinkedIn Ning Other... Other... Other... Pandora Technorati Twitter Typepad YouTube

disclaimer

I am unabashedly an employee of EMC, but the opinions expressed here are entirely my own. I am a blogger who works at EMC, not an EMC blogger. This is my blog, and not EMC's. Content published here is not read or approved in advance by EMC and does not necessarily reflect the views and opinions of EMC.

search & follow

search blogs by many emc employees:

search this blog only:

 posts feed
      Subscribe by Email
 
 comments feed
 

 visit the anarchist @home
 
follow me on twitter follow me on twitter

TwitterCounter for @storageanarchy

recommended reads

privacy policy

This blog uses Google Ads to serve relevant ads with posts & comments. Google may use DoubleClick cookies to collect information (not including your name, address, email address, or telephone number) about your visits to this and other websites in order to provide ads about goods and services of interest to you. If you would like more information about this practice and your options for not having this information used by Google, please visit the Google Privacy Center.

All comments and trackbacks are moderated. Courteous comments always welcomed.

Email addresses are requested for validation of comment submitters only, and will not be shared or sold.

Use OpenDNS