2.002: meh – ibm really, really doesn't get flash
Someone sent me this today:
Word to the wise, though – if you don't understand something, don't blog about it as if you do.
I've tried to get IBM's Tony Pearson to understand this repeatedly over the years, and he just keeps making the same mistakes. Probably has him despising me as much as that other blogger with the same first name, because every time he slips up, I'm usually there to correct him before his misinformation gets any traction.
Be sure to take the time to read the comments, and you'll see that TonyP clearly didn't take the time to understand the STEC ZeusIOPS drive or its wear-leveling algorithms. As a result, he pretty much embarrasses himself and his employer (not to mention the IBM Distinguished Engineers he throws under the bus) in the process.
At least he didn't try to drag Master Scientist BarryW down with him!
So, knowing that TonyP wouldn't dare to actually do the math for his readers, I will…
hey tony! here are the answers to the quiz!
Using the architectural definitions and modeling tools for the STEC ZeusIOPS wear-leveling algorithms and assuming that the SLC NAND flash will tolerate exactly 100,000 Program/Erase (P/E) cycles, the math says that the latest version of the 256GB (raw) STEC ZeusIOPS drive will wear out below it's rated usable capacity when exposed to a 100% 8KB write workload with 0% internal cache hit at a constant arrival rate of 5000 IOs per second in 4.92 years when configured at 200GB, and in 8.91 years configured at 146GB (yeah, I was off by .08 years).
Unfortunately, I cannot share the actual data or spreadsheet used to compute these numbers because they contain STEC proprietary information about their architecture and wear-leveling algorithms. So you'll have to trust me on this, and trust that IBM and EMC are in fact using the same STEC drives with the identical wear-leveling algorithms, just formatted at different capacities.
At a mix of 50/50 Read/write, the projected life of the drive is 9.84 years @ 200GB, and 17.8 years @ 146GB. And for what TonyP asserts is the "traditional business workload" (70% read / 30% write) the projected life expectancy is a healthy 16 years @200GB and 30 years @146GB.
Now, that's long enough for the drives to be downright ancient - more likely they will have been replaced with newer/faster technology long before the drive is even half-through its P/E life expectancy under those conditions.
So in the Real World that we all actually live in, nothing is ever 100% write – even database logs (which are not recommended for Flash drives) will not typically generate a 100% constant write workload at max drive IOPS. And the current generation of SLC NAND has been observed to easily exceed 100,000 P/E cycles, so even the above numbers are extremely conservative.
No, the truth is, the difference between the projected life at 146GB and 200GB on a 256GB (raw) ZeusIOPS is truly insignificant...and your data is no more at risk for the expected life of the drive either way.
Unless, of course, your array can't adequately buffer writes or frequently writes smaller than 8k blocks which will drive up the write amplification factor...two issues I suspect the DS8K in fact suffers from. Which, of course, would explain why IBM's Distinguished Engineers wouldn't want to take the risk with the DS8K. They don't get to be DEs by leaving things to chance, to be sure.
Symmetrix, on the other hand, isn't subject to these risk factors. Writes are more deeply buffered and delayed by the larger write cache of Symmetrix (DS8K is limited to 4GB or 8GB of non-volatile write cache vs. 80% of 256GB on DMX4 and 80% of 512GB on V-Max). Symmetrix writes are always aligned to the ZeusIOPS' logical page size to minimize write amplification, and the P/E cycles experienced by the NAND in the drive is proactively monitored to enable pre-emptive replacement should a drive exhibit premature or runaway wear-out.
Not so the DS8K, apparently…hence the conservative approach.
But don't be fooled – the deficiencies of the DS8K mean you will pay more Dollars per Usable GB of SSD on a DS8K than for Symmetrix DMX4 or V-Max EFDs.
Oh – and for the record TonyP, I don't think I ever said EMC was using a newer or different EFDs than IBM. I just asserted that EMC knows more than IBM about these EFDs and how they actually work in a storage array under real-world workloads. Thus, EMC are able to ship drives configured with more usable capacity per device without increasing the risks to customer data.
See, while IBM was playing catch-up,
EMC DID THE MATH!