« 0.030: new site tool - snap shots from snap.com | Main | 0.032: seek and ye shall find: another emcer is blogging »

August 23, 2007

0.031: inside tiered storage - part 2 (options)

Part 2 of a planned 4-part series exploring the concepts and implementation of tiered storage. If you missed it, you should probably read part 1 (definitions) first.

Several weeks ago I was invited to sit-in on a "Peer Incite" conference call with the folks behind Wikibon.org. The topic had been advertised as a peer review of EMC's 7/07 announcements, but the topic leader (Peter Burris - or was it David Vellante?) chose instead to focus the discussion around the implications of EMC's announcement that the Symmetrix DMX-4 would be the first high-end storage array to offer native support for SATA-II disk drives, and specifically the 750GB devices. (You can read the collective results of that conference call here.)

One interesting aspect of this discussion was the clarity offered around the differing approaches for implementing tiered storage. I personally thought this one of the more valuable parts of the discussion, but it seemingly was not included in the posted summary. In fact, it was that omission that initiated the idea for this series in the first place (admittedly, I had initially thought I'd be able to cover the topic in a single post, but I couldn't manage to pull that off).

So in this chapter I'll explore the four different options for implementing tiered storage, as was discussed in that original call.

Of course, I'll add a little of my own color along the way.

And maybe even a surprise ending...

four ways to implement tiered storage

(Unless the Open Systems Storage Guy corrects me again, that is.)

On the call, four partially-overlapping alternatives were discussed for implementing tiered storage.

  1. Use separate storage arrays (devices), each optimized for the tier they support
     
  2. Use a "virtualization" engine (hardware or software) to front-end two or more tier-optimized storage arrays
     
  3. Use one tier optimized storage array that can also front-end other (external) tier-optimized storage arrays for the other tiers
     
  4. Use a storage array optimized to simultaneously support multiple tiers within a single array

Now, practically speaking, only the first approach fully encompasses all of the Tiers I defined back in part 1. I am not aware of any single virtualization offerings that effectively integrate on-line, active archive, near-line and tape into a single unified infrastructure, nor that concurrently support the varied interfaces of block SCSI/FC/FICON/iSCSI (disk), NFS/CIFS (file), CAS (object), and block SCSI/ESCON/FC (tape).

Ok - I take that back. The exception to that assertion is probably a zSeries mainframe - from any mainframe application perspective, all attached storage is equally and readily accessible, be it tape, disk or even core memory smile_teeth. Would that we could do the same in the open systems/Windows world - right?

So there - no angry cards and letters from the mainframe constituency, OK?

So right from the outset, I'll assert that anyone with any open systems/windows hosts and whose information storage requirements demand all of the tiers I've described will have no choice but to employ Strategy 1 for at least part of their storage infrastructure (no matter which definition of Tiers you choose to use).

That said, let's look a little closer at each of these approaches:

separate tier-optimized arrays

This is really where it all started, although perhaps more out of necessity than strategy, per se.

Some will likely argue this point, but Symmetrix was really the first practical tier-optimized external storage device - most will agree that the product, if not the company and the technology, actually created the external storage market. Back when Symmetrix was first introduced, IBM held a virtual strangle-hold on the mainframe storage market.

Until EMC challenged that with the introduction of a plug-compatible, channel-attached, low-cost (!) external RAID device for IBM mainframes, that is.

Yup, even before SRDF and later TimeFinder, the sheer cost advantage of Symmetrix allowed mainframe users to store data more cost effectively than when their only choice for storage was IBM. And the lower cost allowed for more to be stored on-line, instead of on tape...improving performance and scale while growing the total addressable market at the same time.

Fast forward 15 years, and we're faced with new dynamics that are forcing more and more information to be stored on-line. But today, not everyone can justify the expense of a Symmetrix (or a USPV) for all of their data. Nor do they necessarily need all the capabilities of high-end block storage mega-array for all of their data, or perhaps even for any of their data.

So whether driven to reduce costs or to meet specific capabilities, the tier-optimized approach affords customers with solutions that match almost perfectly with their requirements. Using a CAS device for archiving and compliance and a NAS device with low-cost drives for general-purpose file share optimizes cost for function - different horses for different courses, so to speak.

EMC wisely recognized this trend back in the 1990's, and once they realized that the technology cost curve wasn't going to enable Symmetrix to scale down, they acquired CLARiiON to serve the (then) mid-tier market. And during the first half of this decade, CLARiiON arrays fulfilled the "second tier" application needs in most Symmetrix customer shops. And of course, competitors followed: IBM addressed the mid-tier through a series of false starts and ultimately by OEM'ing smaller vendors midrange products, while HP & Sun combined Hitachi high-end gear with their own mid-tier products. And Hitachi added its own mid-tier products, although they weren't picked up by either of their primary distribution partners.

If you expand the discussion to include NAS, CAS and VTL in addition to block, I think only EMC offers the full lineup of products. Well, I guess IBM has a Worm thing and Sun has this Thumper-based concoction that they each pass off as CAS. And Hitachi doesn't offer a VTL as far as I can tell. And while NetApp might offer support for everything I've mentioned, the limited scale of their kit will inevitably drive you to require multiple boxes, equating pretty much to "tier optimized" for each application.

virtualization in front of separate tier-optimized arrays

The second approach for implementing tiered storage is to put a virtualization engine (with no storage of its own) in front of the tier-optimized storage arrays. Generally speaking, the goal of this approach is to insulate the hosts and applications from the physical location of the data volumes (a subject we'll explore more in part 3 of this series). Additionally, the virtualization engine may include additional storage services, such as the ability to migrate/relocate data without host or application integration, to make local and/or remote replicas of data volumes, and perhaps even thin provisioning.

Products in this arena fall into two camps, and no - I'm not talking about the "in-band" vs. "out-of-band" debate (I'm going to try and avoid that subject all together). To me, the more significant differentiation is between block-based and file-based virtualization engines.

In the block-based camp, we have products like EMC's Invista and IBM's Storage Volume Controller (SVC). Although their implementations are radically different, they both basically do the same thing - redirect SCSI block-level raw I/O requests transparently to one or more storage devices. Hosts see LUNs that are created by the virtualization engine, and not the ones on the storage devices themselves, effectively creating a platform-neutral view of storage capacity.

By the way, IBM's Barry Whyte is in the middle of a rather comprehensive explanation of block-based storage virtualization over on his blog, and I encourage you to follow along. Be sure not to miss all the related bits: part 1, part 2 and part 3 are up already, and I expect there is more still to come.

On the other hand, the file-based storage virtualization offerings present "network shares," typically using NAS networked file sharing protocols such as NFS or CIFS. Celerra and NetApp are two products that come to mind. For example, you could put both a Symmetrix and a CLARiiON behind a Celerra engine, and implement tiered storage that way. Adding RainFinity to the environment makes it even easier to distribute and relocate files, directories and volumes across different classes of storage by hiding the server & share names from the hosts, effectively creating one huge namespace for all the shares in the environment.

tier 1 array virtualizing external tier-optimized arrays

The third alternative is the one most commonly associated with (and actively promoted by) Hitachi and Mr. T - aka. controller-based virtualization. With this (typically block-based) approach, in addition to the included internal disk drives, the storage array also "front-ends" other external storage arrays by connecting additional ports to the other storage's front-end I/O ports. Storage LUNs on the external arrays can be presented as "pass through,"where the hosts see the same geometry as they would if directly-connected to the external array (but accessed via a different WWN). Or the capacity can be presenting "in bulk" to the (aptly named) controller (primary array) and then segmented and reassigned to create sub-devices for the hosts. The hosts only deal with the named SCSI targets on the primary array, insulating them from name/target/location changes that may occur - allowing LUNs to be readily (and often non-disruptively) moved from one physical array to another (should you have a need to do that - and be willing to accept the associated risks).

People are probably most familiar with Hitachi's implementation of this approach as the Universal Volume Manager (UVM) on the USP, NSC and USPV platforms (note that the diskless NSC technically belongs under the prior category, but I chose to mention it here because its implementation is identical to the USP/USPV). With the exception that this approach employs storage natively supported within the controller, though, it's practically the same as the prior one anyway (I'll explore some of the subtle differences in the next chapter on challenges).

Once you cut through all the hype, the fact is this approach is really intended to deliver a solution justified by the Tier 1 storage requirements, with the added bonus of being able to use other (old) disk storage arrays for light-duty Tier 2 applications, such as near-line backups. Sure, you could in fact constantly run I/Os through UVM to the tier 2 stuff behind the USP, but most agree that the more practical application of this technology is either said backups, or to migrate data on a one-way trip into the USPV. (Insert Vampire and Trojan Horse observations here).

array optimized to support multiple tiers

Finally, and as I've contrasted before (in my DMX-4 overview), is the in-the-box tiered storage approach enabled by the Symmetrix DMX family - an approach unique to the DMX, in the high-end anyway (several mid-tier arrays, including EMC's own CLARiiON, also support a wide variety of disk drive speeds & technologies within the same array). But Symmetrix was inarguably first high-end storage platform to support the 500GB low-cost fibre channel drives, and the DMX-4 remains the only high-end array with announced support for SATA-based storage (500GB at GA, 750GB by year's end).

Importantly, this in-the-box approach requires more than just support for fat+slow+cheap disk drives in order to deliver tiered storage. For example, the system needs to provide a means for segmenting resources so as to insulate the "tier 1" applications from the other tiers. Performance optimizations (such as I discussed here) have to be adjusted to compensate for slower rotational speeds and transfer rates. And best practices have to be defined to help people understand the implications of (for example) trying to use a RAID 6 7200rpm volume as the SRDF/S R2 for a mirrored 15Krpm R1 (hint - rarely a good idea).

The fundamental differentiating idea behind tiering within a single device is to reduce cost, complexity and overhead as compared to any solution that requires multiple storage arrays, not to mention additional hardware to support virtualization. Ideally, adding low-cost SATA drives to an existing Tier 1 storage platform should cost no more than purchasing a separate SATA-only storage device. In fact, the 5-year ROI should be lower through incremental cost avoidance:

  • You don't have to buy additional ports, CPU's, cache and possibly FC switches just to connect the virtualization device to the separate storage devices, saving both CAPEX (acquisition) and OPEX costs (electricity, floorspace, cooling, maintenance/service contracts, etc.)
     
  • You don't have to pay higher maintenance/service costs for storage no longer under warranty
     
  • You have fewer different things to manage and operate

In addition, you reduce your data integrity risks, but I'll defer that discussion until chapter 3 as well.

and one more makes five

OK - I promised a surprise ending...and here it is. Fact is, there's (at least) one more option - combining an in-the-box tiered storage array (like the DMX) with the ability to hang other tiers off the back (as does the USP/USPV).

In his post entitled Choices, SATA and a touch of DMX-4, Nigel Mackem (over at RupturedMonkey) challenged both EMC and Hitachi to meet in the middle - presenting the customer case for Hitachi to add SATA support to the USP, and for EMC to add external storage virtualization to the DMX.

OK. Done. And EMC's been shipping it for over 3 years already.

well, sorta kinda

See, here's the thing. Back when we introduced 5671 code (Q1, 2005), we also introduced a new product called Open Replicator for Symmetrix (ORfS). When I describe it, it sounds a lot like Hitachi's UVM - ORfS has a mode (called "Live Migrator") that allows you to insert a Symmetrix in between your current host and storage. You take front-end FA ports on the Symmetrix, connect them to the front-end ports of any external array (directly or via a switch), map the existing LUN(s) to the Symmetrix, map and reconfigure the host to talk to the Symmetrix' WWN instead of the external storage's. Turn it all on, and the host is running all its I/O's against the DMX.

The "kinda-sorta" part is that we built this first (and intentionally) as a data migration tool - you actually have to create a target set of LUNs on the DMX's internal storage; ORfS/LM's objective is to migrate (transparently) the data off of the old storage and into the DMX - all with only the initial disruption to retarget the host to the new WWN (the exact same disruption, by the way, that you'll be forced to take with Hitachi's UVM for every volume you want to intercept with your USP).

Now, the interesting observation is that while it's running, ORfS/LM works pretty much the exact same way as does UVM: read misses are resolved by regenerating I/O read requests from the external storage, and write's are synchronously mirrored to both the old and the new destinations. We do this just in case something happens to disrupt the migration - you always know your "original" copy is also "current," even though your "new" may not yet be "complete."

Oh - and ORfS also has a secondary utility - the ability to push or pull a replica of a Symmetrix volume to/from 3rd party storage - vendor-neutral remote replication. This proves to be a very useful utility in tiered storage environments, because it makes it easy to make copies of Tier "1" data onto Tier "2" storage devices (ORfS can actually track and push incremental updates, helpful if you want to keep the Tier 2 replicas in sync with the primary).

that's two out of three use cases

Admittedly, ORfS doesn't enable everything that UVM does - it doesn't support the full-time pass-through of I/O's to external storage.

But understand, our plans have always been to natively support lower cost storage devices within the DMX - and we did it before anyone else. And because we knew we could offer ATA-class storage in the DMX at a competitive price, we just didn't see a practical use case for permanently routing I/O through the array. More importantly, given the inherent latency and data integrity issues with regenerating I/O requests instead of merely re-directing it, we acknowledged that Invista was the more appropriate, reliable and practical way to address customers with that need.

But ORfS does cover the migrate in (vampire) use case, and it can be used to make external backups of Symmetrix devices onto cheaper storage. We can argue the utility of the third use case (oh, and we will, I'm sure). But just because we didn't tack the "virtualization" label on Open Replicator doesn't mean that EMC has been ignoring this market requirement altogether. Indeed, while Invista is our Virtualization offering, we have also invested in addressing many of the same use cases (and challenges) that virtualization can solve across all of our products.

ttfn!

I think I'll climb down off the soap box right there for now, and go gear myself up for the "challenges" chapter. Given where Barry Whyte has gotten with his deep-dive on virtualization, this should get interesting Real Soon Now smile_wink.


TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d834c659f269e200e54eced2358833

Listed below are links to weblogs that reference 0.031: inside tiered storage - part 2 (options):

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Thanks for pointing out the Wikibon call. Sorry the tiered storage summary was difficult to locate-- here it is - 'EMC: Fewer tears for tiered storage':

http://www.wikibon.org/EMC:_Fewer_tears_for_tiered_storage

I will plan to update the Wikibon definitions with the the excellent inputs of this blog.

A better surprise ending would've been YottaYotta's block-based virtualization engine, the NetStorager GSX 3000. Its controller-based approach supports heterogeneous back-end arrays without compromising the performance of single arrays with multiple tiers or requiring redundant storage.

Large enterprises such as AOL are already using YottaYotta clusters to address their storage tiering needs:

http://searchstorage.techtarget.com/originalContent/0,289142,sid5_gci1257509,00.html

The comments to this entry are closed.

anarchy cannot be moderated

about
the storage anarchist

 
Barry's Facebook profile
 
View Barry Burke's profile on LinkedIn Digg Facebook FriendFeed LinkedIn Ning Other... Other... Other... Pandora Technorati Twitter TypePad YouTube

disclaimer

I am unabashedly an employee of EMC, but the opinions expressed here are entirely my own. I am a blogger who works at EMC, not an EMC blogger. This is my blog, and not EMC's. Content published here is not read or approved in advance by EMC and does not necessarily reflect the views and opinions of EMC.

search & follow

search blogs by many emc employees:

search this blog only:

 posts feed
      Subscribe by Email
 
 comments feed
 

 visit the anarchist @home
 
follow me on twitter follow me on twitter

TwitterCounter for @storageanarchy

View blog authority

PageRank

recommended reads

privacy policy

This blog uses Google Ads to serve relevant ads with posts & comments. Google may use DoubleClick cookies to collect information (not including your name, address, email address, or telephone number) about your visits to this and other websites in order to provide ads about goods and services of interest to you. If you would like more information about this practice and your options for not having this information used by Google, please visit the Google Privacy Center.

All comments and trackbacks are moderated. Courteous comments always welcomed.

Email addresses are requested for validation of comment submitters only, and will not be shared or sold.

Use OpenDNS