0.002: storage virtualization: naming gone awry
Naming a product or capability is probably one of the hardest things to do well, especially in high tech where you try to strike a balance between descriptive and inspirational monikers. As one of my mentors oft noted, "All the good names are already taken; the trick is to find the available one that stinks the least" (rest in peace, JH).
I think we, as an industry, failed miserably with the term "storage virtualization."
Here's why:
Shall we play a game?
No, not Global Thermonuclear War. Or even Tic-Tac-Toe. (Already been done.)
Let's play: One of these things is not like the others. One of these things just doesn't belong.
- Virtual Memory
- Server Virtualization
- Storage Virtualization
- Virtual Reality
- None of the Above
Give up? You might be surprised...
when "virtual" isn't
OK. Remember that this is a high-tech, storage-centric blog.
In my book, the correct answer is #3.
The definition of "virtual" in the context of information technology that I cut my teeth on was along the lines of "temporarily simulated or extended by computer software." As in:
- Virtual Memory: Simulating you have more memory than there really is.
- Server Virtualization: Simulating multiple dedicated servers within a single physical server (timesharing on steroids)
- Virtual Reality: Simulating an environment that doesn't actually exist
- Storage Virtualization: Simulating you can store more than the physical storage you have installed could actually hold.
No wait: - not! -
With Storage Virtualization, there isn't any mention of making a little bit of storage look like a lot more storage. We have Fake Memory. And Artificial Servers. And Imaginary Playgrounds like World of Warcraft and Second Life where there are whole parallel realities.
But Virtual Storage is an absolute misnomer. When you buy it, you don't suddenly get access more storage than you own. Your storage admins still have to manually re-allocate storage to meet growing demands. Sure, the technology makes it easier to move data around, to new physical storage, but you don't get the impression of having more storage than you really do!
The term "Storage Virtualization" just doesn't belong.
Heck, the two biggest storage virtualization pundits came close to agreeing with this over the last week. Mark Lewis asserted that in-band virtualization is really just Volume Management under a different moniker. Hu Yoshida retorted that storage virtualization has beyond volume pooling, and then lists as examples all the things that external storage arrays have delivered on top of "virtual LUNs" since their inception (replication, protection, migration, etc.).
Fact is, they're both (at least partially) correct. What we (the industry) have described as Storage Virtualization is really just good-ol' logical volume management - nothing more than the I/O redirection layer pioneered by the likes of Veritas. Yeah sure, today there are additional services connected to these logical volumes that didn't exist a decade ago, and you can now have logical volumes hidden behind other logical volume managers (behind even other logical volume managers).
But the descriptive name is still just a volume manager, no matter how you twist it.
Of course, Hu missed (sidestepped) Mark's point that the USP "controller" is merely acting as an in-band volume manager, asserting that Hitachi has no aspirations to be SAN based. This even though you can't read anything about the USP without reference to the "crossbar switch," "millions of (meaningless) IOPS" and "virtual ports" - all lifted directly from the taxonomy of SAN switches and directors.
And Mark eloquently (and predictably) argued that true virtualization belonged in the network, not on an array-that-would-be-an-I/O-redirector, even though most customers probably don't want to buy more hardware to solve a problem that has been demonstrated to be effectively solved already (on some scale, at least) with an array-based and network-based in-band solutions.
But for those of us playing at home, neither of them really explained why "virtual" doesn't mean the same thing when associated with Storage that it does when attached to Servers, Memory or Reality.
what's in a name?
Oh, I understand that "virtualization" is inspirational, but that's only half the objective of naming. It's also important that the name actually describes the product or service being provided (remember the Chevy Nova urban folklore?)
Why can't we all just agree to call volume management what it really is? Whether the implementation is a host-based logical volume manager (LVM), a SAN (-based) volume controller (SVC), or an array-based ultra-special volume manager (UVM), they all do essentially the same thing. In fact, they do the same same thing that storage arrays already do - they present a logical view of physical storage that masks (and simplifies) the physical implementation. And they do it for the same reasons - so that the hosts, networks and storage admins don't have to deal with the down-and-dirty physical management (and protection) of spinning rust.
And I agree - in-band and out-of-band simply aren't the issue for volume managers, it's just another implementation choice.
No, the only thing that really differs between implementations is really the number of touch points - one on every host vs. at least one on each network vs. perhaps fewer if implemented on the array (if the array can scale big enough, that is). And just to be clear - today there really isn't any virtualization (oops) Volume Management device scalable enough to virtualize even just ONE of today's high-end storage arrays, much less the dozens and hundreds that exist in many of the world's IT shops.
But none of the volume managers make a little bit of storage look like a LOT of storage, or anything like storage that isn't actually there.
[It's] a strange game [where] the only winning move ... is not to play.
maybe we got "thin provisioning" wrong, too.
Oddly enough, the technologies that DO provide a perception of storage that doesn't really exist DOESN'T use any derivatives of "virtual" - when in fact, it probably should. Think about it, "thin provisioning" actually belongs with the others - it gives hosts the impression they have access to far more storage than has been physically allocated.
And consider compressed file systems, data deduplification and RDE (redundant data elimination) - by compressing out duplicate (redundant) blocks of data, these technologies can fit terabytes of data on a fraction of the real storage. (Another mentor often jokes that de-dup really only needs storage for a "1" and a "0" - everything else can be compressed down to an ordered pattern of those two bits ;^).
Maybe it's not too late to correct our mistake - maybe we should admit our mistake, apologize to the market, and reclaim the name "storage virtualization" for the appropriate technology. And change the "virtualization wars" back into the "volume management" wars that they really have been these past few years.
I think we owe it to our customers and the market watchers to pick names that stink the least. We can do better.
How about a nice game of chess?


Barry, its too late. The ship sailed a long time ago on this one. But that doesn't stop some from trying to redefine terms on a whim as some seem to think (I've been trying to maintain some order in this wild wild west on my blog: http://www.equallogic.com/blog/default.aspx?id=2816) If we continue to be undisciplined with the language we use in this business eventually we'll need vulcan mind-meld technology in order to have a conversation.
I see your point on thin provisioning. Its mapping technology, just as striping, aggregation and subdivision are. But this ship sailed already too. I think the best we can do is to talk about thin provisioning as a form of storage virtualization.
Posted by: MarcFarley | May 11, 2007 at 01:03 PM