« 1.055: symmetrix v-max - a revolutionary evolution | Main | 1.057: symmetrix v-max - scale up, scale out, scale away! »

April 14, 2009

1.056: inside the virtual matrix architecture

Overtake the future. This is the third in a series of posts on EMC's Overtake the future launch on 14 April 2009.

The cornerstone of today’s Overtake the future launch is of course the new EMC Virtual Matrix Architecture, the foundation upon which the virtual data center will scale and thrive henceforth.

Combining the market-proven functionality that has made Symmetrix the World’s Most Trusted storage platform with the latest in industry standard compute and I/O technologies, the Virtual Matrix Architecture liberates the power of Symmetrix from the physical barriers of backplane-based monolithic storage arrays and redefines ease-of-use for storage in today’s increasingly virtualized data centers.

But while this new architecture is inarguably revolutionary in the world of storage, the Virtual Matrix is in fact borne of a Darwin-esque evolution of the same Symmetrix architecture that launched the external storage market over 18 years ago. The result is the first storage architecture that integrates the performance and efficiency of traditional scale-up architectures with the cost-effective flexibility of scale-out, blurring the distinction between modular vs. monolithic while redefining the scope of scalable enterprise storage.

In this post I will explain the path that has led EMC to The Virtual Matrix, and along the way I’ll highlight several of the key features of this revolutionary new architecture.


global memory – the foundation of symmetrix, both old and new

The single most distinguishing feature of Symmetrix for the past 18+ years has been its global memory. In every Symmetrix, memory is a central shared resource that is accessible by every single processor and I/O stream in the system.

Over the years, the interconnect between memory and the I/O processors and the way that these processors communicate with each other both have changed, but the operational utility of global memory hasn’t. Write requests received by front-end communications ports are stored in global memory for the back-end disk directors to deliver to disk, and host read requests are fulfilled by the disk directors by placing payloads in global memory for the front-end directors to deliver back to the requestor.

Bus Architecture Early Symmetrix systems (before the DMX-era) used a communications bus to transport data and inter-processor communications between processors and memory. The processing complexes themselves were separate and purpose built – front end SCSI, ESCON and later Fibre Channel based directors connected to hosts, while first SCSI and later FC disk directors connected to the disks. Each of these directors presented different emulations, and over the years, the directors evolved to be more like blade servers, with each “blade” supporting 2-4 independent CPU “slices”, each able to transfer data to and from central memory over the backplane.

In 2003, EMC introduced the first significant architectural change to Symmetrix since its birth – the Direct Matrix Architecture. Direct Matric ArchitectureAlthough still maintaining the front <—> memory <—> back implementation, the bus interconnect of the prior generations was replaced with the Direct Matrix – dedicated I/O transports between each front- and back-end director to the global memory directors. These dedicated paths eliminated the contention bottlenecks of the previous bus architecture, and they allow the Symmetrix DMX series to deliver levels of performance and capacity scalability that has still not been equaled by any other high-end storage array. Today, no other high-end array supports as many disk drives, and it has been only recently that one competitor matched the amount of global memory supported by the Symmetrix DMX-4.

As you should expect, the Virtual Matrix Architecture is also structured around the central resource of global memory – but the implementation is radically different from prior generations. But, before I explain how it differs, let’s discuss some of the reasons for changing things in the first place.

beyond the backplane

Over the past decade, Symmetrix has been portrayed by competitors as “monolithic”, especially those who sought to differentiate from the fixed-frame disk complexes that Symmetrix employed up until the introduction of the DMX3 in 2005. Pre-DMX3 all Symmetrix came in limited sizes – the 5th-generation Symmetrix 8000 series was available in a 96-drive and a 384-drive package, for example. DMX1 & DMX2 were slightly better in that there were actually 4 different sizes (DMX-850, DMX1000, DMX2000 and DMX-3000), but you still couldn’t grow from one size to the next as your needs changed.

The DMX3 and DMX4 changed all that. Customers can start with the 2-disk director DMX4-1500 and grow all the way up to the full-blown 4-disk director DMX4500, supporting up to 2400 drives. Back-end performance scaled up as you added disk directors, and the customers’ investment was protected as needs changed.

But in talking with customers even now, one would quickly come to realize that incremental growth alone wasn’t enough – customers also needed more flexibility in how they deployed their storage across the data center.

In addition, many of EMC’s largest customers had need for even larger configurations than DMX4 could offer – in fact, many of them have multiple independent DMX4’s to meet their performance SLA’s where they would prefer to have only one.

Unfortunately, the very real truth is that the laws of physics get in the way of stretching signals over a physical backplane, and the DMX4 is already at the practical limits of today’s technology. In order to scale Symmetrix even larger, it had to get beyond the limitations of a passive backplane architecture.

So the Virtual Matrix was borne out of two key customer requirements: to support ever-larger capacity storage arrays while removing the requirement that all the bays of the array must be physically adjacent.

And though switching to glass interconnects to the storage bays might solve the latter, this is an expensive alternative that unfortunately does nothing to help increase the scale limits of the backplane.

And there was one more significant motivator behind the Virtual Matrix:

those costly dis-integrated directors

In the early days of Symmetrix, it was generally necessary for every computer vendor to build their own processor complexes. Though Intel was dominating the desktop/laptop market, these were the (end of) the glory days of DEC, Prime, Wang, Data General, HP, Sun, Honeywell, Bull, Apollo, Stratus (et al), and most every one of them had their own proprietary processors. Those that didn’t used “standard” CPUs and surrounded them with custom logic designs.

Symmetrix was just a big I/O server, and so it naturally followed suit, and Symmetrix arrays through today's DMX-4 were built with a healthy helping of custom hardware – most of it centered around accelerating data movement through the system and in performing continuous error checking to ensure data integrity. This logic was purpose built for the task each director would support, and the design worked well: you could mix and match front-end directors to get the connectivity you needed, you could add memory directors to improve cache hit rates, and you could scale the back-end to deliver more cache-miss IOPS if that was what you needed.

But the downside of this purpose-built approach was that you could only put so many director blades on the backplane. No, the bigger challenge was that the design required unique and custom hardware for every generation. Every director had to be built up from the chips, every interconnect invented and implemented from the ground up. And over recent years the gap between “custom” and “industry standard” closed, leaving little leverage in custom hardware.

Back before even the first DMX came to market (the DMX1000 and DMX2000 were launched in  February 2003), Symmetrix engineers were drafting proposals to integrate the functions of the front, back and memory controllers onto a single controller form factor. Symmetrix V-Max Engine That idea evolved into the notion of switching processors from the now-nichey Power 4 CPU from IBM/Motorola, with largely custom infrastructure, to the far more widely used Intel x86/IA64 processors and their lower-cost industry standard infrastructures. And by virtue of changing processor complexes, the new “unified director” could more readily integrate standard I/O cards, memory DIMMs and interconnects so as to become more cost-efficient and agile as new processors and components were introduced. The engineers just needed to figure out how to connect multiple of these new unified directors together.

Another benefit of putting the memory local to the CPU within the unified directors meant data transfers where no longer limited to the speed of the Bus or the Direct Matrix – I/O could be received and sent with far lower latency and overhead than in prior Symmetrix generations. But embedding the memory in the unified director posed a new challenge – the memory was “local” to the processors, but Symmetrix and the Enginuity storage OS were based on the tried-and-true foundation of “global” memory.

enter the virtual matrix

Summarizing the requirements, the next architectural evolution of Symmetrix had some pretty tough targets:

  • memory must be globally accessible
  • get the maximum performance benefits of local memory access
  • maintain/extend the incremental scalability of the DMX3/DMX4 (add directors to scale)
  • leverage a unified director built on industry-standard components for simplicity and flexibility,
  • accommodate future scale to ever larger system images
  • allow for distributing cabinets around the data center
  • minimize the impact on Enginuity software
  • deliver all this without compromising reliability or availability

Simple, huh?

Well the solution in fact did turn out to be rather straight-forward.

In essence, the Virtual Matrix Architecture “virtualizes” processor access to memory. That is, the code treats access to remote memory exactly the same as it does to local memory. Virtual Matrix ArchitectureThe trick is that a small layer of software, assisted by EMC custom hardware logic (in an ASIC on the Virtual Matrix Interface), presents any location in memory as “local” to the processor complexes. If the target memory locations are indeed local, then access to memory is direct, at memory bus speeds. And if not, the request is packetized and parallelized to be sent over the Virtual Matrix Interconnect fabric to the Virtual Matrix Interface on the director that owns the specified memory target.

For the curious, the first generation of the Symmetrix V-Max uses two active-active, non-blocking, serial RapidIO v1.3-compliant private networks as the inter-node Virtual Matrix Interconnect, which supports up to 2.5GB/s full-duplex data transfer per connection – each “director” has 2, and thus each “engine” has 4 connections in the first-gen V-Max.

Why RapidIO you ask? Primarily because of its non-blocking, low latency, high bandwidth, parallelism and cost efficiency – RapidIO has been used in a broad range of embedded applications from MRI systems to military fighter jets. You can learn more about RapidIO at www.RapidIO.org.

a matrix designed for the future

In order to keep up with the exponential growth projections of both storage and virtualized server complexes, the Virtual Matrix Architecture is designed to scale well beyond the limits of the first-generation Symmetrix V-Max. In fact, the gen 1 Virtual Matrix Interconnect could easily address well beyond 256 nodes. But the Virtual Matrix Architecture doesn’t limit the fabric to being 2 RapidIOs; it could be 4 or 8 RapidIO networks running in parallel, or it could be built on a different infrastructure altogether – InfiniBand, FCoE/DCE – or whatever comes along in the coming years.

More importantly, on top of the two dimensions of memory access that the Virtual Matrix Interface implements today (direct to local memory and over the fabric to memory in peer nodes), the Architecture allows for a third dimension of interconnect – a connection between different V-Max systems. This interconnect would not necessarily expand to share memory across all the nodes in two (or more) separate V-Max arrays, but it would allow multiple V-Max arrays to perform high-speed data transfers and even redirected I/O requests between different Symmetrix V-Max “virtual partitions.” This capability of the Architecture will be leveraged in the future to “federate” different generations of V-Max arrays in order to scale to even greater capacities and performance, and will also be used to simplify technology refreshes. In the future, you’ll be able to “federate” a new V-Max with the one on your floor and non-disruptively relocate workloads, data and host I/O ports.

It ain’t magic…but it’s close.
 


TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d834c659f269e201156f216db1970c

Listed below are links to weblogs that reference 1.056: inside the virtual matrix architecture:

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

TimC

You're right, it isn't magic, it's NUMA. Cmon, almost magic?

It's funny, all the companies you proclaimed the death of above were doing this in the 90's. I guess because they died you can claim you re-invented it?

TimC

Ok, so I'll stop being debbie downer for a minute. I didn't see it in the spec sheets, are these the new nehalem xeon's? If not, back to the drawing boards! (OK, I guess that was a bit of a downer) :)

Oh, and where's the FCOE? This doesn't appear to be fitting in to the cisco datacenter of the future unified fabric yadda yadda yadda vision very well!

Nigel

Great post Barry, finally some interesting detail. Im beginning to get a picture now and it looks impressive (prior to the detail I was struggling to see the big deal).

Looking like a good platform built for the future, a solid foundation for building for the future, making the future look exciting and surely changing the playing field.

No questions that this is no longer Moshe's Symmetrix ;-)

The comments to this entry are closed.

anarchy cannot be moderated

about
the storage anarchist


View Barry Burke's profile on LinkedIn Digg Facebook FriendFeed LinkedIn Ning Other... Other... Other... Pandora Technorati Twitter Typepad YouTube

disclaimer

I am unabashedly an employee of EMC, but the opinions expressed here are entirely my own. I am a blogger who works at EMC, not an EMC blogger. This is my blog, and not EMC's. Content published here is not read or approved in advance by EMC and does not necessarily reflect the views and opinions of EMC.

search & follow

search blogs by many emc employees:

search this blog only:

 posts feed
      Subscribe by Email
 
 comments feed
 

 visit the anarchist @home
 
follow me on twitter follow me on twitter

TwitterCounter for @storageanarchy

recommended reads

privacy policy

This blog uses Google Ads to serve relevant ads with posts & comments. Google may use DoubleClick cookies to collect information (not including your name, address, email address, or telephone number) about your visits to this and other websites in order to provide ads about goods and services of interest to you. If you would like more information about this practice and your options for not having this information used by Google, please visit the Google Privacy Center.

All comments and trackbacks are moderated. Courteous comments always welcomed.

Email addresses are requested for validation of comment submitters only, and will not be shared or sold.

Use OpenDNS