0.012: a terabyte isn't enough for my home
Just read Hu Yoshida's brief blurb about how he thinks enterprise-class storage obviates the need for storage in the home.
NOT!
As I noted in my introductory post to this blog, I have multiple terabytes of storage in my home. Today. Let's see - a quick run-down:
| HD TiVo Series 3 | 750GB |
| TiVo Series 2 | 600GB |
| TiVo Series 2 | 250GB |
| MP3 File Server | 500GB |
| Laptop 1 | 500GB |
| Laptop 2 | 40GB |
| Desktop | 360GB |
| Portable USB Drive | 300GB |
That's 3.3 TERABYTES of storage! And that's just what's spinning at my home on a regular basis. On top of that there's an old Linux PC I boot up from time to time (to upgrade hard drives in my TiVos), a 40GB Creative Nomad 3 portable MP3 player, my 8GB iPod Nano, and the 2+GB of CompactFlash drives for my Canon Digital SLR. And not to forget my "remote replicas" - the 300GB MP3 file server and 250GB TiVo Series 2 that live at the beach house I share with my cousins.
While this is some serious storage, I'm sure my list isn't the largest among my readership (I know one guy who has a whole SGI file server in his basement). So for anyone to suggest that enterprise-class storage on the other end of some magical unlimited-bandwidth data link is going to obviate the need for storage in the house?
I seriously think not.
Let me explain why...
i really need a household storage appliance
In fact, I'm convinced we will ALL soon need home storage appliances, simply because we (the home-bound consumer) are probably the largest creator AND consumer of digital information. Like I discussed in in my article on the Exabyte-enterprise, the vast majority of impending growth in storage is coming from "rich media' stuff like digital photos, music and video, be they stuff we create or simply stuff we collect.
And given that we're going to have all this new "stuff", we gotta find a place to keep our stuff (BTW - if you haven't seen George Carlin's "Stuff" skit, you should definitely catch it - it's just as funny if you think "digital" instead of "physical" every time he mentions "Stuff"...see it here on Google Video or here on YouTube).
Anyway, we need a place to store our stuff, and I firmly believe that means we need a storage solution especially adapted for the home. And I think it has to be a special-purpose "storage appliance" for the same reasons that there are SOHO, mid-tier and enterprise-class storage today - one size simply cannot fit all.
While I surely don't want to have a USP or a DMX in my house, I would immediately welcome an appliance that could centralize and consolidate all the different storage that is currently scattered around my house. And for the same reasons that enterprises moved away from storage embedded inside their servers and into external storage arrays in the first place: consolidation, availability and scalability.
Consolidation to improve my utilization - today NONE of the above devices are constantly 100% full, but when I go away for a week, my HD TiVo inevitably fills up with all kinds of neat HD stuff it finds as Suggestions. It's a pity that the extra space on my MP3 server can't be used as a temporary storage for those travel weeks.
Availability for more efficient protection of my assets. I have at least 3 copies of my MP3 collection (175GB
totaleach), sort of a poor-man's mirroring. Saved me from losing my entire library a few times, and I use the portable drive to keep the library at the beach house in sync with the one at home. I have similar backups of my photos, videos and documents. I'd prefer the protection of RAID 5 or RAID 6, plus just one external copy of my most critical data.Scalability is a big one, and if you've read my other articles, you know it's one of the hardest things to get right. But my own personal data repository is growing perhaps as much as 50% a year as well, if only with new music and photos. I need a solution that makes it easy to expand my capacity, WITHOUT having to rip-and-replace drives. And if I could avoid the hours it takes to migrate onto a new drive, I'd appreciate that - especially when my HD TiVo has to be off-line for the length of the migration (which means I can't do any upgrades during Heroes or any of my wife's favorite shows - my service windows might actually be shorter than that of some enterprises
).
And why not just put it out in a Google- or Yahoo-drive, like Hu suggests?
Well, I can think of at least a few reasons why I'd prefer to keep my data (at least my primary copy of it) within the boundaries of my own home:
Privacy. It's mine, nobody else has the right to see it, and I don't want to ever have to worry that someone hacked into the "public" repository and helped themselves to my music, home movies or pictures of my children. And not that I have anything to hide, but my lawyer and I would prefer that the search warrant for my data be served on me, and not executed through some loophole requiring a third party to turn over my data.
Actually, that's probably excuse enough. But I have more...
Availability. Read the fine print on Google Mail or any of the other "free" storage repositories, and you'll notice that these repositories actually don't guarantee that your data is available 24/7/365. Nor do they guarantee to protect your data from irretrievable loss or damage, or even to attempt to recover it if it disappears. (I could make a side comment here about how little Hitachi's Availability Guarantee actually guarantees, but...oh, I guess I just did).
Backups. Related, but I'd like to know there are backups being made. Maybe onto a DVD I drop into the mail, or maybe using an on-line backup service. And I'd like to choose what gets backed up - maybe we don't need to waste the space for backing up episodes of Desperate Housewives, but I can assure you I don't want to have to rip my CD collection all over again.
Performance. Last I checked, the pipe to my house wasn't unlimited. Or cheap. Digital video to my HD TiVo consumes more download bandwidth than I have available upload bandwidth on my broadband cable, and Verizon's FIOS isn't any better (and isn't available to either of my homes, either).
Security. Kinda related to #1, and I won't claim that my firewall is necessarily bullet-proof. But I'm pretty sure that hackers are more likely to target YouTube than Barry's Home File Server.
they're (almost) here!
Now, the good news is that vendors are starting to recognize that there actually is an emerging Terabyte-in-the-home market, and products are on the horizon that could solve our needs. And things have come a long way since the days of the (hacked) single-drive LinkSys NSLU2 that I currently use as my file servers. A few weeks ago the DROBO popped up - interesting, but maybe a little too dumb for my needs. Microsoft is beta testing Windows Home Server (read about it here on Paul Thurrott's site), and there will obviously be several new hardware platforms for that solution - but honestly, I don't think they've actually hit the mark yet either.
See, the Terabyte Home needs more than just a simple file server that doesn't require a PhD to operate. We need the storage appliance to deliver a basic set of services and capabilities to cover all the data in our home.
For me, the requirements would look something like this:
- Source-agnostic: Ability to replace ALL my embedded storage, not just the stuff I need for my PC. That means that TiVo and cable boxes and PlayStations and XBox360's and wii's all have to be supported. Tricky that, but it's what I need. Importantly, the challenges, especially the Digital Rights Management, will have to be solved whether I'm using a home storage appliance or some centralized repository.
- Client-agnostic: I'd like to be able to use a SINGLE copy of my MP3's to support any and all media playing devices. My iPod insists on getting music only via iTunes; my AudioTron can play anything off of a CIFS share, while AppleTV wants DAAP servers and XBox supports only a special flavor of UPNP server. I've *almost* got my hacked NSLU2's covering everything - a commercial replacement has to work out of the box.
- Protocol agnostic: 1Gb/10Gb Ethernet, USB-2, FireWire and eSATA so I can hook it up to anything in my house. NFS, CIFS, iSCSI, DAAP, UPNP, HTTP, FTP so I can share files & data, both within my house and externally.
- Scalable performance - my TiVos record 2-4 HD data streams 24/7 (mostly collecting stuff I *might* like to watch, if and when I have the time to veg-out in front of the DLP). On top of that, I need enough bandwidth and IOPs to handle 2-3 viewing streams (for the kids), an MP3 stream or two, downloads of HD video content (in preparation for the looming demise of physical DVD rentals), backups, iPod copies, video editing, (legal) CD&DVD ripping, and whatever else that might join the home data manipulation arsenal in the coming years.
- Integrated support for both local and remote backups, including remote copies to either another appliance, or (more likely) to a centralized service provider. Let me control the security and encryption keys so I'm reasonably sure nobody else can see my data, and I'll be good.
- Remote replication - I need a better way to keep my beach house MP3 library in sync with my home library. Incremental, changes-only resyncs initiated from either side are mandatory, so I can download and/or rip new music from either location.
- Secure Access Controls - to keep hackers (and my neighbors) out of my stuff. If I want to share anything with friends and family, I will indeed probably upload it somewhere, but I may also want to host my own private sharing server. So I'll need safe and simple security/access controls/user authentication for my circle of friends and family.
- Data Encryption - inevitably I'll want to add my own layer of encryption on my data, just in case someone breaks into my house and steals my appliance.
- De-dup and compression - also inevitably I won't want to pay for, power or cool any more storage than I have to. The ideal solution will automatically reduce my storage to the barest minimum required, power down drives when they're not being used, and generally be as Green as possible.
- Thin Provisioning. Yeah, since Windows and Linux are dragging their feet towards supported dynamic resizing of volumes, I'll definitely want to take advantage of storage-based thin provisioning, just to maximize future flexibility and reduce the guesswork about how much space I'll need for my photos next year.
- And all the rest - Hot sparing, Hot code loads, Non-disruptive Migrations onto new drives. RAID 6. Incrementally scalable capacity and performance. No vendor lock-in for disk drives. Support for an external UPS. Brownout ride-through. Scalable global memory/cache. Flash SSD support.
Interesting - review the above list of requirements, and it sounds an awful lot like what our enterprise customers are asking for (at a larger scale). I guess it's not really all that surprising though, since both the Terabyte-Home and the Exabyte-Enterprise are operating under similarly limited budgets, and neither can count on having sufficient PhD's around to make all this stuff Just Work.
Yeah - I know this isn't all going to happen overnight. And I'm pretty sure that a certain level of Storage Anarchy and probably some more TWINE are going to be required (see last week's Strategies... entry for the background on these code names
). But despite Hu's assertion, I seriously doubt that today's enterprise storage arrays alone are going to adequately address either the Exa-prise or the Tera-home.
At least not at a price we can all afford.
-----
Update: I wouldn't normally do this, but Marc Farley's been having a little fun with an off-shoot of this discussion over at his blog, and it's definitely worth checking out!


How do you feel about Solaris? How about a multi-terabyte server running NAS using ZFS as a backend? It seems to me that you would meet *most* of your criteria for a home setup - compression, scalability, backups (snapshots), data fault protection, hot spares, mirroring, etc. A lot of features you want are already in ZFS.
Couple that with the fact that Solaris (or OpenSolaris) is free from SUN and ZFS costs nothing and you have an enticing home solution for a technical user (not quite easy enough for anyone to set up yet :P).
Posted by: Lee Hinman | June 21, 2007 at 04:49 PM
Some assembly required - hardly what I'd call an appliance. I'm technical, but the mass market needs a turn-key solution that Just Works, I think.
Thanks for the suggestion, though.
Posted by: the storage anarchist | June 21, 2007 at 08:57 PM
Excellent post. Certainly have come to similar conclusions. How can we make use of all that storage as well as maintain backups of the important bits.
Posted by: Michael Fox | July 11, 2007 at 07:40 PM
Another thing people probably don't consider is to future proof things. You can buy a decent NAS box now, but I think spares for such a item in 5 years might be difficult. Especially if the PSU dies and you have no access to the unit to get the data off.
Friends have considered getting a decent hardware raid pci-e card, but the problem I have with this is that the card itself might fail in years to come, thus the data on the drives cannot be accessed.
I am leaning towards software raid on something like linux, since it can be moved from system to system and still work. So it's not tied to any one thing. And processors are fast enough now that the bottleneck of software raid is probably not that bad anyways. I'd feel happy knowing I can migrate the drives across systems as things die and no longer can be replaced.
Posted by: Michael Fox | July 14, 2007 at 12:25 AM
Barry - as someone who also has multiple TB of home storage, I concur with your sentiments, but one thing that you missed that I really want for my home storage is ILM-light.
I don't necessarily want to pay for backing up >ALL< my data. If I loose the gigs of pictures I have from my Halloween party, brother's wedding, etc... I will survive. I don't want to loose them so RAID6 is a good idea, but I don't need to be schlepping them around the universe.
However, I want my Quicken and other sensitive files to be backed up in multiple places (hopefully something solid state locally as well as remote and encrypted) and some intermediate tier that is backed up, but might take a while to restore.
End users need to be able to intelligently specify what types of data is important to them (I know some friends who have a melt down if even one of the multiple thousands of pictures of their 1-year old are lost) and have it tiered accordingly.
Just my $0.02
C
Posted by: Colin Gallagher | July 18, 2007 at 02:57 PM
Check out Infrant's ReadyNAS NV+ (they're a division of Netgear now I think):
http://www.infrant.com/products/products_details.php?name=ReadyNAS%20NVPlus
It supports upto 4x750 GB in RAID 5 (they're working on 1 TB drive support). You can start with one or two drives and expand dynamically without losing data.
It's based on Linux and supports most file sharing protocols (including sending to remote sites via rsync). GigE, USB2, streaming iTunes.
I'm planning on getting one, but am debating whether to wait for the next firmware release (which will support 1 TB drives), or get one now and upgrade later.
Posted by: David Magda | July 20, 2007 at 07:47 PM