Explain NetApp to me
September 26, 2008 10:43 AM   Subscribe

Can someone explain to me as if were an idiot (which I am) what the company NetApp does?

I don't understand what filers are, what NAS and all that stuff is, FYI. Thanks.
posted by sholdens12 to Computers & Internet (6 answers total)
In your computer, you have a hard drive to store data.

If that hard drive fails, you lose your data. So, you set up a Redundant Array of Drives (a RAID) on your machine. Now you have multiple drives in a simple computer, so if one (or more, depending) drives fail, the other drive kicks in.

You want to access that data over your local network from another computer. So, you set up file sharing so that another machine -- or many other machines -- on your network can access the data stored on that RAID. That's called Network Attached Storage (NAS). If someone sells you a pre-built computer to do that type of work -- as opposed to a general-purpose computer -- it's called an appliance. Once could say... a "network appliance."

Now, if that computer fails, you lose access to that data, at least until you transfer your hard drives to another server. Some businesses need continual access to their data. It's possible to distribute the data not only across redundant drives, but also redundant servers, so that -- barring an extremely significant catastrophe -- you can have continual uptime on your data. NetApp specializes in providing servers and software to provide a network infrastructure to provide secure, redundant, high-uptime access to storage. A NetApp filer is one of the servers that comprises the complete "Storage Area Network" (SAN) that redundantly holds all of that network-accessible data.
posted by eschatfische at 11:01 AM on September 26, 2008

I'm sure someone smarter than me will do a better job, but speaking as someone who is responsible for an application that we use NetApp stuff with:

1. We fairly recently replaced a room-sized storage system with a Netapp cabinet

2. Our application has tons of files that need to be shared across multiple application servers, like terabytes of stuff and millions of files. Those servers need to be able to read and write quickly and reliably and backups need to be painless. Also, making a duplicate of all that data needs to be painless. Making more storage available needs to be painless. Re-allocating storage needs to be painless. Netapp does all those things.

Someone else can explain filers and NAS, NFS, SAN, etc.. It tends to make my head hurt. Short version is that our application servers have pretty small hard disks (they're "blades") and the majority of their storage in the the network. That means that the network storage has got to really perform.
posted by idb at 11:02 AM on September 26, 2008

And like eschatfische says, hardware failure has to be painless, i.e. individual disks can fail and end users won't notice.
posted by idb at 11:03 AM on September 26, 2008

Strictly speaking, NAS and SAN are different types of storage on a network.

NAS (network attached storage) is as eschatfische described. It's a box with hard-drives in that has files on it that can be accessed directly as files. Windows file shares (technically SMB and CIFS) and NFS for linux/unix are examples of NAS shares. A NAS server, or appliance if it's dedicated to the purpose, are accessed directly by users, sometimes there's a windows server or the like in front of it, which handles access permissions etc, but itself gets the user's files from a NAS server. NAS servers can range from single drive boxes for your home network, to systems with hundreds of hard-drives at once.

A SAN (storage area network) is block-level access. When your computer writes data to a hard-disk, it first writes the files to a file-system (NTFS, for windows). the computer then writes those changes in the filesystem to the hard-disk, using a particular syntax. The hard-disk knows nothing about the files themselves, only the blocks of file-system that change. A SAN allows this to be done over a network. Using the appropriate hardware and software, a server can treat a remote SAN store as if it was a hard-drive, and write its own file-system to it. Of course, the SAN store isn't sharing real individual disks, but virtual chunks of big arrays of them. This means that SAN arrays can be a lot quicker than individual disks in individual servers, and lots different servers can all use the central SAN store to store well, everything, making it much more efficient. It is however quite expensive to build a proper SAN.

This form of block-level traffic is often more efficient than file-level shares, so it's suitable as a backend for other servers to use, which then store the user's files in a filesystem stored on that remote disk. Examples of SAN network traffic are iSCSI, fibre-channel and ATA over ethernet.

NetApp sell hardware and software to do both. As virtual servers grow in popularity (one physical server with lots of individual fake servers running on it, so you can use the hardware more efficiently), so do NAS and SAN networks, as it's a lot easier to virtualize a server when you can store a whole bunch of them on a central storage system.
posted by ArkhanJG at 12:16 PM on September 26, 2008

I'll take a stab, too, even though the answers so far are good.

The simplest possible answer is that NetApps are basically computers with lots of hard drives, designed to be accessed over the network, so that lots of people within an organization can access them at once. That's basically a "filer," or "Network Attached Storage."

NetApps use something called RAID, which solves two common problems with hard drives: they're slow, and they're mechanical and thus wear out over time.

A hard drive might, on average, go 5 years without failing, but if you have 1,000 of them, the odds of one failing sooner is 1,000 times more likely. When the hard drive has vital company data, this is an unacceptable risk, so the same data is written to multiple hard drives, so that it doesn't matter if one fails.

The other problem is that hard drives are slow. (Think of how long your computer takes to boot, for example: most of that's reading from the hard drive.) In many cases, the computer can "go" much faster than a hard drive, so that in busy situations, your computer is sitting around waiting for data from the hard drive. So instead of writing a file to one hard drive, you split it up into several chunks, and write each of those chunks to a different hard drive. This is called striping, and it makes disk access much faster. (But remember, you're actually writing each chunk to multiple hard drives, so that if one hard drive fails, it doesn't matter.)

Since no one knows the difference between NAS and SAN, and since getting RAID "tuned" just right for maximum performance takes copious amounts of wizardry, NetApp packages computers doing hard-core NAS and RAID into cute boxes and sells them to companies for obscenely high prices.

This is, of course, a simplification, but hopefully it helps!
posted by fogster at 12:33 PM on September 26, 2008

In NetApp land a "filer" is specifically what other storage vendors would call a controller. It is the controlling device to which you attach a bunch of disk. You would usually (not always) have two filers in a cluster for redundancy.

NetApp's filers are not only capable of NAS (file protocols - NFS / CIFS) but also SAN (FC / iSCSI).
posted by lych at 2:48 PM on September 26, 2008

« Older VMWare: Web server on guest OS, accessed from Host...   |   Blackberry Curve ? Newer »
This thread is closed to new comments.