Why did WinFS fail?
August 9, 2009 2:45 PM
Why did WinFS fail?
The Wikipedia article is surprisingly in depth and is a great sales pitch for WinFS. It does not, oddly, explain why WinFS failed. I assume if it was axed to simply get Vista out of the door more quickly, and not just an engineering problem, the open source community would adopt it. It appears that there was an open source project called Storage for Gnome that mirrored the capabilities of WinFS, but support for that dropped too.
The only thing, as far as I can discern, that came out of WinFS is filestream in SQL 2008 which is kind of ho-hum.
What's the deal?
The Wikipedia article is surprisingly in depth and is a great sales pitch for WinFS. It does not, oddly, explain why WinFS failed. I assume if it was axed to simply get Vista out of the door more quickly, and not just an engineering problem, the open source community would adopt it. It appears that there was an open source project called Storage for Gnome that mirrored the capabilities of WinFS, but support for that dropped too.
The only thing, as far as I can discern, that came out of WinFS is filestream in SQL 2008 which is kind of ho-hum.
What's the deal?
In my professional experience, I found the idea of the "build train" to be quite interesting.
At a major OS developer like Microsoft, you have hundreds if not thousands of interrelated projects in parallel development for a future release.
This mass slouching towards release is the build train.
As the release date nears, risk management dictates that more fundamental pieces get locked down before higher projects that depend on them do.
While I am not knowledgeable about the internal development story of WinFS, I assume it was axed from Vista when it was so as to not add disproportionate risk to the schedule, either for performance or reliability reasons.
Why it was not picked up for Win7 is also an interesting question.
posted by @troy at 3:03 PM on August 9, 2009
At a major OS developer like Microsoft, you have hundreds if not thousands of interrelated projects in parallel development for a future release.
This mass slouching towards release is the build train.
As the release date nears, risk management dictates that more fundamental pieces get locked down before higher projects that depend on them do.
While I am not knowledgeable about the internal development story of WinFS, I assume it was axed from Vista when it was so as to not add disproportionate risk to the schedule, either for performance or reliability reasons.
Why it was not picked up for Win7 is also an interesting question.
posted by @troy at 3:03 PM on August 9, 2009
My naive understanding is that it's just a lot harder than people thinking. Making something quickly store and fetch blocks from a disk is a very different problem from making a semantically aware database with full text search on arbitrarily typed blobs. I'd love to hear a more detailed explanation, though.
posted by Nelson at 3:27 PM on August 9, 2009
posted by Nelson at 3:27 PM on August 9, 2009
One big issue was the number of Windows applications that would have broken on a native WinFS system, necessating various kludges to ensure backwards compatibility. So, by 2003, according to Paul Therrott, WinFS was conceived as a layer that ran on top of NTFS, not a full file system of itself. As such, without a rewrite of the antiquated SCSIport low level file system driver, later delivered in 64 bit versions of XP and Vista as Storport, there were bad performance hits to using WinFS.
And so long as NTFS is the underlying file system, running a RDBMS as a data store on top of that, even with Storport, isn't going to be particularly efficient or quick, from a hardware perspective. Since any Windows developer that wants an RDBMS store can implement any of a number of SQL alternatives as part of that specific application, and since some of them are paid Microsoft product options, there's not a lot of gain to MS for sticking it in an OS, for free.
posted by paulsc at 4:55 PM on August 9, 2009
And so long as NTFS is the underlying file system, running a RDBMS as a data store on top of that, even with Storport, isn't going to be particularly efficient or quick, from a hardware perspective. Since any Windows developer that wants an RDBMS store can implement any of a number of SQL alternatives as part of that specific application, and since some of them are paid Microsoft product options, there's not a lot of gain to MS for sticking it in an OS, for free.
posted by paulsc at 4:55 PM on August 9, 2009
It is a truism in the software development world that as databases age they tend to become more and more like filesystems, and as filesystems age they tend to become more and more like databases.
Filesystems don't keep enough metadata and relational information, and databases are way too slow. The only way to have it both ways at this point is faster hardware. While CPUs have been speeding up, and RAM sizes have been increasing drastically, storage seek times and data pipe speeds (ie. the limitations for databases and filesystems) are relatively stagnant in comparison.
posted by idiopath at 6:03 PM on August 9, 2009
Filesystems don't keep enough metadata and relational information, and databases are way too slow. The only way to have it both ways at this point is faster hardware. While CPUs have been speeding up, and RAM sizes have been increasing drastically, storage seek times and data pipe speeds (ie. the limitations for databases and filesystems) are relatively stagnant in comparison.
posted by idiopath at 6:03 PM on August 9, 2009
It does not, oddly, explain why WinFS failed.It's a closed source inhouse product, they're not going to document why they failed unless they get something out of it.
In the past I thought that database filesystems like WinFS and ReiserFS had a place but now I think that they can't overcome the inertia that the current file metaphor has. The problem is when you transfer a file with filesystem-level metadata to another system you need a wrapper format (which doesn't exist) and you need the target platform to support database filesystem semantics (which they don't) so you may have to discard the metadata or have a separate metadata file sitting beside it (which reduces the usefulness of it). The practicalities of interoperability doom such a change in the file metaphor that a database filesystem imposes. Imagine how much modification of software you'd need to make files anything more than paths, dates, and a binary. It's simply a mammoth task.
Instead, a few years ago we started seeing filesystems offering standardised function calls into user-space metadata extraction tools. This way you can make available metadata in, for example, MP3s by reading their ID3 metadata. It's not as fast and the data isn't separated at the filesystem level because instead you're just reading parts of the file. This is an elegant hack but from the users perspective it can look the same and this simpler metaphor works right now and it's portable across systems. You'd transfer a file and the metadata is in the binary rather than being separate. This does mean that in most cases a database filesystem is relegated to just being a cache of the file's inbuilt metadata.
WinFS was massively overhyped and for most of the scenarios (exposing email subject lines and authors in the GUI) people would be better served by the metadata extraction approach mentioned above, or by a custom database. Any cross-platform application now just bundles sqlite if they need that kind of capability. The dumber, simpler, extraction approach is probably what will win despite it being an inelegant hack.
It's almost like how a few years ago there was some talk of resolution independent GUIs and how all these applications would need to be rewritten for high DPI displays. The problem with people advocating this approach is that they didn't provide a way to migrate software to the new approach. Instead what we've seen is the widget toolkits slowly becoming more liquid and the icons being done in SVG (or other vector formats) and it's now clear that with newer widgets and icons we'll get most of the way towards resolution independence.
I've got my hopes on btrfs, a filesystem that has some features of databases and is considerably smarter and faster than anything we saw of WinFS.
posted by holloway at 7:13 PM on August 9, 2009
Maybe someone got smart and realized that a filesystem is supposed to keep files where you put them, quickly and reliably, and that [some other piece of software (or, the horrors, the user)] should deal with the where and the why. Until they can come up with a way to add features without compromising data integrity, and without interfering with people who prefer to organize their own files, they should really concentrate on just keeping the thing stable.
posted by gjc at 7:47 PM on August 9, 2009
posted by gjc at 7:47 PM on August 9, 2009
the latest version of OSX has a search thing that does what WinFS was supposed to doPresumably that's Spotlight which is quite different. It's the kind of metadata extraction tool that I was talking about (it's a server that indexes what's in binaries and provides system calls to search it).
posted by holloway at 7:49 PM on August 9, 2009
I had a co-op job at MSFT in '91 where I worked on "Windows 4" - the successor to Windows 3. We cross-compiled on OS/2 because that was before NT. Maybe we had pre-releases of NT, it's fuzzy in my memory. I used to telnet into a Xenix box to read my email.
Anyway, two of the groups under Win 4 were "Venus" a new "OO" UI and "Vulcan" which was - guess what - an OO database filesystem.
Microsoft has been trying to get this to float for nearly two decades now. I don't know why it keeps failing but this project has been failing for a looong time.
posted by GuyZero at 8:15 PM on August 9, 2009
Anyway, two of the groups under Win 4 were "Venus" a new "OO" UI and "Vulcan" which was - guess what - an OO database filesystem.
Microsoft has been trying to get this to float for nearly two decades now. I don't know why it keeps failing but this project has been failing for a looong time.
posted by GuyZero at 8:15 PM on August 9, 2009
Maybe someone got smart and realized that a filesystem is supposed to keep files where you put them, quickly and reliably, and that [some other piece of software (or, the horrors, the user)] should deal with the where and the why.
Or not. The hierarchical "file/folder" or "file/directory" metaphor has been craptastic for a while now, which makes the whole "putting them" thing problematic. WinFS was announced in the context of the Be File System, which was notable for bringing lots of server-grade stuff -- journalling, indexing, metadata -- to a desktop FS.
You have a point that the most successful attempts to get past hierarchy and present files in a database-like space have been in standalone applications -- iTunes is a good example -- but that doesn't mean that all attempts to get beyond hierarchical organisation at a fundamental level in consumer OSes are dumb by definition. Or, basically, what holloway said, though I think that the file metaphor may well have an expiry date, on account of the iLife apps and things like the iPhone (and many mobile platforms) where there is no user-exposed filesystem unless you hack it to the surface.
But on point: filesystems are hard. Taxonomies are also hard. And WinFS cropped up in perhaps the most dysfunctional period of Microsoft's OS development history.
posted by holgate at 10:57 PM on August 9, 2009
Or not. The hierarchical "file/folder" or "file/directory" metaphor has been craptastic for a while now, which makes the whole "putting them" thing problematic. WinFS was announced in the context of the Be File System, which was notable for bringing lots of server-grade stuff -- journalling, indexing, metadata -- to a desktop FS.
You have a point that the most successful attempts to get past hierarchy and present files in a database-like space have been in standalone applications -- iTunes is a good example -- but that doesn't mean that all attempts to get beyond hierarchical organisation at a fundamental level in consumer OSes are dumb by definition. Or, basically, what holloway said, though I think that the file metaphor may well have an expiry date, on account of the iLife apps and things like the iPhone (and many mobile platforms) where there is no user-exposed filesystem unless you hack it to the surface.
But on point: filesystems are hard. Taxonomies are also hard. And WinFS cropped up in perhaps the most dysfunctional period of Microsoft's OS development history.
posted by holgate at 10:57 PM on August 9, 2009
Afroblanco: many popular newer database storage engines give up relational / reliability / rollback features in favor of performance, thus becoming more like a traditional filesystem.
For example sqlite, one of the most deployed databases, if the not most pervasively deployed, is really more of a replacement for fopen() than it is a replacement for oracle (as they claim on their own website). Sqlite is pretty close to being a userland file system driver, which you use to load a disk image; it is popularly used by programs like firefox as a replacement for a configuration directory.
posted by idiopath at 12:00 AM on August 10, 2009
For example sqlite, one of the most deployed databases, if the not most pervasively deployed, is really more of a replacement for fopen() than it is a replacement for oracle (as they claim on their own website). Sqlite is pretty close to being a userland file system driver, which you use to load a disk image; it is popularly used by programs like firefox as a replacement for a configuration directory.
posted by idiopath at 12:00 AM on August 10, 2009
It failed because it tried to solve the wrong part of the problem, which happens to be the easy part that we have solutions for. We already have XML, Schema, RDF and a couple more not-really-concrete-yet layers for representing structured data and trying to apply semantics to it in a machine-readable way, so having a proprietary storage system that replicates this sort of functionality is a waste. MS probably realised that but wouldn't tell you so, of course.
The really hard bit is to come up with an ontology that can be meaningfully shared between applications and attempt to be globally applicable. Achieving that would be nearly equivalent to achieving strong artificial intelligence - it's a massively hard problem because you can't really describe everything in the world in a consistent mechanical way.
This problem is part of the reason the layers above XML/Schema (the ones that represent semantics as opposed to structure) aren't really finalised yet: information scientists haven't really figured out how to do it. We don't know how we know what we know, nor how to represent it in such a way that a computer can "know" in the same way.
posted by polyglot at 6:48 AM on August 10, 2009
The really hard bit is to come up with an ontology that can be meaningfully shared between applications and attempt to be globally applicable. Achieving that would be nearly equivalent to achieving strong artificial intelligence - it's a massively hard problem because you can't really describe everything in the world in a consistent mechanical way.
This problem is part of the reason the layers above XML/Schema (the ones that represent semantics as opposed to structure) aren't really finalised yet: information scientists haven't really figured out how to do it. We don't know how we know what we know, nor how to represent it in such a way that a computer can "know" in the same way.
posted by polyglot at 6:48 AM on August 10, 2009
Or not. The hierarchical "file/folder" or "file/directory" metaphor has been craptastic for a while now, which makes the whole "putting them" thing problematic. WinFS was announced in the context of the Be File System, which was notable for bringing lots of server-grade stuff -- journalling, indexing, metadata -- to a desktop FS.
Except at the top level where you select the disk drive (*), the file/directory metaphor is just another database anyway. With most modern file systems there is no assumption that a file occupies a contiguous space on a medium, much less that all files located within a directory are grouped together. Folders, directories, and even filenames are just metadata tags, and for most modern filesystems a file can easily have multiple tags associated with it.
To me, there are two big problems with attempts to make file databases into SQL databases. First, there is the "everything looks like a hammer" approach of treating everything according to relational tables which runs into problems when those tables are likely to be sparsely-populated and you have hierarchical relationships between objects. (Not that you can't deal with those kinds of data models, but normalizing those data models are a big pain in the ass.) Second, a common way of dealing with a collection of files is to pack them all into a compressed form that can be treated as a file (attached to email or sent over http), or as a data node (browsed, edited, etc., etc.).
And on preview, I think another problem as polyglot points out is that its next to impossible to create an ontology that works equally well across all application domains.
(*) and network shares, SSDs, and RAID devices are often a virtual interfaces on top of multiple physical storage containers.
posted by KirkJobSluder at 6:56 AM on August 10, 2009
Except at the top level where you select the disk drive (*), the file/directory metaphor is just another database anyway. With most modern file systems there is no assumption that a file occupies a contiguous space on a medium, much less that all files located within a directory are grouped together. Folders, directories, and even filenames are just metadata tags, and for most modern filesystems a file can easily have multiple tags associated with it.
To me, there are two big problems with attempts to make file databases into SQL databases. First, there is the "everything looks like a hammer" approach of treating everything according to relational tables which runs into problems when those tables are likely to be sparsely-populated and you have hierarchical relationships between objects. (Not that you can't deal with those kinds of data models, but normalizing those data models are a big pain in the ass.) Second, a common way of dealing with a collection of files is to pack them all into a compressed form that can be treated as a file (attached to email or sent over http), or as a data node (browsed, edited, etc., etc.).
And on preview, I think another problem as polyglot points out is that its next to impossible to create an ontology that works equally well across all application domains.
(*) and network shares, SSDs, and RAID devices are often a virtual interfaces on top of multiple physical storage containers.
posted by KirkJobSluder at 6:56 AM on August 10, 2009
Thanks all! I think I have a pretty clear understanding of the reasons why development in WinFS stopped. I've been playing around with FILESTREAM some and it is a bit more sophisticated than it first appeared. I guess it could be summed up as being an over-engineered solution to a problem that really didn't exist in the first place. I do wonder how much the internet played in killing WinFS, which wasn't addressed here. More I think about it, the more I realized that things I'd setup to be best consumed in a relational database are already in a relational database managed on the web (e-mail, music, etc.).
posted by geoff. at 12:44 PM on August 10, 2009
posted by geoff. at 12:44 PM on August 10, 2009
Two bits:
Why did WinFS fail?
Because Longhorn was cancelled. The WinFS project was a part of Longhorn, and when Longhorn died, WinFS did as well. The history of software projects is littered with stories of promising sub-projects going down with the ship.
The problem is when you transfer a file with filesystem-level metadata to another system you need a wrapper format (which doesn't exist) and you need the target platform to support database filesystem semantics (which they don't) so you may have to discard the metadata or have a separate metadata file sitting beside it (which reduces the usefulness of it).
I think this is overly pessimistic. Apple’s consumer and server OSes have been shipping with arbitrarily extensible metadata on HFS+ volumes (that’s the typical Mac hard drive format, for the uninitiated) for over four years. They use it to implement Unix file permissions.
posted by Ptrin at 7:13 PM on August 10, 2009
Why did WinFS fail?
Because Longhorn was cancelled. The WinFS project was a part of Longhorn, and when Longhorn died, WinFS did as well. The history of software projects is littered with stories of promising sub-projects going down with the ship.
The problem is when you transfer a file with filesystem-level metadata to another system you need a wrapper format (which doesn't exist) and you need the target platform to support database filesystem semantics (which they don't) so you may have to discard the metadata or have a separate metadata file sitting beside it (which reduces the usefulness of it).
I think this is overly pessimistic. Apple’s consumer and server OSes have been shipping with arbitrarily extensible metadata on HFS+ volumes (that’s the typical Mac hard drive format, for the uninitiated) for over four years. They use it to implement Unix file permissions.
posted by Ptrin at 7:13 PM on August 10, 2009
« Older My brakes make a weird noise. What's the deal? | Help save my puppy from my bullying cat Newer »
This thread is closed to new comments.
posted by scruss at 3:01 PM on August 9, 2009