In a journaling file system, why are journal writes more reliable than file system writes?
April 10, 2007 3:56 PM
Subscribe
In a journaling file system like ext3, a journal entry is written before each file system change, describing the change about to be carried out. This allows quick recovery if the actual file change is interrupted or not carried out due to power outage or whatever. But why is the act of writing the journal entry not susceptible to the exact same threat of being interrupted?
I've read several descriptions of journaling file systems, but I can't find an explanation as to why you're not just pushing the same interruption/inconsistency problem further up the chain.
I can
guess some possible answers: (a) the journal entry is physically smaller or takes less time to write than the actual file change, thus decreasing the chance of something going wrong while it's happening; (b) a single journal entry write can represent the two or more file system writes necessary to keep the system consistent; (c) the file system driver somehow 'prioritises' writes to the journal over normal file system writes; (d) the journal operates on some kind of transaction basis so that partially-written entries can be recognised as such and ignored.
These are all nice theories out of my head as to why journal writes are more reliable and more atomic than the actual file system writes. But I can't confirm these hunches anywhere. So what are the actual reasons (in, say, ext3, if a concrete example is needed)?
posted by chrismear to computers & internet (10 comments total)
1 user marked this as a favorite
1) you find a new location on the disk to write your new data
(if a failure happens here, you lose the new data)
2) You save the old location in the journal
(If a failure happens here, you lose the new data)
3) You update the actual file allocation table
(If a failure happens here, you can read the journal, and recover the old data, you lose the new data)
4) You update the journal to indicate that the data is written fully
(if a failure happens here, do the same thing as step three)
That's how I would implement a journaling file system, anyway.
posted by delmoi at 4:05 PM on April 10, 2007