There are 10 kinds of people in this world: those who can deal in binary and those who can't.
July 18, 2008 12:06 PM   Subscribe

How is data on my hard drive retrievable even after it is written over?

I try to follow the accepted wisdom regarding personal or sensitive data. I have a program that erases files to DoD standards. My understanding is that it over-writes the data multiple times with randomly generated bits and bytes.

But then you hear people say that's not good enough, that if you really want the file to be completely gone you have to use a sledgehammer to break the disk, then an acetylene torch to burn it, then pee on the ashes and dilute the mixture in 3 gallons of bleach (possibly I'm paraphrasing).

So how do forensic computer experts do their thing? If I have a file on Monday that is represented by 01010101 and I save over it on Thursday with a file that is 10101010 how can they examine my computer and say "Well it reads 10101010 right now but I can tell that last Monday it read 01010101 and that's illegal".

Am I over-simplifying? Isn't it an either/or, binary state. There is no history to a binary state. It's either a 1 or a 0. How would you know it was a 1 three months ago? I have read previous AskMe's on how to erase data and what works and doesn't work but I am more interested in the actual theory behind magnetic media and the permanence of data I guess.
posted by pixlboi to Computers & Internet (16 answers total) 3 users marked this as a favorite
While software sees only 1s and 0s, the hard drive is ultimately an analog device. The magnetic field of a given bit is not necessarily aligned 100% one way or the other. When a bit is flipped, only most of the bit will be changed. Careful probing combined with statistical analysis can suggest older values. There is a good Wikipedia article on the subject and another one on data recovery in general.

Here's a video of clean room data recovery.
posted by jedicus at 12:20 PM on July 18, 2008

In short, this is possible because the data is binary, but the storage method is analog (hard drive technology is based on magnetism, which isn't an on/off thing). So, it's possible for there to be remnants of older signals which are beneath the threshold of the read head; these are what people are looking at when they retrieve written-over data.

"The problem lies in the fact that when data is written to the medium, the write head sets the polarity of most, but not all, of the magnetic domains. [...] The recovery of at least one or two layers of overwritten data isn't too hard to perform by reading the signal from the analog head electronics with a high-quality digital sampling oscilloscope, downloading the sampled waveform to a PC, and analysing it in software to recover the previously recorded signal. What the software does is generate an "ideal" read signal and subtract it from what was actually read, leaving as the difference the remnant of the previous signal. Since the analog circuitry in a commercial hard drive is nowhere near the quality of the circuitry in the oscilloscope used to sample the signal, the ability exists to recover a lot of extra information which isn't exploited by the hard drive electronics."
posted by vorfeed at 12:21 PM on July 18, 2008

These guys are trying to sell a wiping app, but they have a decent explanation of the problem.

Essentially, the heads of a drive aren't 100% accurate, so there can be some overlap that could possibly be read by someone with enough time/money/motivation.
posted by jjb at 12:21 PM on July 18, 2008

The explanations are correct but more theoretical.
Maybe the CIA or the US military could retrieve the data. A computer magazine once wiped the data of several HDD and gave it to companies that could retrieve data from damaged/overwritten HDD. Not one was able to retrieve the data.
posted by yoyo_nyc at 12:39 PM on July 18, 2008

But then you hear people say that's not good enough

It is good enough. Every pass overwrites a slightly different part. At 3 passes the chances of recovery are extremely thin.

FWIW, Gutmann doesnt even endorse his own method anymore because modern drives dont work like the older drives he did his research on. Plain jane wipes still do wonders.
posted by damn dirty ape at 12:40 PM on July 18, 2008

Try and find a company, anywhere, for any amount of money that claims to be able to pull data off a modern hard-drive that's been erased to DoD standards.

You won't.

With old tech hard-drives, it is spoken in hushed tones that the magnetic zones that make up the 1/0 were big enough that the head would only have a decent effect on on area of that domain when it overwrote, or that the head spilled some magnetic interference on the neighbouring 'bit'. So if you took an expensive enough machine you could just read (say) the top corner of each of those zones and potentially pull something back. To my knowledge, no-one ever actually practically demonstrated this. The closest I've heard is that someone could say 'Yeah, there was some data there, but I've got no idea what it was'.

These days, with the smaller tolerances and perpendicular recording techniques, it's just not feasible. Unless we have some quantum leap in measurement techniques and I suppose a larger quantum leap in pattern recognition to run against all potential magnetic remnants for each grain in the 'bit' (tens of grains, maybe a hundred?) then it's just not feasible, and you could argue the pattern recognition would just be like a monkey at a typewriter creating a cartesian product of all possible states.
posted by Static Vagabond at 12:44 PM on July 18, 2008 [1 favorite]

The problem is that when you overwrite a 0 with a 1 you don't end up with a perfect 1, you end up with a 0.95. Same for overwriting a 1 with a 0, you end up with a 0.05. So by looking at what the "official" value of a bit is, and what it actually is (an analogue measurement as jedicus says) you can work out what it was before. The more times you overwrite the closer you get to a true 1 or true 0, but you never quite get all the way there.
posted by alby at 12:46 PM on July 18, 2008

Modern drives can mark a block as bad, and invisibly re-map requests from software to access those blocks to somewhere else on disk. That's a great feature and works well, but it does leave the original block (which is probably perfectly readable) with the original data it contained. That original block cannot be written to by any software - not by the operating system, not by your erasure program, not by the BIOS, not by a low-level format - because the remapping occurs at a very low level inside the hard disk. But a suitably equipped data forensics person can get at those blocks fairly easily, by directly controlling the hard disk actuators.

Another issue is that different temperatures, different drive positions, and different amounts of wear on the components of the drive, between when the data was written and when it was erased, might mean that some of the original data track is still detectable at one edge or the other. This can potentially be accessed by mounting the platters from your hard disk on specialist reader hardware.

Finally, if the pattern used to overwrite the data track is anything other than completely random, it's possible to use statistical techniques to "subtract" the erasure pattern and recover a weak analogue representation of the original data.

All in all, these are expensive and difficult techniques with low probability of success. For most purposes, software data erasure will be adequate. On the other hand, if you would lose your family, job, or liberty over whatever's on that hard disk, I'd stick with the hammer + acetylene method;)
posted by standbythree at 12:49 PM on July 18, 2008 [1 favorite]

Related question: So far, all the answers have assumed the magnetic over-writing is done by the drive head. What about external magnetization, such as the field from a strong current or a powerful magnet. In the past, I've dragged hard drive magnets (left over from the sledgehammer method) across the case of a drive I wanted to wipe well, but not destroy. This was after a thorough, multiple-pass software wipe. Would the external field take care of any artifacts left over from the software?
posted by dinger at 3:29 PM on July 18, 2008

Related answer: any external field achievable without superconducting electromagnets would do a less thorough job of scrambling the data bits than the write head. Any external field strong enough to affect data would also scramble the embedded servo information badly enough to confuse the drive's track positioning mechanism, which would make the drive unusable and data recovery difficult - less difficult than overwriting the data in the usual way, though.
posted by flabdablet at 5:37 PM on July 18, 2008

Maybe I'm completely uninformed here, but couldn't you achieve much the same thing as these data wiping apps much more cheaply by simply probing the hard drive with a reasonably powerful magnet?
posted by Rhaomi at 5:52 PM on July 18, 2008

Note to self: preview is useful, especially when the tab has been open for twenty minutes or so prior to commenting.
posted by Rhaomi at 5:54 PM on July 18, 2008

To amplify what flabdablet says, generally, writing a very wide, very low frequency signal (e.g. moving a handheld permanent magnet over a platter) is not an effective method of removing narrow, high frequency signals (the data tracks on the platter). The signal level would be reduced to the point that the drive itself probably wouldn't be able to read the tracks, but it would be an easy day in the office for a data forensics firm asked to recover the data.
posted by standbythree at 7:13 PM on July 18, 2008

As others have said, there's some talk about how this might still be possible on modern drives, but if anyone's actually done it, they're buried deep inside some intelligence agency somewhere. Any remanent magnetism holding erased data is a piece of the disk that could be holding real data and making more money for IBM (or Hitachi, or whoever). There's a lot of research money poured into getting those bits as dense as possible. So any palimpsest margin is going to be really narrow. My guess is it isn't possible to get very much data back from an erased drive at all. Still, depending on the circumstances, “not very much” could still be enough data to hang you.

On the first hand again, though, the organizations I know of that care about their data simply never consider a drive to be “clean” once it's ever held sensitive stuff. By the time they're getting rid of it the drive is obsolete anyway, so it's cheaper to physically destroy it (hammers, sandblasters, furnaces, whatever) than to ensure that it's properly erased before selling it off.

And, every now and then, someone does surprise us all with a simple effective technique for retrieving supposedly-vanished data. So if you're in the business of being paranoid, it does pay to be paranoid.
posted by hattifattener at 12:18 AM on July 19, 2008

I think the motivation for using a sledgehammer or torch or angle grinder or whatever is not so much that conventional wiping is unreliable, but rather
  1. a drive turned into slag is immediately identifiable as such, whereas it's possible to mistakenly think that a drive has been wiped when it has not
  2. physically destroying a drive is quicker than waiting for a long multi-pass wipe process
  3. geeks like excuses for destroying things

posted by Rhomboid at 12:43 AM on July 19, 2008

Oh, and 4. if a drive has suffered a failure it may not be possible to wipe it.
posted by Rhomboid at 12:45 AM on July 19, 2008

« Older I have a hungry kitten and a terrified bird   |   Who would want a veterinary symptom video Newer »
This thread is closed to new comments.