Single device encryption - is sounds like a SHAm.
May 16, 2012 3:19 PM   Subscribe

Help me understand how encryption works when there's just one device involved.

Okay, I think I finally understand at a high level how public key encryption works: each side generates a public/private key pair, sends the public key to the other party, messages are encrypted using the public keys can only be decrypted using the private keys, and it's secure because determining a private key from a public key is extremely difficult. (Or something like that). I can see how it's secure - if an attacker gets their hands on a public key and data encrypted with that key it's extremely difficult to decrypt the data, and presumably private keys are kept secure on each side.

But what I don't understand is how encryption that lives entirely on a single device (say, a computer or a smartphone) can ever be secure - at least, more than marginally secure from a determined attacker. Note that when I say attacker, I'm not talking about "kid who knows how to use computers," I'm talking about "determined criminals seeking banking information for many, many people." For the sake of the question, assume I have a large amount of valuable data (I don't, but that's what makes the question interesting). I can see how local encryption could prevent a run-of-the-mill person from decrypting it, but my question is: what if it's someone with some powerful (but not NSA-powerful) computers and some real know-how?

3 examples:

My smartphone has an email app* that claims to store its data with AES-256 encryption, so if someone got a hold of my SD card or my entire phone and they don't know the PIN I use to open the application they won't be able to pull my email off my device. But it seems to me that the app has to keep the decryption key around somewhere in order to decrypt the information when I want to see it. If someone gets a hold of my phone, why can't they just find the key stored somewhere on my phone and use it to decrypt my data?

Similarly, my employer gave me a laptop with PGP encryption on the hard disk. I assume this relies on keys and reversible encryption functions so it can decrypt the data on my HD when I want to use the computer. But the key(s) must live on my computer, and I believe the rest of the ecryption process is known or at least could be discovered. If someone gets a hold of my laptop and doesn't know my password, couldn't they get a hold of the keys and run my dta through the same decryption steps that PGP uses?

Finally, I've played around with encrypted 7z files using 7-Zip. According to Wikipedia, it creates a key based on running a password through SHA-256 is 524,288 times. But that still means that the password is the weak point and subject to a brute-force attack - an attacker could just generate all the keys for passwords up to 20 characters in length (starting with likely passwords) then try all the keys. Sure, creating all of those keys will take a long time. But it's a one-time operation, then the attacker essentially has a rainbow table of keys that they can use elsewhere. Maybe I'm underestimating the amount of time that it would take to generate all of the keys and try them out, but since the hard work of generating the keys is a one-time operation it seems that an attacker could throw a lot of resources at them and then re-use the keys in future brute-force attacks. So how is that secure, unless I have an insanely-long password?

This question doesn't have any practical application for me - I'm not trying to design a secure system. I'm just interested to know why these encryption mechanisms are considered secure. Maybe I'm underestimating the difficulty of these attacks, or maybe I don't understand how the keys are stored or generated.

*The specific app doesn't matter, but it's Touchdown by Nitrodesk.
posted by Tehhund to Technology (13 answers total) 4 users marked this as a favorite
For your phone app, it's very possible there's a device-specific ID/GUID that's hardwired into the device that they use as part of the key to encrypt the data stored on the SD card. No device, no decryption. That at least saves you from the "lost SD card" attack, and possibly saves you from a naive "rogue app shipped off all my data" attack that doesn't include the per-device ID.

"Generating all the keys for passwords up to 20 characters" is computationally infeasible. Even if the passwords are only lowercase alphabetic characters, and let's say it's 15 characters instead of 20, and let's say the output is only 1 byte, that's still 1500 exabytes of storage.
posted by 0xFCAF at 3:34 PM on May 16, 2012 [2 favorites]

Best answer:
an attacker could just generate all the keys for passwords up to 20 characters in length
There are over three thousand trillion trillion trillion possible passwords of length 20, using only the ASCII printable characters. And to be clear that doesn't count those of less than length 20.

If someone could magically generate keys for a trillion of them per second, it would take quintillions of years to get them all.

Disclaimer: maybe my math sucks
posted by Flunkie at 3:35 PM on May 16, 2012 [4 favorites]

In all your examples, the password is the point of security. It is the element hackers need to decrypt the data. If your password is short and guessable (e.g. in the dictionary, or a date), it can be brute-forced quickly.

There are 839,299,365,868,340,224 possible 10 character passwords composed of upper and lower case alpha numerics. Trying 1000 passwords a second it would take 20 million years to try them all. And a rainbow table would use more storage than exists on the planet.

Trying to brute force a 20 character password on a billion computers would take many times the age of the universe.

That's where the security lies. Also, some encryption algorithms (like the one 7z uses) have a built in "work factor", meaning they are deliberately designed to take extra computing power to decrypt so brute force attacks take longer.
posted by justkevin at 3:36 PM on May 16, 2012 [3 favorites]

You're on the right track with your thinking about 7zip: in each of these cases, what's happening is that the software is using your password/PIN to generate a private key; that key is used to decrypt the contents. So, when you enter your PIN for the encrypted email program, it's converted into a private key used to decrypt your email. Similarly, PGP takes your password, passes it through an algorithm, and uses the key out the other side to decrypt your data.

The term here is a key derivation function: it's a way of taking a shorter key and turning it into a much longer one (e.g. 256 bits for AES-256).

In each case, you're completely right that the weak point is the password. Breaking PGP itself will take many (many) orders of magnitude longer than brute-forcing the password. So the key to real security in all of these situations is picking an appropriately strong password.

Along these lines: if your email app really is using a 4-number PIN as a lock, then the AES stuff is essentially bullshit. If that's the case, they're selling you an iron safe (AES) with a glass door (the PIN).
posted by jacobian at 3:37 PM on May 16, 2012 [2 favorites]

I started to type justkevin's (and everyone else's) response, but he got there first. I'd add to that: Once you decrypt the data with a password, you still have to know if that data is valid or gobbledygook. With various cryptography challenges with shorter messages, it's been a big deal what language the message was in, so that you could look through for patterns that appeared to be a message.

All of this is reiterating that, past the password, what you're trying to do is to make it computationally expensive enough to try all of the variations that nobody bothers. If you have to create the password, decrypt a file (or several files), and then look through that file to see if you've got something that looks like decrypted data, and you can make that step take a sizeable fraction of a second for the foreseeable future, that's the goal.
posted by straw at 3:42 PM on May 16, 2012 [2 favorites]

Best answer: So let's deal with these cases one by one.

Your smartphone likely picks a random encryption key that's used with AES-256, which is a symmetric cipher (not PKI). That key is encrypted with your PIN. So, no PIN, no key to decrypt. That said, if I can get malware on your device, I can log input, capture your PIN, and get your email. This is why most mobile devices have settings to wipe your device after N bad PIN entries: the security is all in the PIN.

With your laptop, you're using public key encryption, which you're missing a step from. When you encrypt a message using a public key, what you're actually doing is selecting a session key, encrypting the message with the session key, encrypting the session key with the public key, and then sending the encrypted message and encrypted key to the other party. Private key is used to decrypt the session key, and the session key is used to symmetrically decrypt the message.

There are a couple of different ways to do whole disk encryption on your laptop. One way is to prompt you for a password at boot, which is used to decrypt things. It's pretty easy to understand how the password is safe with this method. Another method, that's used by Microsoft's BitLocker and others is utilizing a special chip in your lappy called a trusted platform module (TPM) chip.

Encryption with the TPM chip works like this: the password gets stored in the chip, which you can't just read out of. On boot, the TPM chip runs a checksum on special files in the OS to make sure tampering (malware, rootkit) hasn't occurred. If it has, halt, no boot. If no tampering is detected, it releases the key, which is used to decrypt the disk and get things functioning.

Regarding 20 character passwords: you and others are not thinking about the problem correctly and understanding what rainbow tables are used for. Storing a password is very insecure -- you can read it. So we hash the password. You can use rainbow tables to attack these hashes -- precompute every hash for a given keyspace (e.g., all 8 character possibilities), and then do a lookup. We (the security community) know about this, though, so we also add a salt -- whenever we want to store a password securely, we generate a random value, combine it with the password in some way (normally concatenation, because this increases the size of the keyspace), and hash the result. A successful rainbow table against a salted password would require every key within the keyspace concatenated with every possible salt value precomputed. We're talking end of the Milky Way timescales here.

In your 7zip example, this isn't even useful. The 7zip password isn't stored anywhere. When you type in the password, 7zip decrypts the start of the file and checks for a 7zip header. No header == bad password. Also, the computational cost isn't in generating every possible password -- that's dirt cheap. The computational cost is in the decryption operation, and you still have to pay that in your example.

In the security field, whenever I or someone else is assessing a product, the encryption implementation is the first thing we look at. This is why software engineers are idiots if they don't use existing encryption libraries. They are _extremely_ well vetted, and because so many products use those libraries it's a _major_ event when someone finds an attack against a commonly used encryption lib. Basically, it makes a very sexy target to security folk.
posted by bfranklin at 3:57 PM on May 16, 2012 [4 favorites]

I should also note with the phone example that a smart engineer wouldn't store the decryption key on removable media like an SD card. They'd put it in the phone's onboard mem.
posted by bfranklin at 4:01 PM on May 16, 2012

Response by poster: These answers really help. Let me see if I can sum it up at a level that I can remember easily:

In any properly-implemented encryption scheme, the decryption key is not stored on the device (except maybe for TPM, which bfranklin pointed out as an unusual case). Instead, the decryption key is calculated from my password (or PIN) every time I want to access my data. Even though there is a deterministic mapping of passwords --> keys, with a sufficiently-strong password trying to brute force the key by starting with passwords is prohibitively difficult (i.e., computationally expensive).

Is that the take-away? Because if so, that exactly answers my question and makes a lot of sense. In addition to assuming that keys are stored locally, I assumed that the password --> key mapping limited the number of possible keys to a small enough number that generating all keys for very-long passwords would be feasible, but it sounds like that assumption was wrong.
posted by Tehhund at 4:29 PM on May 16, 2012

Actually, TPM is not particularly unusual. If anything, it's the gold standard at the moment.

Additionally, your wording is a little ambiguous. Brute forcing the key is expensive because decryption is expensive (checking that first encrypted block to see if the header is correct), not because deriving the key from the password is expensive.
posted by bfranklin at 5:17 PM on May 16, 2012 [2 favorites]

One thing to expand on with full-disk encryption: it's extremely unlikely that the encryption key is actually derived from your password. For one thing, this would mean that the entire disk would need to be decrypted and re-encrypted every time you change your password. bfranklin touches on this above, but a good example is Apple's FileVault 2 full-disk encryption in OSX Lion.

With FileVault 2, as I understand it, there's a single 128-bit encryption key for the entire encrypted volume. Lion lets you designate specific users who can boot the encrypted system volume, and each one has to enter their password when they're added to the list. It then stores a separate copy of the encryption key for each user on the disk, encrypted with that user's password (or a derived key). It also stores one additional copy of the key, encrypted with the "recovery key", a randomly-generated string of characters that Lion gives to you when you first encrypt the volume.

When the system boots, the boot loader asks for one of the authorized users to enter their password, at which point it uses the password to decrypt the volume encryption key, and then uses the key to decrypt the volume (or rather, to decrypt/encrypt any block that it needs to read or write) so the system can boot. Of course, you can also use the recovery key (which is not stored anywhere, although you can opt to let Apple keep a copy for you) to decrypt that extra copy of the volume key if needed.

Of course, whenever a user changes their password, the system needs to decrypt and re-encrypt the stored copy of the volume encryption key for that user, but that's a much smaller task than doing the same thing for the entire contents of the volume.
posted by McCoy Pauley at 7:46 PM on May 16, 2012 [2 favorites]

Additionally, your wording is a little ambiguous. Brute forcing the key is expensive because decryption is expensive (checking that first encrypted block to see if the header is correct), not because deriving the key from the password is expensive.

this is exactly back to front. decrypting the data is computationally cheap - we want access to the data to be fast. deriving the key from your password is computationally expensive - you only do it once each time you access your data, so it doesn't matter if it takes a while, and if it's slow then it slows down brute force searches of the password space.

see PBKDF2 and key stretching.
posted by russm at 2:23 AM on May 17, 2012 [2 favorites]

Response by poster: Thanks all! This has been very helpful, not to mention interesting.

This clarification was very helpful:

it's extremely unlikely that the encryption key is actually derived from your password... It then stores a separate copy of the encryption key for each user on the disk, encrypted with that user's password (or a derived key). It also stores one additional copy of the key, encrypted with the "recovery key"

I read a few things that mentioned encrypting keys with other keys, and I couldn't understand why that was useful. But now I see: encrypting the master key with each user's keys ensures that there's just 1 master decryption key. It also helps me understand how recovering a TrueCrypt volume works (by storing an additional copy of the master decryption key, but encrypted with my password) - I had originally shied away from TrueCrypt because I didn't understand how recovery works, but now I might be able to use it confidently (with good backups, of course).
posted by Tehhund at 5:05 AM on May 17, 2012

Indeed, you are right Russ. Oddly, I know that, I know I knew it last night, and I'm staring at my own comment wondering how the hell I wrote that. I blame pre-baby stress.
posted by bfranklin at 7:55 AM on May 17, 2012 [2 favorites]

« Older What's that funny smell from the sauna?   |   Make our family cruise memorable, but in a good... Newer »
This thread is closed to new comments.