Join 3,376 readers in helping fund MetaFilter (Hide)

What's the Secret Password?
April 27, 2007 4:51 PM   Subscribe

How exactly does authentication work in a website like Basecamp, or more generally, a site built on Rails or LAMP. When I sign-up, I enter a username and password. I presume this is stored in a database table. But after that, how does the server know who I am during the course of my 'visits' and how does the SQL database know what I have access to ( only my projects ) and what I don't have access to ( Other peoples projects)? Where do cookies, if at all, come into play. If cookies do come into play, can they not simply be forged? I am completely clueless regarding the subtleties of authentication, user sessions, and security. Please enlighten me.
posted by kaizen to Computers & Internet (20 answers total) 10 users marked this as a favorite
Yes, cookies come into play. The cookie will either have something like your username and (encyrpted) password, which will be used to authenticate you, or a session id of some kind. Typically the values of the cookies are selected to be difficult to falsify, you'd have to know something "secret" to figure out what value to send for the cookie. Some (maybe most) cookies are vulnerable to being spoofed if an attacker can find out the value of your cookie.

Check out the cookies that you have stored on your computer some time. Firefox used to have an option to require you to accept every cookie given to you, checking that option if it still exists would be illuminating because you'd see every time that a site tries to give you a cookie, and what's in it.
posted by RustyBrooks at 5:00 PM on April 27, 2007

When you create an account, your username and password are stored in a database.

When you login to your account, you provide a username and password. If they match, a record is created in a sessions database table. The record contains your username, and a long random string which is created just then. Your browser is given a cookie, which contains that long random string.

When you request a webpage from the same site that you've logged-in to, your browser provides that cookie with the long random string (automatically). The webpage looks up the long random string in the sessions table, and if it finds the string you provided, it assumes you are the username associated with that same entry in the sessions table. And of course the webpage will give you access or not, to various things, depending on the privileges associated with your username.

Sometimes your log-in will time out after a while. All the site has to do is delete your entry from the sessions table, and presto, the long random string that your browser provides no longer works, and you need to log in again.

Since the string is long and random, other people can't just guess it - or at least, it's at least as hard to guess the random string as it is to just guess your password in the first place. If the whole website uses SSL (https), no one can eavesdrop on it.

There is no 100% standard way to do the above. Generally each web application does authentication slightly differently. But the basic steps above are always present.
posted by jellicle at 5:03 PM on April 27, 2007

(That is to say, in many authentication schemes, your cookie is all that is needed to authenticate you into the system).

An alternative authentication mechanism that is somewhat rare these days, but does exist, is done through the use of an authentication "header". The client sends a reqest to the server. The server notes that there is no auth header. The server sends back a response saying "sorry, I need an auth from you" and your browser pops up a window asking for user/password. When you enter it, it sends the auth string back to the server, and the server accepts or rejects it.
Here is a key to this scheme t hat most people miss: the client stores the auth string that you entered and sends it back to the server each time you visit a page, generally until you close the browser. Every request you send to the side has your username and password, whether the site needs it for this page or not. It's kind of crude but I've seen big websites use it.
posted by RustyBrooks at 5:04 PM on April 27, 2007

Well, as far as knowing what projects you have access too: all of those projects will have some sort of UserID or owner field indicating who they "belong" too and who has access too them.

Authentication can be done in a lot of ways. Some sites will just store your password, or an encrypted version as a cookie. Those are easy to forge, but in order to do it, you have to know the password. And if you know the password, then you have access anyway.

Another way to do it is to give each user a session key as a cookie. A database lookup will retrieve all the information associated with that session.

If you want extra security, you can tie the session to a specific IP address, so if someone copies it, they won't be able to use it unless they're on the same machine or accessing the site through the same proxy.
posted by delmoi at 5:04 PM on April 27, 2007

how does the server know who I am during the course of my 'visits' and how does the SQL database know what I have access to ( only my projects ) and what I don't have access to ( Other peoples projects)?

As others have said, two different security principles in play here. First of all, people are mentioning that "cookies have either your encrypted password, or a session ID," as if this is a neutral choice. Storing an encrypted password is bad, bad, BAD design (sorry Matt); session IDs are seriously preferable if you're designing a new app. Yes they technically can be forged, but a session is typically associated with an IP, and each new session will send a new pseudo-random hash to the client so you can't forge with an expired session cookie. Since you're interested in PHP, see more about the $_SESSION variable and the session functions.

Controlling access to "my" projects is based on database design. If only certain users are supposed to have access to certain objects (say, projects) in a database, the database will have tables to represent "users" and "projects" as well as, say sub-units of a project called "tasks." Since each project is owned by a user, each will be linked to precisely one user; since each task is associated with one project, a task gets its user link from its project link. Read up on database design to make sure you get all the right links / foreign keys in place to ensure security.
posted by rkent at 5:24 PM on April 27, 2007 [1 favorite]

If the web app was competantly written, it will never store the user's password verbatim, just a one-way hash (such as SHA-1) of the password. By doing it that way, it is still possible to verify that the user specified the correct password, but impossible for the server admin (or a malicious user that gets access to the DB through a flaw in the code) to look at the DB and see the passwords of every user.
posted by Rhomboid at 5:25 PM on April 27, 2007

Whoa, RustyBrooks! "HTTP Authentication" (which is what you're talking about) is no more crude or rare than cookie-based authentication! (Part of what I do for a living is HTTP auth and security, so excuse my petulance.)

You know how when you visit some sites, your browser pops up a dialog asking you for a username and password? Not a text entry field in a web page, but an actual browser dialog. That's HTTP Auth in action.

(The difference between using cookies and HTTP Auth is that a cookie is completely up to the server to interpret; HTTP says nothing about it. It's up to PHP or whatever to deal with it. HTTP Auth is part of the protocol, so Apache deals with it. Sending auth credentials with every request is just as crude as sending every cookie with every request. I feel "real" HTTP authentication is better than cookies, because I know it's not stored on disk and disappears when I quit the browser.)
posted by phliar at 5:36 PM on April 27, 2007

It is possible to use session ID information without cookies. What I describe below also doesn't imply HTTP authentication, either. You can just include the session ID's in all internal site links on a generated page; when the user clicks a link, that link includes, as part of the URI, a session ID.

Those long, complicated links include (I believe) a session ID. Even if you have cookies turned off, you can fill a shopping cart and complete an Amazon order (unless this has changed recently).

Cookies are a good backup for this kind of system; if, for example, a user leaves and looks at other sites for a while, then loads an page, Amazon's site/server can detect a cookie and use the same session ID again. Without stored cookie information, loading '" or any other page without clicking on a link in a previously-generated page (which would have those special URIs with session IDs included in them) would lead to a fresh, new session being created, and old session information being lost.

Using both the URL-encoded session ID _and_ the cookie-encoded session ID is a good way to ensure that session/identity information is preserved even if a) the web browser does not have cookies enabled (admittedly rare, but still possible), and/or b) the user leaves a site and returns to it later.
posted by amtho at 6:07 PM on April 27, 2007

Incidentally, this is one of those programming problems that people invariably solve in the wrong way, repeatedly, until they converge on the right answer as given above.
posted by smackfu at 6:31 PM on April 27, 2007

Incidentally, this is one of those programming problems that people invariably solve in the wrong way, repeatedly, until they converge on the right answer as given above.

Amen brother!

Just as a contrast to "how does it work" Ive just done a couple of security assessments where the organizations have done pretty much everything wrong you can imagine (from a security perspective), so just for kicks these are some examples I've seen recently with things going wrong with Web Authentication and Authorisation.

1. Login allowed (and on some sites default) over HTTP rather then SSL/TLS. (fine for your blog - not so much for companies handling sensitive data). I don't care if you have a pretty lock icon next to the login form field.

2. Basic Mode Authentication over HTTP to devices on the belief that Base64 is "encryption".

3. Allowing enumeration of valid web app users by inference from a "password is incorrect" error message versus a "username is not known" error message.

4. Interestingly by using 3. finding that roughly 15% of enumerated accounts have passwords set to their user name or "password"). Guessing no password strength system in many applications.

5. Inserting the username without input validation on the "username is not known" error page - resulting in cross site scripting ability of the login screen.

6. Even better then 5., having an account registration sign-up page which doesn't validate input *and* displays the sign up information back to administrators (for them to authorize access) - allowing the placement of a persistent cross site scripting attack for the administrators without requiring them to do something dumb (like click on an email link etc). Insert AJAX keystroke logger or session cookie capture script here.

7. Having a Javascript based navigation system that discloses the "hidden" menu options (sort of "if user=admin then show this link to secret admin screen" logic). Doesn't take a criminal genius to just type these in and - yip now I'm the admin. Or of course you can just brute force page names and find it with DirB.

8. Being all pleased that SSL is used, then having HTTP based assets on the same server, which of course discloses the session ID in the clear because no one made the cookie an SSL only/secure cookie.

9. Being all pleased that SSL is used then having the session ID in the URL (for everyone to see over your shoulder, for you to include as the footer whenever you print a page, and to show up nicely in your Internet Cache). Extra marks in this case because sessions of this application were expected to stay open for the entire work day without cookies expiring, so if you printed off a page for a work colleague (URL at bottom of page) you were in effect saying "here is my session ID - go become me".

10. Leaving WS_FTP.log and CVS entries files on servers (effectively giving directory listings) so we can find the *super secret* configuration files with administrative usernames and passwords.

11. Having a /admin directory with no authentication

12. Locking down the web application but leaving the JBoss/Tomcat/WebSphere/whatever admin screen accessible to the real world (with weak or default credentials)

I'm sure there were others (SQL injection in login form etc) - but I'm done ranting. The OWASP is your friend.
posted by inflatablekiwi at 7:53 PM on April 27, 2007 [6 favorites]

Authentication and security are really orthogonal issues that intersect in the typical use case of controlling access to privileged resources in web applications. Authentication is handled by storing a user’s name and password in a database, then matching these against user input at login. So that login only needs to happen once per session, a cookie is usually set in the browser, containing some identifier which should be unforgable and unguessable.

Security is usually managed by an Access Control List (ACL,) which is essentially a table containing mappings from user accounts or groups to actions on privileged resources. ACL security has been the de facto standard for decades, but it does have problems, and there are other options.

Capability security is a model in which a user may gain access to any resource she can name. For example, a privileged resource in a web application may be address by a URI containing an unforgable, unguessable key. Any user who knows that key may access the resource, so security is maintained by only telling users the URI of resources to which they should have access.

Capability security has a number of implications, including: 1) Authentication is not strictly necessary in a capability model, becuase it is sort of implicit. 2) There is nothing to prevent a user from sharing a privileged URI. (Note that in traditional models, there is nothing to prevent a user from sharing their password.) 3) The ACL model can be implemented in terms of capabilities, but not vice versa. 4) In a web-based environment, all requests should be made via SSL, because it is the URI itself which contains the privileges and should be protected.

A while back, I came across a web-based, collaborative text editing application that used the capability model, but I cannot remember where it was now… There was no need to create a user account to use the application. You simply request a URI for a new document editor, and then share the URI with others whom you want to be able to edit the document. In the document editor, you can also request a new "read-only" URI, which can be shared with others whom you want only to be able to read the document.
posted by ijoshua at 8:02 PM on April 27, 2007

Ah, here it is: WideWord.

Here is a test document that I’ve created:


Note the URIs in those links. They use the SSL protocol, and the capability “key” is the last part: that long string of seemingly random letters and digits. That key would be very nearly impossible to forge or guess.
posted by ijoshua at 8:21 PM on April 27, 2007

The above stuff is basically right (inflatablekiwi in particular nails the ways this goes wrong), but I want to step back a little and provide some higher level context.

The basic problem here is that you want a way for clients to prove their identity using some sort of credentials. For your house, your credentials are your key - on the web, credentials can be lots of different things. Username / password combinations are the most obvious example, but most of the systems described above use a combination because (obviously) you don't want to provide your username and password for each page.

So the issue is, how do you issue a new credential to users when they login? You want this new credential to have a few properties: The session_id system is one way to do it, but it's not so great for performance. Having to maintain a list of extant sessions is sort of annoying. The basic strategy that I use (described at great length in this very excellent paper on this topic, along real life examples of how weak security is one most websites) to put a user-id and expiration date in cleartext in the cookie, as well as a hash of those two values + some random string from the server (called the "salt"). When you submit that string, what happens is the server can verify it's legit by looking at the first half (in clear text), add the salt and hash it. If it matches the hash in the cookie, that cookie was generated by the server and you can trust that the values in clear text are legitimate. Otherwise, the cookie has been tampered with and you should force them to present a username and password. This makes the cookie validation process super easy - you don't need to look up information anywhere else.

Whew. Hope that helps.

PS - this is kind of a tangent, but well designed websites never ever store your password in a database. Instead, they store a hash of your password. Hashes (which I mentioned earlier, but dunno if you're familiar with) are basically one way functions. They're very easy to calculate going forward (totally made up example: "foo" turns into "ab3534cef53") but there's no way given the result to go backwards (ie given "ab3534cef53", it's practically impossible to figure out that it was originally foo). This makes sure that if someone breaks into their database, all they'll see is your hashed password, which is useless. Also, very good databases will salt password like I mentioned in cookies to prevent rainbow attacks. I think that's pretty rare, though.
posted by heresiarch at 9:26 PM on April 27, 2007 [1 favorite]

phliar: the basic http authentication that browsers use is entirely unencrypted, it's your password being sent to every page in the clear, requested or not. It's bad. And I say this as a person who works for a company that uses it as their means of authentication. With cookies you at least have the opportunity to use some more sophisticated methods of authentication, like those described above. And it is uncommon. I'd say 90% of the sites that have logins do it via a specialized form and cookie combination, not the built in http auth. Some of them are also bad, some of them are good.
posted by RustyBrooks at 10:47 PM on April 27, 2007

I've said it before in response to other AskMe questions, and I'll say it again: while theoretic good practices say that tying a session to an IP address is more secure, it's also true that it then limits certain users from being able to use your website. There are far, far more edge proxies out there -- AOL being the biggest, but many large companies also being examples -- that proxy each and every request made by users of the internal network through a random specific proxy in a pool of available ones. What that means is that requests from a single user behind one of these network configs appear to come from multiple IP addresses -- the IP addresses of each proxy -- and the session breaks. I spend weeks debugging this once, and while I don't like that it's a reality, it is one.
posted by delfuego at 8:51 AM on April 28, 2007

The session functionality in modern versions of php makes authentication exceptionally simple. My website, for example, currently uses a single session_start(); statement to track user status, and only fifteen additional lines of code to handle sign in/out requests.

And even fewer lines of code if I followed established protocols for code formatting.

And though my authentication code does not currently use MySQL, even that addition (assuming a closed registration process) would add no more than four or five net lines of code.
posted by The Confessor at 9:25 AM on April 28, 2007

Just to reiterate: please sweet God in heaven do not store literal copies of users' passwords in the database. This is what Reddit did, and now some happy hacker has all their users' passwords.

Use a secure hash function, like SHA2, and use a random salt to prevent Rainbow table attacks.
posted by Coda at 9:56 AM on April 28, 2007


13. Doing everything else right, but using a logging level that records the form variables, so the username/password end up available in plaintext in the log even though it's not stored that way on the DB.
posted by smackfu at 2:10 PM on April 28, 2007

For what it's worth, the products from 37Signals do not seem to encrypt the passwords at all. I haven't used any of their paid services, but the couple of free services have sent me my password by email in plain text, which leads me to believe that there is no hashing of the password stored in the database. This is a security risk. I only point it out because you specifically referred to one of their products in your question.
posted by madman at 3:13 PM on April 28, 2007

madman, it is possible to store passwords in a recoverable ciphertext. A keyed, salted, symmetric cipher algorithm is probably secure enough for most web applications, as long as the cipher key is kept secret.
posted by ijoshua at 7:13 AM on April 29, 2007

« Older How to make sure a slightly in...   |  Recommend me some movies to wa... Newer »
This thread is closed to new comments.