A Guide to Web Authentication Alternatives

Previous Section:
1. Table of Contents and Introduction

2. Authentication Options Supported by HTTP Servers and Browsers

The HTTP protocol standards to which all HTTP server programs and web browsers are expected to conform (RFC1945, RFC2068, RFC2069) define two authentication methods, "basic authentication" and "digest authentication."

2.1. Basic Authentication

"Basic authentication" is supported by essentially all HTTP server daemons and web browsers. (At least, I've never encountered one that doesn't support it, and don't hope to.) It is thus by far the most widely used authentication method.

2.1.1. How Basic Authentication Works

To use basic authentication, you must configure your HTTP server daemon to know that certain documents require authentication to access. How to do this varies with different HTTP servers. We will describe briefly how this is done with Apache. Most other commonly used Unix servers work similarly.

First, all documents to which access is to be restricted are placed in some common directory under your server's document root. That directory (and all beneath it) can be configured either by placing commands in a file named .htaccess that resides in that directory, or by placing the same commands in an appropriate <Directory> block in the global configuration file. The directives will be the same in either case, giving at least the following information:

Authorization Realm Name - Some label which identifies which service this authorization is for.
User Database Name - These describes where the database of valid users and user passwords is stored. (IMPORTANT: It should not be stored anywhere under the server's document root, since any data there could possibly be viewed by the user, and you don't want people viewing your password database).
Restricted Operation - A list of which kinds of HTTP transactions authentication is required for.

Now, suppose the user attempts to access an HTML page or a CGI program which is stored in (or below) the directory. The sequence of events defined by the HTTP protocol specification for basic authentication goes like this:

The user's browser sends an HTTP request to the server, asking for the page.
The HTTP daemon on the server starts up. It notices that the request is for a file in a restricted directory, so it checks the configuration for that directory. It discovers that authentication is required to access files in this directory, and it sees that no authentication was included in the request. It therefore rejects the request with a response that says authentication is required and gives the authentication realm name that was in the .htaccess file or <Directory> block.
The user's web browser receives the rejection. It searches its memory to see if it has a login/password combination saved for that authentication realm name on that server. If it does not, it pops up a dialog box, asking the user for a login and password to use for that authentication realm name at that server. If the user supplies them, it stores them in its memory, so it will have them in the future.
Now that the browser has a login and password to use for this particular realm at this particular site, it resends the same request that it previously sent in step (1), but this time it includes the login and password.
The HTTP daemon on the server starts up again to process the new request, just as in step (2). It again discovers and reads the .htaccess file or <Directory> block, and again discovers that an authentication is required. But this time, it finds a login and password included in the request.
The HTTP server daemon then looks up the login name in its user database, and checks the password that was in the request against the one it has stored in its database. If the account doesn't exist or the passwords don't match, the HTTP server daemon sends a rejection back to the browser (which normally responds by asking the user again for a new login and password to try).
If they do match, the HTTP server daemon may make some further checks (it is possible to permit documents only to some groups of users). If these fail it may still reject the request.
If all access tests succeed, and the target is an HTML page, the HTTP server transmits the page. If the target is a CGI program, then it runs the CGI program, passing the user name (but not the password) to the CGI program in the REMOTE_USER environment variable. This lets the CGI program know who the user is.

Theoretically this whole process would be repeated for each new page accessed. The first request for each page would be rejected, and only the second request for each page would include the login and password. In practice, most browsers seem to try to guess which pages will need authentication based on past experience, and will send the login and password on the first request if they have one cached. Mostly this is invisible to the user, because only for the first page is the user actually asked to enter a login and password. On later pages, the browser automatically resends its saved copy.

I wonder if there might some circumstances, with multiple different authentication domains on the same server, where the browser might guess wrong and send a login and password from one authentication domain with a request to another domain. I haven't confirmed this, but I suspect the risk is minimal.

2.1.2. Advantages and Disadvantages of Basic Authentication

From a security point of view, basic authentication is not very satisfactory. It means sending the user's password over the network in clear text for every single page accessed (unless a secure lower-level protocol, like SSL, is used to encrypt all transactions). Thus the user is very vulnerable to any packet sniffers on the net. (In the HTTP 1.1 protocol the login and password are encoded, but only in a trivially decodable manner).

From an efficiency point of view it is also less than impressive. If the browser isn't clever enough to guess that authentication information will be needed when sending the original request, then each page must be requested twice, so that the ordinary delays due to net latency are doubled. On pages with frames, where the top-level frame and the contents of each subframe must be fetched separately, this may require two requests for each frame (note that if some of the frames don't include sensitive information you may improve performance by making those frames available without authentication).

The work required by the HTTP server is also significant. It has to process each request twice, rejecting it once, and then accepting it. It also has to look up the user in the user database again for every new page accessed. If the user database is large, this can be very slow.

A well-designed HTTP server can reduce some of these problems. It can cache information about what access is needed for which files, so that repeated requests can be processed more quickly. It can use user databases designed for rapid lookups, such as the hashed dbm files supported by Apache. Still, the protocol is fundamentally cumbersome enough that it must be slow.

Another important problem with basic authentication as currently implemented is that there is no way for a user to log off without exiting the browser. Suppose you have been accessing a site requiring authentication through your web browser. Even if you have moved on to a different site, your web browser still has your login and password cached. So if you leave the computer and someone else tries to access that site, that person will get on right away, without being asked for a login and password. On most operating systems another person wouldn't easily be able to see your password, but as long as your browser is running they could continue to use it.

Neither Netscape nor Internet Explorer offers the user any way to flush authentications out of memory short of exiting the browser. Many web sites suggest to users that they should "exit all browser windows" after they are finished to make sure they are logged off. Even users who remember to do this may fail to notice all windows, including minimized and iconified ones. On a Macintosh, exiting all browser windows isn't even sufficient - the browser doesn't automatically exit when all windows are closed. Most browsers have a command menu option to actually exit the browser (closing all windows) but it takes some user education to teach people to actually use this. Thus there is a very large chance that users will fail to effectively log out when leaving their computer. This can be a significant security problem in environments where users commonly share computers. It's probably reason enough, by itself, to avoid using basic authentication for applications with very sensitive data and large numbers of untrained users.

Finally, many web sites prefer not to use basic authentication because they want more control over the appearance of the login screen. With basic authentication, you get whatever ugly little login box that the browser chooses to pop up. Usually the only text in this box that you have any control over is the authentication realm name (some sites try to jam all sorts of information into that).

2.2. Digest Authentication

Digest authentication was added to the HTTP standard to provide a method of authenticating users without sending passwords over the network in clear text. This fixes the major security weakness in basic authentication.

Digest authentication, however, has only recently been beginning to catch on. Apache's web server has long included support for it, but until recently the only browser that implemented it was W3C's reference browser, Amaya. Now support for it has appeared in Internet Explorer 5.0, Mozilla 1.9.7, Netscape 7, Opera 4.0, and Safari 1.0. That's pretty much all current browsers. Microsoft's IIS 5.0 server also supports it.

There have, however, been some problems with compatibility of different digest authentication implmentations. Not all browsers work with all servers. When Microsoft finally implemented it, they implemented it differently ("incorrectly" would probably not be too harsh), so IE browsers would work with IIS servers, but no other browsers would work with IIS, and IE wouldn't work with other servers. Apache servers 2.0.51 and later have a "AuthDigestEnableQueryStringHack" setting that can be turned on to work around this problem, allowing IE browsers to be used with Apache.

So, almost a decade after the standard was introduced, it is now just becoming practical to use digest authentication on a web site targeted to the general public.

2.2.1. How Digest Authentication Works

For the most part, digest authentication works just like basic authentication. The browser requests a page, which is rejected. But the rejection message is a bit different, in that it says a digest authentication is required and also gives a string called a "nonce," which is some string (generally based on the time of day and the IP address of the requester) which is different for each request made.

As with basic authentication, the browser gets a password (either from the user or from its cache memory) Instead of just sending that information, the browser does the following:

Concatenates the user name, the authentication realm name and the password, and then computes an MD5 checksum of that whole string.
Concatenates the URL requested and the method for the request, and then computes an MD5 checksum of that string.
Concatenates the two previous checksums with the "nonce" string supplied by the server, and then computes an third MD5 checksum of that string.

The checksum resulting from the last step is sent with the request for the new page, as are the clear text of the login name and the nonce value.

MD5 is an algorithm that takes text strings of arbitrary length and generates a 16 byte checksum. It is designed so that if you are given only an MD5 checksum, it is extremely difficult to find a block of text that would result in that checksum. It can be considered a one-way encryption algorithm.

When the server receives this new request, it looks up the user name in its password database and gets the user's real password. It then computes the same three checksums above, using that real password. If the result is the same as the one the browser sent, then the user either supplied the correct password, or they got very lucky and found another password that has the same checksum (one chance in 340,282,366,920,938,463,463,374,607,431,768,211,456). (Actually, the server should store the result of the first MD5 checksum in the password database instead of storing the clear-text password - this saves computation, and, more importantly, protects the user's password).

The digest authentication standard includes some other features. It allows for an MD5 checksum of the entire request or response to be included, enabling the server or browser to detect if their messages have been tampered with somewhere on the net between them. Also it allows the server to specify which other pages the same authentication can be used for.

2.2.2. Advantages and Disadvantages of Digest Authentication

In spite of all its many improvements over basic authentication, digest authentication is not an all-around security solution. It does make it far more difficult to steal a user's password, since passwords are never sent over the net, but snoopers on the network could still see all the text of the request and the response, so it does not protect the secrecy of the actual data sent or received. If you want to protect that, you need to be using something like SSL or SHTTP to provide fully encrypted communications.

Though a snooper cannot see a user's password, which would make it possible to fully impersonate the user, it might be possible to just save a copy of the whole request and resend it. If the target is a static HTML page, replaying the request in this way just gets the snooper another copy of the response that they presumably saw while snooping on the original transaction. If the target is a CGI program, this could be a bigger problem.

The "nonce" value can be used to solve this problem. For complete security, the HTTP server could keep a record of the nonce values it has sent, and allow each to be used only once. This is difficult, however, since it requires a lot of record keeping. A simpler, though less secure, approach is to include data like the user's IP address and a time stamp in the nonce string. Then the HTTP server would only accept authentication requests if the IP address the request appears to originate from matches the one in the nonce string, and if the nonce string is not too old. Thus the person sending the replayed request would have to arrange to appear to be coming from the same IP address as the original user (generally difficult, but not impossible), and the whole thing would stop working after the time stamp expires. (Note that there are problems with using IP addresses this way that will be discussed below.)

Another problem with digest authentication is that it allows for very little flexibility in the way that passwords are stored in the server's password database. The server needs to be able to generate the MD5 checksum of the concatenation of the user name, the authentication realm name, and the password. So server either needs to be able to have access to the plain text password to generate this checksum, or the checksum itself needs to be stored in the password database. That means that you can't do digest authentication out of databases where the passwords are encrypted by any other one-way encryption method. Since virtually all well-designed password databases use some other one-way encryption method to store their passwords, this covers nearly all pre-existing databases.

There is one way in which digest authentication is less secure than basic authentication. If unauthorized people gain access to the password database on a system with basic authentication, they don't get anything very useful. Only encrypted passwords are stored there, and they aren't very useful because to get authenticated they have to supply the real password, not the encrypted version. So they'd have to decrypt the passwords before they could use them (which is easy for badly chosen passwords, but fairly hard for well-chosen ones).

With digest authentication, the password database contains the MD5 checksums of the concatenation of the user name, the authentication realm name, and the password. It's still hard to obtain the password from this, but you don't necessarily need the password. Just use this value as the result of the first checksum instead of actually computing it. This first checksum is all villians need to be able to pretend to be that user on your system. The only small consolation is that since the authentication realm is part of the checksum, they can not use it to access other sites where the same user may use the same password. Thus it is much more important to protect the password database under digest authentication.

Thus, digest authentication fixes several weaknesses in basic authentication but falls well short of being a fully secure protocol. Still, it's more secure than most of the other authentication methods discussed in this report (at least if they are not combined with SSL).

2.2.3. Why Digest Authentication Isn't Used

Why have the major browser vendors been slow to support digest authentication? I don't know. I can only speculate.

Some people have suggested that it was because digest authentication is based on encryption and thus the US export restrictions on encryption software made it unattractive. It was one of the goals of the team that designed digest authentication to devise a protocol whose use would not be limited by copyright or export restrictions. That's why digest authentication does not use two-way encryption algorithms, but only one-way MD5 encryption. The US export regulations explicitly did not restrict export of such programs. This makes sense because although algorithms like MD5 can be used to prove that you know something (a password in this case) to someone else who also knows it, they cannot be used to transmit secret information to someone who doesn't already know it. Source code for MD5 is included in many packages that have been freely distributed world-wide without interference from the US government (including the Apache HTTP server). (Note: some people claimed that export of one-way encryptions systems was restricted. I'm not an expert on this and could be wrong. The high level of confusion even among professionals over what was and was not legal to export from the US was a not insignificant part of the problem with this law - it retarded progress even in areas it does not regulate.) At this point US export restrictions have been loosened so it may be less of an issue - though some other nations still have similar regulations.

My guess is that the browser manufacturers didn't support digest authentication because it is simply too much of a half measure. Rather than offer all sorts of intermediate levels of security, they'd prefer to offer a more complete security solution, like Netscape's SSL protocol. This can solve all the problems with snooping (unless the snoops have access to seriously heavy-duty decryption technology, probably available mainly to various secretive government agencies).

Next Section:
3. Do-It-Yourself Authentication Options

Last update: Tue Mar 29 12:07:52 EST 2005