Tuesday 9 October 2007

kx509, kerberos and cosign

One of the things I've been doing over the summer is working on implementing some additions for our web authentication system. I thought I'd take a few moments to discuss these changes, and to describe the way that we're using them.

Historically, we used client certificates for web authentication. Generally speaking, these client certificates were obtained using the University of Michigan's kx509 system run transparently from the PAM stack at user login. When it worked, our users were unaware of a seperate authentication step to use web applications and all was well. However, we were (and are) seeing an increasing demand to provide web applications to clients that aren't under our control. These clients don't have the kx509 utilities installed, and don't have PAM (let alone having fancy things integrated into the stack). We implemented a solution which would download client certificates into the browser, but pretty soon ran up against the fact the most browsers have incredibly poor user interfaces for dealing with certificate expiry, selection and expiry. Implementing a replacement has been on the cards for years, but we'd limped along (using a locally developed kx509 implementation that worked with the Mac OS X keyring, to allow Safari to download credentials, and the new kx509 plugin for NIM developed by Secure Endpoints)

We decided upon Cosign (again from the University of Michigan) as a replacement web authentication system, and others set about building a production system around this for our environment. However, Cosign has the major drawback that it requires users to authenticate! Rather than our existing system, where web authentication occurs transparently (as long as the user uses a supported browser on a managed platform ...), they had to explicitly authenticate to the Cosign portal. Initial investigations looked at using x509 certificates (delivered by the kx509 mechanism) to authenticate users with those certificates to Cosign, and then allow Cosign to authenticate the user to the application. However, we'd always had the problem with kx509 that it wasn't possible to perform certificate delegation, without running a service called 'kct' on the KDCs. We'd always been a little wary of kct's code quality and, in fact, had never deployed it in production. This lack of delegation appeared to rule out kx509-based Cosign for many of the web applications we were interested in building, all of which seemed to benefit from some form of credentials delegation. I'll talk more about those later.

So, despite the fact that, ironically, Cosign had been originally chosen because of its kx509 support, we had to look elsewhere. The NegotiateAuth HTTP authentication mechanism allows browsers to perform Kerberos authentication, and was a promising fit. We control the installation settings of Firefox on all of our managed machines, so we could ensure that NegotiateAuth was enabled for our weblogin servers (one problem with Firefox's NegotiateAuth mechanism is that it's configuration settings aren't exposed in any UI, and are therefore hard to modify). This minimal support would ensure that our local user experience was no worse than that with kx509. So, I spent a few days implementing NegotiateAuth support (the new negotiate directive) in Cosign's login script. This was relatively straightforwards, especially compared to the issues with arranging for transparent fallback that followed.

The fallback issues are, as with most things on the web, down to the differences in browser behaviour and UI. The simplest way to achieve fallback is to present the page to the browser with the required headers, and let the browser render the failure text if it can't perform the authentication. However, the way that browsers react, firstly if they don't support NegotiateAuth, secondly if they're not configured to support NegotiateAuth for that domain, and thirdly if they don't have credentials is highly variable, and often suboptimal. Usability testing fairly rapidly showed that this wouldn't be a viable option across the set of browsers we needed to support for remote users. So, we started looking for a mechanism to allow 'testing' for NegotiateAuth support, without alerting the browser.

The solution we ended up with uses some Javascript, and the XMLHttpRequest method to perform a 'background' test of a NegotiateAuth protected page from the server. If this fetch succeeds, then we redirect the user's "main" login page to a NegotiateAuth protected copy of cosign.cgi, which proceeds to authenticate them based on their Kerberos credentials. This works on all of the browsers we tested (Firefox, Safari, Opera, Konqueror) with the exception of Internet Explorer. When IE is prompted to perform NegotiateAuth, and doesn't have credentials it produces a Basic login dialog box, which it then uses to try NTLM against the server. Our solution to this is to browser sniff in the redirect script, and to not even try NegotiateAuth if the browser is IE. We also disable the check for Safari, as this doesn't support credential delegation which we require later in the authentication process. The (rather clunky, I'm afraid) production version of this script is available from https://weblogin.inf.ed.ac.uk/cosign/js/redirect.js

Needless to say, there are further complications. We have cosign deployed across multiple web servers for resilience, all of which answer to requests for https://weblogin.inf.ed.ac.uk/. Firstly, different browsers perform Kerberos service name lookups in different ways. Firefox always uses the canonical name of the host (that is, it uses the DNS to resolve the name in the URL, and uses the results of that resolution). Safari always uses the name entered in the URL. This means that our webservers must have keys for both HTTP/weblogin.inf.ed.ac.uk, as well as HTTP/theirhostname. Firefox then throws an additional spanner in the works. The DNS lookup is performed twice - once for Kerberos, and once to determine the IP address of the host to connect to. If the names are being allocated in a round-robin fashion, then you will end up using HTTP/host-A as the service principal, whilst connecting to host-B. So, all of our web login servers also have to have each other's keys in their keytabs. This Firefox bug is in bugzilla.mozilla.org as bug #383312. The final problem is the the Apache NegotiateAuth module mod_auth_kerb only supports authenticating against a single, chosen, key from a keytab. In our situation, we want it to use any key from the keytab. I've implemented a simple patch which adds the KrbServiceName Any directive, allowing the use of any key that's in the servers keytab.

This is all now runing as a stable service. I'll talk in a future post about some of the additions we've made to this in order to support Friend or Guest accounts, and more about the need for delegation.

No comments: