Software Design Document for an MDAS/NPACI/DOCT Authentication/Privacy Mechanism

Chaitanya Baru, Markus Jakobsson, Wayne Schroeder

Introduction

This document, combined with the Authentication Mechanisms paper, describes an Authentication/Privacy software system under development by SDSC/UCSD for MDAS/DOCT. This describes general requirements, and options and choices that have been made, and provides a general description of the software. The "Authentication Mechanisms" paper provides an intellectual framework for the project. A third paper, The SDSC Encryption/Authentication System (SEAS) describes the completed system in more detail.

Requirements

The following outlines the requirements of the software:

Provide user authentication and secure communication mechanisms for the MDAS Storage Resource Broker (SRB) for a variety of projects: NPACI: Brain image, Earth Science, etc DOCT: interface to Legion authentication, which is being developed. This includes: User to SRB authentication and encryption, and SRB to SRB authentication and maybe encryption
Provide this authentication and communications security in a manner such that it can be used in other projects.
Completed within about three months (at least a partial implementation): Needed by mid-Sept for a DOCT demo. Needed by Oct 1 for NPACI
Must be usable as part of a Batch system on some computer systems. On more trusted hosts, passwords will not be required (no passwords in batch scripts) and tickets will not expire over a period of many days or weeks (batch jobs can be restarted and continue with the authentication that they had).
Easily deployable across DOCT/NPACI Unix systems, and perhaps eventually Windows NT.
For DOCT, usable by inventors applying for patents on a wide variety of hosts across the Internet.
For DOCT applications, and also NPACI, provide a mechanism whereby users can generate their own initial identification/authentication. (ie an introduction message, "Hi, I'm Bob, from now on you can identify me using this public key.")
Provide encryption for communication between the SRB and the MDAS Database MDAS will implement User Access-Control-Lists, enforced by the SRB and maintained in the MDAS DB catalog.
Light-weight/simple if feasible

This system is intended to operate at the communications level. That is, it will provide encryption and authentication between two running computer programs that are communicating via TCP/IP sockets.

It should be noted that the SRB will have access to data stored in various storage systems (HPSS, UniTree, Unix, FTP, DB2, Illustra) as user 'SRB', not as any other user. That is, the SRB will not need to "forward authentication" or "su" to another user. It will act as a data gateway but not as an authentication gateway. If the storage system allows access to a particular file to user 'SRB' _and_ the MDAS ACL allows access for the user communicating to the SRB, then the SRB will provide the access. Thus the user must securely authenticate to the SRB, the SRB much securely access its ACL data, and access to the SRB login account must be controlled and restricted.

Component Choices

Although there are a number of higher-level software systems that meet some of these requirements, nothing exists that provides all the needed features/characteristics. Kerberos, PGP, SSH, and SSL each cover some of the requirements, but not all.

All of these higher-level systems utilize common lower-level components to implement their functions. These are the various standard encryption/signature/hash algorithms such as RSA, DES, Triple-DES, IDEA, MD5, RC5, etc.

To achieve what we need, we'll need to use existing software components and combine them, and write some software that uses them. The question is whether we should use components at the Kerberos/SSL level, or the lower-level building blocks such as RSA, RC5.

Kerberos provides user authentication and secure application-to-application communications, but does not have the Batch or Introduction features. Also, it is not easily deployable, especially to a large number hosts for people applying for patents (it is designed for use within a trusted realm). It is fairly large and difficult to work with and it would be difficult to add Kerberos encryption to many applications within the time constraints.

Similarly, SSL can encrypt socket communications but does not readily handle user authentication. (SSL (much like SSH) uses RSA to distribute session keys (for socket sessions)). The SSL protocol appears to provide for user authentication, but it is an external function, and is not part of the Netscape SSL 3.0 reference source. SSL has the following primary drawbacks:

The netscape reference implementation is for Solaris and Windows systems. We would have to port it to others (or maybe find other implementations). This could be fairly difficult and time-consuming.
SSL does more (and less) than we need. It is moderately complex, making it difficult to port and debug (particularly to the C90). Since it does less than everything we need, we'll have to combine it with other software anyway, and it would be better to work with smaller and simplier components.
Even on Solaris systems, the recently acquired source does not compile correctly. There is a problem with the Makefile. Perhaps this could be easily resolved, but there could well be a long series of problems to fix which could be very time-consuming.

Similarly, SSH and PGP, utilize encryption algorithms to provide services, but do not cover all the features we need.

While it may be possible to use SSL or other higher-level systems as part of this solution, the trade-offs are not favorable. They would would add significantly more complexity, while contributing only lesser degrees of functionality.

The GSSAPI (Generic Security Service API) is another possibility. The Globus team is doing some interesting work with GSSAPI and SPKM (Simple Public-Key GSS-API Mechanism). Both of these though are are fairly complex, provide more than we need, and would likely be quite difficult to work with. As is frequently the case, it is likely that we would have to port the software ourselves to the Crays, and that would be difficult (for a C90-type machine).

The purpose of implementing GSSAPI would be to make it easier to applications to interface to it. But since we are currently interested in only one, or a few, applications (primarily the SRB), there is little need to conform to an industry standard. In fact, the simple interface we envision will be easy to add to the SRB and much easier to implement than GSSAPI.

Over time, we may wish to migrate toward the GSSAPI, perhaps building on the work Globus is doing. But this would be a multi-year goal.

Case Larsen and William Johnston of LBNL have done some interesting work using SSH (the SSH-agent) to produce pluggable backends to GSSAPI incorporating SPKM. (The Globus team will be using some of this software as a foundation.) This, however, is partially written in C++ and would need to be ported to the Cray. Since GSSAPI is not a key requirement for us at this time, we are probably better off using other software, for now.

RSA Data Security, Inc. offers a BSAFE 3.0 package that includes DES, Triple DES, DESX, RC2, RC4, and RC5. However, it is unlikely that they support the Cray architecture and we already have ported a RC5 implementation to the Cray.

So we believe that the most effective approach will be to work with existing software components at the encryption level, at least for most of the functionality. We will develop software layers on top of these to provide the needed authentication and encryption infrastructure. The level of effort to develop this software is reasonable, and we expect that this approach is best alternative for enabling significant progress over the coming months.

Design

Encryption Library API

For the initial implementation, we're using RC5 for symetric key encryption and RSAREF 2.0 for the public/private key encryption.

SEA Library Encryption API

These routines are designed to be easily added to the SRB and other software packages. Layered on the encryption routines, these routines provide two functions: encryption of socket connections and authentication of users on those sockets.

We had considered slightly different API, but the current design is as follows:

The routines are currently called the 'sea' library, for SDSC Encryption and Authentication. These two routines establish an encrypted session on a TCP/IP socket:

seaBeginEncryptionServer(fd)
seaBeginEncryptionClient(fd)

The fd is a socket connecting a client and server. The client calls seaBeginEncryptionClient and the server calls seaBeginEncryptionServer. The routines perform a handshake, using RSA to exchange a random session key to begin encrypting data on the link.

These two routines are called instead of write and read to send and receive encrypted data on a socket:

seaWrite(fd,buf,len)
seaRead(fd,buf,len)

The application can establish encryption at any point in the session. Both sides just need to know to call the seaBeginEncryption routine at the same time. Once encryption is established, the application calls seaRead and seaWrite just like read and write to exchange information that is encrypted. If data does not need to be encrypted, the application can use read and write instead of seaRead and seaWrite, even after encryption is established on the socket. Multiple sockets can be encrypted via multiple calls to seaBeginEncryptionServer/Client.

Normally, authentication (described below) will be performed on encrypted sockets but this is not necessary. The authentication exchange uses RSA to securely identify users without exchanging any plain-text passwords. However, first encrypting and then authenticating is the preferred method, as it does provide a little more security. Control information to the SRB should be encrypted.

seaWrite and seaRead will transfer data without encryption if the seaBeginEncryption{} routines have not been called. We may also add a seaEndEncryption routine at some point.

SEA Library Authentication API

There are multiple distinct authentication environments.

In the DOCT environment, we may wish to allow users to introduce themselves to the system and from then on it is sufficient that the system knows that the same person is communicating with the system (ie, they have the private key).

For the DOCT environment, we can simply have the software generate public/private keys, and send the public key to the key manager (see below). In effect, saying "here's my public key, from now on you can identify me with it".

In the NPACI/SDSC environment, a userid can have a higher level of trust if the SEA authentication system confirms that the user is actually running on one of our trusted hosts. Like passwordless access to HPSS from the C90, the MDAS system just needs to confirm that the user is logged onto the C90 as a particular user to acquire the access privileges of that user. So for the introduction function (when initial user public/private keys are established), the system has to confirm that the user is actually logged in on a trusted host (eg the c90) and running as that user. Once that is done, users can be confident that a user registered as user@SDSC actually is that SDSC user.

The planned solution is to run an agent on the trusted hosts. This agent would identify itself to the key manager via our normal public/private key mechanism (ie as some trusted object, eg "c90-agent"). The introduction utility will create the public/private key and tell this agent to access the public key in a publicly accessible file under the user's home directory. The agent will access this file, confirming that the file is owned by the user and is in the user's home directory, and with then pass the information on to the the key manager. Once this is complete, the introduction utility can remove the public key file.

In this case, the user will not be allowed to chose an SRB username. Instead, the SRB username will be the user's username on the host followed by the domainname, eg "schroede@SDSC".

These two routines are called to authenticate a user:

seaAuthServer(fd,userid)
performs the server-side authentication protocol, if successful, returns the authenticated user identifying string in the userid argument. This is the user on the other side of the socket, or the client application's name.

seaAuthClient(fd, objectname, password)
performs the client-side authentication protocol. If objectname is a NULL pointer or contains a NULL string (default), the library will generate and use the user's id (of the form username@domain); if non-NULL, this is the name that the client will attempt to authenticate as (the Server must have a public key matching that string and the client must have the private key that goes with it). Similarly, if password is NULL or contains a NULL string, the library uses the default method to decryption the private key; if provided, password is used as the private key decryption pass phrase).

There are also two support funtions:

seaCheckUserPrivateKeyUnencrypted(name)
Returns true if an Unencrypted private key exists for the user, or (if name is non-null), for the named process.
seaCheckUserPrivateKeyEncrypted(name)
Returns true if an Encrypted private key exists for the user, or (if name is non-null), for the named process.

See the "User private key files" section below for a related description.

Trust Models

For an initial "introduction", a SEA utility generates a pair of RSA keys, stores the private key locally (encrypted), and sends the public key to the key manager to be recorded for future use. Once this is done, the user can authenticate to the programs using SEA (the SEA library accesses these keys to securely authenticate).

The SEA system provides for three "Trust Models" for the initial introduction.

Password - The admistrator sets up a password for each user's initial introduction and provides this to the user via a secure mechanism (phone or mail). The SEA authentication program and key manager daemon will verify this password before allowing a new RSA key to be registered.
Trusted Host - A "Trusted Agent" process runs on each system that has a unique user home directory. This agent communicates with the key manager daemon to verify that the public key being provided is indeed the key owned by the stated user. This agent program registers itself with the key manager at startup. For systems that share user home directories (most SDSC workstations, for example), only one trusted agent needs to run on one system to handle the whole collection of computers. The security of this trust model is relying on the security of the hosts involved and is appropriate for a centrally managed, security-aware, administrative environment.
Self-introduction - The system allows the introduction of new RSA keys as long as the name does not match an existing key (this uniqueness is required in the Password and Trusted Host models too). This could be appropriate in an environment where one is only concerned in verifying the continued identity of the user; ie once introduced they are the same individual.

Higher-level functions

A set of utilities and deamons will be needed to provide the key generation and management:

A utility to generate an RSA public/private key.

Stores the private key in a predetermined path with user read/write permission (no access by group or other) (accessible by the socket/encryption library in processes run as that user)
Can store this private key itself either encrypted or not. For Batch jobs (trusted hosts), users will store this unencrypted so that the library can access the private key without a password.

A utility to pass a users public key to a secure daemon (this will be executed by a user only once, ie you can introduce yourself to the system once and only once.)
A daemon to receive these public keys and write them into a data file (preventing duplicates or unauthenticated changes). (Read access to this data is not a concern, and the s_accept routine will read this data directly.)
A utility to update a users public key (ie replace an existing key with a new one, if the user can prove posession of the current private key) (not needed initially)

Originally, we were considering PGP or ssh-keygen for the first utility. However, PGP would have to be ported to the Cray, and we found that converting between PGP's or SSH's key file format to RSAREF's to be fairly involved. So instead, we have developed our own utility to generate and store keys.

Key Management

The user public keys have to be available to the authentication library on the Server side. These, being public keys, are not particularly confidential, but updates need to be controlled.

We were considering two two implementations for key storage, one using the local Unix file system and one using the MDAS catalog (database). We decided that the former is all that is needed. Storing keys into the MDAS catalog would have some advantages but would be somewhat less secure.

The Key manager process will respond to requests to store new keys by first checking that the user does not already have a key. If not, the key is accepted and stored.

The SEA library will access the public key files directly, although eventually we may want this to go thru the key manager.

User private key files

User private key files have to be available to the authentication library on the Client side. For batch jobs, they can not be encrypted in a way such that passwords are needed to use them.

Currently, we plan to store user-password-encrypted private keys in the users' home directories (eg NFS mounted), and unencrypted private keys in a local file system (so they won't cross a network). Users will be able to create unencrypted private keys from their encrypted private keys. These can be used for batch jobs of interactive sessions (ie the library will be able to automatically authenticate them). The "unencrypted" private keys will actually be encrypted but only in a "semi-secure" manner. The SEA routines will encrypt and decrypt the user private key data as it is being written and read from the private key files. This algorithm will take constant information (such as UID and host), generate a key (via MD5), and use this key to encrypt. This will provide a little additional security, but only because the algorithm is not widely known.

We will make the private key storage location a configurable option. On the C90, for example, the home directories would be fine for the unencrypted private keys as they are not NFS mounted. On SDSC workstations, storing these keys in /tmp in an option, since that file system is local.

Server Identification

For the SRB, there is little need to confirm that an client is actually talking to the SDSC SRB. If they are not, they will not be able to access data.

For the USPTO, however, one does need to confirm that the client is indeed talking to the USPTO before divulging sensitive patent or trademark information. This can be accomplished via RSA, but a public key for the USPTO will have to be available from a trusted source. Then, library routines could confirm the identity of the USPTO by encrypting a random string with the public key, sending it to the server, and getting a decrypted response back. The server can only accomplish this if it has access to the USPTO private key.

This will not be implemented, at least initially.

SRB User Access Control

Once a user is identified, access will be controlled to data via that id. Access Control Lists will be maintained in the MDAS catalog specifying who has what type of access to each particular dataset.

Portions of this have been developed but substantial additional development is required. This will largely be developed by Raja, with assistance from Wayne.

MDAS Web Interface

Unfortunately, since Java can not access local files, we can not implement our authentication mechanism in Java Script for use with our web interface to the MDAS catalog.

Alternatively, we can use passwords. For demo purposes, we can send these passwords unencrypted but soon we'll need to encrypt them. Two alternatives exist:

1) Run a secure version Netscape/WWW server (with SSL). These perform automatic socket-level encryption by generating a random key and exchanging it via RSA (as is commonly done). I'm not sure if these can be supported in SDSC's environment, but possibly.

2) Develop Java Script code and CGI scripts to transfer the user password via RSA. The server-side CGI scripts to send a public key and the Java Script software could take the users password input and encrypt it via the public key, and send it. Implementations of this may already exist that we could use.

Demo

We need to develop an SRB authentication demo for DOCT by mid-September. Goals may include:

register a new user to the SRB from a remote (local to the demo) workstation
demo access to public datasets, no access to ACL-controlled datasets
grant access rights to a data set (granted by the owner)
demo access
de-register the user (to rerun the demo)

This may not be a particularly exciting demo, but if we can pull it off I think it will demonstrate considerable capability.

Licensing Restrictions

Even without using PGP, the use of RSA and RC5 may eventually require some attention to licensing. However, the use of RSA by U.S. Universities for non-commercial purposes is allowed without a direct license from the patent holder (RSA Data Security). As we move into a more production mode, this may require more attention.

SRB to SRB authentication

The SRB to SRB authentication could be implemented by using the User to SRB authentication capabilities using the "SRB" objectname. This is very similar to user authentication, but the application can specify the name by which it is authenticating. A second (or later) SRB in a chain of SRBs (for a particular connection) could trust the information passed to it when that connection has been identified as from object "SRB". The user would need to authenticate to the first SRB, and the second SRB would know that the user has authenticated. Details for this implementation need to be examined.

Other web pages of interest

RSA
Kerberos Users' Frequently Asked Questions
International PGP Home Page
SSH Home Page
SSH at SDSC
International Cryptography Pages

Other important sources

We are using sample source code from "Applied Cryptography" by Bruce Schneier, via associated diskettes.