E*TRADE Secure Data Exchange:
Using an SSL-based Web Server and Browser to Securely Exchange Files
By Ross Oliver
Published in the December 1999 issue of ;login:
The e-commerce industry is awash in mergers, strategic alliances, partnerships, and outsourcing. This organizational networking creates an increasing demand for sharing of data among organizations. Much of the data is sensitive or confidential. At the same time, traditional secure data paths of direct dial-up links, private networks, and dedicated communications lines are giving way to the Internet as the all-purpose information conduit. Since the Internet is a public data medium, the challenge is how to give members of the the organization the capability to move data about quickly and easily, while still protecting data integrity and confidentiality.
This article describes a simple file exchange service called Secure Data Exchange (SDX) that I developed to fulfill this data transport need. My employer is E*TRADE Group (parent company of E*TRADE Securities), so speed and efficiency of online data flow and maintaining security and confidentiality are both critical factors.
For two-way file exchange, an FTP server is probably the most common service used. It was certainly the one requested the most by users. However, I was opposed to FTP because of these weaknesses:
No built-in encryption. The FTP protocol sends all data as cleartext. It is possible to encrypt the file before transfer, but this requires both the sending and receiving parties to obtain, install, and use the same encryption software, as well as manage the password, key, or other method of encryption/decryption. This extra encryption step might sometimes be accidentally (or intentionally) omitted, allowing the data to transit the Internet in the clear. Even when the file itself is encrypted, the FTP login and password still cross the Internet as cleartext, exposing them to possible eavesdropping.
Difficult to implement in a firewalled environment. Because FTP requires multiple connections on multiple ports, the FTP protocol is difficult to get through a firewall. I wanted a service that kept all its traffic on a single well-known port, and all connections were initiated by the client.
Archaic user interface. The UNIX-inspired command structure of FTP can be difficult for non-technical users. This is mitigated somewhat by newer GUI-based Windows FTP clients, but the process can still be obtuse.
Difficult to limit operations. I wanted to be able to restrict the ability to upload, download, and delete files on a per-directory basis.
Account management. E*TRADE being a typical fast-growing company, I wanted a better way to manage user accounts than the usual UNIX utilities useradd, usermod, and userdel. Ultimately, the requesting department or group should have the ability to manage accounts and permissions for their own directory on the server.
My first attempt at a solution was to look for commercial FTP servers which offered encryption capability. One product is called FileDrive made by a company called Differential (www.filedrive.com). This product looked promising, but was expensive, and would require all users both inside and outside of E*TRADE to obtain and install the FileDrive client software. At the time, the company also lacked a UNIX client.
I also considered Secure Shell (ssh), but as with FileDrive, licensing and installation of client software would be required. In addition ssh key management would require significant time and effort. The user interface would not be much better than FTP.
Another reason I wanted to stay away from commercial software packages was to avoid going into the business of end-user support. This service had to be as simple and fool-resistant as possible.
The Light Bulb
For about a year, I had been toying with the idea of using an SSL-based web server for secure file exchange. Most Internet users are familiar with downloading files from web sites: click on a link to a document, and the browser retrieves the file, then either displays it, invokes the proper external viewer, or offers to store the file to the local disk drive.
Uploading would be done using an HTML form with an input field of type FILE. Here is an HTML fragment showing how the File type is used:
<form ENCTYPE=multipart/form-data method=post action=receivefile.cgi>
File to upload:
<input type=file size=35 name=upload_file>
<input type=submit value=Upload>
To upload a file, the user enters a file name in the text box of a form (newer browsers can also offer a file selection box), and when the form is submitted, the browser sends the contents of the file as part of the form data.
A web-based file exchange service would have several advantages over FTP:
Using SSL would automatically encrypt data, logins, and passwords, eliminating the possibility of eavesdropping.
No software would need to be purchased or installed by the users.
A simple GUI-based user interface would be easy for non-technical users to master.
Since E*TRADE already used the Netscape Enterprise Server, no new software would need to be purchased for the server side.
Most sites permit web traffic, so the firewall problems of FTP would be avoided.
At first, I had hoped to find a complete package somewhere on the net to implement this scheme. A lot of searching and browsing did not turn up any results. The lack of any examples of file uploading may be the result of lingering suspicions about the file input type. When Netscape first introduced the file upload feature, many people criticized it as a security risk.
I spent some time trying to write Perl code from scratch to parse the multipart-MIME form data, but abandoned that approach when I discovered the Perl module CGI.pm implemented file reception. In just a few hours, I had a functioning web page consisting of a single input field and a "Submit" button, along with a CGI script that would receive the file and write it to disk. A new service was born.
User Interface Design
The next step after proof-of-concept was to flesh out the service and design the user interface and associated functions. The user interface is a single page, four stacked sections. The top section contains the E*TRADE logo, the SDX title, and standard warning: "Unauthorized access is prohibited." The next section contains a directory name or title, and text description of this particular directory. The third section contains the file upload field (with a Browse button, if the browser support it) and "Upload" button. The bottom section contains the list of files available for download. The list includes the file names and modification dates and times of the files. If file deletion is permitted, a "DELETE" link beside each file name allows users to delete individual files.
My personal preference for web page visual design is rather utilitarian. I like 'em fast and clean, without a lot of eye candy. Nevertheless, a batch of text floating on a sea of white background is a little sterile. To add some visual interest, I added a vertical color bar (adopted from the company intranet) to the left side of the page, and the company logo to the top section.
To keep the service as generic as possible and keep management to a minimum, no navigation links are provided to move among directories. Users must either bookmark or enter URLs manually.
Since the Netscape Enterprise server was already used to run the main E*TRADE web site, and I had plenty of experience with it in my previous position as a UNIX systems administrator for E*TRADE, I chose this as the web server for SDX as well. The host machine is a Sun Ultra 2 running Solaris 2.6.
Each department or group that uses SDX is given one or more file exchange areas. Each area has a corresponding unique URL, and map to a directory on the host filesystem.
Within each of these directories are the scripts and configuration files that implement the service. To hide the inner workings of SDX, and prevent conflicts between configuration files and data files, the actual data files are stored in a subdirectory immediately below each file area directory.
Two perl scripts implement the primary SDX functions: index.cgi generates all the HTML to display the page (there are no static HTML files), and receive file uploads. The script delete.cgi handles file deletions.
I named the main script index.cgi so it would be automatically invoked by Netscape when users entered the directory URL. This allows users to treat the URLs as directories or file folders, and prevents direct listings of directory contents.
Three zero-length files may also be present in each directory. They serve as on-off flags for the upload, download, and delete functions. My original plan was for these files to contain lists of LDAP groups that are permitted to perform each operation. This has yet to be implemented.
Access to SDX requires individual logins and passwords. To maintain accurate activity logs and accountability, I don't allow shared or generic accounts.
For user management, the Netscape LDAP service included with the Enterprise Server 3.5 is used. Originally, I used the Netscape Administrator interface to create user accounts, but this became too cumbersome when adding groups of more than a few users. So I developed scripts that use the Netscape LDAP interface utility "ldapmodify" to perform batch additions of users. I also created a page and CGI script that uses ldapmodify to give the users the ability to change their passwords.
Once users and groups are defined, access to the directories is controlled by the Enterprise Server access control lists (ACLs). Having directory ACLs separated from user and group definitions is less convenient than I would like. Ideally, a single interface would manage all elements. My future goal is to give the index.cgi script direct access to the LDAP database, and store the directory ACLs as LDAP entries. Defining my own ACLs would also bypass a limitation of Netscape's ACLs: the server can't determine a user's permissions until he actually attempts an operation. I want to be able to determine in advance the user's permitted operations, so forbidden operations are not even displayed.
Tools for Automating the File Transfer Process
Once I had the basic system up, the need quickly arose for a way to perform non-interactive file transfers, such as from scripts and cron jobs. For an earlier project involving benchmarking web site performance, I had written a Perl script that used the SSLeay Perl module to perform HTTP retrievals of web pages. I was able to quickly adapt this script to the function of automated downloading by added a few lines to write the retrieved file to disk.
The upload script took more time, mainly in working out all the nuances of the multi-part MIME format required for the form submission. Once again, finding no examples on the web, I was working from scratch. I also added a flag variable to the upload form so when the server received a submission from a script, its reply was a small, easily-parsed text message rather than a full-blown HTML page.
The Perl scripts worked well in our predominantly Solaris UNIX environment. But many potential users wanted to transfer files to and from Windows NT systems. Obtaining or building Perl, then adding the SSLeay module on Windows would be beyond the capabilities of most of my intended audience, so the Perl scripts would be of no use. I decided to build standalone Windows executable versions.
I had not done any C coding in Windows in quite some time, so as an interim step, I first built C versions of the programs on UNIX. Using as a template the demo SSLeay client cli.cpp, included in the SSLeay package. These standalone UNIX binaries would later prove to be useful on UNIX hosts where no Perl versions or SSLeay module were available.
Once the UNIX versions were working well, I began the port to Windows using Microsoft Visual C++ 6.0. The most time-consuming part of the port was getting the SSLeay environment built under Visual C++. This took several hours of searching the online C++ help files and much trial-and-error. Once this was done, however, porting the actual programs went fairly smoothly. Most changes had to do with differences in the TCP/IP socket library functions.
One final change made later was to restrict the utilities to using only the DES encryption algorithm, to avoid the need for an RSA license.
A brief description of some of the problems I encountered:
Windows users do not hesitate to use spaces and all sorts of special characters in their file names. I had to take this into account when translating file names into URLs, translating non-alphanumeric characters into hex. Path names also had to be trimmed off. This requires two separate passes, one for UNIX paths, and another for Windows.
The the mime.types file included in the UNIX version of Netscape Enterprise server defines the file extensions .bat and .exe as the CGI file type. This meant whenever someone tried to download a file with either of these extensions, the server tried to execute the file locally (on the UNIX host) rather than of sending it as data.
A frequent occurrence is to omit the "s" from "https." This can happen if the user simply forgets the "s" when typing the URL, or relies on the "feature" of most browsers to assume "http" if only the domain name is entered. The usual result is a call to me complaining that the SDX server is down. A future enhancement would be to set up another web server instance on port 80 which would returns a redirect to the 443 port.
I originally had the server on a high-numbered IP port, because port 443 was already in use on the host machine. This caused a problem for some outside users because their sites' firewalls allowed outbound connections only on ports 80 and 443. Using the Solaris virtual interface capability, I added another IP address, and moved the SDX service to port 443. Once I had a dedicated IP address, I also assinged a dedicated DNS name.
I severely underestimated the amount of disk space that would be needed. Some users are transferring files several hundreds megabytes in size. The original host for the service was a Sun Ultra 2 with two 2-gigabyte disk drives (which were also shared with several other services). I am in the process of moving the service to a dedicated Sun Enterprise 250 server with 100 gigabytes of disk space. So that users would not be surprised by running out of disk space during an upload, I also added a message to the upload portion of the form showing the amount of free disk space.
The CGI.pm perl module uses a temp file to store the uploaded file as it is being read from the posted form. If the filesystem containing the temp file runs out of space, the module silently truncates the file. The default location for these temp files is /var/tmp, which on my original host, was a rather small filesystem. To permit larger temp files, I created a temp directory on the main file storage filesystem. The following Perl statement will specify to the CGI.pm module what directory to use for temp files:
$TempFile::TMPDIRECTORY = '/opt/usr/tmp';
If the upload does not complete (the network connection is broken or user presses the Stop button), CGI.pm leaves the temp file behind. To keep the temp space from filling up because of this, I created a daily cron job to clean up any leftover files.
I originally had placed the GIF graphics files used on all the pages in a dedicated subdirectory "images," a subdirectory of the document root directory. However, some versions of Netscape browser prompted the user with two separate authentication prompts, one for the images directory, and one for the actual directory being accessed. Newer versions don't have this problem, but to solve the problem for the affected users, I created beneath each document directory an "images" symbolic link to the actual "images" directory.
One of the biggest challenges is educating potential users about the availability and use of SDX. It is not enough to simply put some documentation on the Intranet and wait passively for requests to come in. I plan to use email messages, presentations, and maybe even this paper to raise awareness about the virtues of SDX.
Along with education about the generic service, I plan to produce documentation geared toward application and web site developers about how to use my techniques in their own web sites.
Improvements in Access Control
Having directory ACLs stored separately from user and group definitions is less convenient than I would like. Ideally, a single interface would manage all access control elements. My goal is to give the CGI scripts direct access to the LDAP database, and store the directory ACLs as LDAP entries. A frequent request is to restrict download/upload/delete access
by user or group rather than by entire directory. Defining my own ACLs would also give me the flexibility to do this.
One of the weaknesses of server ACLs is that the server doesn't determine a user's permissions until he actually attempts an operation. I would prefer to have that information in advance, so when the dynamic page is generated, forbidden operations are not even displayed.
Delegated User and Access Management
To reduce my administrative workload, I would like to be able to delegate "sub-administrators" who can add, change, and delete users in their designated groups. This will become even more critical as the number of users grows, and perhaps multiple servers at our multiple data centers are established. I am currently reviewing a product for this purpose called SiteMinder, by Netegrity.
Integration with Other Access Control Methods
The last thing any of us need is yet another login and password (YALAP?). Someday I would like to be able to point the server at an authentication server, and not have to manage user accounts at all.
SDX has been in active service for over a year. There are about 10 different groups within E*TRADE who use SDX on a regular basis. The most frequent users use SDX to exchange spreadsheets, Microsoft Word files, and documents. Other groups have successfully built automated processes to exchange files using the tools I provide. One group has adopted my techniques to use on their own web site for serving their clients.
Some developers still cling to FTP, especially if they already have scripts built to use it. But acceptance is growing, especially with the increasing reliability of the automated transfer scripts.
Here are the URLs for the tools and products mentioned in this article:
CGI.pm Perl module
Secure Shell (ssh)