A Common LISP Hypermedia Server

John C. Mallery (JCMA@AI.MIT.EDU)
Artificial Intelligence Laboratory
Massachusetts Institute of Technology

5 May, 1994

Proceedings of The First International Conference on The World-Wide Web, Geneva: CERN, May 25, 1994. See also the versions published, postscript.

Keywords: Common Lisp, HTML, HTTP, Hypermedia, Interactivity, Server.

Abstract: A World-Wide Web (WWW) server was implemented in Common LISP in order to facilitate exploratory programming in the interactive hypermedia domain and to provide access to complex research programs, particularly artificial intelligence systems. The server was initially used to provide interfaces for document retrieval and for email servers. More advanced applications include interfaces to systems for inductive rule learning and natural-language question answering. Continuing research seeks to more fully generalize automatic form-processing techniques developed for email servers to operate seamlessly over the Web. The conclusions argue that presentation-based interfaces and more sophisticated form processing should be moved into the clients in order to reduce the load on servers and provide more advanced interaction models for users.

Availability: http://www.ai.mit.edu/projects/iiip/doc/cl-http/home-page.html

Overview

Introduction

A World-Wide Web (WWW) server was implemented in Common LISP (see the FAQs, the Association of Lisp Users, and the ANSI specification) in order to facilitate exploratory programming in the interactive hypermedia domain and to provide access to complex research systems over the Web. The general motivation for developing this server was to provide a computational tool that would strengthen the link between the artificial intelligence researchers and the distributed hypermedia community. As the amount of information available over the World-Wide Web grows, it will become increasingly necessary to deploy intelligent methods to store, retrieve, analyze, filter, and present information. At the same time, high-productivity programming tools employed by AI researchers will become increasingly relevant for testing new ideas in servers before incorporating them into standards-based clients. A Common LISP HTTP server provides a bridge that allows AI researchers to plug their systems into the WWW as it affords developers of distributed hypermedia standards a vehicle through which they can import effective and relevant technologies. In this spirit, the conclusion proposes that HTML borrow ideas from the CLIM Presentation System in order to modernize the user interfaces in clients, to upgrade the interaction paradigm, and to achieve a number of efficiency gains for servers.

More specific motivations for this work in the context of the Intelligent Information Infrastructure Project at M.I.T. were to develop an HTTP server meeting these criteria:

The current server is excellent for rapid-prototyping precisely because it builds on Common LISP and provides a fine-grained vocabulary of operators which are easily combined and modified according to evolving application requirements and draft protocol standards. The server itself is an example of rapid prototyping as its 2700 lines of LISP were written from scratch and debugged by one person in about two weeks. A high-profile demonstration at the end of the period was well-received.

More recently, an authoring tool for email-based forms was generalized to emit HTML (Renaud, 1994). This graphical tool is now used to author forms that can work over email and WWW. The authoring tool is important for the generalization of email forms across the Web because form processing in email servers performs automatic retries when user input is syntactically or semantically incorrect. To implement automatic retries over the WWW, it is necessary to dynamically generate retry forms to explain the problem, incorporate all correct answers from the original form, and allow the user to correct the erroneous answers. Computing form retries requires datastructures beyond those normally associated with scripting languages. [Other research ( Houh, Lindblad, & Wetherall, 1994) arrives at similar conclusions concerning the limitations of scripting languages for more advanced, dynamic WWW applications.] In general, more advanced form-based applications will need the richer datastructures offered by full-featured programming languages like Common LISP.

Initial Applications

The Common LISP HTTP Server implementation was driven initially by the desire to provide WWW access to email servers and associated document or survey databases, which were built on the COMLINK System. Development of a native server offered WWW access not just for these specific applications, but also, for the COMLINK substrate systems in general. Thus, any application developed on top of COMLINK would automatically have the World-Wide Web as a user interface.

The initial applications included:

All these applications handle invalid or incoherent user input by returning dynamically generated HTML that explains the problem and guides the user's efforts to resubmit.

Advanced Applications

Two artificial intelligence applications were fielded by the end of the development period:

Integration with both of these large LISP systems was possible within several of hours of work each.

Server Features

The main implemented features of the server include:

Classes of HTTP URLs

Uniform Resource Locators are implemented as CLOS objects, to which URL strings are mapped upon receipt. Although the URL implementation handles all URL schemes defined by the URL specification, only the base classes for HTTP scheme will be described here:

Exporting URLs

URLs become accessible via the server once they have been exported with the function EXPORT-URL. The export function always takes an external URL name and a export type, which defines the computation used to serve the data denoted by the URL. In most cases, some additional arguments are provided, depending on export type. For example, when data resides in a file, the physical pathname where the data is stored is passed as an argument to export. Similarly, when a database is searched, the database is passed as an export argument along with the URL. The following export types are defined:

The COMLINK System

Over the past year and a half, the author developed a system that performs sophisticated operations over email, routes documents, and automatically surveys users. The COMLINK system uses a transaction-controlled, persistent-object database to represent users, hosts, mailing lists, document categories, documents, and more. For the purposes of this paper, three aspects of this system are relevant.

Document Routing and Retrieval

The COMLINK system supports document universes with associated taxonomies of categories. As documents are distributed through document universes, they are archived and indexed by some categories. Users can subscribe to a publications stream or retrieve documents by means of boolean combinations of categories. When retrieving documents, users may also provide temporal and quantity constraints to further circumscribe the documents found. The document retrieval and publication code that stands behind the initial WWW applications uses this technology as a back-end.

Automatic Form Processing

The COMLINK system interacts with users via textual forms exchanged in email, as well as conventional ``subject line'' commands, much like those found in standard listservers. Email servers are associated with a command interface that provides access to all the forms and subject line commands.

Forms are composed of a series of queries. In addition to a question or instructions, a query has an associated CLIM presentation type that presents any default value and parses any new value supplied by the user. Forms are written to a stream by calling the WRITE-FORM function with a set of value bindings for each query of the form. As WRITE-FORM calls the generic operation to present each query, the queries present themselves by calling the PRESENT method for their presentation type.

Forms are parsed by finding queries and converting the textual input associated with each query into its internal representation. The basic procedure for parsing a query is:

When query values are successfully parsed, the form's response function is applied to the parsed query values to perform the computation associated with the form. If there are query parsing errors, the system returns to the user a form with all the correct values defaulted and an explanation about how to correct the failing queries for successful resubmission.

Graphical Form Authoring Tool

A graphical form authoring tool was written by Renaud (1994) for the COMLINK system. Coded in CLIM, the interface defines meta-level abstractions for forms, queries, and presentation types that allow users to define automatic surveys and forms without having to write LISP code. At the same time, the data structures are abstracted in a way that forms can be defined dynamically under program control. For the set of presentation types previously defined for COMLINK, Renaud defined new presentation and accept multimethods that dispatch on the HTML presentation view. This means that the same presentation type, which already worked for the email view, could now operate for the HTML view, displaying itself in HTML and accepting its input with the HTML form-processing facilities.

Unlike HTML form input types, CLIM has a rich basic set of presentation types, which are routinely extended by application programs. After pairing up the presentation types that match between CLIM and HTML form input types, Renaud defined the appropriate cliches to accept input from the user for CLIM presentation types via HTML cliches. Once this small correspondence set was exhausted, the rest of the task reduced to accepting a text string from the client and applying the CLIM accept method to parse the input from the string. Unfortunately, all the computation required to parse input strings must remain on the server because there is currently no defined way to tell clients how to accept the input or check its validity.

Conclusions

An immediate goal for the Common LISP server is to develop seamless form processing over both email and the Web. At first, CLIM presentation types will be evaluated by the server as it checks the values returned in HTML forms for validity and parses them into the appropriate internal representations. After a period of experimentation, during which a good set of presentation types will be identified, it should be possible to propose a set of presentation types that clients can handle without undue difficulty. If servers could transfer definitions to clients, the presentation types available to remote applications could be extended or refreshed dynamically. This would require extensions to the HTTP protocol and agreement on a safe language(s) for transferring presentation parsers and generators to clients. By relocating the main responsibility for validating user input, considerable computational load and unnecessary connections can be offloaded from servers and distributed among clients.

The Common LISP HTTP server makes it possible to interface complex LISP systems often found in artificial intelligence applications to the World-Wide Web. Although the immediate goal of the server was to serve as a research tool for a project on intelligent information routing at M.I.T., the server can be useful for many other members of the world-wide LISP community and allow them to participate in the explosion of interesting WWW applications. At the same time, the rapid prototyping features of Common LISP can now become available to the WWW research community so that they can more quickly mock up and test out new ideas.

Availability

The server home page at http://www.ai.mit.edu/projects/iiip/doc/cl-http/home-page.html explains how to obtain copies of the server and provides further information, including source access.

Acknowledgments

This paper and the server was improved by comments from Marc Andreessen, Mark Nahabedian, Benjamin Renaud, Howard Shrobe, and Robert Thau. Benjamin Renaud implemented the WWW interface to the rule learning system. Boris Katz adapted his natural language system to allow WWW access. This paper describes research done at the Artificial Intelligence Laboratory of the Massachusetts Institute of Technology. Support for the M.I.T. Artificial Intelligence Laboratory's artificial intelligence research is provided in part by the Advanced Research Projects Agency of the Department of Defense under contract number MDA972-93-1-003N7.